1. Introduction
2. Methodology of ANN-based prediction models
2.1. Dataset generation
which is a translation by an amount of the profile proposed by Liu et al. [20], and a soliton-type profile
recently proposed by Aydın [42]. It is noted that both and are expressed in terms of the distorted spatial variable introduced by Liu et al. [20]. Here is the aspect ratio of the landslide and is the beach slope angle, see the definition sketch provided in the Appendix (Fig. 10). The hyperbolic bottom profile in Eq. (2) is the well-known solitary wave and its shape is very similar to that of the Gaussian profile (1); both profiles assume the same maximum amplitude, but the hyperbolic profile is broader, occupying a larger area by approximately 10% (Fig. 1).
Fig.1 Comparison of the bottom profiles given in Eqs. (1) and (2), respectively, at initial time , (a) on the slope as functions of , and, (b) on a horizontal plane as functions of . The initial slide submergence is chosen as . |
Fig.2 Dependence of run-up on (left) landslide horizontal length for a fixed slide thickness of m, and, (right) slide thickness for a fixed slide length of m. Results for three different values of the initial slide submergence ( , 3, 6) are presented and the maximum run-ups are also indicated by colored circles. The slope angle is chosen as . |
2.2. Fundamentals of ANNs
2.2.1. General structure
Fig.3 Architecture of an MLP network model. |
2.2.2. Feature selection
2.2.3. K-fold cross-validation
Fig.4 Implementation of the 10-fold cross-validation algorithm. For each of the 10 test sets, 10-fold cross-validation is employed on the training set. The accuracy of each prediction is the average of the folds. |
2.2.4. Performance metrics
2.3. Implementation of prediction models
Fig.5 (a) Flowchart showing implementation steps of the MLP-based prediction models. (b) Flowchart showing implementation steps of the feature selection algorithm. |
3. Results and discussion
Fig.6 Variation of average MSE with the dataset size. |
Fig.7 Scatter plots of the output variable ( ) for (a) the whole dataset of 54,952 rows, and (b) the 9K dataset. The predictors from top to bottom are the initial slide submergence, the maximum slide thickness, the maximum slide length, and the time of the maximum run-up. |
Table 1 Descriptive statistics of the variables in the 9K dataset except for the bottom profile code. The means and the standard deviations for the whole dataset are also provided. |
Variable | Minimum | Maximum | Mean | Standard deviation | ||
---|---|---|---|---|---|---|
Empty Cell | 9K | 9K | 9K | Whole data | 9K | Whole data |
(m) | 15.29 | 2999.84 | 836.48 | 841.70 | 680.30 | 684.05 |
(degrees) | 2 | 20 | 9.86 | 9.84 | 5.87 | 5.88 |
(m) | 0.76 | 65.57 | 18.33 | 18.42 | 12.58 | 12.60 |
(m) | 10.14 | 499.99 | 218.94 | 220.17 | 131.23 | 131.29 |
(sec) | 7.42 | 292.51 | 72.63 | 73.03 | 39.25 | 39.43 |
(m) | 0.50 | 24.99 | 9.64 | 9.69 | 6.84 | 6.85 |
Table 2 Predictors of the 9K dataset are ranked according to their ReliefF scores ( left), based on which six models (M6-M1) are constructed using backward feature elimination ( right). |
Predictor | ReliefF score | Model | Features |
---|---|---|---|
-0.0042 | M6 | , , , , , | |
-0.0016 | M5 | , , , , | |
-0.0010 | M4 | , , , | |
0 | M3 | , , | |
0.0011 | M2 | , | |
0.0268 | M1 |
Fig.8 Box-whiskers plot for the descriptive statistics of the 9K dataset. |
Table 3 The performance metrics (Mean Standard deviation of the 10-folds, 4-subfolds and 50 runs) of the models M6-M1 with predictors given in Table 2. |
Model | RMSE (m) | MAE (m) | MAPE (%) | R |
---|---|---|---|---|
M6 | 0.051 0.005 | 0.037 0.004 | 1.109 0.095 | 0.999 |
M5 | 0.055 0.005 | 0.039 0.004 | 1.194 0.115 | 0.999 |
M4 | 1.425 0.009 | 0.986 0.009 | 14.082 0.179 | 0.978 |
M3 | 1.581 0.002 | 1.123 0.002 | 14.300 0.091 | 0.973 |
M2 | 1.774 0.002 | 1.249 0.002 | 15.380 0.093 | 0.966 |
M1 | 1.851 0.002 | 1.292 0.003 | 15.817 0.145 | 0.963 |
Fig.9 Performance metrics of the models as tabulated in Tables 3 and 4. |
Table 4 The performance metrics (Mean Standard deviation of the 50 runs) of the models M6-M1 for the 1K independent dataset. |
Model | RMSE (m) | MAE (m) | MAPE (%) | R |
---|---|---|---|---|
M6 | 0.052 0.004 | 0.037 0.003 | 1.155 0.087 | 1.000 |
M5 | 0.059 0.004 | 0.040 0.003 | 1.360 0.094 | 0.999 |
M4 | 1.380 0.007 | 0.952 0.007 | 14.016 0.171 | 0.980 |
M3 | 1.520 0.002 | 1.083 0.001 | 14.341 0.067 | 0.976 |
M2 | 1.743 0.002 | 1.218 0.001 | 15.570 0.086 | 0.968 |
M1 | 1.837 0.003 | 1.262 0.001 | 16.096 0.098 | 0.964 |
4. Conclusions
Declaration of Competing Interest
Appendix A. The analytical model and its limitations
Fig.10 Definition sketch for the landslide problem (not to scale). and respectively indicate the maximum vertical thickness and the maximum horizontal length of the sliding mass at its initial location ( ). The undisturbed water depth is , where is the beach angle with the horizontal. SWL stands for the still water level. (Modified from [42]). |