The hydrological disasters have the largest share in global disaster list and in 2016 the Asia’s share was 41% of the global occurrence of flood disasters. The Jammu and Kashmir is one of the most flood-prone regions of the Indian Himalayas. In the 2014 floods, approximately 268 people died and 168004 houses were damaged. Pulwama, Srinagar, and Bandipora districts were severely affected with 102, 100 and 148 km 2 respectively submerged in floods. To predict and warn people before the actual event occur, the Early Warning Systems were developed. The Early Warning Systems (EWS) improve the preparedness of community towards the disaster. The EWS does not help to prevent floods but it helps to reduce the loss of life and property largely. A flood monitoring and EWS is proposed in this research work. This system is composed of base stations and a control center. The base station comprises of sensing module and processing module, which makes a localised prediction of water level and transmits predicted results and measured data to the control center. The control center uses a hybrid system of Adaptive Neuro-Fuzzy Inference System (ANFIS) model and the supervised machine learning technique, Linear Multiple Regression (LMR) model for water level prediction. This hybrid system presented the high accuracy of 93.53% for daily predictions and 99.91% for hourly predictions.

The Floods are the most damaging disaster in terms of property and life. Flood is the most occurring disaster in the world as compared to other types of natural disasters. (Ahern et al. 2005 & Kiran et al. 2019). Floods are influenced by many factors like precipitation, Snow-melt, Land Use-Land Cover, and built-up (Kim et al. 2009). Out of all other continents, Asia is the most affected continent (Cavallo & Noy 2011;

Table 1. Large scale flood events since last two decades in India

Event Year | Region | Casualties | Source |
---|---|---|---|

2004 | Bihar | 885 | Chandran et al. (2006) |

2005 | Maharastra | 1,187 | Singh, O. & Kumar, M. (2013) |

2007 | Bihar | 1,287 | Kumar et al. (2013) |

2008 | Bihar | 434 | Bhatt et al. (2010) |

2010 | Leh | 255 | Ashrit R. (2010) |

2012 | Assam | 36 | Pal et al. (2013) |

2013 | Bihar | 201 | Kansal et al. (2016) |

2013 | Uttrakhand | 5,700 | Rafiq et al. (2019) |

2014 | Jammu & Kashmir | 268 | Mishra et al. (2015) |

2015 | Chennai | 400 | Seenirajan et al. (2017) |

2017 | Bihar | 294 | Singh S. (2018) |

2017 | West Bengal | 39 | Singh S. (2018) |

2018 | Kerala | 445 | Vishnu et al. (2019) |

2019 | Bihar | 139 | Mishra et al. (2019b) |

The Jhelum River is the main river of Kashmir division, which runs along its entire length of 140 km. Most of the towns and villages are located on its banks. The width of the Jhelum river varies between 69 to 113 meters from Sangam to Ram munshibagh (Romshoo et al. 2018). The most flood-affected districts of Kashmir are Anantnag, Pulwama, Srinagar, and Bandipora. The valley does not have any Early flood warning system right now and the flood monitoring is a manual one (Fig. 1). The study area has the total area of 8603 sq/km, with 14 catchments having tributaries draining from Pir Panjal range and joining the river on the left bank and on the right side tributaries join the river from the Himalayan range (Bhatt et al., 2017) as shown in the (Fig. 2). Sandran river, Bringi, Arapath, Lidder, Vaishow, Rambiara, Watalara, Aripal, Sasara, and Romushi are those tributaries which join the Jhelum river in Anantnag and Pulwama districts and contribute a lot to the water flow of Jhelum river (Fig. 2). For this reason, the Sangam gauge station was taken into consideration, because after Kakapora village, no other tributary joins Jhelum up to Srinagar.

Fig. 1. Manual water level monitoring on river Jhelum

Fig. 2. Watersheds of kashmir valley with tributaries draining in river Jhelum

The daily precipitation and temperature data of 30 years ranging from 1980-2010 from three meteorological stations, Pahalgam, Kokernag and Qazigund were obtained from the India Meteorological Department. The daily water-level data from 1980-2019 of Jhelum river at three gauging stations, Sangam, Rammunshibagh, and Asham was acquired from the Department of Irrigation and Flood Control Jammu and Kashmir. The watershed of the study area was generated using the SRTM Digital Elevation Model (DEM) using ArcGIS software.

In this system, we have four base stations out of which three base stations were equipped with wireless sensors to measure different parameters and one base station namely

Sangam, where sensors are not used and data was acquired from the meteorological station. All the three WSN equipped base stations have the same architecture (Fig. 3).The system mainly depends on the wireless communication for data transmission and not on the internet because of the volatile situation of the Kashmir valley which experiences frequent internet blockade by the government for security reasons (Iqbal 2017).

Fig. 3. Architecture of base station

The ANFIS is a multilayer feed-forward network, being so specific operations were performed on incoming signals by each node (neuron). The ANFIS is a Takagi-Sugeno model with five layers in which membership functions, inputs, and derived rules determine its structure. An optimal number of epochs (iterations in learning phase) and type of membership function determines the efficiency of the model. The ANFIS employs «if-then» rules to perform an operation(Jang 1993) which is described below for a first-order model of common two fuzzy rules (Younes et al. 2015).

Rule 1: If X1 is A1 and x2 is B1 then f =p1x1+q1x2+r1

Rule 2: if X1 is A2 and X2 is B2 then f2 =p2x1+q2x2+r2

Where A and B denote grade like «Low» or «Less» whereas p1,q1,p2,q2 are parameters. The Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and Coefficient of Determination (R2) statistical methods were used to evaluate the performance of the model (Antanasijevi et al. 2013).

The first and foremost thing is to select the optimal variables, which are admissible to the desired output. 120 tests were done to get the optimal number of inputs and based on these tests, the best number of inputs turned out to be four. Furthermore, the best type of membership function was determined by testing all eight types of membership functions and the hybrid type of training algorithm was used. The epoch number and membership function number was kept constant at 40 and 3 respectively. The Triangular membership function showed the best results when compared to the other membership functions as shown in table 2. This model was selected for further modification in order to enhance its performance. To finalize the best performing structure of the model, it was necessary to determine the optimum number of functions. The number of membership functions were varied from 3-6 and the epochs were kept constant at 40. The tests revealed that the optimum number of membership functions is four. The best performing membership function was selected on the basis of the smallest RMSE for training and testing.

Table 2. Membership functions with respective test values

Function Type | RMSEtrain | RMSEtest | MAEtrain | MAEtest | R-sqtrain | R-sqtest |
---|---|---|---|---|---|---|

triangular Membership function (trimf) | 2.1357 | 3.3613 | 0.286 | 0.561 | 0.997 | 0.993 |

trapezoidal membership function (trapmf) | 2.2750 | 3.6257 | 0.317 | 0.601 | 0.991 | 0.987 |

generalized bell membership function (gbellmf) | 2.1485 | 7.4029 | 0.292 | 1.013 | 0.963 | 0.959 |

gaussian membership function (gaussmf) | 2.1556 | 4.2192 | 0.307 | 0.843 | 0.981 | 0.976 |

gaussian combination membership function (gauss2mf) | 2.2727 | 4.5048 | 0.319 | 0.775 | 0.978 | 0.969 |

pi-shaped membership function (pimf) | 2.3343 | 3.8190 | 0.395 | 0.741 | 0.981 | 0.975 |

Difference between two sigmoidal membership functions (dsigmf) | 2.2188 | 7.9834 | 0.296 | 1.107 | 0.985 | 0.905 |

Product of two sigmoidal membership functions (psigmf) | 2.2201 | 7.8919 | 0.399 | 1.091 | 0.957 | 0.893 |

The process of flood monitoring and early warning starts from the sensing module. The CS475A radar sensor is ideal for outdoor rough condition, calculates the distance between the sensor and the water by measuring the elapsed time between the emission and return of pulses. This data along with the data from rain measuring, tipping, self-emptying bucket, temperature sensor, and the magnetic hall-effect water-flow sensor is transmitted to the microcontroller. The Arduino 2560, which has 54 digital I/O pins with Atmega 2560 microcontroller sends the sensor data to the processing module of the base station via the HC-12 communication module, which is processed and a localised prediction about the future water level at this base station is made by finding a correlation between the rainfall and the water level.

Where L is the river water level, Q is the lag time, R is the rainfall, and α is the coefficient, which illustrates the correlation between water level and rainfall. The increase in the water level is directly proportional to the intensity of the rain. The data acquired through sensors is transmitted to the control center via the SX1272LoRa module. At the control center, the data is received by another SX1272 LoRa module and stored in the database. This data is later used to update the database and retrain the model. For the accurate water level prediction, the system makes use of weather forecasts from IMD (Indian Meteorological Department). The trained ANFIS model at the control center takes the forecasted values as inputs and produces an output which is the predicted value of the water level at Sangam. As we have from hourly to day to day forecasts available so this model can predict the water levels accordingly. Now, as the water level prediction for Sangam, Kakapora, Pampore and Ram munshibagh are available, The Multiple Linear Regression model takes these four predicted values as input and generates the future water level of Ram munshibagh as output which can be denoted by the equation:

Where B4 is the response variable, βi are the coefficients, where i= 0, 1, 2 and 3 are regression coefficients. B1, B2, and B3 are independent variables. The summary of the model is shown in table 3.

Table 3. Summary of LMR modelSR-sqR-sq(adj)R-sq(pred)0.18017998.82%98.78%98.68%

Then this predicted water level is compared against the five warning levels which are shown in Table 3 to determine the intensity and possibility of a flood. The intervals between the data acquisition from sensors depend upon the intensity of rain and the water level. The interval of per data acquisition from sensors starts from 15 minutes, which decreases with an increase in every warning level which is shown in (eqn. 7).

Tj = Tj - ki (7)

Where Tj is the time interval between the two consecutive measurements, Δtis the increment unit, be ki ∈ {0, 1, 2, 3, 4, and 5} is the warning level. This equation means the interval between the two measurements decreases as the warning level increases. Therefore, we will have predictions that are more accurate.

Our system has five warning levels viz. Normal, High, Very High, Critical, and Flood. The time intervals of these warning levels are shown in Table 4.

Table 4. Warning levels and corresponding time intervals

Level 1 | Level 2 | Level 3 | Level 4 | Level 5 |
---|---|---|---|---|

Normal | High | Very High | Critical | Flood |

15 Minutes | 10 minutes | 5 minutes | 1 minute | 1minute |

The proposed system was used to predict the water levels of river Jhelum at Srinagar. The ANFIS model was used to predict water level at Sangam or base station B1 because all the data was available to develop the model. The developed model has 256 rules, four-member functions for each input parameter and one output function. The efficiency of the ANFIS was evaluated using RMSE, MAE and R2 tests. The model achieved RMSEtrain (0.4306), RMSEtest (0.6109), MAEtrain (0.0623), MAEtest (0.0783), R2train (0.972) and R2test (0.966). These results were achieved at epoch number 230 and increasing epochs after that did not show any improvement in the model. The model showed that the predicted values and the measured values were almost equal with residuals falling between ±1 which signifies the efficiency of the model.The results were accurate with an accuracy of 93.53% for daily predictions and 99.91% for hourly predictions (Fig. 5, 6)

Fig. 4. Flowchart of The Proposed system

Fig. 5. Hourly Prediction Results

Fig. 6. Daily Prediction Results for 12 Days

The accuracy of the system in short-term predictions is better than the long-term predictions. The accuracy of the system depends on the accuracy of forecasted values of the precipitation and temperature.

Table 5. Coefficients of Regression model

Term | Coef | SE Coef | T-Value | P-Value | VIF |
---|---|---|---|---|---|

Constant | 1.771 | 0.215 | 8.24 | 0.038 | |

B1 | 0.0448 | 0.0228 | 20.35 | 0.015 | 3.36 |

B2 | 0.1440 | 0.0791 | 22.57 | 0.014 | 3.39 |

B3 | 0.9260 | 0.0241 | 38.35 | 0.008 | 4.27 |

In Fig. 6, The probability plot of residuals approximately follows a straight line with the least number of outliers. The residuals versus fits plot verify that there is no recognizable pattern in the points and the residuals are randomly distributed and fall randomly on both sides of 0 (Fig. 7). For all observations, the distribution of residuals is shown by the histogram of the residuals and which shows only two outliers. The order in which data were collected is displayed by the residuals versus order plot. No trends or patterns are shown in residuals and thus indicating that there is no correlation between independent variables.

Fig. 7. Residual Plots