## Abstract

Sediment load prediction is very important in planning, operation, and maintenance of water structures located on rivers. The sediment loads exhibit random characteristics due to the uncertain nature of sediment transportation in rivers. To predict suspended sediment load, stochastic processes, regression methods, neural network models, and fuzzy logic have been used in the literature so far. The purpose of this study was to develop a model that can make accurate predictions of suspended sediment loads. Here, a combination of wavelet and fuzzy logic techniques (WFL) is proposed as a new technique to model the behavior of sediment load. While the wavelet method is able to decompose the original series into its sub-bands, fuzzy logic method can be used as a predictive model for each sub-band. It is possible to detect significant power at specific intervals from average wavelet spectra. The WFL was compared with the stand-alone fuzzy logic approach based on the Nash–Sutcliffe Sufficiency Score (NSSS), mean absolute error (MAE), and square of correlation coefficient (R) used as performance indicators. The results of the study show that the WFL provides a considerable improvement over the stand-alone fuzzy logic approach in the prediction of sediment load.

- fuzzy logic
- hydraulic structures
- prediction
- suspended sediment load
- wavelets

## INTRODUCTION

The amount of sediment load is an important input to determine the dimensions of hydraulic structures such as dams, diversion weirs, and settling basins. To determine the sediment deposition properly, it is necessary to predict suspended sediment loads in rivers. The sediment transportation is also closely related to basin erosion.

The suspended sediment load modeling in rivers is one of the most complex problems in hydrology. The variables used in modeling include various uncertainties. Several processes, such as basin erosion and river bed motion for suspended sediment load generation in rivers increase the complexity of the sediment prediction problem. To predict the suspended sediment load in rivers, in addition to basin parameters such as area and slope, meteorological and hydrological variables such as precipitation and discharge, respectively, are used. There are many sediment prediction models proposed in the literature. These are parametric models (Vansickle & Beschta 1983), regression models (Sinnakaudan *et al.* 2006), artificial neural network (ANN) models (Nagy *et al.* 2002), fuzzy logic models (Kisi *et al.* 2006; Rajaee *et al.* 2009), genetic-Kalman filtering (Altunkaynak 2010), and support vector machines (Kisi 2012; Lafdani *et al.* 2013).

Hydrological processes can be modeled using empirical (black-box), conceptual (grey-box) and physically based methods (white-box). Black-box models involve mathematical equations derived not from the physical processes, but from the analyses of concurrent input and output time series. Black-box models can be further divided into three main groups: empirical hydrological methods, statistically based methods, and hydroinformatics-based methods (Abbott & Refsgaard 1996). Compared to physically based sediment models, it is shown that statistically based models, such as time series modeling approaches, are easy to use and give satisfactorily results. Rajaee *et al.* (2009) studied neural network, multiple regression, and sediment rating curve models for daily simulation of suspended sediment loads. Comparison of the results indicated that the neural network model outperforms other models. Sediment rating curve and its application by artificial neural networks was achieved by Jain (2001). Abrahart & White (2001) proposed a neural network model for sediment transportation. In addition to those stand-alone approaches, hybrid models that use wavelet technique along with stand-alone approaches have become widely used in sediment load predictions. A conjunction method (wavelet-ANN) for suspended sediment load prediction was proposed by Partal & Cigizlioglu (2008) to demonstrate the performance of the hybrid method over the stand-alone ANN method*.* In this wavelet-ANN method, the data were decomposed by discrete wavelet transformation and the generated new series was used as inputs into the ANN model. It is shown that the combined wavelet-ANN has better performance than the stand-alone ANN method.

Mirbagheri *et al.* (2010) employed discrete wavelet transform to establish models with a combination of three different approaches, which are genetic, neural network, and fuzzy methods. They compared the combination model results with sediment rating curves and found that the wavelet-genetic model performed better than the wavelet-neuro-fuzzy (WNF) and wavelet-neural network (WNN) models. Alikhani (2009) illustrated the advantage of the WNF model over neuro-fuzzy (NF) approach in the simulation of suspended sediment load time series. Rajaee (2011) proposed a new wavelet artificial neural network (WANN) model for daily suspended sediment load (SSL) prediction in rivers.

Wavelet combination models have been used for predicting different hydrological variables. A WNF conjunction method for prediction of precipitation was used in Turkey. The daily precipitation data of three stations were used in the study. The observed daily precipitation values were decomposed into sub-series by using discrete wavelet transform and then selected sub-series were used as inputs to the neuro-fuzzy models. It is shown that the WNF model provided good results compared to other classical approaches (Partal 2009). Partal (2009) showed that the combined wavelet transform and neural network methods could be applied successfully for evapotranspiration modeling from climatic data. Wei *et al.* (2012) developed a wavelet-neural network hybrid modeling approach for the prediction of river discharge using monthly time series data. A discrete wavelet multi-resolution method was employed to decompose the river discharge time series data into its sub-series with low and high frequencies, and these sub-series were then used as input data for the artificial neural network. Tiwari & Chatterjee (2011) proposed a new hybrid model, the wavelet-bootstrap-ANN (WBANN), for daily discharge forecasting. They explored the potential of wavelet and bootstrapping techniques to develop an accurate and reliable ANN model. Yong & Zhi-Chun (2011) studied the performance of the combination of wavelet and fuzzy models for runoff predictions in Yamadu station at Sinkiang province and found that the prediction accuracy of the model is satisfactory. Özger *et al.* (2012) employed WNF and WNN models to predict drought occurrences. Maheswaran & Khosa (2013) applied a multi-scale non-linear model based on coupling a discrete wavelet transform (DWT) and the second-order Volterra model, WVC, for daily inflow forecasting. They found that the relative performance of the WVC model over ANN, WANN, and auto-regressive moving average with exogenous variables (ARMAX) is superior for lead times of 1–5 days.

There are two types of wavelet banding, discrete and continuous wavelet transforms. Most of the studies on wavelet combination models apply discrete wave transform for band separation. Conversely, the use of continuous wavelet is rare. So far, these two techniques have not been compared in terms of their performance. However, it is known that continuous wavelet is able to represent sub-bands without loss of information whereas discrete type can lose information between dyadic bands. The application of continuous wavelet transform to suspended sediment prediction problem has not been seen thus far.

Since suspended sediment load time series exhibit random characteristics and have low serial correlations, it is very difficult to obtain accurate predictions with stand-alone models such as fuzzy logic. For this reason, hybrid models such as wavelet fuzzy logic can be employed to improve prediction performance. This improvement in prediction is the result of wavelet sub-banding properties that make it possible to generate easy to predict sub-bands and then reconstruct them for final prediction results. The aim of this study was designed to propose appropriate methods for suspended sediment load predictions. The continuous wavelet transform was used for sediment load time series decomposition. The wavelet-fuzzy logic combination model was employed to predict the monthly suspended sediment loads. The predictions are achieved for a 1-month lead time.

## MATERIALS AND METHODS

### Data

The monthly measured suspended sediment data used in this study were collected from four different stations located in Corukhi River and miscellaneous East Black Sea basins. Figure 1 depicts the locations of the stations taken into consideration. Suspended sediment samples were collected by using the depth integration method that represents entire depth and profile of a river cross-section. The US DH-48 type sampler was used for the collection of suspended-sediment samples. It is designed to sample isokinetically, meaning that water and sediment enters the nozzle at the same velocity during sampling to collect representative data. After onsite sampling, sediment concentration and sediment grain size distribution are determined by filtration technique in a laboratory study.

The study area has very rough topographic conditions. The amount of sediment carried by the rivers is very high due to the steep slopes of the rivers' watersheds. Snow melt and heavy precipitation are the two significant factors triggering the sediment yield in Corukhi River basin and miscellaneous East Black Sea basins. For this reason, predicting the sediment transportation rates becomes very important for water engineers, especially in flooding and snow melting season. The study covers a time period of 72 months, from 1999 to 2005. Statistical properties of the data are presented in Table 1, including mean, standard deviation, skewness coefficient, coefficient of variation, and the lag-1 autocorrelation coefficients.

The data from Coskunlar station have relatively smaller coefficient of variation and larger autocorrelation coefficient. The season covering the April–July period is the flood season. The amount of sediment discharge reaches its peak value during this season in all stations and it decreases to a normal amount in the rest of the year. Thus, the time series of sediment discharge includes sudden jumps from the normal values which makes prediction difficult.

### Description of the methods used

#### Fuzzy logic approach

Fuzzy logic is based on set theory. In the classical approach, an object is either in the set or not. Mathematically, in terms of membership, when the object is an element of the set, the value is accepted as ‘1’ and when it is not an element of the set, the value is accepted as ‘0’ Fuzzy sets are an extension of the classical notion set (Zadeh 1968). The basic principles of fuzzy logic in the classical approach, Aristo's system, are, a proposal is either true or false. In a fuzzy logic system, a proposal may be true or false. Also, it can have an intermediate truth value. The classical approach allows only two quantifiers, *‘*all*’* and *‘*some*’*. Fuzzy logic allows these quantifiers, too. Besides this, fuzzy logic allows more quantifiers like ‘most’, ‘many’, ‘several’, ‘few’, etc. In fuzzy logic, every logical unit could become fuzzified. In fuzzy logic, every object has a degree (between 0 and 1). The fuzzy logic analysis and control method steps can be described as: (1) receiving one or a large number of measurements or other assessments of conditions existing in some system that will be analyzed or controlled; (2) processing all received inputs according to human-based, fuzzy ‘if-then’ rules, which can be expressed in simple language (words) and combined with traditional non-fuzzy processing; and (3) averaging and weighting the results from all the individual rules into one single output decision or signal which decides what to do or tells a controlled system what to do. The resulting output signal is a precise defuzzified value. Detailed information about the modeling steps can be found in Özger & Sen (2007).

#### Continuous wavelet transform

Wavelet transform (WT) makes the analysis of a signal in time and scale domains possible and it has been employed for investigating non-stationary time series recently. This approach is also known as multi-resolution analysis. WT can detect the time of occurrence of a particular event that Fourier transform (FT) cannot perform. In other words, while FT decomposes a signal into sine waves of several frequencies, WT decomposes a signal into a shifted and scaled version of the original wavelet (Özger 2010). Wavelet is a small wave and its form tends to be asymmetric and irregular unlike sine waves. The basic wavelet function, (*ψ (t)*), also called the mother wavelet, can be given in the following form:
1The continuous wavelet transform is defined as:
2where *s* is the scale parameter (it indicates dilation if *s* > 1 or contraction if *s* < 1) and *T* is the translation parameter interpreted as shift of the wavelet function.

The transformed signal CWT(*T*, *s*) is a function of the translation parameter, *T*, which specifies the location of the wavelet in time, and the scale parameter, *s*. The mother wavelet is represented with f(*t*). The elements in CWT(*T*, *s*) are called wavelet coefficients. Each wavelet coefficient is associated with a frequency and a point in the time domain. As a conclusion of CWT, many wavelet parameters, *C*, are captured.

Continuous wavelet transform algorithm is performed in five steps: (1) choose a wavelet and compare with the corresponding part of the original signal at the starting point; (2) calculate the correlation parameter, *C*, between the wavelet and part of the signal concerned, the greater the parameter the greater the similarity; (3) scroll the wavelet to the right, then repeat the first and the second steps until the entire signal is evaluated; (4) scale the wavelet, then repeat the first, the second, and the third steps; and (5) repeat steps 1 to 4 for using all scaling parameters (Misiti *et al.* 2009).

The boundary effects in the wavelet transforms can have an influence on the analysis results. The boundary effects become prominent especially in low frequencies. There are various algorithms to deal with boundary effects. The boundary should be treated differently from the other parts of the time series. Often, it is desired to use simple schemes based on signal extension on the boundaries. This involves the computation of a few extra coefficients at each stage of the decomposition process to get a perfect reconstruction. In this study, a symmetrization method was employed to treat boundary effects. This method assumes that signals can be recovered outside their original support by symmetric boundary value replication. Conversely, it is well known that all these methods are effective in theory, but are not entirely satisfactory from a practical viewpoint. In the present study, since relatively better prediction results were obtained, the selected algorithm was assumed to be satisfactory for the wavelet analysis.

#### The combined wavelet fuzzy logic model (WFL)

Wavelet transform is a technique that uses wavelets to implement transformation. Wavelets are small waves that grow and decay over a small distance. Geophysical time series include different patterns, such as periodicity, trend, and noise, which are the results of different mechanisms affecting the process. Filtering such patterns helps in understanding the behavior of time series. One of the latest techniques used for filtering time series in time and scale domains is the wavelet transform. There is a tendency to filter the data before its use, especially in prediction problems. Several researchers have proposed that it is better to make predictions after decomposing both predictors and predictand into several bands (Kim & Valdes 2003; Webster & Hoyos 2004; Özger 2010). Wavelet transform makes it possible to separate time series into its sub-series. Here, the important question is how the significant bands can be selected. For this purpose, Webster & Hoyos (2004) proposed the use of average wavelet spectra obtained from continuous wavelet transform of a variable of concern. The significant spectral bands can be selected based on the average wavelet spectra, which show the variation of power with scales. Thus, at the end, we have a number of different sub-series, each of which carries specific information about the process. Conversely, each predictor time series is separated into a number of sub-series using the same spectral bands as that of the predictand.

Subsequently, it is necessary to relate each band of predictors to the corresponding band of predictand with a statistical scheme. Here, we used a fuzzy logic model to establish a connection between predictors and the predictand band (Kabatas 2014). A number of fuzzy models would be needed to make predictions. Finally, all those predicted bands of the predictand variable are reconstructed to obtain the final series.

Each band is modeled by Takagi-Sugeno type fuzzy logic approach. Here, Gaussian type membership functions were chosen for fuzzy inference. The proposed model has two sub-sets called ‘Low’ and ‘High’, and two inputs. There are four fuzzy rule bases used in this study, which are obtained from two different fuzzy sets as ‘Low-Low’, ‘Low-High’, ‘High-Low’, and ‘High-High’. These are presented in Equations (3)–(6).
3
4
5
6Here, *Q _{s}(t)* shows the sediment discharge at time

*t*,

*a*,

*b*, and

*c*are the model parameters, and

*y*is the result of each rule. The final result is obtained as weighted average using Equation (7) as a defuzzification method. 7where

*w*is the weighting factor and is the final defuzzified model result.

### Performance indicators

A number of statistical methods can be used to evaluate the performance of proposed models. In this study, the Nash–Sutcliffe Sufficiency Score (NSSS), mean absolute error (MAE), and square of correlation coefficient (R) were used as performance indicators, as they are widely used in the literature.
8
9
10where *n* is the number of data points, *S _{o}* and

*S*are the observed and predicted suspended sediment loads, respectively. The bar denotes the mean of variable.

_{p}For ideal models, NSSS and R^{2} should be 1 and MAE should be 0. NSSS takes values between –∞ and 1. R^{2} ranges from −1 to 1. NSSS shows the degree of explaining variance. If this value approaches to 1, then it means that all the variance can be explained by the model. If R^{2} is closer to 1 or −1, it can be said that there is a statistically significant relationship between the variables. Positive values indicate that two variables are directly proportional and negative values show the presence of inverse relationship. If R^{2} is closer to 0, it means that there is no relationship between the variables. R^{2} = ±0.5 means that there is an average relationship.

## RESULTS AND DISCUSSION

The WFL model proposed in this study employs continuous wavelet transform to address the difficulties associated with conventional fuzzy logic model. The wavelet technique can divide the sediment discharge time series into several bands and can treat the non-stationary properties of time series. In addition, the embedded periodical characteristics of the sediment discharge time series could be detected using the WFL model.

Prior to the prediction of sediment load using the WFL model, it is necessary to make band selections. The important step for wavelet modeling is the selection of bands that have significant power. So far, there has not been any established decomposition rule that separates time series into their bands. The main approach is to use average wavelet spectra for band separation. The Morlet mother wavelet is employed (e.g. Torrence & Compo 1998) for continuous wavelet transform. In this study, the bands are separated according to average wavelet spectra of sediment discharge time series. It is possible to see the significant power at specific intervals from average wavelet spectra. The sediment load variable time series is separated into their bands using these intervals. The average wavelet spectra of the four stations are shown in Figure 2. The power spectra of the stations show different patterns of significant powers. While a dominant annual cycle can be seen clearly for all stations, inter-annual cycles around 3–4 years are also present in Baglik and Coskunlar stations. Based on those power spectra, five significant bands were established for time series data of each station. The selected bands are 1–3, 3–6, 6–10, 10–15, 15–60 months for Baglik station; 1–3, 3–6, 6–10, 10–15, 15–60 months for Camlikaya station; 1–3, 3–6, 6–10, 10–15, 15–60 months for Esiroglu station; and 1–3, 3–6, 6–10, 10–15, 15–60 months for Coskunlar station. The sample time series of wavelet bands for Baglik station are shown in Figure 3. It is possible to see inter-annual periodicities, such as 3–4 years, which corresponds to the 15–60 months band. The annual cycles are represented by 10–15 months band. The 6–10 months band shows the intra-annual cycles and the other bands are for noises. Similarly, it is possible to obtain band series of other stations.

The separated bands contain specific information embedded in the original time series. The bands are free from the effects of processes involved in the generation of time series and represent only one feature of the concerned series. For instance, a higher level band can explain the long time cycles of the concerned variable and eliminates other properties such as high frequency noise data. Conversely, short time cycles hold information about noisy data.

Two previous sediment discharge values are selected as input variables to predict the 1-month ahead sediment discharge for Model-1. The other model scenarios are presented in Table 2. In this study, two-thirds of the time series were selected as training data and the remaining part was left for testing the models. The prediction performances of the FL and WFL models are presented in Tables 3 and 4, respectively.

Overfitting is an important problem in fuzzy model training. When a network overfits the training data, its correlation coefficient reaches the maximum value. Conversely, such networks usually give bad results for the test and validation samples. Therefore, the test and validation sample correlation coefficients should be taken into consideration for selecting the best model scenarios. The model scenarios show that using five or more previous sediment load values as input variables leads to overfitting for the fuzzy logic approach. However, the same situation is not valid for the wavelet fuzzy logic modeling where overfitting is not a problem.

The superiority of the WFL model over FL can be seen from the comparison of Tables 3 and 4. While the FL model gives very low prediction results for all scenarios, WFL models have good performances in at least two scenarios. The model configuration and input combination varies with stations for the WFL model. For instance, the best model scenarios obtained for Baglik, Camlikaya, Esiroglu, and Coskunlar stations are the Models-1, -5, -5, and -1, respectively. Test data MAE values of the four stations range between 172 and 683 t/day. The lowest MAE value was found in Coskunlar with Model-1. In Camlikaya and Esiroglu stations, Model-5 gave the best results. The NSSS values change between 0.82 and 0.99 for testing data. Looking at the results in terms of NSSS shows that Model-1 employed in Baglik station is the best model.

The number of membership functions for fuzzy modeling was found by trial and error. The simulations showed that a wavelet-FL model structure with two Gaussian membership functions and four rules provides the best performance in terms of MAE and R^{2}. The monthly suspended sediment yield prediction results using the past sediment values as inputs are presented in Figure 4. While the WFL method is able to catch the peak values of observed sediment discharge, the FL method misses the peaks and gives average predictions for all stations. The prediction performance increases when times series are divided into several series holding different types of information. The modeling with time series that exhibit similar characteristics can be easier than the time series that include all types of information such as noise, periodicity, trend, etc. Since the WFL method can generate several homogenous time series from original time series and then can make predictions for each decomposed times series, the WFL outperforms FL.

These results showed that using the sub-series decomposed by wavelet approach improves the model accuracies significantly. The WFL model can be employed for 1-month ahead sediment load predictions. In practice, it is important to predict peak values because sediment loads that occur with a flood can reduce the cross-sectional area of the river and mostly prevent the flow of the river. In a management plan, the amount of sediment transported by floods should be present in order to make preparations before the flooding season. As can be seen from the comparisons of observed and predicted sediment load time series, the WFL model can capture the peak sediment values where the FL model fails.

The proposed approach for sediment load prediction improves the stand-alone fuzzy model by incorporating wavelet approach. It is possible to make a discussion based on physical system understanding and time series properties. Fuzzy rules used in the fuzzy modeling are a stationary set of rules over time, while the characteristics of sediment load time series change seasonally. Using wavelets of different time scales, stationary rules can be implemented for a non-stationary system, and still non-stationary prediction behavior can be obtained. Conversely, using reasonable spectral components removes noise and, thus, increases the prediction performance.

## CONCLUSIONS

Sediment load predictions are important for the planning, operation, and maintenance of water structures. In a river, sediment discharges exhibit random behavior that makes sediment load prediction very difficult. In this study, sediment loads were predicted by fuzzy logic (FL) and wavelet fuzzy logic (WFL) approaches. Monthly sediment loads measured at Baglik, Camlikaya, Esiroglu, and Coskunlar stations were employed to set up the predictive models. Previous sediment discharge values were used to predict 1-month ahead value.

Fuzzy logic models may become unable to predict suspended sediment time series due to high non-stationarity and non-linearity, if pre-processing of the input and/or output data is not performed. Wavelet technique was used to decompose sediment time series into its sub-series. Average wavelet spectra were used to decide the selection of bands and, then, each sub-series was modeled by the fuzzy logic approach. It is shown that WFL gives better results than single fuzzy logic modeling. As a result, it can be concluded that using the continuous wavelet transformation as a pre-processing technique can improve the model accuracies significantly.

The combined wavelet method can include boundary effects especially for the low frequency assessments. Various algorithms can be employed to minimize these boundary effects. Conversely, wavelet transforms with down-sampling can give very different coefficients when the input signal is shifted. This can be the main limitation of wavelet transforms in the sediment discharge time series analysis. To overcome this problem, the application of complex discrete wavelet transform (CDWT) can be suggested.

As a future direction, the proposed approach can be tested for higher time resolutions, such as daily and hourly, and integrated into physically based hydrological models.

## ACKNOWLEDGEMENTS

This research was supported by the Graduate School of Science of Istanbul Technical University. We would also like to thank two anonymous reviewers for their comments that greatly improved the manuscript.

- First received 28 December 2014.
- Accepted in revised form 28 April 2015.

- © IWA Publishing 2015

Sign-up for alerts