## Abstract

A reduction in the concentration of chlorine, which is used as a chemical disinfectant for water in drinking water distribution systems, can be considered to be an index of the progressive deterioration of water quality. In this work, attention is given to the spatial distribution of the residual chlorine in drinking water distribution systems. The criterion for grouping the water-quality parameters normally used is highly subjective and often based on data that are not correctly identified. In this paper, a cluster analysis based on fuzzy logic is applied. The advantage of the proposed procedure is that it allows a user to identify (in an automatic way and without any specific assumption) the zonation of the network and easily calibrate the unknown parameters. An analysis of the correlation between the sampling sites for the residual chlorine has been used to assess the applicability of the procedure.

- calibration
- clustering
- fuzzy logic
- networks
- reactions
- water quality

## INTRODUCTION

The quality of the water released in distribution systems can change, in space and time, because of various phenomena: reactions with organic or inorganic compounds in the bulk liquid phase and with the walls of pipes as a result of corrosion, material releases, and/or biofilm formation; and the mixing of water with different origins and/or characterized by different residence times in the network (LeChevallier *et al*. 1990; De Rosa *et al*. 1998; Munavalli & Kumar 2004; Environmental Protection Agency (EPA) 2005).

In particular, the bacterial population is generally increasing in the network, both because of the growth of the microorganisms that are initially present in the water or infiltrated from outside, and because of the re-growth phenomena in parts of the network where favorable conditions occur (long hydraulic detention times, nutrient availability, and high temperature) (Lund & Ormerod 1995; De Rosa *et al*. 1998).

The chlorine, which is often used as water disinfectant, can be regarded as an index of the progressive deterioration of the water quality (Castro & Neves 2003; Ghinelli 2004). In fact, the concentration of chlorine released in the network from the treatment plant is reduced as a result of reactions with inorganic and organic compounds present in the bulk liquid and in the pipe walls (Van der Kooij 1992; Lu *et al*. 1999).

To ensure that a sufficient chlorine concentration (around 0.2 mg/l) is present in every part of the network, booster chlorination facilities can be installed within the network (EPA 2005). For this reason, it is important to estimate the correct residual chlorine content at nodes (or at specific ‘control nodes’) of the drinking water distribution network.

To identify locations where the chlorine concentration drops below a certain level or exceeds a given maximum level, the resource manager of the distribution network must be able to predict the spatial distribution of the residual chlorine.

For this purpose, the resource manager should acquire the necessary computational instruments to evaluate the amount of chlorine in every part of the network. To efficiently assist with this management, the simplest model of chlorine decay with sufficient accuracy should be identified (Fisher *et al.* 2011).

Most of the currently available models of chlorine decay (see as examples Biswas *et al*. 1993; Rossman *et al*. 1994; Ozdemir & Ger 1998; Ozdemir & Ucak 2002; Munavalli & Kumar 2004, 2005; Kohpaei & Sathasivan 2011) consider pipes as ‘chemical reactors’ where chlorine enters at one end, decays because of reactions with organisms in the water (bulk decay), is consumed around the pipe walls (wall decay) because of the biofilm attached to them, and is transported toward the other end of the pipe. Thus, the law of chlorine decay is described by a combination of first-order kinetics explaining the reactions occurring within the water and in the pipe's walls (Fisher *et al*. 2011). These kinetics include three parameters (*k _{b}*,

*k*, and

_{w}*k*– referred to hereafter as water-quality parameters), which have to be determined using experimental data (Rossman

_{f}*et al*. 1994; Hallam

*et al*. 2002).

It should be taken into account that the instantaneous decay rate models for the bulk and wall reactions should be integrated simultaneously (Fisher *et al.* 2011). Previous researches (Powell & Hallam 2000; Ozdemir & Demir 2007) have demonstrated that the parameter determining the bulk-reaction (*k _{b}*) does not vary with the pipe's diameter. Ozdemir & Demir (2007) instead observed a variability in parameter

*k*with the Reynolds number, Re. However, the verification of the relation

_{b}*k*–Re still requires more detailed experiments

_{b}*.*Several experimental analyses indicate that

*k*varies (in the range 0.3–1.398 d

_{b}^{−1}) with an inverse proportionality with the initial chlorine concentration (see as an example Rossman

*et al*. 1994; Furlani & Marinelli 1998). Other authors (Castro & Neves 2003; Lee

*et al*. 2004) have observed a low variation of the chlorine concentration with the parameter

*k*so that they suggest to assume a unique value of

_{b}*k*for all pipes of the network.

_{b}In contrast, the literature (among others Yeh 1986; Yang & Ko 1996) shows that the value of the parameter that determines the wall reactions (*k _{w}*) depends both on the local geometrical characteristics of the pipes (such as the pipe diameter, pipe material, and pipe roughness) and on the flow characteristics (such as the flow velocity and the level of bacterial colonization).

Thus, the spatial variability of the water-quality parameters plays an important role in the accurate evaluation of the chlorine distribution in the network (Yang & Ko 1996; Lee *et al*. 2004).

The number of parameters included in the model could also be considered to be a reasonable measure of the model's complexity. On the other hand, because the number of unknown parameters affects the sensitivity in the prediction error of a model's results (Clark & Coyle 1990; Datta & Sridharan 1994; Munavalli & Komar 2005), it is necessary to reduce the number of parameters as much as possible or to group them into ‘clusters’.

On the basis of the aforementioned, it is clear that the estimation of the unknown parameters requires an efficient criterion for grouping them and the application of calibration techniques to easily and automatically identify their values.

Some researchers (as examples, Garcia-Villanova *et al*. 1997; Clark & Sivagenesan 1998) have estimated the spatial distribution of chlorine in a network by applying regression and/or correlation analyses. They have obtained useful relationships between the water-quality parameters and other specific parameters such as the reaction time, pH, and water temperature. However, this approach may not be suitable to automatically determine the spatial distribution of the chlorine concentration in a network.

Various inverse-based procedures have also been developed to estimate water-quality parameters, and clustering techniques that were previously applied to find solutions both in water networks (Bascià *et al*. 1999) and aquifers (Yeh 1986) have also been applied (Lee *et al*. 2004; Morga *et al*. 2007) to solve the problem of the spatial distribution of chlorine in networks. Lee *et al*. (2004) suggested the classification of sampling sites into groups with similar levels of chlorine. However, this type of approach is limited because the results are directly dependent on measurements and their locations. More recently, Aubry *et al*. (2011) proposed an FEM (finite element method) approach to solve the inverse problem connected with estimations of the water-quality coefficients. Although the procedure appears promising, more applications and experiments are required to generalize it. In contrast, Morga *et al*. (2007) used fuzzy logic (Ross 1995; Yang & Ko 1996) as a clustering technique by assuming as an initial condition an empirical relationship between the residual chlorine and the flow velocity. This procedure is limited because it is only partially automatic, and the resulting zonation depends strongly on the values of the hydraulic variables (flow velocity and water pressure) obtained as outputs of the simulation process. However, the results obtained have substantially confirmed that, as noted by previous researchers (see as an example Silvert 2000; Li *et al*. 2013), fuzzy logic could be applied in controlling available information in water-quality management. Thus, Termini & Viviani (2012) recently improved the approach of Morga *et al*. (2007) by paying particular attention to the choice of the initial partition and the identification of the membership condition for the ‘cluster’.

In recent years, there have been numerous attempts at online monitoring in water distribution systems. This is because of the technological innovation associated with data transfer and the fact that cost reductions for sensors allow the collection and automatic transfer of data from increasing numbers of sensors to the point of use. ‘Online’ modeling can be used to continually assess standards of service complaints for pressure and provide timely warnings of changes in the water-quality conditions. As an example, Machell *et al*. (2010) presented an online modeling package (Aquis) that is driven by near real-time flow and pressure field data gathered using GPRS communication. Landa *et al*. (2010) presented a real-time monitoring system for water distribution networks based on a new formulation of second-order chlorine decay equations, in which the notion of the concentration of the ‘equivalent reducing reacting contaminant agent’ plays the same role as the concentration of chlorine.

The point is that, because the online simulation models require a continuous feed of flow and pressure data from the real networks being simulated, the proper design and management of sensor systems are critical to achieve the necessary level of efficient and reliable operation. Consequently, the limitation of online monitoring is that it depends on the security of the sensor system performance, and well-designed tools for supporting constant system monitoring are required (see as an example Ediriweera & Marshall 2010).

Thus, the aim of the present work is to present an integrated procedure to determine the spatial distribution of residual chlorine in a real-size network. This procedure is based on interfaced codes (commercial and/or appositely elaborated) and includes a fuzzy-clustering procedure to automatically determine the unknown parameters. The advantage of this procedure is that it allows users to automatically identify (without any specific assumption) the chlorine level at any location of the network. The procedure includes the following steps: (1) the simulation step to estimate the flow rate and flow direction in each pipe of the network and the water pressure (hydrodynamic module) and the chlorine concentration (water-quality module) at each node of the network; (2) the clustering step to estimate the spatial diversity of water-quality parameters; and (3) the calibration step to estimate the value of the unknown parameters. It should be noted that the procedure has also the option of by-passing the simulation process for the water pressure and flow rate estimation at ‘control nodes’ of the network and applying the calibration of water-quality parameters by using real-time measured data.

This paper has to be viewed as an extension of the work by Termini & Viviani (2012). This paper is organized as follows. First, the simulation model is described. Then, the procedures for clustering and calibrating the water-quality parameters are explained. Finally, the procedure is applied to a sub-network in Palermo City (Italy), and a correlation analysis is used to assess the procedure's performance.

## THE SIMULATION MODEL

Simulations have been performed with the help of the EPANET modeling software package developed by the US Environmental Protection Agency's (EPA) Water Supply and Water Resources Division (Rossman *et al*. 1993, 1994) and extensively used in the literature (among others Furlani & Marinelli 1998; Ozdemir & Ucak 2002; Castro & Neves 2003; Ho & Khalsa 2008).

The EPANET code includes two modules: a hydrodynamic module, which estimates the flow rate and flow direction in each pipe and the water pressure at each node of the network, and the water-quality module, which simulates the movement and transport of substances in the water under varying conditions.

The hydrodynamic module (EPA 2005) solves the following system of equations

– for each head tank of the network:
1– for each pipe *i–j* and for each node *k*:
2where *y _{s}* is the water level at node-tank

*s*,

*q*is the entering flow rate at node-tank

_{s}*s*,

*q*is the flow rate at a pipe connecting node

_{ij}*i*to node

*j*,

*h*is the hydraulic head at node

_{i}*i*,

*A*is the transverse area of node-tank

_{s}*s*(which is infinite for the sources),

*E*is the topographic level of node-tank

_{s}*s*,

*Q*is the water demand at node

_{k}*k*, and

*φ*is a function that expresses the relation between the head loss and the flow rate at each pipe. In the present study, the Hazen–William equation has been used as follows: 3where

*d*,

_{ij}*L*, and

_{ij}*ɛ*are the diameter, length, and roughness coefficient of the pipes

_{ij}*i–j*, respectively.

The water-quality module (EPA 2005) solves as follows

– the transport equation for each pipe *i–j* of the network:
4– the mass conservation equation, for each node *k*:
5where *c _{ij}* is the substance concentration in pipes

*i–j*, which is a function of distance

*x*along pipes

_{ij}*i–j*and of time

*t*;

*A*is the cross-section of pipes

_{ij}*i–j*;

*V*is the mass of the entering substance at node

_{i}*i*;

*Q*is the flow rate in the reach

_{si}*si*; and

*ϑ*is called the ‘reaction term’ because it combines the decay reactions of the considered substance inside pipes

*i–j*.

In this work, attention is given only to the residual chlorine, which is an excellent parameter for studying the water quality in a distribution system. The EPANET code simulates the water reaction, which determines the chlorine decay caused by its reaction with organic and inorganic substances, neglecting the presence of the pipe; the wall reaction, which is caused by the pipe's deterioration; and the mass transport toward the walls, which depends on the reactions determined by the exchange of mass between the pipe's walls and the flow. The application of the superimposition effects law determines the following reaction term (Datta & Sridharan 1994; Rossman *et al*. 1994; Yang & Ko 1996), which is included in the code:
6where Sh is the Sherwood non-dimensional number that depends on the flow regime, *D* is the molecular diffusivity (equal to 1.21 × 10^{9} m^{2}/s), *r _{h}* is the hydraulic radius, and

*k*and

_{b}*k*are parameters that depend, respectively, on the kinetics developing within the water (bulk decay) and in the pipe's walls (wall decay). From Equation (6), it is clear that the parameter

_{w}*k*is a function of the dimensionless Sherwood number, Sh (which depends exclusively on the hydraulic flow regime), the molecular diffusivity of chlorine in water,

_{f}*D*, and the diameter of pipes

*i–j*,

*d*. Consequently,

_{ij}*k*is uniquely determined once the simulation problem (1)–(3) is solved. In contrast, parameters

_{f}*k*and

_{b}*k*have to be estimated by using field data.

_{w}Thus, it can be concluded that the solution of the hydrodynamic module (1)–(3) depends on the estimation of parameter *ɛ _{ij}* (roughness coefficient of pipes

*i–j*), while the solution of the water-quality module (4)–(6) depends on the estimation of parameters

*k*and

_{b}*k*.

_{w}## SPATIAL DIVERSITY OF RESIDUAL CHLORINE

As mentioned in the ‘Introduction’, Morga *et al*. (2007) developed an empirical clustering technique based on the geometrical and hydraulic characteristics of the network. Lee *et al*. (2004) identified the clusters based on a distance matrix (Euclidean distance) determining the distance to a cluster of selected sampling sites. The point is, to enable the automatic application of the procedure, it would be necessary to identify the ‘clusters’ according to a criterion that is not directly dependent on the values of variables from the simulation process, but that is defined from an arbitrarily chosen number of parameters.

With the aim of satisfying this requirement, in this work, an integrated fuzzy logic technique has been applied. Fuzzy logic (Ross 1995) is, in fact, particularly suitable for analyzing data sets characterized by the uncertainty caused by random factors. Specifically, in the present paper, the fuzzy C-means algorithm in the MATLAB code has been used (Termini & Viviani 2012). The algorithm consists of choosing an arbitrary initial fuzzy partition (or an arbitrary number of centroid partitions) and then iteratively minimizing the difference between two successive partitions. Such a procedure consists of minimizing an objective function that represents the distance from any given data point to a cluster center weighted by that data point's membership grade. Thus, the algorithm requires the definition of a ‘membership’ function for each cluster and the choice of an initial partition. The choice of the initial condition is particularly important for efficient model calibration/scenario runs.

In agreement with Fisher *et al.* (2011), the initial partition could be assumed on the basis of the chlorine concentration resulting from the initial chlorine dose. To apply a criterion that is not dependent on the initial chlorine dose, in this work, the ‘membership’ condition is defined by the matrix, *U _{rc}*, which consists of a number of rows,

*r*, equal to the number of grouping parameters (i.e., the number of measured residual chlorine values at the consumption times

*t*:

*r*= 1, …,

*t*×

*n*; with

*n*= the number of pipes) and a number of columns,

*c*, equal to the number of initial parameters,

*nP*, arbitrarily chosen (i.e.,

*c*= 1, …,

*nP*). The elements of matrix

*U*have variable values between 0 and 1, depending on the degree of ‘membership’ of each pipe in each time considered (0 = no membership). Thus, once the matrix is defined, the ‘membership’ condition for a particular cluster is identified by the maximum value of the degree of ‘membership’ of each pipe of the network.

_{rc}This fuzzy-clustering procedure would also allow a user to take into account both the invariance of parameter *k _{b}* over time and with demand and the introduction of a reacting parameter to estimate the variation of

*k*caused by re-chlorination at every boosting location, according to specific second-order reaction equations (Kastl

_{b}*et al*. 1999).

The advantage of the proposed cluster procedure is that it is a user-friendly, modular system for the dynamic simulation of complex multiple reactions, which may or may not be linked by hydraulic flows. Once the clusters are found, the optimal parameter values are derived by minimizing the sum of the squared difference between the measured chlorine concentrations and the model estimates at corresponding times during the simulation.

## PARAMETER CALIBRATION

### Roughness coefficient of pipes

The water pressure in each node and flow rate in each pipe of the network are estimated as the result of the simulation problem (1)–(3) (hydrodynamic module). To obtain reliable results, the parameter *ɛ _{ij}* has to be calibrated.

The calibration of parameter *ε* is performed by solving the inverse problem and imposing the following objective function (F.O.):
7where *h _{i,m}* and

*h*, respectively, represent the measured and estimated pressure heads at node

_{i,s}*i*,

*σ*is the mean square error between the measured and simulated water heads, and

_{h}*n*is the number of nodes controlled by a remote control system. The decision variable of Equation (7) is the unknown parameter

_{t}*ε*, and the state variables are the measured pressure heads. The optimization problem in Equation (7) is solved by applying the numerical code previously elaborated by Tucciarelli & Termini (1998). For the simulations performed in the present work, the daily water consumption law for each node is determined by considering the daily variation coefficient reported in Figure 1.

The daily water consumption law of Figure 1 was determined by applying a stochastic model calibrated in previous studies conducted by Fontanazza *et al*. (2006a, 2006b) by using measured water consumption data collected in three residential buildings in the same network being considered in the present work. The water consumption data were obtained using a volumetric pulse flow meter installed at the building's service connection to the distribution network considered for the application.

However, by installing an automated meter reading (AMR) device on the user's property, it would be possible to track the user's water consumption up to four times daily and to use the real-time data instead of the water consumption law (Figure 1). Thus, the use of an AMR system would allow a network manager to perform simulations using real-time data on a daily basis.

### Water-quality parameters

The calibration of the water-quality parameters, *k _{b}* and

*k*, is performed by solving the following optimization problem: 8where

_{w}*C*and

_{i,m}*C*are, respectively, the measured (for example, by a remote control system) and calculated chlorine concentrations,

_{i,s}*σ*is the mean square error between the measured and calculated chlorine concentrations, and

_{C}*n*is the number of measurements.

_{m}## CASE STUDY

### Characteristics of the network

An analysis has been conducted in the sub-network ‘Oreto-station’ in the city of Palermo (Italy). A plane-view of the network is given in Figure 2. This network includes 394 HDPE (high-density polyethylene) pipes (with diameters ranging from 110 to 225) and 263 nodes. The network is fed through two inflow pipes with diameters of 500, connecting inflow nodes 1 and 107 to the head reservoir (S. Ciro Basso – 54 m a.s.l.). The sub-network is provided with a remote control system that makes it possible to monitor the water pressure (and flow rate) at eight internal nodes (nodes 1, 47, 57, 107, 134, 162, 210, and 248) of the network. The chlorine concentration is also recorded through a chlorine residual meter installed at these nodes. Both the water pressure and the chlorine concentration data were made available by the municipal water service management company (AMAP s.p.a.) for the network.

To increase the number and the spatial distribution of the field data, the residual chlorine was also measured at other sites (opportunely selected) of the network. The sampling sites are also reported in Figure 2 (grey/red circles). The sampling sites were selected to include commercial users with direct links to the network and residential users without domestic storage tanks (the presence of such tanks could affect the results because of a stagnant water condition). The residual chlorine measurements were carried out using two multi-parameter photometers made by Hanna Instruments (mod. HI 9310). Three field measurement campaigns were conducted. The first (13 October 2004) included 13 measurement sites, while the second (8 February 2005) and third (14 March 2005) included 21 measurement sites. During each campaign, four simultaneous series of measurements were performed, spaced 3 hours apart (i.e., at 09:00, 12:00, 15:00, and 18:00). Finally, a total of 220 measurements was obtained.

Because the examined network is composed of new pipes made of the same material (HDPE), a unique value for the roughness coefficient, *ε*, was assumed for the whole network.

As a result of the optimization problem in Equations (1) and (2), a Hazen–Williams coefficient equal to 140 was obtained.

### Application and results

To assess the goodness of the procedure, first, an empirical clustering technique based on the physical and hydraulic characteristics of the network and previously developed by Morga *et al*. (2007) was employed. Then, the results obtained were compared with those determined by applying the proposed procedure. In the following analysis, a constant value of *k _{b}* has been used. In agreement with other authors (Lee

*et al*. 2004), it has been assumed that

*k*= 0.55 d

_{b}^{−1}.

#### Empirical clustering and calibration

The analyzed network consists of new pipes made of the same material. Thus, in accord with Morga *et al*. (2007), the empirical zonation of *k _{w}* was performed only by considering the pipe's diameter and the flow velocity.

Some studies (Levi & Mallevialle 1995; Lu *et al*. 1995; Kiènè *et al*. 1998) have shown an inverse proportionality between the chlorine concentration and the ratio between the wet surface and the water volume per unit pipe length. This result might indicate a high consumption of chlorine in pipes with a small diameter. On the other hand, it has been observed that a high consumption of chlorine occurs with low flow velocities.

Thus, for the examined network, the first zonation map was produced by considering the distribution of the diameters of the pipes, as reported in Figure 3(a).

Then, for each considered consumption time (at 09:00, 12:00, 15:00, and 18:00), the flow velocity in each pipe was estimated by solving the hydrodynamic module shown in Equations (1)–(3). Thus, the pipes were grouped by fixing the variability ranges for the flow velocity, and a second zonation map was produced. Finally, the solution of the water-quality module, Equations (4)–(6), under the assumption of a unique value of *k _{w}* = 0.625 m/d (i.e., the mean value of

*k*obtained from Equation (8) – see previous section) gave the chlorine concentration in each pipe of the network. Thus, the pipes were again grouped by fixing the variability ranges of the chlorine concentration so as to obtain a third zonation map. A comparison between the flow velocity–zonation map and the chlorine–zonation map allowed us to verify that, in accord with the previous literature results (Levi & Mallevialle 1995; Lu

_{w}*et al*. 1995), the flow velocity–zonation and chlorine–zonation are very similar. Thus, a final velocity/chlorine zonation map for the network was obtained for each considered consumption time. As an example, Figure 3(b) reports the velocity/chlorine zonation map obtained at 09:00.

The final zonation of the network was defined by comparing the velocity/chlorine zonation map with the distribution of the pipe diameters. Finally, five zones were identified for the examined network (zone 1 = Policlinico, zone 2 = Archirafi, zone 3 = Perez, zone 4 = Oreto, and zone 5 = Stazione), as Figure 4 shows; the corresponding variability ranges of the flow velocity and chlorine concentration are reported in Table 1. From Table 1, it should be noted that, although zones 2 and 3 are characterized by the same flow velocity and chlorine concentration variability ranges, two different zones were considered because they were characterized by different pipe diameter distributions.

Then, the calibration of the ‘zonated’ *k _{w}* parameters was performed by using the measured chlorine data and solving the optimization problem, Equation (8).

Finally, to investigate the relation between the considered variables (the pipe's diameter and pipe's flow velocity) and parameter *k _{w}*, the weighted averages of the diameters and flow velocities of the pipes included in the

*p*th zone (the weight was defined on the basis of the water volume in each included pipe), and , were estimated. The regression analysis, performed using and as independent variables, yielded the following relationship (with

*R*

^{2}= 0.998): 9where the coefficients

*α, β, ξ, x*, and

*y*have to be determined by using experimental data. For the considered study case:

*α*= 35.419, β = –259.820,

*ξ*= –18.515,

*x*= 0.308, and

*y*= 6.313.

The high value of the determination coefficient *R*^{2} confirms the strong relation between the parameter *k _{w}* and the aforementioned variables (pipe's diameter and flow velocity), in agreement with the literature findings (Levi & Mallevialle 1995; Lu

*et al*. 1995; Kiènè

*et al*. 1998). Equation (9) suggests that the spatial distribution of the unknown parameter,

*k*, could be defined on the basis of this empirical relationship and by using simulated data. However, it should be clear that the use of Equation (9) is restricted to the data considered in the present work; further data would be necessary to generalize the expression.

_{w}#### Clustering by fuzzy logic and calibration

The empirical clustering procedure previously applied presents two main limitations: (1) the resulting zonation depends on the initial condition (determined by assuming a unique value of *k _{w}* for the entire network), and (2) it does not allow the user to work with a sufficient level of automation. Thus, clustering by fuzzy logic has been applied.

The algorithm requires the definition of a ‘membership’ function for each cluster and the choice of an initial partition. To assess the advantages of the proposed procedure, two initial partitions were considered for the application. First, the initial partition of Figure 4 (empirical initial partition) was applied, and the membership condition was defined on the basis of the estimated chlorine and flow velocity values. Then, the procedure was applied by considering an initial partition based on the definition of a ‘membership’ function.

As a result of the first application (i.e., by considering the initial empirical partition), four zones for parameter *k _{w}* were determined. The values of

*k*, calibrated as the solution of Equation (8), are reported in Table 2.

_{w}Then, the procedure was applied again by considering, as the initial partition, the one determined through a specific ‘membership’ condition for each pipe of the network. The ‘membership’ condition was defined by the matrix, *U _{rc}* (because it is

*t*= 4,

*r*= 1, …, 4 ×

*n*; and

*c*= 1, …,

*nP*). For the examined network, it was assumed that

*nP*= 10. Thus, once the matrix is defined, the ‘membership’ condition for a particular cluster is identified by the maximum value of the degree of ‘membership’ of each pipe of the network.

As a result of the algorithm application, four zones were identified by grouping clusters 3, 8, and 9; clusters 4, 5, and 6; and clusters 1 and 10 of the initial partition. Then, the calibration of parameter *k _{w}* (indicated as

*k*) was performed using Equation (8). Table 2 also reports the final zonation of the network and the value of parameter

_{w,c}*k*determined for each zone.

_{w,c}Table 2 highlights that the values of parameter *k _{w,c}* obtained by applying the membership condition are equal to those obtained by applying the empirical zonation.

### Sensitivity analysis of chlorine concentration with water-quality parameters

#### Parameter *k*_{b}

_{b}

To verify the sensitivity of the chlorine concentration with parameter *k _{b}*, Equation (8) was solved by assuming a constant value for parameter

*k*and a value of

_{w}*k*that varied in the range of 0.1 ÷ 1.398 d

_{b}^{−1}(Furlani & Marinelli 1998). As a result, very similar values for

*σ*were obtained for different values of

_{C}*k*. This can be observed from Table 3, which reports the values of

_{b}*σ*obtained for different values of

_{C}*k*. Thus, a low sensibility of the chlorine concentrations with

_{b}*k*was observed.

_{b}#### Parameter *k*_{w}

_{w}

To verify the sensitivity of the residual chlorine with the spatial variability of parameter *k _{w}*, Equation (8) was solved by considering separately the data measured during each field campaign, by assuming a constant value for coefficient

*k*and varying the value of

_{b}*k*(which was unique for all pipes of the network). In agreement with the results obtained in previous works (Rossman

_{w}*et al*. 1994; Furlani & Marinelli 1998), the variation in

*k*was in the range 0.1–1.15 m/d. Table 4 reports the values of

_{w}*k*and the corresponding values of

_{w}*σ*obtained, for each case, as a result of the optimization problem of Equation (8). It can be noted from Table 4 that

_{C}*k*assumes very similar values (it varies between 0.6 and 0.65 m/d), but the mean squared error assumes increasing values. This confirms that the spatial variability of

_{w}*k*could determine different values of residual chlorine in a network.

_{w}## CORRELATION ANALYSIS FOR SAMPLING SITES AND VALIDATION

To further verify the zonation determined by applying the aforementioned procedure, the corresponding associations among sampling sites were checked using a correlation analysis.

In fact, the sampling sites with a high correlation coefficient should be included in the same zone of the network (Lee *et al*. 2004). For this purpose, a correlation matrix was generated for all the sampling sites. The terms of the correlation matrix are the correlation coefficients of the chlorine concentration between a sampling site and the other ones. The calculated correlation matrix is shown in Table 5. The correlation matrix is a squared matrix (21 × 21) with diagonal terms equal to 1.0; the maximum (less than 1.0) value of the correlation matrix has been boxed. Table 5 shows that the sampling sites with a high correlation coefficient (squared) belong to the same zone of the network. Figure 5 reports the final zonation and the positions of the sampling sites. From Figure 5, it can be seen that the sampling sites that belong to the same zone are characterized by high values for the correlation coefficient. As an example, sampling sites P5 and P13 have a correlation coefficient equal to 0.98 and belong to the same zone.

Then, the good fit between measured and estimated values of chlorine concentration has been verified and the root mean squared error (*σ _{c,t}*) has been used as indicator. It is as follows:
10where

*N*is the total number of measurements considered. Figure 6 reports the pair of values (

*C*and

_{i,m}*C*) and the line of perfect agreement (bisector line). The error bar is defined by the value of

_{i,s}*σ*. As shown in Figure 6, with a few exceptions, the points arrange around the bisector line and the

_{c,t}*σ*is quite low in comparison to the magnitude of the measured values.

_{c,t}## CONCLUSION

This work considered the simulation of the decay law of chlorine in drinking water systems. In particular, attention was given to the identification of the spatial diversity of the water-quality parameters. The criterion for grouping the water-quality parameters was based on fuzzy logic. An integrated numerical technique, including interfaced codes (commercial and/or appositely elaborated), was applied to determine the spatial diversity of the residual chlorine in a sub-network in Palermo City (Italy). The procedure included the EPANET code for the estimation of flow rates, water pressure, and the transport of substances in the water of the network (hydrodynamic and water-quality modules); a C-means fuzzy algorithm for the pipe's clustering, and a previously developed optimization algorithm (Tucciarelli & Termini 1998) for calibrating the unknown parameters. The advantage of the proposed procedure in comparison with others proposed in the literature is that it allows the user to automatically identify the zonation parameters of the network without any arbitrary assumption. The initial partition of the fuzzy-clustering algorithm is based on the identification of the membership condition for a ‘cluster’.

The procedure has the option of by-passing the simulation process to determine the water pressure and flow rate at ‘control nodes’ of the network and applying the water-quality parameter calibration by using real-time measured data. Thus, the advantage of the proposed clustering procedure is that it is a user-friendly, modular system for the dynamic simulation of complex multiple reactions, which may or may not be linked by hydraulic flows.

The application of this procedure to a real-size network determined a zonation of the quality parameter *k _{w}* that was basically equal to that obtained by considering, as the initial partition, the empirical zonation determined by the empirical relationship between the residual chlorine and the flow velocity.

To assess the procedure, the associations among sampling sites were checked using a correlation analysis. As a result, it was shown that the sampling sites with high correlation coefficients belonged to the same defined zone of the network.

- First received 8 August 2013.
- Accepted in revised form 11 October 2014.

- © IWA Publishing 2015

Sign-up for alerts