Data-driven water distribution system analysis – exploring challenges and potentials from smart meters and beyond
Abstract
The availability of clean drinking water at any time is often taken for granted. However, the reliable supply of drinking water is, unfortunately, challenged by various threats, such as global warming and aging infrastructures, which put an immense pressure on the drinking water systems as we know them today. Consequently, utilities, technology providers and researchers seek to identify optimised and new approaches to maintaining and improving the quality of the delivered water. In an age of digitalisation, data-driven approaches are becoming increasingly important, as they have demonstrated various benefits for the operation and design of water distribution networks. However, the increased collection and application of the data also pose a major challenge to the water sector. This PhD developed methods to help utilities in validating and applying their data in novel ways by analysing ‘real world’ data obtained from five Danish utilities, applied in six case studies. The PhD study was sectioned into: 1) data collection; 2) data validation and reconstruction; and 3) data application. Data-collection devices such as smart meters are increasingly deployed throughout water distribution systems. When utilities introduce smart meters, the selection of sampling resolutions has a trade-off between the applicability of the collected data and transmission costs. Analysis of a district metered area with smart meters installed revealed that common sampling resolutions of between 1 and 24 hours are sufficient for water loss assessments as long as utilities have representative demand patterns of their network available. However, sampling resolutions < 1 hour are potentially important to obtain reliable water quality simulations. Automatic validation and reconstruction processing of the collected data are of paramount importance for utilities. The PhD project developed a systematic approach for categorizing anomalies. Four categories were introduced, with Type 0 describing system anomalies, Types 1 and 2 describing sensor data containing a low data quality, and Type 3 covering sensor data anomalies storing information about actual – though unusual – events appearing in the water distribution network. To identify anomalies of Types 1 and 2, seven validation tests were developed. Analysis of pressure and flow data sets from three Danish utilities revealed a large proportion of anomalies, with on average 10% missing data and up to 35% anomalies of Types 1 and 2 in a utility’s pressure data sets. These high numbers also emphasised the need for reconstruction processes to generate reliable data streams that are required in data applications. An example was presented whereby artificial neural networks were used to provide missing data and to further validate dubious observations. The collection of data from water distribution systems is not a new concept, but large amounts of data (such as temperature data) are often left unused due to a lack of evidence of successful applications. To show the benefits of temperature data, a temperature model and a hydraulic model were combined to identify the status and location of valves in the network. This novel approach and field tests in the network unexpectedly revealed various anomalies of Type 0 in the utility’s asset database, ultimately casting doubt on the validity of the hydraulic model. As long as such anomalies prevail in the data sets, it is not possible to apply advanced data-driven applications successfully. Another issue in the case study included a low quantity of applicable temperature data. Smart meter temperature data, potentially available in each household, can be used to overcome this challenge. In another case study, the simulated temperature throughout a district metered area showed a satisfying resemblance to smart meter temperature data (average root mean square error of 0.9 ºC). This highlights the potential of using smart meter temperature data for more advanced applications, such as leakage detection and valve status detection. Silo thinking is traditionally a common feature of the water sector, and the value of water supply data is thus often overlooked in external applications. In a case study, the effect of deploying heat pumps on the water distribution network mains was assessed as a supplement to the district heating system of Copenhagen. A net heat extraction potential of 20.7 MW was estimated. Moreover, this caused the share of users complying with an upper temperature limit of 12 ºC to increase from 41% to 81% during August. In another case study, smart meter water consumption data were linked to an urban drainage model to compare simulated wastewater flows with in-sewer observations. The in-sewer observations were found to be erroneous, and the smart meter data were deemed more valid in estimating dry weather flow than in-sewer observations. Overall, the project showed that many anomalies prevailing in the utilities’ asset databases and sensor data are first discovered through the application of the data. As long as utilities cannot maintain a high level of data reliability, it is doubtful whether more sensors will increase utilities’ understanding of their systems. The true potential of the data will not be unlocked until a high level of data reliability is secured. In the coming years, utilities, technology providers and researchers should together identify methods for reducing the uncertainties prevailing in asset and sensor data, making it possible for the sector to reach higher levels of digital maturity.