Data-Driven Financing

June 9, 2020
Machine learning helps set stormwater utility rates

In the rolling hills, 45 minutes to the west of the Dallas/Fort Worth metroplex sits the City of Mineral Wells. A small community of about 15,000 people, the city has suffered more than its fair share of flood-related damages over the last decade. The community’s stormwater infrastructure has long been in need of expansion, repair, and upgrades. To help finance these improvements, the city contracted Lockwood, Andrews & Newnam Inc. (LAN), a national planning, engineering, and program management firm, and NewGen Solutions, a financial management consulting firm, to undertake a stormwater utility fee study. Using the results of the study, the city would be able to institute a fee schedule that would capture the funds necessary to make desperately needed stormwater infrastructure improvements.

Stormwater utility fees have become increasingly popular methods of capturing funding for stormwater upgrades and maintenance. A stormwater utility fee is much like the fee for other public utilities. Consumers pay a fee that corresponds to the amount of stormwater service they enjoy. Consumers who send a large amount of stormwater runoff to the stormwater system represent a heavier user of the system than a consumer who sends little to the stormwater system. However, unlike other public utilities, there is no metering device that captures utility use at the consumer level. To calculate stormwater utility consumption, another method must be used.

Impervious surface coverage is a frequently used proxy for stormwater utility consumption. Areas of impervious surface, such a parking lots, driveways, and roofs do not absorb water. Stormwater will quickly run off impervious surfaces into the stormwater system. In contrast, areas of pervious surface like grass, dirt, or forested lands will absorb water. By using impervious surface coverage to determine relative contributions, a community can establish an impartial, quantifiable system for assessing a stormwater utility fee. This impervious surface dataset forms the backbone of the stormwater utility fee.

Impervious Surface & Machine Learning
To generate this impervious data set, the project team used supervised machine learning. Machine learning is an emerging technology that shows tremendous promise in the field of civil engineering. We enjoy the benefits of machine learning on a daily basis. Things like GPS directions, Facebook’s “People You May Know,” and even email spam filters take advantage of machine learning to deliver a better user experience. All of them examine large quantities of data and then identify patterns in the data: the fastest route, an old friend from high school, or an email that can be ignored.

Machine learning algorithms can be grouped into two classes: supervised or unsupervised learning. In the case of supervised learning, training input data is coupled with an output. An email spam filter provides a good example of supervised learning. When an email is deleted without having been read, it provides an example of the type of content that the user is not interested in. The training data (the inbox) is coupled with instruction (read or delete). With enough training, the spam filter will learn to distinguish spam emails from non-spam emails.

The impervious surface dataset was derived from National Agricultural Imagery Program’s (NAIP) aerial imagery. NAIP imagery was chosen for its reasonably high resolution (it can resolve areas less than 3 square feet) and for having four-band imagery. Four-band imagery includes the standard three-band imagery—red, green, and blue—but also includes near-infrared imagery for a fourth wavelength. This fourth wavelength is important because it allows for a fourth dimension of discrimination when comparing land cover classes.

Several different machine learning training datasets were assembled. The datasets were intended to complement each other. Weaknesses in one were offset by the strengths of other datasets. In all cases, the algorithms were trained using supervised machine learning. The algorithms were provided operator-defined examples of different land cover classes (e.g. dirt, pavement, grass, etc.). From those examples, the algorithms were able to develop statistical definitions of each land cover class based on the red, green, blue, and infrared values in the imagery.

The first impervious surface dataset created, the Simple Dataset, used seven different operator-defined land cover classifications. The seven different classes ultimately were reclassified into either impervious or pervious land cover (shown in Table 1). 

This method produced an impervious surface dataset that was low in uncertainty but high in false positives for impervious cover. The algorithm assigned a land cover classification in almost all areas with very little areas of indeterminate classification. While near adequate, the assignment of land cover classes was not perfect; the algorithm incorrectly identified areas of impervious cover. In this case, uncertainty does not refer to the certainty of a correct classification, but rather the certainty that a classification has been made.

To complement the Simple Dataset, a refined dataset was developed. This second dataset was developed by combining three “component datasets” into the Composite Dataset. The three-component datasets were assembled to minimize incorrect classification by taking pairs of spectrally similar land cover classes and training algorithms to identify those in isolation. For example, dirt and pavement can appear spectrally similar; both are often shades of grey or brown. While spectrally similar, one is pervious and one is impervious, so it is critical to distinguish the two. This comparison provides a good example of the benefit provided by the near-infrared spectrum data. If two classes are statistically indistinguishable in the visible spectrum, the near-infrared band may provide the divergence needed to discriminate between the two classes.

Rather than using seven classes in the training data, these component datasets used only three classes each: the two paired classes of interest and a third class containing all other land cover classes. For example, when isolating areas of pavement from areas of dirt, the three classes used were Dirt, Pavement, and everything that was neither dirt nor pavement. In this manner, the algorithms can isolate dirt and pavement without the need to identify five additional land cover classes. The five-in-one class simply provides a background against which the classes of interest can be contrasted. The three-component datasets were assembled as described in Table 2. 

In the component datasets, it was assumed that there would be low incorrect classification in the classes of interest, while there would be significant error outside the classes of interest. For example, in component set 1, areas of dirt and pavement (as observed in aerial imagery) were correctly identified as such while many grassy areas (part of the background class) were incorrectly identified as dirt. This expected relationship between error and land cover classes was observed and was present in all three component datasets. While this may sound problematic, this relationship allowed for a simple and effective classification system for the ultimate pervious/impervious classification.

The three-component datasets were combined into the Composite Dataset. The Composite Dataset was created in such a manner that the contributions from each component dataset were carried over into the Composite. Each pixel in the Composite Dataset carried three values, one from each component set. These three values were examined to produce an impervious/pervious classification. Depending on the combination, the pixels were reclassified as impervious, pervious, or undefined. For example, if a pixel was identified as Dirt in Set 1, and background in Sets 2 and 3, it was reclassified as pervious. In any case, where a single component set returned a positive identification and the two other component sets classified an area as “background”, it was taken to represent a positive identification and the classification scheme shown in Table 1 was followed. In many cases, areas were simultaneously classified as two or three different land cover classes. To resolve these conflicts, the particular combination of results was examined. In theory, 27 different possible combinations of results were possible. In practice, the Composite Dataset returned only 21 combinations. Table 3 provides an example of some combinations and their resolutions. 

The majority of the Composite Dataset had no conflict and was easily classified. Areas that were simultaneously classified as two or more pervious land cover classes were reclassified as pervious. Similarly, areas that were simultaneously classified as two or more impervious land cover classes were reclassified as impervious. Finally, any areas with disagreeing classifications or no positive identifications were classified as undefined.

The resultant Composite Dataset exhibited complementary characteristics when compared to the Simple Dataset described earlier. The Composite Dataset exhibits comparatively high uncertainty (the undefined areas) but comparatively little incorrect classification. To address this, in any location where the Composite Dataset was undefined, the classification from the Simple Dataset was inserted. In this manner, the best part of each dataset, the classification strength of the Composite Dataset and the low uncertainty of the Simple Dataset, was used to produce a useful product. In the few cases where a pixel was unresolved in both the Composite and Simple Datasets, the classification (either pervious or impervious) of the majority of the surrounding eight pixels was used.

Setting Utility Rates
Once the impervious surface dataset was created, NewGen Solutions ran financial models to determine the appropriate monthly billing amounts for existing utility billing customers. The monthly stormwater utility fee for most consumers increased. This is not surprising; the community had a shortfall in stormwater utility expansion and maintenance funding. The existing stormwater utility billing scheme charges a flat rate of $2.50 for all utility billing customers, regardless of property type, size, or impervious surface coverage. The proposed stormwater utility fee will switch to an Equivalent Residential Unit (ERU) based scheme. The ERU is a unit of area measurement equivalent to the amount of impervious surface found on a typical single-family residential property in the area. In Mineral Wells, LAN found the average single-family residential property had 2,600 square feet of impervious surface on the lot. Accordingly, the ERU was set to 2,600 square feet.

In an ERU-based scheme, the fee is proportional to ERU. To fully fund the stormwater utility, a per-ERU fee of $3.89 per month will be required. This rate is within the range of $2.50 to $6.50 per ERU per month seen in similar Texan communities. In this ERU-based scheme, all single-family residential utility billing customers will pay a flat rate of $3.89 per month. Rather than a true impervious surface-based fee for single-family residential customers, the common practice of a flat rate for single-family residential consumers provides a significant reduction in administrative burdens. All non-single-family residential consumers will be billed on an ERU-based fee schedule.

A handful of very small non-single-family residential properties (those with less than 0.64 ERU of impervious surface), can expect a reduction in stormwater utility fees. All other utility billing customers will see an increase in fee. In order to prepare the community for ERU-based billing, it is possible that the rate may initially be set at $2.50 per ERU. At this rate, single-family residential consumers will not see an increase in their stormwater utility fee. All other consumers would experience the shift to ERU-based billing. Over time, using a publicly announced schedule, the City of Mineral Wells may slowly increase the rate until it reaches the $3.89 rate required for full utility funding. This slowly ramping schedule would allow for utility billing customers to make plans and ensure all financial considerations are made with the new utility rates in mind. The City Council of Mineral Wells will have the authority to deliberate and set utility rates. 

About the Author

Tak Makino

Tak M. Makino, CFM, is a Flood Mitigation Manager at Lockwood, Andrews & Newnam Inc., a national planning, engineering, and program management firm.