The International Stormwater Best Management Practices (BMP) Database is and will continue to be a valuable resource for the stormwater management profession. Its breadth and usefulness will increase over time. An important consideration is how the database should and should not be used. I hope the suggestions in this article will stimulate further observations by others.
Of concern is the manner in which the performance data may be used to draw conclusions about the performance of stormwater treatment systems commonly called structural BMPs. The database authors have prepared several summaries of performance data, such as Table 1 and Figure 1 (Geosyntec and Wright Water Engineers 2007a and 2007b). The database is organized into 14 categories of treatment system types. Examples are biofilters, media filters, retention (wet) ponds, wetlands, and hydrodynamic devices.
This engineer presents several observations related to the structure of the database relevant to its use in generating summaries like Table 1 and Figure 1. They overlap to some extent but are discussed separately. These observations are grouped as:
- Categories containing two or more types of treatment systems that are substantially different
- Facilities of the same type of treatment system designed to substantially different engineering criteria
- Facilities of the same treatment type but with significant differences in site characteristics that may affect performance
- Inclusion of field studies in which the performance is substandard
- Use of percent removal as an indicator of performance
- Use of effluent concentration as an indicator of performance
Let’s cover each of the above issues.
1. Categories containing two or more types of treatment systems that are substantially different. The authors of the database speak to this issue with particular attention to the category of hydrodynamic devices (Geosyntec 2007a, b) stating: “BMPs have been grouped into broad categories. These categories may mask distinctive differences in design and performance in subcategories for multiple BMP types. This is particularly true for the Hydrodynamic Device category, which represents a wide range of various proprietary and non-proprietary device types.”
The category of hydrodynamic devices is a good example of this issue, as noted by the authors of the database. The category currently contains about a dozen different treatment types, of which about two-thirds are manufactured systems. These include swirl concentrators (also known as vortex separators); similarly sized vaults that do not possess swirl motion (e.g., Stormceptor); baffle boxes; oil/grit separators; and oil/water separators. It is reasonable to split the current category into three categories as commonly used by stormwater engineers at this time: oil/water separators, oil/grit separators, and manufactured vaults. I have already discussed elsewhere why the term hydrodynamic separator should be dropped (Minton 2007). The term has never been defined and is a distinction without merit. All of the devices placed in the current category of hydrodynamic devices are simply wet vaults and should be identified as such.
I make a distinction between oil/water and oil/grit separators for sound engineering reasons. Oil/water separators have been uniquely sized for many decades following an established method that predates the use of separators in stormwater treatment (API 1990). There are two types: a large baffled vault commonly referred to as an API separator (API for American Petroleum Institute), and the coalescing plate separator. They are sized to obtain high removals of oil, grease, and total petroleum hydrocarbons (TPH), and with respect to stormwater treatment are suggested for limited applications (Washington 2005). In contrast, what has been historically called the oil/grit separator, a small vault with baffles, is substantially smaller, on the order of one-fifth the volume of an API separator. Yes, they remove some oil and TPH. But then so do grass swales and wet ponds, and they are not called oil/water separators. The baffle box is essentially an oil/grit separator, perhaps sized differently. Regardless, what is called a baffle box should be placed with what we have been calling the oil/grit separator.
As the sizing method (model selection) for manufactured vaults differs from that for oil/water separators, as well as their application–that is, sediment removal–they should be placed in their own category and evaluated separately as to performance. However, it is not unreasonable to consider these as oil/grit separators. It has been concluded by most knowledgeable state agencies that these devices provide a lower level of treatment, perhaps half the common performance goal for total suspended solids (TSS) (for example, Technology Assessment Protocol- Ecology [TAPE] and Technology Acceptance and Reciprocity Partnership [TARP] certification process decisions). Their unit volumes (cubic feet per acre treated) are similar to those for what we have been calling oil/grit (baffle box) separators.
The issue acknowledged by the database authors applies to three additional categories as well–biofilters, media filters, and detention basins. The category of biofilters contains strips and swales, which are designed quite differently. Stormwater enters each in a distinctly different manner: one as sheet flow of modest depth and flow velocity, and one as concentrated flow of greater depth and velocity. This difference likely affects performance per unit area of treatment. The two are as distinctly different as wet ponds and wetlands, which are placed in separate categories. Also in this category are facilities that are an amalgamation of strip and swale, commonly grassed freeway medians. They are essentially very long swales, beyond the usual length of swales designed by standard procedures (using Manning’s equation). There are also two facilities identified as “unimproved ditches,” also with lengths substantially beyond swales of the usual length. Grouping what are essentially four different treatment types into generalized analysis is not appropriate. Each type should either be placed in a separate category or in a separate subcategory within the current category.
In the category of media filters, we find sand filters with filters whose media contains various amendments. Sand filters are generally viewed as incapable of removing dissolved pollutants, although there are data indicating removal of dissolved zinc and copper by mechanisms not yet understood (Caltrans 2004, Portland 2007). Regardless, sand filters should be placed in a separate category given their widespread use, common design criteria, and little expectation for the removal of dissolved pollutants. We also find two vertical gravel filters (one is called a stone swale, illustrating the confusion of terminology), whose data indicate that they do not perform as well as sand filters, which is to be expected. Inclusion with sand filters, therefore, inappropriately skews performance statistics.
Within the category of media filters are essentially amended sand filters: peat filters (two) and bioretention filters (one). Certainly the number of studies of bioretention filters placed in the database will rapidly increase in contrast to those for peat filters (called organic filters by some), an uncommon and what appears to be a little-used treatment system. Given the intense interest in bioretention filters, they deserve their own category. As it is likely the performance of peat (organic) filters will not differ significantly from that of bioretention filters, the former can be placed with bioretention filters. As dry swales are essentially sloped bioretention filters, they should be placed in this category, although none is in the database at this time.
We also find manufactured filters in the media filter category. These systems should have a separate category, as their method of sizing differs substantially from that of the traditional flatbed sand and amended sand filters, including bioretention filters. Most manufactured filters have relatively coarse media with a short residence time of a few minutes, whereas flatbed sand and bioretention filters have fine media and a residence time of several hours.
Grouping of amended filters is complicated by the use of different amendments specific to a targeted pollutant. This complication exists for both public-domain and manufactured filters. Organic amendments, such as compost, are intended for dissolved metals and anthropogenic organics, whereas activated alumina is intended for dissolved phosphorus. It is perhaps unrealistic to place these in separate categories or subcategories at this time. However, it would be beneficial for users of the database if the authors were to note the media type with its intended purpose. Lacking these distinctions, database users may group into an analysis filters whose treatment objectives differ.
The third category to be discussed is detention basins. The category apparently contains two different treatment types: dry extended detention basins and wet extended detention basins. The latter includes a shallow wet pool. Because the wet pool is added with the specific purpose of improving performance over that of the dry basin, these two types should reside in separate categories. We then have four categories of basins: extended detention, wet extended detention, wet ponds, and wetlands. However, there are wetlands with extended detention volumes. As this type is also commonly sized as an extended detention basin (micropool wetland) or as a wet extended detention basin (marsh wetland), it is arguably more appropriate to place these versions in the wet extended category. This gives three categories: dry extended detention, wet extended detention, and wet basins without any flow restriction. Wet ponds and wetlands are in effect merged. This grouping is based on the arguable view that the volume specification is the more dominant indicator of performance (100% live storage, 50% live/50% retained storage, and 100% retained storage) rather than plant density, which varies substantially between the many variants of wet ponds and wetlands. Stormwater engineers want to know the difference in performance between the three types of basins, given that each has other advantages and disadvantages.
2. Facilities of the same type of treatment system designed to substantially different engineering criteria. Modification of the 14 categories as suggested above addresses to some extent the issue of different design criteria in a given category, but the problem remains with several of what are the same type of treatment system such as swales, strips, and wet ponds. We address this issue here with the category of retention (wet) ponds.
Figure 2 presents data from the database for wet ponds and wetlands, as this engineer questions the distinction. Represented along the x axis is the unit volume ratio of Vb/Vr. This engineering criterion represents the volume of the wet pool divided by the average volume per storm that passed though the particular facility during the monitoring period. The concept of relating the unit volume ratio to performance was developed by the USEPA (1986).
To my knowledge, only one manual in the United States or Canada uses the USEPA methodology. All other manuals use methods that are invalid, using the same volume determined for extended detention basins. This fails to recognize the significant effect of a retained volume on performance.
Depending on the state BMP manual, the design Vb/Vr ratio ranges from about 1.5 to 6 by happenstance, with about 1.5 to about 2.5 being the most common. Figure 2 shows a substantial range in the unit volume of the studied basins, with a few both below and above the current design range. Interestingly, increasing the unit volume of a wet basin beyond a Vb/Vr ratio of about 1 does not improve performance, likely due to the growth of algae (Minton 2005). Figure 2 also shows that the performance degrades and is inconsistent below a Vb/Vr ratio of about 1. Regardless, it is not appropriate to include in a generalized analysis as in Table 1 facilities whose design criteria fall outside the current design norm.
It is a reasonable expectation by professionals who use summaries like Table 1 that the data reflect facilities whose design falls within the range of commonly used criteria. We are faced with the difficulty that design criteria differ among state manuals. But clearly the first step is to exclude from generalized analyses facilities whose design lies outside the common range. This is one benefit of the database: comparing the performance of facilities that lie within the common design to those that do not.
This is not to say that current design criteria are valid (Minton 2004). Just because facilities within a category perform at a high level does not mean that the design criteria are correct. This is apparent from Figure 2. It is important to obtain data from facilities that are undersized or oversized according to current practice. Such data allow an important question to be addressed: Should design criteria be modified?
In addition, just because one category of treatment types does not perform as well as another category does not mean that the outcome is inherent to the less-effective treatment type. An example is extended detention dry ponds (detention ponds). Table 1 indicates that they generally do not perform as well as wet basins. However, design elements can be added to extended detention basins that can significantly improve performance (Minton 2006).
3. Facilities of the same treatment type but with significant differences in site characteristics that may affect performance. The effect of the differences in site (and watershed) characteristics on performance remains largely unanswered. Site characteristics are requested (Water Wright Engineers and Geosyntec Consultants 2007) but are not readily available from the database at this time. Possible site differences that may affect performance include site activity, soil type, slope, presence of landscaping, use of fertilizers, whether drain inlets with sumps are present and maintained, and the manner of source control BMPs such as sweeping. To some extent, these differences are integrated into observed influent concentrations and particle size distributions, which can be measured and in turn related to performance.
A site difference that users should consider is climate. Simplistically, we can divide the United States and lower Canada into four broad regimes: wet, semi-arid, cold, and semi-tropical (Minton 2005). Figure 3 illustrates what appears to be a significant difference in performance of wet basins in cold versus wet climates. The distinction is that in cold-climate areas, the wet pool freezes each winter, with intermittent winter melts and a significant spring melt volume. Figure 3 presents the data from Figure 2 with some additional facilities not in the database in 2007, prior to the current version. The facilities in Figure 3 noted as cold-climate facilities are located in Minnesota, Wisconsin, and Ontario. Intriguingly, all of the facilities with unit volumes less than 2 with effluent concentrations above 20 mg/L are located in the cold-climate region. Figure 3 suggests data points from the two climatic regimes should not be pooled. Figure 3 also suggests that for wet climates, increasing the Vb/Vr ratio above about 0.5 provides no noticeable improvement in performance. But it is not clear how low the Vb/Vr ratio can be and yet achieve effluent concentrations in the range of 10 to 20 mg/L. Being able to make analyses as in Figures 2 and 3 is an important use of the database.
Maintenance is relevant. I recently reviewed a professional paper yet to be published in which it was reported that cleaning the forebay and afterbay of a micropool wetland reduced the mean phosphorus concentration in the effluent by about half. Only two other analytes were measured, TSS and chemical oxygen demand, neither of which changed in the effluent. The authors of the article surmised that a noticeable decrease in phosphorus occurred because the maintenance action removed dead organic matter, which would have otherwise degraded in the facility releasing the phosphorus. The maintenance history of facility is requested as part of the entry of data into the database (Water Wright Engineers and Geosyntec Consultants 2007).
4. Inclusion of field studies in which the performance is clearly substandard. The database includes studies in which the performance of a particular facility was clearly outside the norm, yet the particular facility was sized according to current design criteria. Clearly, the observed performance must be due to other site factors. An example is a series of studies of strips by the California department of transportation (Caltrans). Four facilities were constructed of four different lengths with the apparent intent to ascertain the effect of length on performance. Lengths at each test facility ranged from 2 to 13 meters. Two of the facilities experienced negative removal regardless of strip length, whereas two found positive removal, although one was modest in its performance. At one test facility, the TSS increased from about 75 mg/L to between about 150 to 700 mg/L, oddly increasing with strip length.
A facility designed according to the norm that experiences negative or even modest performance relative to the norm suggests something about the site that is perhaps unique. When given a study whose performance is outside the norm, the database authors should query the data provider as to the likely causes for the deviation, and should include this information in the database reports.
Defining the norm with respect to performance is difficult and to some extent subjective. Certainly, it seems reasonable to exclude facilities that experience negative removal from analyses such as those shown in Table 1 and Figure 1. A cautionary note is that with some pollutants, their concentrations are commonly so low that negative removal is not to be unexpected in some cases. Examples are some metals, particularly the dissolved fraction, and nitrogen and phosphorus, particularly in planted systems such as wetlands. Judgment is warranted, based on a through understanding of the pollutant removal mechanisms present with each type of treatment system (Minton 2005).
5. Use of percent removal as an indicator of performance to compare treatment system types. The database authors have well stated the weaknesses of using percent removal for judging the performance of treatment systems, particularly when comparing different facilities within and between categories (Jones et al. 2008). The database authors have proposed that performance evaluations be based on effluent concentration. However, state BMP manuals with performance goals currently use percent removal. We can therefore expect engineers and planners will continue to use percent removal until such time as the proponents of effluent concentration propose an accepted framework of performance goals. That said, what follows are some cautionary notes regarding the use of percent removal.
I have observed users of the database constructing tables listing several facilities within a category, showing the percent removal of each, then calculating the mean or median removal percentage for the list, and using that figure for some purpose. It is, however, more appropriate to pool the selected data for several facilities as done in Table 1 rather than to produce a mean or median of the performances of the facilities. Prior to pooling the data, users should exclude facilities that lie outside the design and performance norms as discussed previously. Each user will have to decide the norms relevant to the situation. Currently, the database does not lend itself well to this consideration, as the design and/or operative criteria (for example, volume treated) during the test period are frequently not given as requested (Water Wright Engineers and Geosyntec Consultants 2007).
The bias of this analysis is reduced to the extent of excluding facilities whose design criteria fall outside the norm, those that have negative performance, and possibly those that clearly have substantial performance. If a particular facility has what seems to be an unusually high or low efficiency, the influent concentrations should be reviewed. The former may be due to influent concentrations that are unusually high, and the latter to influent concentrations that are low. Regardless, a database user should examine the influent concentrations for each facility to ascertain if it is similar to what is expected in the user’s region or for the particular land use that the user is considering. For example, if the user is considering the use of facilities in residential developments for metals removal, it is reasonable to exclude facilities in the database treating runoff from a high-volume freeway. I have commonly seen those interested in bioretention apply percent removals for metals found in treatment train parking lots to rain gardens treating roof runoff, where the influent concentrations are substantially lower.
6. Use of effluent concentration as an indicator of performance to compare treatment systems. The use of effluent concentrations as presented in Table 1 overcomes the inadequacies of percent concentration, but raises inadequacies of its own. It is best to view percent removal and effluent concentration as two sides of the same coin, with inadequacies of each that are mirror images. For example, percent removal may be greater at sites with high influent concentrations, but so may effluent concentrations. Conversely, if a facility achieves a relatively low concentration, in comparison to other facilities, this may simply be due to low influent concentrations. This point has been recognized by the database authors (Geosyntec et al. 2000), but is not clearly apparent in Table 1. In the database are several facilities tested by the Washington state department of transportation that produced mean effluent TSS concentrations less than 5 mg/L. However, the influent concentrations were less than 5 mg/L, apparently due to the stormwater having first passed through grassed areas.
This raises the question of whether studies whose influent concentrations are outside the norm should be excluded from the analyses such as that represented in Table 1. Certainly, it seems reasonable to exclude studies in which the observed influent concentrations are below what some call the “irreducible” concentration. Should a study be submitted for inclusion in the database that appears to have concentrations outside the norm, such as those mentioned above, the database authors should query the submitter for possible causes.
Selection of a site with high concentrations to test a manufactured product is of concern to some. But if performance is based on effluent concentration, those who raise such concerns should be aware that one need only select a site with relatively low influent concentrations. Generally, effluent concentration increases with increasing influent concentration for coarse media filters, extended detention basins, and grass swales.
The only treatment types whose effluent concentration is likely unaffected by influent concentration are fine media filters and wet basins, as suggested by Figure 3. However, for wet basins, this fact may simply mean that we are grossly oversizing the basins. As for fine media filters, the statement may be correct only for sediment (TSS and suspended sediment concentration [SSC]) and particulate pollutants. This stipulation is not necessarily consistent between regions, given the possible variation in the content of clay-size material, which readily moves through such filters. The higher the concentration of clay-size material in the influent, the likely the greater the effluent concentration of TSS/SSC.
A particular advantage of using effluent concentrations for comparative purposes is that it avoids the bias, which some might believe can never be fully addressed, inherent to the sampling of the influent: that is, influent concentrations tend to be understated. More importantly, the degree of bias is likely highly variable between sites due to differing conditions related to sample withdrawal. The use of effluent concentration might also be viewed as a means of negating the possible relevance of site differences previously discussed. The reasoning is that site differences, for the most part, affect influent characteristics: for example, variation in the amount of coarse sediment, which is easily removed. Treatment systems may tend to moderate the effect of differing influent characteristics as affected by site characteristics. As a result, the differences in the characteristics of effluent may vary less from site to site than those of the influent. However, as yet, the characteristics of effluent with respect to relevant parameters such as particle size, specific gravity distributions, and dissolved/particulate ratios have not been well defined.
But, all aspects considered, I believe it is more appropriate to use effluent concentrations in generalized analyses as long as the influent concentrations experienced at each facility are considered and noted.
Summary and Final Suggestions
In summary, I offer the following suggestions for conducting generalized analyses.
- Conduct analyses of facilities that are clearly of the same type: for example, distinguish strips from swales, extended detention basins from wet extended detention basins, and sand filters from amended filters.
- Include in generalized analyses only facilities whose design falls within commonly used engineering criteria.
- Recognize site characteristics, such as climate, to the extent they are currently understood.
- Give careful consideration to facilities whose performance is clearly outside the norm, such as those that experience negative removal.
- Pool the data for a given treatment type for evaluation, rather than average the averages of separate facilities
- Recognize the relationship between influent concentration and percent removal, with some pollutants below or above concentrations outside the norm, which may distort the percent removal observed.
- Recognize the same when using effluent concentration.
Of course, what constitutes the norm is a matter of judgment. As a consequence, which facilities are included in a particular analysis will vary with the user.
Further splitting of categories, particularly with respect to biofilters, detention ponds, and filters, would facilitate analyses as suggested above.
In general, the data are of most value for evaluating these questions:
- Why do we see variability of performance of a particular facility from storm to storm?
- Why do we see variability between facilities of the same treatment system type (BMP)?
- For a particular treatment system type, should we change design criteria, design elements, and/or maintenance procedures, leading to improved performance and/or more cost-effective design?
- How might we change the design criteria, design elements, and/or maintenance of each category such that all of the categories achieve similar performance for a particular pollutant within the constraint of cost effectiveness?
I believe that we need to move from the current paradigm of “How does each of the types of structural BMP perform?” to a new paradigm in which we ask “How do we design a treatment system in a cost-effective manner to give us the performance we desire, need, or can reasonably expect?” At minimum, the design criteria or design elements should be altered such that all categories give similar levels of performance for specifically targeted pollutants or classes of pollutants. The exception is a category of treatment systems that is intended to meet a different objective. For example, it is now common to consider the category called hydrodynamic devices (also known as small vaults) for pretreatment.
The points discussed above are all aspects whose consideration in the database will evolve over time. For now, “warning signs” such as the database authors raised in the quote presented at the beginning of this article are prudent. I look forward to a continuing discussion of the use of the database.