Some geese in some very very turbid water in the Hudson river.

Measuring clouds... okay, cloudy water (and air)

Measurement Apr 8, 2025

In the tail end of 2024, the water system that serves NYC had declared a drought warning because we didn't get enough rain for most of the Fall season. Since then, out of idle curiosity I check the official page for the current reservoir levels. The drought warning was lifted in early January since we had a lot of big snow events upstate this winter.

Aside: NYC's open data efforts are great and there's a reservoir data set that's updated quarterly. I'm not 100% sure how the data works but stacking individual median monthly reservoir volumes seems close enough to correct.

Current Reservoir Levels | NYC Open Data
The daily capacity and percent of capacity filled for each of the City’s reservoirs.

But once you get into looking at water levels, you're a handful of clicks away at seeing water quality parameters, and if you're a measurement nerd it's a bit of a trap because there's so much stuff going on with drinking water.

This week while scanning various reports, I came across an unfamiliar unit, the "NTU", short for "Nephelometric Turbidity Units". It is the unit commonly used in the US, and is a special case of the "Formazin Nephelometric Units" (FNU) unit used in Europe and defined in ISO 7027. Both units are used to measure "turbidity", essentially how cloudy a sample of water is. The main difference between them is that NTU specifies a white broadband light, while FNU specifies an infrared monochromatic light in measurement. Otherwise the calibration methods are similar.

Using just base intuition, measuring how cloudy water is means figuring out how much light the water is stopping from passing through. Another way to think of the concept is how much light it scatters away from a straight path, since that would ultimately stop light from getting through. The problem is of course, figuring out how to measure such properties in a repeatable method.

A simple way is to use something like a Secchi disk, which is just a disk with white and black parts printed on it. Put it on a long stick and put it into a column of water like the ocean. You'd manually move the disk until you get a depth where you can't see the disk any more. That depth and a formula gives you an extinction coefficient that gives you an idea of how transparent the water you're testing is. Obviously there's lots of variation in measurements based on local factors like light intensity, glare, individual user's eyesight, etc..

For a more repeatable process, scientists rely on nephelometers, from the greek nephele, "cloud" + meter. Such a device can be made to measure the amount of particles suspended within a fluid, whether a liquid or gas. The basic idea is you have a tube with your sample inside, a light is shone into the sample and a detector measures the amount of light that comes off the sample at a 90deg angle. Since you'd know how much light is being put out and can compare it against the light being scattered to the side, you should be able to derive the proportion of light that is failing to make it through a sample. The same principle works whether talking about water or air. And of course, the problem is "the details" of making such a system work.

So, here's what scientists have decided – referencing the EPA's published "Method 180.1, Determination of Turbidity by Nephelometry" Their guidelines require nephelometers that measure between 0-40 units of turbidity, using 2200-3000k tungsten lamps shining through a sample tube with a max length of 10cm, and a detector centered and oriented 90-degrees from the light.

The nephelometer is then calibrated against reference solutions, which is the interesting part.

The turbidity reference solution is made of a chemical called Formazine. The formazine suspension is prepared by mixing solutions of 10 g/L hydrazine sulfate and 100 g/L hexamethylenetetramine with deionized water passed through a 0.45μ pore size membrane filter. The resulting solution is left for 24 hours, at 25 °C (+/- 3°C), for the suspension to develop. This produces a suspension with a turbidity value of 4000 NTU. The stock solution can then be diluted down to 40 NTU to use as a reference. The stock solution can also be diluted to any other value as necessary. The stock solution should be replaced every month, while diluted references need to be re-made daily.

Why formazin? Apparently because when the two liquid reagents are mixed together, they form a colloidal suspension that has a relatively uniform particle size distribution. Since light being reflected off any object depends on how the object is oriented towards the detector, it's best to have lots of small particles to help average out any effect of particle orientation. The size of particles formed is dependent upon the temperature of the reaction, which is why the standard requires specific temperatures.

Why must the references be remade on a very regular basis? Because there's a process called "Ostwald ripening" that happens to suspensions of solids and liquids where small particles will, over time, clump together and form larger crystals or bodies because clumping puts the system into a better (lower energy) thermodynamic state. (An example of this effect can be found by mixing ouzo liquor with water, causing the ouzo effect.) Since a change in particle size causes changes in light scattering, the recommendation is to keep using fresh reference solution. I'm not exactly sure why the "full strength" stock solution can be kept around for a longer period. Maybe its out of practicality since some of the reagents are known to be carcinogenic, or the fact that having less dilution slows the Ostwald ripening process.

Oh, and since we're at the point of measuring differences in how much slightly clumped particles in suspension have an effect on the standard measurement, devices have to account for the natural scattering properties of pure water.

Anyways, now that we have a reference unit to calibrate our nephelometers with, we can get readings! But what sort of readings are considered "good"? The EPA has a list of state-specific water quality standards, overall standards, and there's a database of standards under the Clean Water Act. It's important to know that turbidity alone is not an indication of water quality – there can be all sorts of nasty things in water that don't meaningfully contribute to its turbidity.

For water treatment plants providing water, the EPA wants <0.15 NTUs for the output streams, with no exceptions lasting more than 15 minutes. Plants of course aim for much lower levels to give themselves plenty of error margin. The combined effluent stream needs to stay <0.30 NTU. No samples should ever go over 1 NTU.

For water from surface sources, like NYC's reservoir system, the EPA wants a 1 NTU measurement for 95% of samples (unless the state allows a higher limit), and a 5 NTU maximum. NYC's water report specifically reports on the small number of instances where a sample crosses the 5 NTU threshold. Surface sources, which are subject to natural runoff and pollution, will have varying levels of turbidity. NYC's water quality reports usually say something about the turbidity levels of each individual reservoir since they do differ from each other and mix on their way down.

Air measurements

Maybe you're asking "this is the process for water samples, how about air? Do you just wave a bottle around and shove it in a machine?". Well, the National Parks Service has a page explaining the process they use to monitor conditions in the parks! Their nephelometers work by regularly opening a door to the sampling chamber, a fan pulls air in, and a light is shone through the sample to make a measurement. Interestingly they don't use a version of the NTU unit to report air turbidity, but 1/megameters to express the total light extinction, or a haze index based on human perception.

Descriptions of atmospheric nephelometers are more detailed, like this manual/diagram from NOAA, and involves a light shining through a sensing chamber, a light trap at the far end, a reference chopper (a rotating disc with features that can break the incoming light into different dark/calibration/signal sections for the machine to work. Most interestingly, the device measures three separate wavelengths - red, green, blue. Light is measured coming in, and compared to the amount of light that is absorbed or scattered by the sample over the known distance of the testing chamber. Math would then allow you to calculate how much light extinction is happening.

The procedure for calibrating an air nephelometer seems different also. I happened to find this "Nephelometer handbook" from the Dept. of Energy, and it describes calibration as being based off the scattering coefficients of dry CO2 gas, which has known scattering properties, and making corrections based on site factors like temperature, humidity, and air pressure. Both calibration procedures seem laborious, but at the least the atmospheric ones don't require mixing of potentially carcinogenic fluids.

Incidentally, air quality measurements like PM2.5 don't use nephelometers! Since we're interested in actual particle size instead of overall light extinction, there are different tools employed to identify what size of particle is floating in the air sample. If you want to learn about those methods, check out EPA's info on Federal Reference Methods (FRM) and Federal Equivalent Methods (FEM). They often include devices that do stuff like beta radiation attenuation for PM10 and PM2.5.

Either way, here we are, two devices of the same name that are ostensibly measuring one property – the extinction of light through a fluid due to suspended particles. The actual realization of the measurement only resemble each other on the surface.

All this research because I wanted to see if I was legally allowed to water my lawn.

🙃


Standing offer: If you created something and would like me to review or share it w/ the data community — just email me by replying to the newsletter emails.

Guest posts: If you’re interested in writing something, a data-related post to either show off work, share an experience, or want help coming up with a topic, please contact me. You don’t need any special credentials or credibility to do so.

"Data People Writing Stuff" webring: Welcomes anyone with a personal site/blog/newsletter/book/etc that is relevant to the data community.


About this newsletter

I’m Randy Au, Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. Counting Stuff is a weekly newsletter about the less-than-sexy aspects of data science, UX research and tech. With some excursions into other fun topics.

All photos/drawings used are taken/created by Randy unless otherwise credited.

  • randyau.com — Curated archive of evergreen posts. Under re-construction thanks to *waves at everything

Supporting the newsletter

All Tuesday posts to Counting Stuff are always free. The newsletter is self hosted. Support from subscribers is what makes everything possible. If you love the content, consider doing any of the following ways to support the newsletter:

  • Consider a paid subscription – the self-hosted server/email infra is 100% funded via subscriptions
  • Send a one time tip (feel free to change the amount)
  • Share posts you like with other people!
  • Join the Approaching Significance Discord — where data folk hang out and can talk a bit about data, and a bit about everything else. Randy moderates the discord. We keep a chill vibe.
  • Get merch! If shirts and stickers are more your style — There’s a survivorship bias shirt!

Tags