Cyanobacteria Assessment Network (CyAN)

An EPA, NASA, NOAA, and USGS Project

What is the CyAN Project?

Cyanobacteria Assessment Network (CyAN), graphic identifier The Cyanobacteria Assessment Network (CyAN) is a multi-agency project among the National Aeronautics and Space Administration (NASA), National Oceanic and Atmospheric Administration (NOAA), U.S. Geological Survey (USGS), and EPA to develop an early warning indicator system using historical and current satellite data to detect algal blooms in U.S. freshwater systems. This research supports federal, state, and local partners in their monitoring efforts to assess water quality to protect aquatic and human health.

The project will

  • develop a uniform and systematic approach for identifying cyanobacteria blooms using ocean satellites across the contiguous United States;
  • create a strategy for evaluation and refinement of algorithms across satellite platforms;
  • identify landscape linkage postulated causes of chlorophyll-a and cyanoacteria blooms in freshwater systems;
  • characterize exposure and human health effects using ocean color satellites in drinking water sources and recreational waters;
  • characterize behavior responses and economic value of the early warning system using ocean satellites and mobile dissemination platform; and
  • disseminate satellite data through an Android mobile application and EnviroAtlas.

Mission Statement and Objectives of the CyAN Project

Support the environmental management and public use of U.S. lakes and estuaries by providing a capability of detecting and quantifying algal blooms and related water quality using satellite data records.

Objectives

  • Create a standard and uniform approach for early identification of algal blooms that is useful and accessible to stakeholders of freshwater systems using the new set of satellites: Ocean Land Colour Instrument (OLCI) on Sentinel-3, Sentinel-2, Landsat and future NASA missions.
  • Develop an information dissemination system for expedient public health advisory postings.
  • Better understand the connections between health, economic, and environmental conditions to cyanobacteria and phytoplankton blooms.

Project Timeline

The CyAN project officially started October 1, 2015. The study focused on selected states, including Ohio, Florida, California, Vermont, New Hampshire, Massachusetts, Connecticut, and Rhode Island in FY16.  It has already begun to expand to continental U.S. coverage in 2017 using the MERIS archive from 2002-2012 in FY17.  Weekly composites of the ESA Sentinel-3 OLCI sensor data will be made available to collaborators for initial review and validation when the data becomes publically available by ESA.

Project Components

Validation

In situ validation data will primarily come from our federal and state collaborators. Sources of data will include, but not limited to, federal, state, and local government agencies, universities, private research groups, and published peer-reviewed journals. Minimum data reporting requirements include sample station identification, cyanobacteria counts, abundance, or phycocyanin pigment concentration, latitude, longitude, depth and date. Additional information that are not required but considered beneficial include chlorophyll-a concentration (especially), temperature, secchi depth, turbidity, and other available water quality measures. Data sets will undergo quality review by confirming that all methods used were documented and widely accepted.

Satellite Algorithms

A strategy for evaluation of algorithm updates has been established, in large part through the open source availability of the NASA ocean color processing software (l2gen) and the SeaWiFS Data Analysis System (SeaDAS). This project will perform a complementary effort by using existing products for MERIS and OLCI that have shown management value to establish algorithm development and data processing infrastructure. We propose to adopt second derivative spectral shape algorithms, which have been show to be robust in the presence of poor atmospheric correction. For MERIS (Medium Resolution Imaging Spectometer) data, the bands at 620, 665, 681, 709, and 754 nm are used. The CI = – SS(681), has estimated cyanobacteria concentrations and the algorithm has been successfully transferred to MODIS (Moderate Resolution Imaging Spectroradiometer) using SS(678). 

Cross Satellite Platforms

The reliable application of any remote-sensing algorithm over a large area requires a strategy for its evaluation, validation, and refinement on multiple spatial and temporal scales using field reference data. Successes and failures need to be understood, as does the ongoing need for refinement of algorithm parameterizations as our understanding of the statistical and analytical relationships within the algorithm improve with time. Assessment and validation will be necessary to identify where or when failure is likely to occur, such that levels of confidence in the remote-sensing product can be provided to water quality managers. Using in situ data as reference and data from multiple ocean color satellite instruments, we will compare: (1) model output from in situ radiometry vs. in situ metrics for cyanobacteria; (2) satellite radiometry vs. in situ radiometry and model output from satellite radiometry vs. in situ metrics for cyanobacteria and (3) model outputs from multiple satellite instruments (i.e. MERIS and Landsat). 

Environmental Assessment

The Environmental Component of the CyAN project focuses on the evaluation of the existing satellite data to document changes in land-cover composition, land-use activities, chlorophyll-a, and cyanobacteria concentrations for the study period 2001–2018.

Land-cover composition: percentage change of corn

Human Health

Remote sensing of cyanobacteria blooms offers a unique opportunity to estimate human exposure to cyanotoxins over specific geographic areas. The health of those communities with a past history of cyanobacteria blooms detected via satellite may be evaluated retrospectively by the analysis of existing health records.  

Economics

Across the U.S., many states are developing programs to monitor events. However, monitoring costs money, it takes time, and results are often not available in a time-frame that is relevant to some management decisions. Automated detection of events based on remote sensing has the potential to improve the quality and timing of HAB related data delivered to resource managers and the public. We will identify costs associated with monitoring and response to algal blooms and economic value of detecting events using remote sensing data. The second thread of economic research is to estimate the economic impact of avoiding toxic and nuisance bloom events in freshwater lakes. Because the benefits of water quality improvements are not valued in markets, measuring the economic value of water quality improvements associated with reductions in toxic and nuisance bloom events requires a nonmarket valuation approach. 

CyAN Project Mobile App, screenshotDecision Support

Satellite data is accessible to scientists, but is not processed and delivered to public or official users in a manner that demonstrates its practical value to daily life. Satellite data is pushed from NOAA, NASA and USGS to EPA. Two existing decision support platforms will allow access to the data. The CyAN Android mobile application is the first platform for immediate decision support. The second platform is the EnviroAtlas for longer-term trend analysis. Additionally, the scientific community will be able to access derived products directly through NASA, NOAA, and USGS public websites. 

Fiscal Year 2016 Updates

Validation

Database infrastructure has been designed and built in R studio which is an open access platform. Parameter codes have largely been completed and method codes are being completed.  HAB field supporting data has been ingested from the Water Quality Portal (WQP) (http://www.waterqualitydata.us/) for OH, FL, CA, and New England States. The database has a data viewer map showing site locations and data density which can be filtered by parameter code and provides summary statistics in terms of number of sites and sampling events.  Data export functions support open access file formats such as *.csv file extensions. The 2007 EPA National Lakes Assessment dataset is being ingested by the end of FY16. Metadata for all data sources is being maintained as data is ingested.

Algorithms

The USGS Earth Resources Observation and Science (EROS) Center has completed the initial phase of development for a Landsat surface temperature product. The algorithm development was a collaborative effort between the Rochester Institute of Technology and the NASA Jet Propulsion Laboratory. Provisional surface temperature products were provided for evaluation and validation. The results from these analyses will inform the CyAN project on how surface temperature data may be routinely used for assessing the onset of harmful algal blooms when combined with other land cover and land use information. In addition Landsat-8 atmospheric correction, chlorophyll-a algorithm calibration and initial validation were completed using a small subset of lakes.

NOAA has assembled water quality data covering most lakes in central Florida from St Johns River Water Management District and the LakeWatch program.  Additional data for Lake Okeechobee has been collected in order to evaluate for blooms, particularly in context of the major blooms of 2016. Methods have been configured for processing to final CI products, including flagging for faulty data in a way that allows rapid reprocessing. 

Cross Platforms

CI algorithms were incorporated into standard data processing system, l2gen, packaged as part of SeaDAS. This includes the CI bio-optical algorithms, plus all required supporting flags and masks.  Preliminary versions of other cyanobacteria algorithms, including MCI and MPH, were also included into l2gen. Software was further update to support: (1) a 50-m landmask; (2) binning and mapping at 50, 100, and 250 m; (3) reporting maximum values when binning and mapping (versus reporting the mean); and (4) GeoTIFF output for using in ArcGIS. All of these updates apply to MERIS and OLCI, plus (pending wavelength requirements for the cyanobacteria algorithms) SeaWiFS, MODIS-Aqua and -Terra, VIIRS, and Landsat-8. 

Preliminary MERIS full resolution time-series for the continental U.S. (CONUS) was generated. A test CONUS time-series for 2008 was developed in Sep 2016 to be made available to the Science Team in late Fall 2016. The recommended primary product for climatological analysis is a multi-day composite of the maximum CI values, such as 10-day or 7-day periods. In addition, the NOAA RSTools uses the ArcGIS Python scripting capability to serve a variety of purposes. The time range for compositing can be specified by user, and also can be handled by the month (e.g. 3 "10-day" composites each month). It also allows determination of median, mean, and other values (10%, 25%, 75%, 90%). It provides a data extraction option for time series analysis of any polygon or point.

Preliminary satellite-to-in situ match-ups for MERIS FRS CI products were tested. OLCI data is now being successfully processed when available. Initial comparison of OLCI CI with MODIS for CI indicates consistent results.

Landsat-8 and MODIS land bands are being compared for consistency, and also for comparison of MODIS to the proposed Landsat-8 chlorophyll algorithm.  The initial results show a good relative relationship, but a large positive bias. 

Standardized statistical methods are being developed to assure that robust and sound methods are being employed. This involves trend analysis, algorithm validation and algorithm intercomparison. As most of the data sets being used do not have normal distributions, non-parametric statistics are preferred.  Decision support metrics are also being evaluated. A paper on trend analysis is being completed. Algorithm inter comparisons will also involve image-based statistics to establish consistency in the methods. 

Environment

Assessment methods were developed to determine the utility of satellite technology for detecting CHAB occurrence frequency at locations of potential management interest. On the basis of NHD+, results show that 5.6% of US National Lakes Assessment lakes and reservoirs are resolvable by satellites with 300 m single pixel resolution and 0.7% of waterbodies when a 3 x 3 pixel array is applied based on minimum Euclidian distance from shore. Satellite data was also spatially joined to public water system surface intakes (PWSI) locations in Florida and Ohio. Recreational and drinking water sources were ranked by CHAB occurrence frequency above the World Health Organization high threshold for risk of 100,000 cells/mL. Ranking identified 158 recreational waterbodies ('08-'11) with concentrations above the WHO high threshold ranging from <0.1% pixels in Kirwan Lake, OH to 99% pixels in Lake Apopka, FL on a single pixel basis. The method presented here may serve as an indicator of locations with higher exposure to CHABs and therefore can assist in prioritizing management resources and actions across recreational and drinking water sources.

An assessment method was developed to quantify the trends in CHAB surface area extent, scalable to different spatial areas, in Florida, Ohio, and California. Time series analysis evaluates the overall trend in satellite resolvable inland waterbodies for each state of interest. To further assess CHAB risk within each state of interest, we used the World Health Organization’s recreational guidance levels thresholds using only cell abundance, to categorized surface area of detectable cyanobacteria bloom into three risk categories: low-risk, moderate-risk, and high-risk bloom area. Results show that in Florida, the area of CHABs is increasing, largely due to observed increases in high-risk bloom area. California exhibits slight decreasing trends in CHABs surface area, primarily attributed to changes in Northern California. In Ohio (excluding Lake Erie), we observe small increases in cyanobacteria blooms in all risk categories of CHABs. This study is the first of its kind to use satellite remote sensing to quantify trends in inland CHAB surface area across multiple water bodies for entire US states.

Landscape/use variables (+67) were acquired from multiple sources (raster and vector) and aggregated to the 12 digit HUC basin level. These variables included primary inputs from the 2011 National Land Cover Database (NLCD), USGS 30-m digital elevation models (DEM), STATSGO soils data, and other sources. Variables include statistics for land cover percentages per unit HUC, land cover percentages within a 100-m stream buffer, soil type, Contained Animal Feeding Operations (CAFO) density, among other variables. Lake sheds associated with HAB perturbed freshwater lakes have been identified and will be modeled via both vector and raster methods (i.e., Factorial Statistical Design). From August 15-26, riparian buffer in situ data was assessed in four lake sheds in Wisconsin (2) and Minnesota (2). This riparian assessment will inform the utility of differing buffer constituencies.

Health

We are restricting study areas to: those water bodies large enough to be evaluated for cyanobacteria blooms via remotely sensed images, and to those water bodies with a spatially proximate population of potentially exposed persons. Each state varies in their collection of health data and their rules for accessing these data. We are in the process of creating lists of areas to include in our health data requests.

Economics

The MERIS CI product has been used to assist with the EPA National Center for Environmental Economics on hedonic housing price model for cyanobacteria blooms in a Florida estuary and brackish lagoons. The CI and CI_cyano product have been used in collaboration with EPA National Risk Management Research Laboratory for informing a hedonic housing price model for cyanobacteria blooms in Ohio. This data will also be used in the Puget Sound housing price models.  If the CyAN data can be successfully used in the housing price models across Florida, Ohio, and Washington state, this data could provide EPA with a consistent measure of water quality that is available across the country and has potential to be included in EPA’s National Water Quality Benefits framework.

We are in the process of gathering information on the costs of HAB monitoring programs across the country. This data will be used to evaluate how data generated from the CyAN project can help to save money or improve operations for different organizations involved in HAB monitoring.

We are also in the initial stages of developing a survey instrument that will be used to collect data that will be used to estimate the economic benefits associated with avoiding/ mitigating HAB events. The MERIS product will be used to assess different regions across the country for conducing the survey. The satellite data products will be used to generate current baseline descriptive characteristics and future scenarios of HAB activity in the study region. There is a lot of interest in the results of this survey across different offices within EPA and outside the agency.

We are developing an approach to use data from the National Survey on Recreation and the Environment (NSRE) and the satellite data to estimate the value of water quality in freshwater lakes across the U.S.

Decision Support

We have developed EnviroAtlas demonstration widgets. An interactive data dashboard prototype was generated to support public access to CyAN data. The “CyAN Historic Data Dashboard” was built using an open source software stack (leaflet.js, d3.js) and will support discovery and evaluation of the CyAN dataset by public citizens and water managers. The online tool will be housed on the EPA EnviroAtlas website. This tool provides historical context to aid in interpreting cyanobacteria bloom events. Collection of user feedback is currently underway to improve the dashboard tool before publication.

Android app development environment established, modified app to establish communication with server environment. Notification API requests functional in external tools (browser, postman), Android emulators and on Kitkat based phones. 

California is reviewing the 10-year MERIS time series and providing feedback. They've identified inconstancies in Lake Tahoe that is being examined. They are also testing the minimum lake size for the approach for climatological analysis. We have identified a need for a more refined but flexible land masking procedure, both to address 1-2 pixel navigation issues, and varying lake levels. A two day training session was developed for California.  This includes a lecture set and a GIS hands-on method.

The team also developed a public facing website for general project information, project updates, and publication links.  

Lakes with Potential 300m MERIS/Sentinel-3 Data

Lake point locations (shown on map below) were generated using boundaries from National Hydrography Dataset (NHD) and National Lake Assessment (NLA) water bodies. A "maximum window" >=900meters was used to identify lakes large enough to include at least one 300X300 meter MERIS/Sentinel-3 pixel unaffected by land surface reflectance. The 1,862 lakes identified using this methodology were then intersected with state boundaries to calculate how many lakes are located within each state. Lakes that intersect more than one state are counted within both states and total values therefore may exceed 1,862. The NHD database is missing many lake name values. Where available, missing lake names values were identified using the National Inventory of Dams (NID) and other databases.

Publications

Journal Articles:

Associated Reports:

Project Contacts

Contact Us about the CyAN Project
Leads for each Agency:
  • Blake Schaeffer, EPA
  • Jeremy Werdell, NASA
  • Keith Loftin, USGS
  • Richard Stumpf, NOAA