Harmful algal bloom smart device application: using image analysis and machine learning techniques for classification of harmful algal blooms James Lazorchak & Joel Allen, Environmental Protection Agency Michael Waters & Miriam Steinitz-Kannan, Northern Kentucky University
Blue-green algae, also known as cyanobacteria, can grow explosively into harmful algal blooms (HABs) and potentially release toxins hazardous to humans and wildlife and contaminate drinking water sources. Objective 1: Continue to improve accuracy of the machine learning algorithm used in the IOS smart device application “HAB app” to automatically classify a potential HAB by image taken with a smart device. Objective 2: Continue to improve accuracy of the machine learning algorithms used in fixed camera monitoring systems to automatically determine the presence and concentration of a HAB. Three fixed camera monitoring systems are currently operational: Lake Harsha (1) in Clermont County, OH & Ohio River (2)
Results & Discussion Objective 1 (HAB app): Confusion matrix comparing actual vs. predicted classes for the IOS HAB application.
BlueGreen
6 0
*Training set divided into 30% test images (n=11) Confusion Matrix Predicted (70/30) Green Bluen=41 Green
Confusion Matrix Predicted (70/30) Green Bluen=52 Green 0 9
95% Confidence Interval for Accuracy: (0.78, 1.0)
Design & Methods Objectives 1 and 2 make use of a support vector machine binary classifier, transforming image pixel color from red-green-blue to hue-saturationvalue color space for classification using an augmented method of color indexing. Test images are verified and correctly classified by agency and other scientists by grab samples, plankton tows, phycocyanin sensors, and hyperspectral satellite imaging.
0
0
8
a. b. c.
Objectives 1 & 2: Collect more correctly classified training images for improved accuracy Perform a principal component analysis to reduce misclassification Calibrate classifier to conform to World Health Organization harmful algal bloom concentration risk levels Add additional fixed-camera monitoring stations on freshwater systems evaluation of camera position, time of day and season effects on picture quality and potential effects on classification
Classification of algae with and without trichomes. The model correctly classified 4 of 5 images with trichomes and 1 of 3 non-trichome images.
Objective 3: a. Collect more correctly classified training images for improved accuracy b. Re-design neural network architecture to better detect features
Estimated Cell Counts
Objective 3 makes use of a convolutional neural network architecture trained on correctly classified microscopic images to differentiate among genera of HABs by shape. Test images will be verified by agency and other scientists.
Next Steps
d. e.
BOUY Site Fixed Camera Station July 18th at 10:30 am
HAB app beta version
Lake Harsha, Aug 12, 2017, 11:30 a.m. High probability of a HAB (confirmed).
Supervised machine learning using convolutional neural networks has been shown to be surprisingly effective in classification problems involving diatoms, benthic freshwater macroinvertebrates, and microalgae. A CNN approach was used initially to classify algae as being in long strands (trichomes) or not – part of a dichotomous key for classification of HABs developed at Northern Kentucky University.
EFLD/EFLS = 1,023,293 cells/ml BUOY = 676,083 cells/ml EMB = 323,594 cells/ml ENN = 1,258,925 cells/ml
U.S. Environmental Protection Agency Office of Research and Development
BlueGreen
3
95% Confidence Interval for Accuracy: (0.72, 1.0)
Satellite Imaging
Blue-Green Algae
Model Training Architecture
Green
Objective 3:
Objective 3: Develop a machine learning algorithm using convolutional neural networks (CNNs) to automatically classify harmful algae microscopically at the genus level.
Green Algae
Objective 2 (Lake Harsha): Confusion matrix comparing actual vs. predicted classes for Lake Harsha.
*Training set divided into 30% test images (n=15)
Green
August 12th at 11:30 am
Actual
Introduction & Objectives
Actual
www.epa.gov
Fixed Camera Prediction of Cyanobacteria • Camera Prediction: • 10:30 am - 98.3% Probability of Bluegreens • 11:30 am 100% Probability of Bluegreens
Total Target MC Concentration EFLD/EFLS =1244.3 ppt (IC 446 ppt) BUOY = 1179.7 ppt (IC 383 ppt) EMB = 1725.3 ppt (IC 560 ppt) ENN = 1843.8 ppt (IC 1032 ppt)
Acknowledgments • • • • • • • • • • •
Ecological Stewardship Institute at Northern Kentucky University Northern Kentucky University Department of Mathematics and Statistics Northern Kentucky University Department of Biological Sciences Thomas More College Department of Biological Sciences Marshall University Department of Biological Sciences Ohio River Valley Sanitation Commission (ORSANCO) Foundation for Ohio River Education (FORE) Oakland University Lake Superior State University Wayne State University Michigan Department of Environmental Quality
Contacts Requests for beta-testing IOS smart device application HAB app or Information about fixed-camera monitoring:
[email protected] or
[email protected] Above: The MicrobeScope™ in use, with designed slide attachment. Right: Some algae photos taken with the MicrobeScope™.
Project website: https://mathstat.nku.edu/hab
The views expressed in this poster are those of the authors and do not necessarily represent the views or policies of the U.S. government.