Focusing on disaster response with Amazon Augmented AI and Mechanical Turk

It’s easy to distinguish a lake from a flood. But when you’re looking at an aerial photograph, factors like angle, altitude, cloud cover, and context can make the task more difficult. And when you need to identify 100,000 aerial images in order to give first responders the information they need to accelerate disaster response efforts? That’s when you need to combine the speed and accuracy of machine learning (ML) with the precision of human judgement.

With a constant supply of low altitude disaster imagery and satellite imagery coming online, researchers are looking for faster and more affordable ways to label this content so that it can be utilized by stakeholders like first responders and state, local, and federal agencies. Because the process of labeling this data is expensive, manual, and time consuming, developing ML models that can automate image labeling (or annotation) is critical to bringing this data into a more usable state. And to develop an effective ML model, you need a ground truth dataset: a labeled set of data that is used to train your model. The lack of an adequate ground truth dataset for LADI images put model development out of reach until now.

A broad array of organizations and agencies are developing solutions to this problem, and Amazon is there to support them with technology, infrastructure, and expertise. By integrating the full suite of human-in-the-loop services into a single AWS data pipeline, we can improve model performance, reduce the cost of human review, simplify the process of implementing an annotation pipeline, and provide prebuilt templates for the worker user interface, all while supplying access to an elastic, on-demand Amazon Mechanical Turk workforce that can scale to natural disaster event-driven annotation task volumes.

One of the projects that has made headway in the annotation of disaster imagery was developed by students at Penn State. Working alongside a team of MIT Lincoln Laboratory researchers, students at Penn State College of Information Sciences and Technology (IST) developed a computer model that can improve the classification of disaster scene images and inform disaster response.

Developing solutions

The Penn State project began with an analysis of imagery from the Low Altitude Disaster Imagery (LADI) dataset, a collection of aerial images taken above disaster scenes since 2015. Based on work supported by the United States Air Force, the LADI dataset was developed by the New Jersey Office of Homeland Security and Preparedness and MIT Lincoln Laboratory, with support from the National Institute of Standards and Technology’s Public Safety Innovation Accelerator Program (NIST PSIAP) and AWS.

“We met with the MIT Lincoln Laboratory team in June 2019 and recognized shared goals around improving annotation models for satellite and LADI objects, as we’ve been developing similar computer vision solutions here at AWS,” says Kumar Chellapilla, General Manager of Amazon Mechanical Turk, Amazon SageMaker Ground Truth, and Amazon Augmented AI (Amazon A2I) at AWS. “We connected the team with the AWS Machine Learning Research Awards (now part of the Amazon Research Awards program) and the AWS Open Data Program and funded MTurk credits for the development of MIT Lincoln Laboratory’s ground truth dataset.” Mechanical Turk is a global marketplace for requesters and workers to interact on human intelligence-related work, and is often leveraged by ML and artificial intelligence researchers to label large datasets.

With the annotated dataset hosted as part of the AWS Open Data Program, the Penn State students developed a computer model to create an augmented classification system for the images. This work has led to a trained model with an expected accuracy of 79%. The students’ code and models are now being integrated into the LADI project as an open-source baseline classifier and tutorial.

“They worked on training the model with only a subset of the full dataset, and I anticipate the precision will get even better,” says Dr. Jeff Liu, Technical Staff at MIT Lincoln Laboratory. “So we’ve seen, just over the course of a couple of weeks, very significant improvements in precision. It’s very promising for the future of classifiers built on this dataset.”

“During a disaster, a lot of data can be collected very quickly,” explains Andrew Weinert, Staff Research Associate at MIT Lincoln Laboratory who helped facilitate the project with the College of IST. “But collecting data and actually putting information together for decision-makers is a very different thing.”

Integrating human-in-the-loop services

Amazon also supported the development of an annotation user interface (UI) that aligned with common disaster classification codes, such as those used by urban search and rescue teams, which enabled MIT Lincoln Laboratory to pilot real-time Civil Air Patrol (CAP) image annotation following Hurricane Dorian. The MIT Lincoln Laboratory team is in the process of building a pipeline to bring CAP data through this classifier using Amazon A2I to route low-confidence results to Mechanical Turk for human review. Amazon A2I seamlessly integrates human intelligence with AI to offer human-level accuracy at machine-level scale for AWS AI services and custom models, and enables routing low-confidence ML results for human review.

“Amazon A2I is like ‘phone a friend’ for the model,” Weinert says. “It helps us route the images that can’t confidently be labeled by the classifier to MTurk workers for review. Ultimately, developing the tools that can be used by first responders to get help to those that need it is on top of our mind when working on this type of classifier, so we are now building a service to combine our results with other datasets like GIS (geographic information systems) to make it useful to first responders in the field.”

Weinert says that in a hurricane or other large-scale disaster, there could be up to 100,000 aerial images for emergency officers to analyze. For example, an official may be seeking images of bridges to assess damage or flooding nearby and needs a way to review the images quickly.

“Say you have a picture that at first glance looks like a lake,” says Dr. Marc Rigas, Assistant Teaching Professor, Penn State College of IST. “Then you see trees sticking out of it and realize it’s a flood zone. The computer has to know that and be able to distinguish what is a lake and what isn’t.” If it can’t distinguish between the two with confidence, Amazon A2I can route that image for human review.

There is a critical need to develop new technology to support incident and disaster response following natural disasters, such as computer vision models that detect damaged infrastructure or dangerous conditions. Looking forward, we will use Amazon A2I to combine the power of custom ML models with Amazon A2I to route low-confidence predictions to workers who annotate images to identify categories of natural disaster damage.

During hurricane season, providing the capacity for redundant systems that enables a workforce to access systems from home can provide the ability to annotate data in real time as new image sets become available.

Looking forward

Grace Kitzmiller from the Amazon Disaster Response team envisions a future where projects such as these can change how disaster response is handled. “By working with researchers and students, we can partner with the computer vision community to build a set of open-source resources that enable rich collaboration among diverse stakeholders,” Kitzmiller says. “With the idea that open-source development can be driven on the academic side with support from Amazon, we can accelerate the process of bringing some of these microservices into production for first responders.”

Joe Flasher of the AWS Open Data Program discussed the huge strides in predictive accuracy that classifiers have made in the last few years. “Using what we know about a specific image, its GIS coordinates and other metadata can help us improve classifier performance of both LADI and satellite datasets,” Flasher says. “As we begin to combine and layer complementary datasets based on geospatial metadata, we can both improve accuracy and enhance the depth and granularity of results by incorporating attributes from each dataset in the results of the selected set.”

Mechanical Turk and the MIT Lincoln Laboratory are putting together a workshop that enables a broader group of researchers to leverage the LADI ground truth dataset to train classifiers using SageMaker Ground Truth. Low-confidence results are routed through Amazon A2I for human annotation using Mechanical Turk, and the team can rerun models using the enhanced ground truth set to measure improvements in model performance. The workshop results will contribute to the open-source resources shared through the AWS Open Data Program. “We are very excited to support these academic efforts through the Amazon Research Awards,” says An Luo, Senior Technical Program Manager for academic programs, Amazon AI. “We look for opportunities where the work being done by academics advances ML research and is complementary to AWS goals while advancing educational opportunities for students.”

To start using Amazon Augmented AI check out the resources here. There are resources available here for use with the Low Altitude Disaster Imagery (LADI) Dataset. You can learn more about Mechanical Turk here.

About the Author

Morgan Dutton is a Senior Program Manager with the Amazon Augmented AI and Mechanical Turk team. She works with academic and public sector customers to accelerate their use of human-in-the-loop ML services. Morgan is interested in collaborating with academic customers to support adoption of ML technologies by students and educators.