Into the FathomVerse: Community science, data, and the sea

Into the FathomVerse: Community science, data, and the sea

Despite extensive research dives and piles of visual data, ocean scientists estimate that more than 90% of marine life remains unclassified. Beneath the frothy waves, there’s a cosmopolis of undersea denizens, unknown and unaware of the world above. 

However, the ongoing climate crisis highlights just how entangled these two worlds are. As we seek to limit human impact on the planet, a better understanding of our oceans is vital. 

“How do we know if anything we’re doing is sustainable if we don’t even know the animals that we’re impacting?” Dr. Kakani Katija said, a principal engineer at the Monterey Bay Aquarium Research Institute (MBARI). “The faster we can process data and learn, the faster we can inform projects like offshore wind or management of wildlife habitats.” For Dr. Katija and her team at MBARI, the answer lay at the intersection of community science, AI, and mobile gaming. 

The data problem 

Since its founding in 1987, MBARI has kept meticulous records of its expeditions and research. Founder David Packard’s vision of a library accessible to all researchers at the institute is now a reality. 

© 2022 MBARI

“We’re really fortunate at MBARI to have a huge infrastructure for video annotation,” Lonny Lundsten said, senior researcher and engineering technician at MBARI. “We have made 11 million observations from the approximately 30,000 hours of footage collected over 37 years.” Lundsten works primarily as a video analyst, cleaning and annotating the video generated during dives and expeditions.  

“It is a huge undertaking to generate the data manually, reviewing every frame of video and annotating these observations of species, habitats, and other interesting events.” 

As the use of imaging systems for science purposes has increased at MBARI, the traditional, manual analysis of these ever-increasing data streams has become intractable. So for the last five years, MBARI researchers have spent a lot of time precisely drawing boxes around organisms and other objects seen in video and images to train machine-learning models. Once trained, the computer models can do the tedious job of combing through the 1,000-plus hours of video collected by MBARI’s diverse tech platforms each year.  

“These tools allow us to rapidly process the data. What took us hundreds of hours to do before, manually, can take just days and is quite cost-effective for models to process a high volume of data in the cloud or on local computers,” he said. 

“Humans just can’t match that.” 

It was a natural evolution and incredibly successful, but it wasn’t flawless. Massaging the results of the machine-learning model took time for Lundsten and his colleagues. MBARI is constantly gathering data, too—one automated underwater vehicle Lundsten works with generates 10GB of footage per hour over its 200-hour expedition, racking up 2TB of data during each of its weeklong deployments.  

MBARI is just one of many research organizations faced with a deluge of visual data. To scale analysis of this trove of valuable data, researchers are increasingly turning to AI. However, scientists needed a way to train the algorithm. Hiring a flotilla of interns and research assistants was feasibly and fiscally impossible, but human oversight seemed deeply necessary. Happily, one of MBARI’s programs, Ocean Vision AI, had a solution in mind that aimed to develop ways to benefit the larger scientific community: gamify the discovery process and leverage community scientists to train the models. 

Building a community science tool 

Based on the success of other community science projects, Dr. Katija felt there was an opportunity to leverage the legion of ocean lovers around the world for good. She and her team at MBARI and Ocean Vision AI would build a mobile game to aid the data reviewing and annotating process while engaging the public. FathomVerse was born, but there was one problem—no one at MBARI had ever designed a video game before.  

© 2024 MBARI

Knowing that she needed new expertise, Dr. Katija sourced proposals from game studios around the world to find the right partner. &ranj, a Rotterdam-based studio, won the contract thanks to its extensive experience in gamifying education. For game director and co-owner GAF van Baalen, the collaboration was a perfect match. 

“Our motto is ‘Together we build a brighter future,’” van Baalen said. “We want to contribute to a better world, and projects like FathomVerse don’t come along every year.” That motivation informed the studio’s designs, as well. 

“We didn’t just pick the best mechanics in the mobile space and slap fish on top, we built everything from the ground up,” van Baalen said. 

Early designs from MBARI grew into a full-fledged game. Players would be shown real pictures and stills from research expeditions and be tasked with identifying the animals in the photos. The idea was that by correctly identifying animals, the players would do the preliminary data cleaning and annotation after AI models took a crack at it, speeding up the process and minimizing errors.  

It was an elegant design but a tricky challenge for &ranj. Designers needed to find a way to provide useful results to the FathomVerse development team while still engaging their players.  

“One of our key challenges was providing meaningful feedback,” van Baalen said. “When you fail in a game, you need feedback to adapt and improve the next time you play. We knew that was vital, but much of the data was unlabeled or had low confidence in the proposed label.” 

To combat this, the team developed a consensus algorithm, powered by the FathomVerse player base. Leveraging metadata such as how many votes an image has and how successful those players are, the algorithm tries to generate a consensus on what the animal might be. This system was designed to disincentivize speed or score chasing; the accuracy of the data is paramount. 

Lilli Carlsen © 2024 MBARI

“As much as we want to grow the game, the data is the most important part. We want to engage players, but we must keep the science front and center,” van Baalen said. This philosophy has proven successful; over 10,000 players from 100 countries are logging data in the game and have generated more than 4 million labels already. 

Bringing it to the surface 

After the data gets beamed back to MBARI, it goes through more rounds of review before it’s officially sent to databases or the originators of the data. As time goes on, this process will be streamlined and automated, accelerating discovery and empowering researchers. 

“What we’re trying to do is lighten the load for researchers,” Dr. Katija said. “Instead of repetitive tasks, researchers can focus on generating insights from the data. They can think more deeply about the data and how to learn from it and grow our understanding about how these ecosystems function and change in a changing ocean.”  

Lundsten foresees these AI models as a research assistant that catches everything. 

© 2020 MBARI

“At some point, I see us actively using this on the research vessels while we’re at sea,” he said. “While surveying, the model would be running in real time, drawing boxes, identifying, and cataloging everything that is seen in the video stream. For example, we might use the ROV for a geological survey, but someone reviewing the machine-learning proposals in the control room may notice an anomalous detection, alerting the ROV pilots and scientists to go back and get a better look. The anomaly could be a species that is new to science or another important discovery. This might be especially valuable when the scientists are geologists, for example, and not familiar with the biology that is being seen.” 

“Each dive becomes more insightful, more invaluable—it’s almost a little bit like magic.” 

For Dr. Katija, this is the end game of the project, its final form: an automated system that identifies, catalogs, and empowers researchers all over the world, from the Pacific to the Persian Gulf. While discovery drives the game and research, the impact always comes back to the ocean and how we’re affecting it. 

“Climate change is happening now,” she said, “and the public is rightfully concerned and feels paralyzed with very few options to make a difference.  

“Now, just by playing a game on your phone, you are directly taking climate action.” 

Lilli Carlsen © 2024 MBARI

Images courtesy of MBARI

Header artwork by Rachel Garcera

Related Stories

AI Evens the Playing Field in Sports