In March of 2020, some positive news was buried in virus panic: using advanced stem cell technology, doctors cured a second man of HIV. In June, researchers released a new combination drug, a “truly life-transforming” treatment for cystic fibrosis patients. Scientists have discovered two pathways for Alzheimer’s they are “cautiously hopeful” might translate to a treatment. While the world’s eyes are locked on the novel coronavirus, researchers are using novel medical technologies to end deadly diseases.
John Shepherd , Ph.D., is a data scientist hoping breast cancer will be next.
Shepherd is a researcher and professor of epidemiology and population sciences at the University of Hawai‘i Cancer Center. His latest research involves using AI to study biomarkers patterns in mammograms. He began a career in medical imaging work before joining UCSF, where he would first get started with mammography. He began breast cancer research around the same time AlexNet, a convolutional neural network, won the ImageNet contest in 2012, proving AI’s incredible power to spot and differentiate images.
“When AlexNet won that contest and I became of aware of [AlexNet], it changed everything in terms of how we addressed problems with big data,” says Shepherd. AI can recognize millions of images with immense precision, only if coupled with a deep training model, which requires a great deal of computational power. AlexNet solved this problem by using GPUs, electronic circuits that rapidly alter computer memory for speedy, efficient image processing.
AlexNet’s GPU approach made AI’s brilliant image processing newly accessible, so long as one had enough images on hand to feed the network. This was all well and good for Shepherd, who had access to a database of six million mammograms at UCSF.
This was less well and good when Shepherd moved to the University of Hawai‘i and there was no trove of mammograms, but this was also why his move would matter so dearly.
“There’s a lot of databases available to do general learning on,” Shepherd explains—seven or eight with data robust enough to train AI. “But let me just give you an example of where those databases are curated from and you tell me what the predominant race is: San Francisco, Vermont, Mayo Clinic, Nurse’s Study at Harvard, IBIS in Northern Europe, Karma in Sweden…” In other words, the majority of the patients in these databases—up to 80% of them, according to Shepherd—are white.
Correcting this data bias is critical to determine why different ethnicities of women experience such different breast cancer outcomes, and how medical professionals can help vulnerable groups.
“In Hawaii, we find that native Hawaiian women do much worse than Chinese women in terms of mortality and staging,” Shepherd says. “And we’re asking, ‘Why is that? Is it an ethnic issue? Is it something we can see in the breast that we can predict these poor outcomes? Or is it just something that’s social that has to do with access issues?’ But we can’t answer even those simple questions of access unless we have a registry.”
Shepherd’s work was already cut out for him.
AUGMENTATION
Breast cancer usually reveals itself in mammography as an asymmetry between the left and right breast. But there’s an extent to which humans can read these asymmetries.
Shepherd explains that the human eye can only see 256 levels of shades out of a mammogram’s 65,000. Additionally, the human brain processes events serially, making it difficult to investigate more than one variable per study.
“AI is just so different,” Shepherd says. AI can see all 65,000 shades of the mammogram and compare thousands of variables’ relevance to cancer outcomes at the same time. All Shepherd would need to do is feed the model image after image: This woman developed cancer five years later, but not this woman. Eventually, the AI spots the biomarkers indicating these outcomes.
But to train an AI model to find cancerous differences in mammograms, Shepherd must first train the AI to ignore healthy differences in mammograms. “[Breasts are] different in thickness, in size, density, and texture, and [the AI model] has to learn that those things don’t necessarily tell it what it needs to know.”
Shepherd also has to train the AI to ignore imaging differences by showing it the same mammogram multiple times, with variations in granulation, shading, scale, and perspective, a process called “augmenting.”
In short, for a mammography database to be of any use to AI, it has to be massive. Even still, if the data includes only local women, the model can only help local women. And the women local to major mammography databases are not in need of the most help.
“If you compared the high stage rates [the rate at which women are diagnosed with advanced stage breast cancer] in Hawaii relative to Northern California, we have a 50% higher stage rate than Northern California,” Shepherd says. “Their high stage rate is about 10%, it’s about 15% for us, and that’s across all ethnicities.” In areas like Guam or Micronesia, Shepherd says this number can be as high as 50 or even 80%.
Mammography databases not only have to be massive, they also have to be massively diverse for these results to be useful to everyone. Ideally, scientists might have a worldwide mammography database to train AI, but there are real technical barriers to achieving any semblance of one.
First and foremost, protecting women means protecting their privacy.
THE HOPE IN THE SYSTEM
The Hawai’i Pacific Island Mammography Registry is the first computerized mammography database to focus on women in Hawaii and the Pacific Islands.
Shepherd explains that Pacific Islanders suffer some of the worst breast cancer outcomes, so it’s important that Hawaii pool its mammography resources to attack this disease with all available force. But a democratically accessible mammogram database poses a threat to its patients’ privacy.
Shepherd solves this problem using powerful technology. Here is how the database balances privacy with accessibility: first, Shepherd uses hashed encryption to protect HIPAA information from himself and the other researchers. While anyone in the world can use the database with prior approval, all mammograms and clinical information are stored behind the registry’s firewall and can only be accessed through the database’s dedicated Nvidia DGX supercomputer. Approved researchers can use the supercomputer for a limited amount of time to run their models on the mammography data. Once their time is up, they get to keep their models, but no mammograms or clinical information. Everything on the supercomputer is then wiped for the next user.
The registry can only maintain the privacy of its patients while offering its fruits to as many researchers as possible with a constant, quick flow of data to and from the supercomputer.
Western Digital has been able to provide Shepherd with the necessary tools: first, the disk space required to hold the registry, and more recently, terabytes of extremely fast flash memory.
It’s exciting that the technology exists to discover breast cancer biomarkers imperceptible to the human eye, but there are some things it cannot tell Shepherd.
“There’s something I heard the other day about the difference between animals and humans: humans are the only animal that really asks the question, ‘Why?’” This is also the difference between humans and artificial intelligence, Shepherd explains. “If we don’t ask that question, ‘Why?’ And let AI do its own thing, we don’t really learn anything.”
As more and more people have experienced the senseless violence of disease this year, many have found themselves asking that question, “Why?” Sometimes, the human need for causality can feel like a curse.
But that essentially human question is also the driving force behind medical science forging life in the face of diseases that once devastated communities. When data scientists ask, “Why?” they can parse understanding from data sets too massive for a single human to read in full. The human need for causality might feel like a curse, but it is also a superpower, allowing scientists like Shepherd to leverage technology as incomprehensibly powerful as AI to get actionable insights for Hawaiian women facing breast cancer.
It is his job to breathe meaning into these mammograms, images of real women whose lives are at stake. Shepherd insists there is another future for Pacific women. It’s just on the horizon, a shade the human eye cannot yet see.