As the pandemic hits new heights, with nearly 12 million cases and 260 million deaths in the United States, a glimmer of hope is emerging. Moderna and pharmaceutical company Pfizer, which are developing vaccines to fight the virus, have released preliminary data suggesting their vaccines are about 95% effective. Manufacturing and sales are expected to increase once the companies seek and receive approval from the U.S. Food and Drug Administration. Moderna and Pfizer officials say the first doses could be available as early as December.
But even if the majority of Americans agree to be vaccinated, the pandemic won’t end suddenly. Kenneth Frazier, CEO of Merck, and others warn that drugs used to treat or prevent COVID-19, the condition caused by the virus, are not silver bullets. Most likely we will have to wear masks and practice social distancing well into 2021, not only because vaccines are unlikely to be available until mid-2021, but also that after each vaccine is released, studies will need to be conducted to monitor the potential for side effects . Scientists will need even more time to determine the effectiveness or degree of protection of vaccines against the coronavirus.
During this time of uncertainty, it is tempting to turn to fortune tellers for comfort. In April, researchers from the Singapore University of Technology and Design released a model they claimed could estimate the life cycle of COVID-19. After entering data – including confirmed infections, tests conducted, and the total number of registered deaths – the model predicted the pandemic would end this December.
The reality is far worse. The US has recorded more than 2,000 deaths a day this week, most in a single day since the devastating first wave in the spring. The country now has an average of over 50% more deaths per day than it did two weeks ago, on top of an average of nearly 70% more cases per day.
It is possible – probably even – that the data the University of Singapore team used to train their model was incomplete, unbalanced, or otherwise seriously flawed. They used a COVID-19 dataset compiled by the research organization Our World in Data, which included confirmed cases and deaths from the European Center for Disease Prevention and Control, and test statistics published in official reports. When hedging their bets, the developers of the model warned that the accuracy of the prediction depends on the quality of the data, which is often unreliable and reported differently around the world.
While AI can be a useful tool when used sparingly and with reasonable judgment, blind faith in these types of predictions leads to bad decisions. A recent study by researchers at Stanford and Carnegie Mellon found that certain demographic voting results in the US, including people of color and older voters, are less likely to be represented in mobility data used by the US Centers for Disease Control and Prevention, the office from the Governor of California and numerous cities across the country to analyze the effectiveness of social distancing. This oversight means that policymakers who rely on models trained with the data may not be able to set up pop-up test sites or assign medical devices to where they are most needed.
The fact that AI and the data it is trained on tend to be biased is not a revelation. Studies looking at popular image processing, natural language processing, and election prediction algorithms have come to the same conclusion over and over again. For example, much of the data used to train AI algorithms to diagnose disease is made up of inequalities, partly due to the reluctance of companies to share code, datasets, and techniques. However, with a disease as widespread as COVID-19, the impact of these models is amplified a thousandfold, as is the impact of government and organizational decisions that are made by them. For this reason, it is important to avoid making AI predictions about the end of the pandemic, especially if they lead to unjustified optimism.
“If these prejudices are not properly addressed, the spread of these prejudices under the guise of AI can exaggerate the health disparities among minorities who are already the most burdened by disease,” wrote the co-authors of a recent article published in the Journal of American Medical Informatics Association. They argued that biased models could exacerbate the disproportionate impact of the pandemic on people of color. “These tools are based on biased data that reflect biased health systems and are therefore subject to a high risk of bias themselves – even if sensitive attributes such as race or gender are explicitly excluded.”
We would do well to heed their words.
For AI coverage, send news tips to Khari Johnson and Kyle Wiggers – and be sure to subscribe to the AI Weekly newsletter and bookmark The Machine.
Thank you for reading,
AI Staff Writer
Best Practices for a Successful AI Center of Excellence: A Guide for CoEs and Business Units Access here