Skip to content

Why collecting race-based data is crucial in tackling COVID-19

There is increasing evidence that both who you are and where you live has an effect on your risk of contracting the virus.

Why collecting race-based data is crucial in tackling COVID-19

Walking around Toronto these days, it’s common to see the message “We are all in this together” hung in windows. It’s an encouraging thought in times like these. As the COVID-19 virus continues to upend our lives, a sense of collective purpose can feel reassuring.

But while it’s true that a virus is indiscriminate in who it infects, it turns out we are not exactly all in this together — or at least, not all in the same way. There is increasing evidence that both who you are and where you live has an effect on your risk of actually contracting the virus.

Ontario released information in late May about the geographic distribution of COVID-19. In Toronto, the highest number of cases was in the two northern corners of the city, areas that also contain high numbers of racial minorities and comparatively lower incomes. And in the United States, stats show that Black Americans are dying at three times the rate white Americans are.

In the current climate, in which it’s become increasingly clear just how deep and prevalent racial discrimination is in North America, this lack of data seems like it could hamper attempts to tackle the disease. But the state of health data collection in much of Canada is often inconsistent, and without clear rules for gathering demographic information. Without that information, the health system is sometimes left flying blind.

Frank Rudzicz is an associate professor at the University of Toronto and a faculty member of the Vector Institute for Artificial Intelligence. He suggests that the lack of a centralized approach to data can slow research and understanding. “Because everything is siloed,” he says, “there’s often layers of extra bureaucracy you have to go through.”

In Ontario, for example, there is no single electronic medical record or EMR, an idea which is becoming increasingly common in the wealthier parts of the world. That’s why residents there have to pay to transfer medical data when they switch doctors.

It’s an issue that Ontario recognizes, appointing Dr. Jane Philpott to lead its pandemic effort. “There are real challenges with a health care system that for as long as I’ve known it has had … silos,” Philpott, the dean of Queen University’s faculty of health sciences and former federal health minister, said to the Globe and Mail. “There are huge amounts of information that have to be brought together in order to understand the pandemic better.”

COVID-19 in particular has illustrated the need for robust data collection, and Toronto is doing just that according to Dr. Vinita Dubey, associate medical officer of health for the city.

“There are over 70 diseases that are reportable to local public health units in Ontario, including COVID-19,” she says, and people who do become infected with those diseases have basic data collected like age, gender, and who they’ve been in contact with.

Socio-demographic data, however, is not usually part of that information collected.

“Questions on Indigenous identity, race, income and household size were recently added to the Coronavirus Rapid Entry System (CORES), our COVID-19 information system,” says Dubey. That means it will still be some time before there is any clear data about how race fits into the COVID-19 puzzle (though the city is trying to use federal census data to get a head start on the issue).

“[Gathering] these types of data will help us to understand if this pandemic is disproportionately affecting certain groups in our community so we can then better inform prevention strategies,” says Dubey. She also adds that such information then allows Public Health to determine if there are linguistic or structural barriers to accessing care or getting informed on prevention and stopping spread.

The collection of race-based data does point to some sticky problems, however. As Frank Rudzciz suggests, in designing, say, machine-learning health algorithms, the idea is to have a system work as effectively regardless of what race a person is. But as he points out, you cannot ignore a thing to hope it will go away.

“We’ve done a lot of statistical analysis of these kinds of data and they echo what we see in the literature,” he says, “and there is a lot of inequity between groups, and not only that, the methods we use to do data science can fall into demographic traps.” He points to the problem of a machine-learning algorithm being trained almost exclusively on data sets of white people. In the same way that a Google image algorithm can propagate racism by labelling Black people as gorillas, any health system has to account for the risk of similar bias and be designed with diversity in mind.

There is, too, the difficult issue of privacy.

“Any conversation about statistical bias also needs to include privacy,” says Rudzcic, “and how we store information, keep it safe, and not let it be misused.”

The question of race and health is, however, an issue that seems to finally be gaining traction. As the Globe and Mail’s health writer André Picard suggested in a recent column, racism is itself a public health crisis.

“This is a time for public health to highlight what really matters to health — the socio-economic determinants, of which race is one,” writes Picard.

It’s an uncomfortable topic. But then, these are also uncomfortable times. And for all of us to truly be in this together, we need better and more data. At least that way, we know what we are facing.