Why Texas’ coronavirus data comes with caveats
Need to stay updated on coronavirus news in Texas? Our evening roundup will help you stay on top of the day's latest updates. Sign up here.
Accurate and up-to-date coronavirus data is critical — not just for informing the officials making policy, but for parents trying to decide whether to send their kids to day care and business owners wondering whether to reopen their stores.
But Texas officials keep correcting the data — whether it’s because of human error, shifting benchmarks or bureaucratic changes. Last week, Texas increased its coronavirus fatality toll by hundreds of people — then lowered it again, lending fuel to skeptics who question the accuracy of the government data. The state does not typically report complete hospitalization data, and during a recent week was reporting even less than usual. And a myriad of tests with varying accuracy have confused its metrics, with some local health departments including test results the state considers unreliable.
Experts say the state’s coronavirus data is useful as long as users understand its limitations — and can identify misinformation campaigns that attempt to discredit it. Generating precise, comprehensive data about a previously unknown virus that has infected millions is next to impossible. Rapidly changing testing technology, inconsistent data collection processes and an evolving understanding of the virus’ effects on the human body have complicated the state’s herculean task of reporting health data that researchers, policymakers, journalists and the public devour each day.
"The overall data are reliable in showing what is happening with COVID-19 in Texas, particularly when taken together," said Chris Van Deusen, a spokesman for the Texas Department of State Health Services.
"What we collect and publish every day is provisional data that hasn't yet gone completely through those rigorous checks and is subject to change," Van Deusen said. "Because of that, not every piece of data will not be precise 100% of the time, and we don't expect it to be, but it is accurate enough to let us know what's happening and guide our response."
The Texas Department of State Health Services coronavirus dashboard updates daily with new numbers of confirmed cases, tests performed, deaths and more, with limited demographic information and some numbers broken out by county. Texas fares well on data transparency in comparison with other states, earning an A for the quality of its data from The Atlantic’s COVID Tracking Project, one of the most robust U.S. resources on the virus.
“From a purely informational point of view, the [state] dashboard is helpful. … If you’re actually working in data with COVID-19, trying to inform policy or disease modeling, those researchers are well aware of the limitations that we have right now,” said Angela Clendenin, an epidemiologist at the Texas A&M University School of Public Health. “There’s utility in it because we recognize that it’s imperfect.”
But over the past four months, data errors have plagued several important metrics that the state reports, mistakes that have the potential to mislead decision-makers about the virus’ true course and to feed dangerous misinformation narratives. Those risks are particularly urgent at a time when epidemiologists are calling for protective measures such as mask-wearing and keeping distance from others.
“It’s only with the data that I can make a prediction” or policy recommendations, said Rajesh Nandy, a professor of biostatistics and epidemiology at the University of North Texas Health Science Center who recently analyzed whether mask orders slowed viral transmission in North Texas.
“Of course we want the data as reliable as possible, but nothing is perfect,” said Nandy, who as a public health researcher is used to adjusting for data discrepancies.
Last week, the state had to twice correct its tally of coronavirus fatalities.
On Monday, the state counted hundreds of previously unreported deaths after switching how it tallied fatalities. Previously, it had relied on local officials’ records; now, it relies on data from death certificates.
By Thursday, state officials issued a mea culpa, removing several hundred deaths that had been erroneously counted because of an “automation error.”
The significant correction set off a scramble for researchers. And the reaction on social media was swift, providing fuel to incorrect claims made by virus skeptics who said that death tallies include fatalities of COVID-positive individuals who died by other means, like car accidents. They do not.
In the end, the state’s reported coronavirus death toll grew by 8% after accounting for the new methodology and the subsequent correction.
"Of course, all data has limitations, and there are times we get updated or better information, and we make changes as quickly and transparently as possible," Van Deusen said. The amount of data officials are collecting during the coronavirus pandemic and the speed at which they're sharing it, he said, "is unlike anything public health has ever done before."
"In normal circumstances, there is a months-long process of reconciling information and doing quality checks before publishing final data on infectious diseases. That's not nearly fast enough for COVID-19, so we and our partners have to move much more quickly," he said.
The state’s new method of counting deaths has created discrepancies with local tallies. The state is reporting hundreds more Harris County deaths than Harris County leaders are. It’s just the opposite in Hidalgo County, part of the hard-hit Rio Grande Valley, where local leaders are reporting hundreds more dead than the state claims.
State health officials say local leaders may be incorrectly including deaths of people who did not live in Hidalgo County. Local leaders say the state’s death certificates have been delayed, leading to incorrect reports — and that they weren’t consulted before the state announced a major change in its reporting methods.
State officials have said they are doing their best under trying and unprecedented circumstances.
Even under the best circumstances, the state’s data paints an incomplete picture of the virus’ spread. Researchers estimate that the true number of coronavirus cases could be more than 10 times the number of positive tests. As many as half of the people who contract the virus may never experience symptoms, making their cases hard to include. Death tolls are also undercounted, researchers agree — the question is by how much.
Distrust of the data
Experts say that those sudden changes can erode the trust of the public at a state or county level.
Brian Southwell, a misinformation expert at RTI International in North Carolina, said while reporting challenges are understandable, “without proper contextualization, that can lead to misinterpretation or misperception.”
Southwell says these perceptions show that government officials have not always “perfectly articulated” what counts as a COVID-19 fatality. Explaining the process of how data is collected and why it is not perfect should be part of the efforts to build confidence, he said.
Those false narratives also circulate in Texas. On July 14, DSHS warned that “an unauthorized and misleading chart using the DSHS logo [was] circling the internet.” The chart presented flawed statistics: For example, the number of flu cases for a full year was compared to only four months of COVID-19 cases. “Misinformation can lead people to make decisions based on false or inaccurate information,” wrote the department on Twitter on July 24.
The incorrect idea that the COVID-19 death toll is artificially inflated has been identified as one of the enduring misleading narratives around the pandemic in preliminary research, said Jessica Collier, who studies misinformation at the Center for Media Engagement at the University of Texas at Austin. A growing body of research shows that human brains are “biased to process information as true the first time that we hear it” — making it difficult to correct false claims later, she said.
Incomplete hospitalization data
In many ways, the corrections to death data were a repeat of earlier issues with the state counts. Texas’ data on how many Texans are hospitalized for COVID-19 has never been entirely complete, state public health officials said publicly for the first time last week.
The revelation came after the state’s reported hospitalization numbers plummeted July 23 by more than 2,000. The drop didn’t represent a sudden mass discharge of Texans from local hospitals, officials said, but rather a complication brought on by incomplete data.
That week, around 15% of Texas’ more than 500 hospitals stopped sharing data with state public health officials, who blamed a new federal reporting requirement that had caught hospital administrators unawares. Texas Department of State Health Services officials did not disclose which hospitals stopped submitting data about coronavirus patient numbers.
The new policy came from the Trump administration, which announced July 15 that hospitals would no longer share data with the U.S. Centers for Disease Control and Prevention and instead send it to a private technology firm contracted by the U.S. Department of Health and Human Services. Hospitals had mere days in some cases to overhaul their data-sharing practices.
The recent shift represented “a lot more reporting,” said Pat Harrison, vice president of system quality and patient safety at Houston Methodist.
Reporting hospitalizations to HHS is tied to resources allocation, such as the distribution of remdesivir, an antiviral drug used to treat COVID-19.
The public soon faced a dearth of information about where the most gravely sickened Texans were being cared for, even though an executive order from Gov. Greg Abbott has required hospitals to report daily information to the state since March.
“It is the hospitalization rate and sadly the death rate that give us the most solid indications of how bad or how threatening the outbreak is in our area, so it’s a pretty profoundly important piece of information for folks to have,” said Anne Dunkelberg, an associate director and health policy expert at Every Texan. The Austin-based think tank joined more than 100 organizations in signing a letter that asked White House officials to return hospital data collection responsibilities to the CDC.
Texas health officials say the state’s data has returned to its usual level of thoroughness, with between 94% and 98% of hospitals reporting daily figures to the Texas Department of State Health Services. But the agency no longer includes a disclaimer on its website noting that the data was highly incomplete from July 23 through July 28. Van Deusen said the state's hospitalization figures were "enough to provide a well-informed picture of hospital demand and capacity across the state and regionally."
Testing questions
The state has also seen issues with its reporting of positive coronavirus tests, many of them tied to the number of different types of tests and their differing reliability.
As the Houston Chronicle reported Sunday, Texas’ count of tests does not include tens of thousands of rapid-result antigen tests, which suggests the state is vastly underreporting the number of Texans who have tested positive for the virus. Antigen tests are taken by nasal or throat swab like other tests, but results are much faster. Since they are more likely to miss an active coronavirus infection, the Centers for Disease Control and Prevention considers these tests to indicate a “probable” case, not a confirmed one. Unlike some other states, Texas does not report probable cases, so it does not include antigen test results on its dashboard.
Last month, the state cut some 5,500 cases from its tally of confirmed coronavirus tests when it discovered Bexar and Nueces counties had been including antigen test positives alongside positive results from other types of tests.
In May, state health officials admitted that in formal statewide tallies, they had been grouping standard viral tests with antibody tests. Experts say it’s important to distinguish between antibody tests — which detect whether someone was previously infected — and viral tests, which show whether someone currently has the virus. Abbott had incorrectly said that state officials were not commingling the numbers before health officials confirmed that they were.
The corrected data led to a small change — less than 1% — in the state’s positivity rate, the share of people who test positive for the virus.
Despite its issues, Texas is providing more and better information than states like Ohio, Missouri and Connecticut, according to some groups that track all 50 states’ data. The COVID Tracking Project, which gives Texas an A, ranks states based on how they report demographics, patient outcomes and testing data, among other factors.
Experts say that as they grapple with a new disease, the guidance changes quickly and often. Data collection will improve with researchers’ understanding of the virus.
People “need to understand some of those challenges that we [face] as data collectors in this unprecedented pandemic dealing with an unprecedented disease,” said Clendenin, the Texas A&M epidemiologist. “We will have a much better idea of how to go back in and look at the data and be able to work with the data when we get on the other side of this.”
Disclosure: Every Texan, Texas A&M University, the University of Texas at Austin and the University of North Texas Health Science Center have been financial supporters of The Texas Tribune, a nonprofit, nonpartisan news organization that is funded in part by donations from members, foundations and corporate sponsors. Financial supporters play no role in the Tribune's journalism. Find a complete list of them here.
Information about the authors
Learn about The Texas Tribune’s policies, including our partnership with The Trust Project to increase transparency in news.