michael barbaro

From The New York Times, I’m Michael Barbaro. This is “The Daily.”

[music]

Today: For months, the U.S. government has been quietly collecting information on hundreds of thousands of coronavirus cases across the country. My colleague, Robert Gebeloff, on the story of how The Times obtained that data.

It’s Wednesday, July 8.

Robert, you live in a corner of The Times, the data team, that I’m not sure most people understand all that well. So when the pandemic starts, how do you all respond?

robert gebeloff

So, by training, my goal is to find stories that can best be told through data, which is not every story, but there’s a lot of stories out there. So if you go back to early March, the pandemic is starting. And I know that our job as The New York Times is to really get our arms around what’s going on and, by that, to start collecting the data that is starting to come out about cases and deaths around the country. So my colleagues set up a team of people across different departments whose primary job would be to monitor all the states, all the major counties, and gather the information and start to build a database. Start to say, we’re getting information from New York over here and California over here, but let’s put it into one database just for the purpose of tracking where the cases were, where the deaths were.

michael barbaro

You’re saying it’s not coming out on a national level. There’s no big clearinghouse that’s going to hand you data every day about exactly where the virus is all across the country.

robert gebeloff

Correct. And at that point, we assume that some kind of federal system may be in the offing, but we weren’t going to wait for it. And part of our report every day, you’ll see on our website, are maps showing where the cases are, where new cases are, where deaths are, where the new hotspots are. That all emanated from these early days of creating this ground-level system for being able to collect this data.

michael barbaro

And I wonder if you can take me into the process of that a little bit. I mean, what does it look like? Where exactly is the information coming from?

robert gebeloff

Well, it’s really like a hive of activity. I mean, that’s the way I like to think of it. You have, at any given time, a team of clerks, reporters, editors, all assigned to monitor what gets announced in various parts of the country. So at one moment, you could have somebody wrestling with new data that was put out by California and trying to get it into a format that matches our data standards. And you could have somebody in Mississippi confused about whether the new data announced is cumulative, or is it new cases for the day? And often, that involves basic reporting of going back to the state and asking questions. Then, while all this is going on and people are collecting this data, we have other people trying to put the data into context. It’s, you know, truly this whole new full-time operation just devoted to trying to track what is really happening with the pandemic and to do some surveillance on the national picture.

michael barbaro

Right. This sounds very tedious, incremental. You know, gathering up tiny bits of data, cleaning it, making sure it all lines up — not sexy.

robert gebeloff

It is not sexy at all. You know, when you’re data journalists, the fun part is doing what we call the queries — asking questions of the data and seeing what it shows. But we all know, like, job one is to make sure your data is good. Otherwise, the questions you ask won’t mean anything.

michael barbaro

Hmm. And what do you begin to learn through this data?

robert gebeloff

Right. Part of what my personal job is to do is to look at this data and try and help understand what it tells us. So, for example, one of the early findings we had when we were looking at the pandemic in March was it seemed to be hitting mostly in big cities — New York, New Orleans, Detroit.

michael barbaro

Seattle.

robert gebeloff

Seattle. It seemed to be in places with a lot of population density. But there was also another class of place that seemed to be popping up. And it was resort counties — places with ski resorts. And so that led us to this insight that it wasn’t just population density, that there are other possible explanations for why places got hit. Then, as the weeks went on, we began to see the fill-in, what I call the fill-in, which is — there were all of these new counties that were starting to get cases. And so by having this record, what we were able to then report is there are now hundreds of rural counties getting their first cases. And, you know, how were they preparing? And how were they talking to people? And then, another thing we’ve been monitoring is there seems to be this ideological difference — or at least there has been — about how serious a problem is it. How soon should government reopen or allow businesses to reopen? And —

michael barbaro

Right. Kind of a red state-blue state divide over shutting down and reopening.

robert gebeloff

Right. But our reporting showed that there was this additional element involved, which was, for the first six to eight weeks of the pandemic, there were hardly any red counties with high infection rates. And most of the hard-hit places were in blue counties. And so we were able to raise the specter of, if you live in a place that doesn’t have first-hand experience with the virus, you don’t have your emergency rooms being overflowed. Maybe that also contributes to your belief that, you know what, we should open the economy. This is not worth shutting down the economy for.

michael barbaro

Right.

robert gebeloff

And all of these types of stories are, again, driven by the idea that in the first place, we had good county-level data that we couldn’t get anywhere else. That allowed us to look at the world through these different prisms and ask different questions about how the pandemic was playing out.

michael barbaro

Mm-hmm. You’re laying out clear examples of why data like this is important and what it lets us understand. But I’m curious what the limitations of this kind of a database are. What does it not tell us?

robert gebeloff

Yeah. So think of it this way. A data set we think of like any other source that we’re going to interview. And we think of what might this source be able to tell us about something. And so we think of questions that we’re going to ask the source. So the problem became — we had this data set, and we knew where the cases were and the deaths were, but we couldn’t ask it any other questions. We couldn’t ask, who were the people actually becoming infected in these counties? Were they old? Were they young? Where they rich? Were they poor? Were they front-line workers? Were they white? Were they Black? Were they Latino? So all these questions we had we couldn’t really ask the data set we had.

michael barbaro

So what did you end up doing?

robert gebeloff

So, along the way, we learned that the C.D.C. actually had some information that would be helpful in this, in that every time a person was confirmed to have a coronavirus infection, the local health agency would fill out a report that would have characteristics of the case — the person, the age, the race. And the form actually asked dozens of questions. You know, was the person at work? Was the person staying home? What were the symptoms? And that these forms ultimately ended up at the C.D.C.

michael barbaro

Hmm.

robert gebeloff

And if we could get our hands on this data, we could ask a lot more questions about how this pandemic is playing out. And so we decided to approach the C.D.C. and request access.

And here’s why we needed that data. So many people in this country are getting sick. So many people are dying. And our job is to try and explain, who is it that is getting sick? Who is dying and why? And if we had any chance of getting answers to those questions, we need the best data. And if the C.D.C. had the data, we wanted to get a copy ourselves.

michael barbaro

And so how do you go about trying to get it?

robert gebeloff

Well, in this case, we ended up suing them.

[music]
michael barbaro

We’ll be right back.

So, Robert, why did The New York Times sue the C.D.C.?

robert gebeloff

So suing the C.D.C. sounds very dramatic. But in fact, many, many times in the course of a year, we go to court to establish our rights to get public information. It’s somewhat more routine than most people would realize. And sometimes it’s because the government out and out refuses to give up the information. But in this case, it was more to do with the timing. Without going to court and putting pressure on the agency, we were looking at the prospect of waiting months to get our hands on this information.

michael barbaro

Right.

robert gebeloff

But by going to court, it sort of put the clock on. And we had the agency’s full attention.

michael barbaro

And so what ends up happening once this clock is ticking and a judge is looking over the shoulders of the C.D.C.?

robert gebeloff

So the C.D.C. tells us that they will comply. They just need to do a little more research as to what they can possibly produce, taking into consideration the privacy of people who are in the database and stripping out personally identifiable information. But ultimately, the day comes where they say, OK, New York Times, here is a database of 1.45 million cases —

michael barbaro

Wow.

robert gebeloff

— that we have collected from state and local authorities. And we were then free to have a new interview subject and be able to ask it a whole lot of more interesting and detailed questions.

michael barbaro

Right. I mean, this quite literally sounds like the motherlode of data on this pandemic in the United States.

robert gebeloff

Well, in many ways it was. What we were able to see from this was detailed information about individuals who had become infected and died. And for each individual, we were able to look at their age, the county they lived in, their race and their ethnicity. And that is far more information than we had before. And in the end, we ended up being able to break down cases for nearly 1,000 counties covering more than half of the U.S. population.

michael barbaro

And this number — 1.5 million Americans — how big a proportion of all cases of the virus is that?

robert gebeloff

So for the time period covered by the data — it was all cases through the end of May — it was about 88 percent of all cases that we had some information about.

michael barbaro

So when you get this massive data dump, what do you do? What do you find?

[music]
robert gebeloff

So when we finally had our hands on this data, we were checking what types of information were included, how complete the information was, and just looking at the data in many different ways to see what it could tell us. And eventually, three main trends emerged.

michael barbaro

And so what were those trends?

robert gebeloff

So the first was just how pervasive the racial disparity was with this pandemic.

michael barbaro

Mm-hmm.

robert gebeloff

Whatever knowledge people had that African-Americans and Latinos were becoming infected at a higher rate, a lot of that was tied to big cities that had released data. But what we found is that this racial disparity pervades everywhere, whether you go from cities to suburbs, even into rural places.

michael barbaro

Huh.

robert gebeloff

In fact, any place we found where there was a significant African-American population, almost all of them, African-American infection rates were higher than the rate for Whites. Same thing with Latinos. Any place we found where there was a significant Latino population, for almost all of them, the infection rate was higher for Latinos.

michael barbaro

Hmm.

robert gebeloff

The second big takeaway is what is driving these racial disparities. So most of the earliest explanations of the racial disparity were focused on death rates. And one of the explanations for the disparities in death rates that is commonly offered is something called comorbidities — the idea that African-Americans might be dying at a higher rate because they were more likely to have preexisting conditions or to be in poorer health to begin with. But in our analysis, we focused mostly on the actual infection rates. And the reason for that is that gets us out of the question of whether comorbidities is driving it and puts us more on the question of who is most at risk to become infected in the first place. And so when we see disparities in the infection rates, we can then raise the question of, why are people in certain groups more likely to become infected?

michael barbaro

Mm-hmm.

robert gebeloff

And that led us to looking at, where do people work? Where do people live? And what is their housing situation? And if you look at where people work and look at what the data shows, it shows that African-Americans and Latinos in the U.S. are far less likely to have the kind of job where you can do it at home. They are more likely, instead, to have a job in the production sector, in a factory or in the service sector. All of that combined would increase your risk of becoming infected. And with housing, what we found is that Latinos in particular are far more likely to live either with more people in the household or with less space in the household, both of which would also increase the odds of a person might become infected.

michael barbaro

So the second discovery very much helps understand the first. There are kind of structural issues around how Black and Latino Americans work and live that contribute to this racial disparity in the pandemic.

robert gebeloff

That’s correct. And the third takeaway from this is what you learn by looking at the pandemic through the prism of age.

michael barbaro

Hmm.

robert gebeloff

Right now, most of what we know about the disparity is all cases of people of all age groups. And that’s how the rates are calculated. But if you realize something about this pandemic, it’s that older people are far more likely to get sick and die.

michael barbaro

Right.

robert gebeloff

And in the U.S. right now, the older population is very disproportionately white, non-Hispanic.

michael barbaro

Huh.

robert gebeloff

So if you don’t account for age, you’re by definition almost understating the disparity. So what we did — what some epidemiologists call “age adjusting” — is looked at infection rates across age groups. And when you look at, say, what the infection rate is for people who are in their 40s or in their 50s, the disparity is much bigger than you’ll ever see in numbers without age adjustment.

michael barbaro

So when you accounted for the fact that so many older people have died from the coronavirus, and that the older population in this country skews white, you found that the racial disparity actually gets even greater.

robert gebeloff

Correct. In fact, if you look at some of the younger age groups, the death rate for Latinos is about 10 times higher that for whites.

michael barbaro

Wow.

robert gebeloff

Now, the caveat to that, of course, is you’re much, much less likely to die at those age groups. But it’s still, among the people who do die in those age groups, it’s very heavily Black and Latino.

michael barbaro

Mm-hmm. I mean, these insights, once again, seem to highlight just how important it is to have this kind of information. Because from what you’re saying, we have been, in some sense, misunderstanding the racial disparities of this virus — the causes of the racial disparities — because we haven’t had access to this data.

robert gebeloff

Well, at minimum, you could say we didn’t know the extent to which these problems existed. And getting data like this helps us sort of define what the ground truth is about how this pandemic is playing out. That being said, there’s still a lot more that we would like to know.

michael barbaro

Mm-hmm.

robert gebeloff

The database had 1.45 million records. And it had, for each record, more than 100 columns or 100 pieces of information. Most of those were blank. And that leaves us in the dark about a lot of questions that we’d like answered, like how many people are contracting the virus at work? Or how many are getting it from traveling or being at bars? So still a lot of room for improvement. And hopefully, knowing what can be done, the power of having this data to answer questions will help inspire the C.D.C. to collect the information better.

michael barbaro

Mm-hmm. And perhaps release it more quickly. I have to think that suing the C.D.C., getting this data and reporting out these insights on race has increased pressure on the federal government to make this information more available. Is that true?

robert gebeloff

I would like to think so. There is still some mystery as to what will ultimately happen. Our case is still pending. The status is, the C.D.C. at this point believes they satisfied our request.

michael barbaro

Right.

robert gebeloff

Our lawyers are still investigating whether or not there was more information that should have been released — or more types of information. And, you know, once that is resolved, the question will be what does the C.D.C. do going forward. And a lot of people, in reaction to the story that published, were asking me, do you think they’ll just start posting this on their own? And I would think that whether or not the information is complete, it’s still better than anything else out there. And so hopefully we will see more of this type of information made public.

[music]

That would definitely be beneficial to not just us, but to researchers around the nation and the world to have access to more complete and better information. But until that happens, we’re going to keep doing what we’ve been doing.

We’re going to go out every day, go to every state and collect data on coronavirus cases and deaths.

michael barbaro

Rob, thank you very much.

robert gebeloff

Thanks, Michael.

[music]
michael barbaro

On Tuesday, the latest updates to The Times’s database found that the virus has infected more than 3 million Americans and has killed more than 130,000 of them. Globally, it recorded nearly 12 million infections and nearly 542,000 deaths, including 65,000 in Brazil, where the country’s president, Jair Bolsonaro, who has repeatedly downplayed the pandemic and avoided wearing a mask, announced that he had tested positive for the virus.

We’ll be right back.

[music]
mission control

Station, this is Houston. Are you ready for the event?

chris cassidy

Hello, Houston. We’re ready for the event.

michael barbaro

38 days ago, NASA and SpaceX launched two U.S. astronauts into space on a mission to the International Space Station, where they joined a fellow American. It was the first time that a manned spacecraft has left American soil in nearly a decade.

mission control

The New York Times, this is mission control Houston. Please call station for a voice check.

michael barbaro

On Tuesday, I spoke with the three U.S. astronauts now aboard the space station.

chris cassidy

Hello, New York Times. New York Times, this is the International Space Station. How do you hear us?

michael barbaro

Bob Behnken and Doug Hurley, who arrived a few weeks ago, along with Chris Cassidy, who has been there since April.

michael barbaro

We hear you loud and clear. How do you hear us?

chris cassidy

We hear you loud and clear as well. Good afternoon. Welcome aboard, and we’re happy to talk to you.

michael barbaro

Of course, their time in space is precious. And so NASA gave us six minutes on the dot.

michael barbaro

If I might boldly call you by your first names — Doug, Chris and Bob — thank you very much for making time for us. I wonder if you can start by telling us exactly where you are in space, relative to us right now.

chris cassidy

Well, while I kick things off, Bob’s going to pull up our mapping program. Right at the moment, we didn’t have it on the computer. Sorry about that. But we’re orbiting 250 miles above the Earth. And it looks like we are abeam of Baja California, just a little bit out into the Pacific Ocean.

michael barbaro

Mm-hmm. So over America — the U.S.-Mexico border.

chris cassidy

Right. Yeah. We’re just over the Pacific Ocean. We just past California heading south.

michael barbaro

If you’ll indulge me for a minute, I want to talk a little bit about feelings. Knowing I was going to be talking to you, I have been thinking a lot about this moment back on Earth and wondering, with so much turmoil here, and you looking down on all of it from such a distance, what that feels like to look down on a planet that’s truly in the midst of some really challenging, tumultuous times.

doug hurley

Well, it certainly is challenging to hear, either by secondhand or when we get the opportunity to see some news up here, all the turmoil that’s going on. The challenges with the pandemic and the strife in the cities and all the different challenges that people are going through on a day-to-day basis. It is — you know, emotionally it does take a toll on us, certainly. And I think the other thing that really resonates with me, personally, is just when you look out the window, when you see the planet below, you don’t see borders. You don’t see this strife. You see this beautiful planet that we need to take care of. And hopefully, as technology advances and as this commercial space travel gets going, more people will get that opportunity. Because I think if you get the chance to look out the window from space and look back on our planet, it will change you. It will change you for the better. And you’ll realize that this is one big world, rather than all these different little countries or cities or factions that we have on the planet. And I think it will make it a better place.

michael barbaro

Well, that’s really interesting. And I wonder if you could say a little bit more about that, because in the time since I believe you’ve all last been in space, there actually have been changes on Earth. You know, major ice shelves have broken off in Antarctica. Huge fires have swept across Australia, California. The Great Barrier Reef has essentially died. And when you look down at Earth, can you actually see some of those changes to the Earth, compared with when you last saw it?

bob behnken

Well, I think one of the things that we see from up here is that the Earth is not a stagnant place. It continues to change, whether it’s a fire, whether it’s the seasons, whether it’s different things happening further out. You know, we just saw a comet become visible in the predawn era. So it’s definitely a lot of things happening with the Earth and —

michael barbaro

Wow.

bob behnken

— that continuous change.

michael barbaro

I have to apologize. Now I need for you to tell me what it means for a comet to become visible in the predawn era and what that actually looks like.

bob behnken

The comet that I’m referring to was really close to the sun. And so it needed to get far enough away from the sun that we could actually, you know, look at it and see its dim little light that was visible in darkness, but kind of blinded by the sun, if you will, if you look too closely at it. And so if we got to a situation at dawn, right before the sun came up, that comet became visible during that short period of time when it was still close to the sun, but the sun was still hidden by the Earth. It was just an awesome sight to be able to see and something that we try to capture. In the few moments that we do have to look out the window, we try to capture those changes. Capture the exciting things that we can see to try to share our view with the folks back home, the folks that are still down on Earth, and just try to give them an appreciation for just how beautiful our planet is and how important it is that we do our best to take care of it.

[music]
michael barbaro

But in terms of that turmoil —

mission control

Station, this is Houston ACR. That concludes The New York Times portion of the event. Please stand by for a voice check from Fox News.

michael barbaro

Thank you all. We appreciate it.

bill hemmer

Bill Hemmer with Fox News. How do you hear me? (ECHOING) Bill Hemmer with Fox News. How do you hear me?

chris cassidy

Hi, Bill. Loud and clear. Welcome to the Space Station.

bill hemmer

Excellent. Thank you.

[music]
michael barbaro

That’s it for “The Daily.” I’m Michael Barbaro. See you tomorrow.



Source link