Correlation is not causation, but state data suggest that the number of cases of COVID-19 infections and amount of mass transit usage are related. The same data explain a big part of the COVID-19 disaster that has unfolded in New York and New Jersey.

All the COVID-19 data presented below were collected on Tuesday at 6 p.m. EST. They clearly show that the epidemic is acute in the states of New York and New Jersey.

Taken together, the states of New York and New Jersey account for just under 9% of the population in the U.S., but for more than 44% of the confirmed U.S. cases, and for nearly 60% of the deaths attributable to COVID-19. This is a remarkably disproportional rate of infection.

It is true that New York City attracts many business and leisure travelers, but on the other coast, Los Angeles and San Francisco are very cosmopolitan too, with busy international airports, and arguably more arrivals from Asia where the virus originated.

New York and New Jersey are exceptional. For example, on Tuesday they were followed by Massachusetts with 28,161 COVID-19 cases and Michigan with 1,768 deaths. New York’s and New Jersey’s numbers are an order of magnitude above other states with large COVID-19 cases and deaths.

While international arrivals, tourism, high population density and high concentration of activities play a role in the propagation of viral infections, a major difference is that New York City and the area of New Jersey adjacent to it are connected by a large and heavily used transit network consisting of various types of rail and bus services. These two states account for just under 9% of the population but for over 42% of the mass transit trips in the U.S.!

A New York City subway car. Could the city’s heavy reliance on mass transit be a factor in the large number of COVID-19 cases?

I sent an earlier version of my analysis to the Cato Institute’s Randal O’Toole, who often conducts transit analysis at the county and metropolitan area level. He informed me that “the National Transit Database data show that the New York urban area carried 44.4% of transit riders in 2019.”

By coincidence, 44.4% represents both the national share of COVID-19 cases in New York and New Jersey, and the national share of transit ridership in the New York metropolitan area; but the fact that these two metrics are so close to each other is an interesting indication.

I looked into state by state data. My analysis suggests a strong correlation between COVID-19 cases and transit ridership. In the graphical summary below, the top table includes raw numbers, the middle table includes shares in percentages, and the bottom table shows correlation factors.

The graph with red dots includes all 50 states and the District of Columbia. It shows a correlation of about 86%. One might observe that New York State has an oversized effect.

There are also two more states that appear to be outliers: New Jersey and California. New Jersey as a state has a modest amount of annual transit trips, but a large number of COVID-19 cases. The opposite is true for California: Due to its very large population, it has a large number of annual transit trips but a relatively low number of cases.

(The low number of cases in California is interesting and is currently being investigated.)

The graph with the blue dots shows the data with these three outliers removed. The correlation decreases to 48% which is both substantial and more reasonable. Mass transit usage was a contributing factor in the spread of COVID-19, but it did not cause nearly 90% of the epidemic; its effect was smaller but substantial.

This is corroborated by published medical research; a sample article states that “bus or tram use within five days of symptom onset was associated with an almost six-fold increased risk of consulting for acute respiratory infection (ARI).”

Two related articles in The New York Times lend further support to the correlation between COVID-19 cases and transit ridership.

In the first article, titled “Virus Is Twice as Deadly for Black and Latino People Than Whites in N.Y.C.,” Governor Cuomo was quoted as saying that these people “don’t have a choice, frankly, but to go out there every day and drive the bus and drive the train and show up for work and wind up subjecting themselves to, in this case, the virus.”

Not only infected passengers infected those around them but also the drivers, engineers and conductors in the transit vehicles who, in turn, became infectious for several days prior to the onset of illness, thus passing COVID-19 to others without any awareness of it.

As a result — as the second article explains — “at least 41 transit workers have died, and more than 6,000 more have fallen sick or self-quarantined.”

These are workers in New York City’s Metropolitan Transportation Authority, which is by far the largest mass transit agency in the U.S.

In hindsight, in addition to distancing and various forms of lock downs, city officials and governors should have shut down urban mass transit systems and issued vouchers to essential workers using transit for their commute to work.

At the lockdown levels of transit volume (i.e., roughly 90% lower than normal), those in need of public transportation could have used vouchers for taxis and similar services for a much safer commute to their essential jobs.

