We all have heard the phrase: “There are three kinds of lies: lies, damned lies, and statistics” (for its origin, see the entry in Wikipedia). This phrase is a damned lie.
You cannot lie with statistics if they are properly done; that is, if you show both the data and the method you used to analyze the data. Only the innumerate can be lied to with statistics. “Innumeracy” is inability to deal with numbers and quantitative reasoning, just as “illiteracy” is the inability to read. Unfortunately, although only 30 percent of Americans are functionally illiterate, a much, much higher percent are functionally innumerate. However, for important issues, you will always find numerate people on both sides, and thus when one side attempts to lie with statistics, the other side will point out the problems and perform alternative analyses. Numerate public, by spending a little effort to follow the logic of both sides, will immediately see who is right and who is wrong. And that’s why statistics are very different from “damned lies.”
Here’s a good example of how this works in practice, with me a numerate bystander in one very important debate: does massive immigration depress wages of native workers?
The proper beginning of the story is 1962 when the young George Borjas arrived as a Cuban refugee in Miami. Eventually he became an economist and was hired by Harvard. His specialty is labor economics and he is one of the foremost American experts on the consequences of immigration for labor markets.
Then, in 1980 Fidel Castro allowed a mass exodus from Cuba, which became known as the Mariel boatlift. Within months more than 100,000 immigrants arrived in Miami.
The following graph taken from Borjas forthcoming paper illustrates the magnitude of this this labor supply “shock,” to use economics jargon:
What we see here is two initial waves during the 1960s, following the Cuban Revolution of 1959 (including Borjas as a “data point” for 1962). During the 1970s emigration from Cuba was shut down by the Castro regime. The huge spike in 1980 is the Mariel boatlift, after which emigration from Cuba was again shut down as a result of behind-the-scenes agreement between Cuba and the US. The smaller spike around 1995 is known as “Little Mariel.” More recently the increase in Cuban immigration is due to the wet-feet, dry-feet policy.
What we have here is a perfect natural experiment to find out how massive immigration influxes affect the wages of native workers. The Berkeley economist David Card saw the potential of this labor supply shock and used it in a paper that was published in 1990 in Industrial and Labor Relations Review.
Now fast forward to 2015, when one summer morning George Borjas decided to revisit this analysis in light of what we have learned about immigration effects since 1990 (much of it due to Borjas own efforts). You can hear him tell the story in this video.
He did something very simple, and you can actually see what he did if you watch the 6-minute section of the video that starts at c.6:30. He used the same CPS data as David Card. However, he focused only on workers that were (1) non-Hispanic (as the best approximation to the native-born), (2) aged 25-59 (prime working age), (3) male, and (4) high-school dropouts. The last characteristics is key, because 60 percent of Marielitos did not complete high school. And even many of those of the rest 40 percent, who did, were looking for unskilled jobs due to their lack of linguistic and other skills. So Marielitos competed directly with high school dropouts, and if there is an effect on the wages, this is where we should look.
Borjas next compared the inflation-adjusted wages of Miami residents, who had these characteristics, to wages of the same segment of the American population in all other American metropolitan areas but Miami. And here’s what the data say:
The vertical line at 1980 indicates the arrival of Marielitos. The blue curve for Miami begins diverging from the black curve (other metropolitan areas) after 1980 and the difference reaches its maximum around 1985. The reason that it takes time for the effect of labor oversupply to reach its maximum impact is that wages are, in economics jargon, “sticky”—it takes several years for them to adjust to new labor market conditions. I saw the same effect in my own analysis of the effects of labor oversupply on national wages in the US. Interestingly, I also estimated the lag effect at 5 years, although at the time I did not know of Borjas analysis (well, because I did mine two years before—in 2013).
Eventually other forces come into play and the wage gap shrinks. Another divergence occurs following the Little Mariel in 1995 and the gap is not closed by the end of the series, probably due to the constant, if at a lower level, immigration into Miami. The blue band and the black dotted line are the 95% confidence limits. What they tell us is that when there is no overlap, the difference between the two curves is statistically significant—highly unlikely to happen by chance alone. In other words, the Miami wages for native-born men without high school diplomas were indeed much lower than for similar workers in other US metropolitan areas during the 1980s and then again in the late 1990s, following the two spikes of Cubans migrating to Miami. During the 1980s Miami wages were 20 percent lower than elsewhere. A very substantial effect.
And here we are. What we have here is David Card and George Borjas starting with the same CPS data. Then they used clearly described procedures in analyzing it. We know precisely why their results are different. One of them is clearly wrong, and you can decide whose procedure is better (I known what my answer is). So where are “damned lies”?