Are Studies in Income Inequality Robust? This One’s Pretty Good
David Autor and David Dorn are known for their research into (among many topics) the polarization of the U.S. labor market, sometimes also referred to as the “hollowing out of the middle class.” Here’s a New York Times piece where they outline a number of their assertions.
I attempted to replicate one of their central papers to get a sense for its robustness. The results are below along with a review of the literature leading up to it.
TL;DR – the paper replicated reasonably well. Areas where I believe greater robustness may have been attainable include where A&D impute missing Census responses; conceivably, multiple imputation by chained equations would have more-fully taken advantage of their data, though A&D assume cohort means. Also, A&D fit their LOWESS parameters by hand, where ideally they would have optimized these parameters using cross-validation. Finally, the way A&D classify U.S. occupations felt somewhat black-box-like to me, asking us to trust a prepared occupation-class mapping file from Dorn’s Ph.D. dissertation. However, as they draw no magnitudinal conclusions and merely point out that polarization exists (this said, statistically significantly), I believe by and large we can consider A&D’s results to be well-identified. They also apply a reasonably robust instrumental variable, derived from the generally historically static U.S. geographic industry breakdown, to their key regressions.
Thank you to Professors Autor and Dorn for answering numerous questions as I made my way through the replication.
A Replication Exercise
Submitted to Professor Raj Chetty and Teaching Fellow Simon Jaeger
Public Economics and Fiscal Policy I
The Growth of Low-Skill Service Jobs and
the Polarization of the US Labor Market
By David H. Autor and David Dorn
American Economic Review 2013, 103(5): 1553-1597
As Goldin and Katz point out in The Race between Education and Technology (2010), in the first three-quarters of the 20th century, the United States experienced both rapid economic growth and declines in economic inequality. Income per capita not only grew, but grew increasingly broadly distributed across income classes, creating considerable increases in economic well-being for the rich and poor alike.
Yet near the end of the 1970s, these declines in economic inequality began to reverse, and quite dramatically. According to Saez (2013), in 1980, the top one percent of income earners appropriated approximately 10 percent of the total U.S. income. Yet just prior to the financial crisis of 2008, this share had grown to nearly 25 percent. In an angry reaction, a group of protestors in 2011 launched a movement known as Occupy Wall Street. This movement received wide media coverage, and today many citizens continue to question the implications of rises in wage inequality for the middle and lower classes (Berrett 2011).
Autor and Dorn’s (2013) The Growth of Low-Skill Service Jobs and the Polarization of the US Labor Market (A&D, from here) provides a meaningful set of contributions to the literature evaluating such recent U.S. income distribution dynamics. As I discuss below, a particular thread in the literature had observed that U.S. employment and wages, especially in the 1990s, underwent polarization, wherein income and employment not only rose at the upper end of the occupational skill distribution, but also at the lower end, despite declines in the middle. A&D evaluate a hypothesis that rises in relative employment and wages in service-level jobs—food service, janitorial service, and home health assistance, for example, jobs believed to constitute a meaningful share of this low-skill employment—was a response to the erosion of middle-skill jobs more routine in nature and hence susceptible to technology and offshore labor substitution.
The purpose of this paper is to evaluate A&D’s analysis by way of replicating their results. In Section II, I review the background literature, I present a review of A&D’s analysis, and I discuss what questions I believe remain open. In Section III, I review specifically how each of A&D’s figures and tables contribute to A&D’s overall conclusions, presenting my replicated results along the way and discussing potential discrepancies that the replicated results call attention to. In Section IV, I present an idea for extending A&D’s analysis which I believe could help policymakers act upon A&D’s results.
II. Background and Referee Report
A. Technical Change, Education, and The Canonical Model of Inequality
Much of the literature on low-skill employment and wages of the past several decades has given a central position to technical change. As early as 1960, Simon (1960) forecasted that information processing technology could eventually displace jobs that involve repetitive use of the eyes, brain, and hands. More formally, Tinbergen (1974, 1975) observed that, despite several decades of increases in the relative supply of skilled labor—i.e., that of college graduates—relative wages for skilled labor had also been increasing. These wage increases suggested simultaneous increases in the relative demand for skilled labor, which Tinbergen attributed to technical change.
From Tinbergen’s observation emerged what A&D and others refer to as the canonical model of labor and technical change. In the canonical model, labor supply is binned according to whether it requires high or low skill. Technical changes over time are modeled as complementary to labor, and if these changes augment labor differentially across the two skill bins, the canonical model predicts unequal changes in employment and wages across them (Acemoglu and Autor 2011, Autor and Dorn 2013).
Many authors have investigated the extent to which skill/technology complementarity, or skill-biased technological change (SBTC) as it is often described, upon which the canonical model is grounded, can in fact describe different features of the evolution of U.S. employment and wages. The most highly-cited literature includes the following.
- Katz and Murphy (1992) provided a foundational analysis, using a labor supply and demand model and Current Population Survey (CPS) data to show how increases in the skilled-worker wage premium between 1963 and 1987 likely resulted from growth in demand for skilled workers.
- Krueger (1993) used CPS data to find that, indeed, “workers who use computers on their job earn 10 to 15 percent higher wages” (p. 33).
- Goldin and Katz (1998) examined technical change as early as between 1909 and 1929, and found evidence that it too led to increased demand for skilled labor.
- Berman, Bound, and Machin (1998) used cross-country employment data and found evidence for SBTC in twelve different developed countries in the 1980s.
- Bresnahan, Brynjolfsson, and Hitt (2002) combined firm-level panel data from the Computer Intelligence Infocorp database with a survey of firm organizational practices and found a correlation between information technology adoption and the use of high-skill labor between 1987 and 1994.
- Bartel, Ichniowski, and Shaw (2007) looked specifically at manufacturing plant operations in the late 1990s and early 2000s, and found that increases in IT utilization corresponded with increases in requirements for technical and problem-solving skills among plant operators.
- Carneiro and Lee (2010) used decennial U.S. Census data to show that the SBTC framework provides reasonable estimates of the college wage premium (though, the framework performs even better if accounting for declines in skill quality that come with increases in college attendance).
Several authors have taken the SBTC analysis even further to consider its implications for education, proposing that, with increases in demand for skills and hence increases in the skill premium, workers over time may use education to upgrade their skills. They argue that this upgrading itself has had consequences for the labor market, increasing the relative supply of high-skill workers and decreasing the relative supply of low-skill workers. The literature exploring this hypothesis includes the following.
- Berman, Bound and Griliches (1994) used Annual Survey of Manufactures (ASM) data, the Census of Manufacturers, and the National Bureau of Economic Research (NBER) trade data set to show “skill upgrading to be positively correlated with investment in computers and to R&D expenditures” in the 1980s (p. 368-9).
- Acemoglu (1998) asserted that “…the rapid increase in the proportion of college graduates in the United States labor force in the 1970s may have been a causal factor in both the decline in the college premium during the 1970s and the large increase in inequality during the 1980s” (p. 1055).
- Autor, Katz, and Krueger (1998) used CPS data to demonstrate growth in demand for skilled labor between 1940 and 1996, and moreover that this growth could be largely attributed to skill upgrading, particularly in industries with significant computer utilization.
- Machin and Van Reenen (1998) showed the existence of skill upgrading in six additional OECD nations in the 1970s and 1980s.
- Card and Lemieux (2001) showed this same phenomenon in the U.S., but argued it had been more prevalent among younger versus older workers.
- Gera, Gu, and Lin (2001) found that variation in skill upgrading across industries was in fact associated with technology indicators such as patent stocks and the age of capital stock.
- Goldin and Katz (2008) argued that education can reverse the wage inequality that technical change may otherwise induce. While technical change complementing high-skill labor can result in a wage premium for skills and hence education, “if, in addition to technological progress, the quantity and possibly the quality of education increases, then inequality could decrease” (p. 3).
However, other literature has questioned the extent to which SBTC, and hence the canonical model, can fully explain recent employment and wage dynamics, pointing to additional possible factors. This literature includes the following.
- Spenner (1988), Howell and Wolff (1991), Howell and Wieler (1998), and Handel (2003) argued that any relationship between technology and skill demand is necessarily modified by firm culture and industry norms, levels of bargaining power between management and labor, and other managerial forces.
- Card and Dinardo (2005) argued more generally that SBTC models insufficiently account for supply-side frictions such as discrimination against age, race, or health status, or general job search hassles.
Other literature has questioned the SBTC and canonical model explanations using counter evidence. This literature includes the following.
- Howell (1994, 1999) presented evidence that the skill mix of U.S. employees shifted very little following 1983, precisely when computer adoption began to accelerate.
- Doms, Dunne, and Troske (1997) used an assembly of several data sources, including the 1988 and 1993 Survey of Manufacturing Technology, to show that manufacturing plants implementing factory automation have tended to employ high-skill workers both after and before the automations, suggesting the technology has had little causal skill demand impact.
- Mishel and Bernstein (1998) argued that because technology adoption seemed to accelerate through the 1980s and 1990s, but that both education demand and wage inequality grew merely linearly, technology adoption may not have been responsible for the latter.
- Card and DiNardo (2002) observed that, despite a correlation between technology adoption and inequality growth in the 1980s, and despite continued computer technology advancements in the 1990s, the 1990s saw a slowdown in inequality growth.
- Hecker (2005) projected a growth in service occupations through 2014 despite their low-skill tendencies.
- Doms and Lewis (2006) and Beaudry, Doms and Lewis (2010) argued that computer technology adoption may arise endogenously from the presence of high-skill labor. They demonstrated this possibility using computer adoption survey data from between 1980 and 2000.
Still other literature has asserted that while technology may complement certain labor, it may substitute for other labor. Early literature in this space included the following.
- Bresnahan (1999) argued that, while technology may complement high-skill work, it can also substitute for middle- and low-skill work, such as billing, auditing, checking people into hotels, and so on.
- Caselli (1999) argued that technology can also substitute for high-skill work, through a mechanism wherein it complements low-skill work, in turn expanding the impact of this work into the previous high-skill workers’ domain. Caselli cites the assembly line as an example.
- Acemoglu (2002) made a similar high-skill labor substitution argument, citing the 19th century adoption of weaving, spinning, and threshing machines substituting for artisan labor.
A&D’s paper follows from three particular threads in this line of literature arguing that substitution has been at least as salient a force as SBTC in the evolution of U.S. labor dynamics. First, Autor, Levy, and Murnane (2003; ALM from here) investigated the specific workforce tasks computers have performed and how these tasks have either complemented or substituted for labor. They first used the Department of Labor’s Dictionary of Occupational Titles to evaluate the task requirements of the different U.S. occupations, and they then linked these occupations with Census and CPS data to observe that, since the 1970s, U.S. workers have been performing increasingly fewer tasks routine in nature and more tasks non-routine in nature. (To classify a task as routine, they considered the extent to which the task is sufficiently well-understood and structured to be automated. Example routine tasks include “record-keeping,” “calculation,” and “repetitive customer service,” p. 1286.) Furthermore, they noted that this shift has been most pronounced in industries with greater computer adoption, suggesting that computer substitution for routine tasks has been just as important to labor shifts across tasks and hence occupations as skill upgrading, if not more.
Second, Autor, Katz, and Kearney (2006, 2008; AKK from here) observed that, although beginning in the 1980s, possibly due to SBTC, relative wages and employment have grown steadily at the high-end of the occupational skill distribution and have declined steadily in the middle, relative wages and employment at the low-end have been increasing since the late 1980s. They demonstrated these effects non-parametrically using CPS along with Merged Outgoing Rotation Group (MORG) data, and labeled them the “polarization” of the labor market. (Goos and Manning, 2003, had previously used this term to describe a similar phenomenon in Great Britain.) They argued that, because the college wage premium has been consistently rising, factors beyond SBTC must be at work in the low-end of the distribution. They also noted how the canonical model, with its merely two skill bins, cannot fully describe these effects.
Third, an entire literature has explored offshore outsourcing and how it too may be substituting for labor, a phenomenon not accounted for in the canonical model.
- Bardhan and Kroll (2003) first estimated that approximately 11 percent of U.S. jobs were susceptible to offshoring in an analysis of Indian newspaper reports. The McKinsey Global Institute (2005) seconded Bardhan and Kroll’s 11 percent figure through a qualitative analysis of the kinds of jobs possible to perform remotely.
- McCarthy (2004) used Bureau of Labor Statistics data to predict that, between 2004 and 2015, 3.4 million U.S. jobs would move offshore.
- In an influential Foreign Affairs essay, Blinder (2006) dubbed offshoring the third Industrial Revolution, noting that advances in Information Technology and global communication will allow nations to trade more fluidly according to their comparative advantages. To adapt, he argued that the U.S. must re-think education, both to allow for overall continued skill upgrading, comparable to that in response to technological change, and to become more personal-service-oriented, to match the jobs likely to remain on shore.
- Following up in 2009, Blinder indexed the offshorability of each occupation in the Department of Labor’s O*NET database according to its physical proximity requirements between workers and work locations. Using this index, he estimated that a full 29 million out of 130 million total U.S. jobs (22 percent) were able to be offshored.
Acemoglu and Autor (2011; A&A from here) brought these three threads together and explored how computer and offshore task substitution have resulted in employment and wage polarization. They noted that the polarization at the low end of the occupational skill distribution has corresponded nearly one-for-one with a relative growth in service occupation employment and wages, which has been steady since the late 1980s. They then suggested that because service occupations—janitorial service, home health assistance, and so on—often comprise non-routine tasks, service occupations may be taking on some of the middle-skill-occupation labor previously in occupations comprising the routine tasks for which computers and offshoring have substituted. Using CPS, Census, and American Community Survey (ACS) data, they presented evidence for relative employment and wage increases in service occupations, along with simultaneous relative employment and wage decreases in production, operator, clerical, and sales occupations, occupations whose tasks they asserted have been more routine in nature. And as ALM noted, because workers of a given skill set can often perform many different tasks, shifts across occupations need not always entail skill transformations. Hence, A&A suggested that shifts from routine-task-intensive occupations into service occupations, perhaps even without education or other skill transformations, could explain some of the polarization AKK observed.
Following A&A, it remained an open question whether the simultaneity A&A observed between the growth in relative employment and wages among service occupations and the declines in relative employment and wages among routine-task occupations could be attributed specifically to computer and offshore substitution for these tasks. In other words, it was neither clear to what extent the routine-task intensity of an occupation could explain the employment shifts into service occupations and the resulting polarization nor to what extent shifts from other occupations, or other mechanisms altogether, have also contributed. This is the question A&D begin to explore.
B. Autor and Dorn’s Findings
A&D present a unified analysis of employment and wage polarization, changes in service occupation employment and wages, occupation routine-task intensity, computer and information technology adoption, and offshoring. A&D frame their analysis around four key hypotheses: that “markets that were historically specialized in routine-task-intensive industries should differentially:
- Adopt computer technology and displace workers from routine-task-intensive occupations
- Undergo employment polarization as low-skill labor reallocates into manual task-intensive in-person services
- Exhibit larger (nominal) wage growth at both ends of the occupational skill distribution (i.e., wage polarization)
- Experience larger net inflows of workers with both very high and very low education, driven by rising demand for both abstract labor in goods production and manual labor in service production” (Autor and Dorn 2013, p. 1560).
As I discuss in more detail below, A&D present several findings related to these hypotheses. First, they reaffirm and add a bit more color to AKK’s polarization observation between the 1980s and 2005, showing that polarization has been more pronounced in local labor markets whose occupations historically have had greater routine task content, in turn helping validate hypotheses (2) and (3).
Second, they reaffirm A&A’s finding that relative rises in employment and wages among service occupations have largely corresponded with the overall relative employment and wage growth at the low-end of the occupational skill distribution, and that relative employment and wages in routine-task-intensive occupations have simultaneously declined. To build on this finding and to show that this simultaneity at the national level has likely not been mere coincidence, A&D exploit geographic variation in a version of ALM’s measure of occupational routine-task intensity, using it as a variable in explaining growth in service occupation employment share among non-college workers. This analysis helps validate hypothesis (4), and constitutes A&D’s most salient, robust, and novel contribution to the questions in the literature noted above. (Note that A&D do not explicitly test the part of hypothesis (4) related to very-highly-educated workers and a rising demand for abstract labor, presumably because the SBTC literature already describes this dynamic quite extensively. A&D contribute mostly to the description of the interaction between polarization and the behaviors of low- and medium-skill workers.)
Third, they find that this same routine-task intensity has been associated with the adoption of computer technology in local labor markets, helping validate hypothesis (1). They also find that the offshorability of jobs in local labor markets has not been as strong a predictor of rises in service occupation employment as the routine-task intensity of the occupations in these markets.
C. Critiques in the Literature
Before I offer up my own review of A&D, there are a few existing critiques worth noting of the ALM, AKK, and A&A thread upon which A&D is grounded. First, Lefter and Sand (2011) noted that, because the Census Bureau changed their occupation classification scheme with the 2000 census, measures of occupational employment and wage growth between 1990 and 2000—which are central to ALM’s and A&A’s analyses of labor shifts across occupations—may be contaminated. They find that the standard deviation of occupational employment growth between 1990 and 2000, as AKK and A&A measured it, was nearly twice as large as that between 1980 and 1990, hence contesting the purportedly observed polarization beginning in the late 1980s. Conceivably, AKK and A&A’s polarization measure could be an artifact of the occupation reclassification. A&D cite Lefter and Sand (2011) and claim to correct for the reclassification issue, though A&D present no robustness measure for this correction. In Lefter and Sand’s (2011) own correction attempt, they discovered much weaker polarization evidence than did AKK, A&A, or A&D. It is not one hundred percent certain to me which attempt was more thorough, and hence I believe polarization’s genuineness—as well as the correctness of the associated routine-task substitution explanation—has not been completely resolved.
Second, Mishel, Schmitt, and Shierholz (2013; MSS from here) make a number of arguments against ALM, AKK, and A&A’s routine-task substitution explanation of polarization. First, MSS use yearly CPS data (in places where the others used decennial Census and ACS data) to observe that employment polarization may have begun as early as 1980, well before computerization and offshoring’s dramatic rise in the 1990s, hence implying that the association between the two in the 1990s may have been coincidental. MSS also note a shift in the CPS occupation classification scheme in 1983, which they correct for. They argue that A&A’s results using decennial Census data may have been similarly subject to this reclassification (perhaps because of coordination between Census and CPS evaluators), and hence the corrected CPS results may be more accurate. However, MSS present little more than speculative justification for this need to correct the Census data. Also, A&D’s Figure 3, which I present in Section III.2 below, shows that service occupations were indeed expanding in the 1980s. And in Table 4, A&D present evidence that this expansion was associated with the routine-task intensity of occupations then, too.
MSS also note that service occupations have constituted merely half of the employment in the lower quintile of the occupational skill distribution, and hence shifts into them would not reflect a full set of employment dynamics in this quintile. In my view, A&D successfully address any such concerns when they show in their Figure 3 that low-skill employment in non-service occupations has continually declined between as far back as 1970 and 2005. Hence, polarization, which is characterized by a rise in this low-skill-occupation employment, indeed seems most likely driven by rises in service occupation employment.
Lastly, MSS question the notion that employment polarization has stemmed from relative demand increases for low-skill-occupation labor which, as a consequence, have also increased low-skill-occupation workers’ relative wages. MSS present the same CPS analysis showing low-skill occupation employment gains in the early 1980s, and note that low-skill occupational wages actually declined over this same period. (Note, as noted above, I find the accuracy of MSS’s CPS-derived results unclear.) They also note that low-skill-occupation employment roughly doubled in the 2000s while wages among these occupations did not increase as dramatically. Finally, they use CPS data to show that, in the 2000s, service occupations witnessed relative wage growth before they witnessed relative employment growth. Using these three observations, MSS question whether employment shifts into low-skill occupations have in fact been responsible for low-skill occupational wage increases. However, in my read of ALM, AKK, A&A, and A&D, these authors do not expect low-skill-occupation employment and wage growth to operate in perfect synchrony. In A&D’s framework, cheaper computerization and offshore labor substitutes for routine-task labor, creating excess low-skill labor supply which eventually shifts into service occupations, creating employment polarization. Separately, the consumption of products whose prices decrease as a result of computerization and offshoring complements the consumption of services, increasing demand for low-skill service labor, which leads to wage polarization. A&D model these two mechanisms as separate and, although both triggered by computerization and offshoring, not necessarily evolving simultaneously.
D. Ideal Experiment
My own review of A&D begins with thoughts on what ideal experiment would identify the factors responsible for employment and wage polarization.
Of course, as noted above, there is still some disagreement over whether and when polarization occurred. Lefter and Sand (2011) question whether polarization could be an artifact of the Census’s occupational reclassification in 2000, and whether employment and wage inequality may have actually grown monotonically across the occupational skill distribution over the past several decades. MSS argue that polarization didn’t begin in the late 1980s, but actually several decades before that. One could discuss ideal experiments for resolving even these disputes. Yet A&D seek to contribute by taking AKK and A&A’s polarization evidence as sufficiently conclusive (reproducing it themselves, even taking what they assert are reasonable steps to account for at least Lefter and Sand’s concerns; MSS’s weren’t published at the time), setting out to help dissect its mechanisms.
In some fields, a Randomized Control Trial (RCT) will be the preferred experimental design for isolating the outcome variation for which a particular treatment is responsible. Participants are randomly assigned to either a treatment or a control group, safely allowing the difference in outcomes between the two groups to be attributed to the treatment. However, with employment and wage polarization research, the ultimate objective is not to identify the effect of any one given treatment in isolation. Rather, it is to identify the entire set of non-random factors responsible for polarization, ultimately to guide policy makers as they consider whether or how to attenuate this polarization. (This said, an important sub-objective is to quantify the specific influence of the routine-task intensity of workers’ occupations, which A&D argue has an effect. Yet even for this, an RCT would not be ideal. Even were an implementable and ethical method discovered to randomly assign workers into labor markets, doing so could distort any behavior-guiding incentives that depend on the worker operating in the environment he or she specifically chooses. No two labor markets are identical even sans treatment effect.)
As such, a quasi-experimental method using a large sample of comprehensive panel data would be ideal. Because polarization is a phenomenon at the entire labor-market level, this ideal data would cover many independent and clearly-delineated such markets. To ensure that the complete relationships between polarization and its responsible factors are identified, these markets would cover the full range of possible polarization developments, from significant declines in polarization (i.e., the homogenization of employment and wages across the occupational skill distribution, if in fact such an effect can feasibly occur) to significant increases. The full domain of possible values of any explanatory covariates would also be present.
Also in this ideal, every relevant explanatory covariate would be both observable and able to be perfectly instrumented, as to avert omitted variable bias and identification inconsistency due to any correlation between the covariate and error in the outcome measures. Skill level itself and its various dimensions would be observable, with no need for wage, education, or other proxies except as test covariates in their own rights. Firm culture and industry norms, degrees of bargaining power between management and labor, workplace prejudice toward age, race, gender, health status, or other personal characteristics, and so on, would all have robust, accurate proxies.
Also, the data would be administrative as to minimize the potential for the measurement error and subsequent attenuation bias common with survey data. The identification algorithm would also fit and cross-validate a nonparametric hyper-spline to the data, as to represent any and all nonlinear interactions both among the explanatory variables and between the explanatory variables and the outcome variable. The experimenter would also bootstrap the data as to discover the standard error for this hyper-spline.
E. Autor and Dorn’s Data
A&D’s source data—The Minnesota Population Center (MPC)’s Integrated Public Use Microdata Series (IPUMS; Ruggles, et al, 2010)—has a number of benefits and weaknesses relative to the ideal. IPUMS is a collection of decennial U.S. Census samples plus annual American Community Survey (ACS) samples. A&D use IPUMS’s records from 1950, 1970, 1980, 1990, 2000, and 2005. Each of these is a U.S. Census sample except for the 2005 records, which are from the ACS.
IPUMS’s principal benefit is the sheer sizes of the samples it captures. The 1950, 1970, and 2005 samples reflect a full one percent of the nation’s households in these years, and the 1980, 1990, and 2000 samples reflect a full five percent. MPC asserts that the samples, due to their large sizes, are representative both of the nation as well as of micro-geographic areas of at least approximately 100,000 people.
IPUMS reasonably complies with the panel nature of the ideal data source, in that it allows for the longitudinal tracing of whole U.S. geographical areas. As mentioned above, because polarization is a phenomenon at the whole labor-market level, observing the evolution of whole geographical areas is the most central observation pertinent to characterizing polarization. (Though also, as I discuss below, whether these individually-traced areas are perfectly isolatable and independent is not completely clear with IPUMS data.)
This said, any insight into the worker-level mechanics of polarization would require observing the historical journeys of individual workers. And IPUMS does not support longitudinal analysis at the individual-worker level, as, at it, IPUMS is merely a sequence of randomly-sampled cross-sections. The specific analytical consequences of this limitation include, first, that where employment has grown among certain occupations, such as among the service occupations that A&A and A&D are specifically interested in, one cannot directly observe to what extent this growth has derived from new individuals entering the workforce versus individuals shifting from other occupations. As such, it is not possible to determine whether polarization has affected new and existing workers alike. Similarly, one cannot use IPUMS to determine precisely how much the employment declines A&D observe among middle-skill occupations has derived from shifts into high-skill occupations, shifts into low-skill occupations, or individuals departing employment altogether. One can only observe that middle-skill occupations have undertaken employment declines in total, and that high- and low-skill occupations have simultaneously undertaken increases. (For separate reasons, A&D choose to measure these changes on a relative basis, though this is not dictated by an IPUMS limitation.) Hence, any policy ideas to address polarization derived from IPUMS data must either operate at a macro geographic-area level, or target merely hypothesized mechanics at the individual level.
IPUMS also falls somewhat short on the panel ideal of frequent observations. Because the data is merely decennial, with the exception of the interval between the 2000 and 2005 samples, only long-term trends are observable. Also, should the data have been recorded during any years in which the observed variables undertook incidental temporary shocks, no contrasting data from surrounding years is available to expose the temporary nature of the observations. As discussed above, although their methods have not been scrutinized in the literature and they do not clearly justify an occupation classification scheme correction, MSS assert that annual CPS data reveals a different pattern in the evolution of the occupational skill distribution between 1980 and 1990 than what A&D observe. Knowing for certain that IPUMS captures the correct pattern is difficult given its decennial sampling.
Regarding the ideal way in which the measured labor markets would be independent and clearly delineated, A&D are able to demarcate 722 U.S. labor markets in the IPUMS data using Tolbert and Sizer’s (1996) “commuting zone” (CZ) formulation. A CZ is a U.S. geographical region reflecting the boundaries of a cluster of commuters who tend to commute more to areas within the CZ than to areas outside of it. Because commuters tend not to cross CZ boundaries, A&D take CZs to be reasonably independent. This said, A&D do not present a test for this independence, nor whether it changes over time. A&D also note that they were not always able to perfectly identify a person’s CZ in the IPUMS data. IPUMS reports either the person’s Public Use Micro Area (PUMA, a state subarea of between 100,000 and 200,000 individuals) or county, but some PUMAs and counties straddle more than one CZ. In such cases, A&D assign the person to a CZ probabilistically.
As mentioned above, the ideal experiment would also have access to the full range of possible evolutions of polarization in a labor market, as well as the entire domains of values of the independent variables, at its disposal. Given A&A’s discovery that polarization has largely coincided with service occupation employment share growth, as discussed above, A&D’s most detailed polarization analysis focuses on changes in the share of labor hours in service occupations. IPUMS seems to cover a wide range of such changes, from declines of approximately five percent within a CZ to increases of approximately twenty percent, as A&D’s Figure 6 depicts in Section III.2 below. The domain of values for the key tested independent variable—the share of labor hours in a CZ in routine-intensive occupations—is also reasonably sizable, from zero to nearly forty percent.
How well IPUMS captures the ideal, complete set of pertinent independent variables is difficult to ascertain, as it is with any retrospective data source. IPUMS does present a rich set of individual economic measures including employment status, labor force status, three-digit occupation and industry worked within, personal and family income, business income, interest income, welfare income, retirement income, weeks worked last year, and usual hours worked per week, among others. It also presents several demographic measures related to age, gender, ethnicity, race, migration status, geographic location, living quarters, and educational attainment, among others. MPC ensures that variables are coded consistently across the different Census and ACS samples so that they can be compared, which is another IPUMS benefit. And as discussed below in the sections for Tables 5 and 6, A&D are able to test their key hypothesis—that local markets constituting highly routine-task-intensive occupations have undertaken shifts toward service occupation employment—in a model including several control variables representing alternate conceivable explanations for the shifts. This said, variables representing firm culture or industry norms, degrees of bargaining power between management and labor, and workplace prejudice toward age, race, gender, health status, or other personal characteristics, for example—variables that Spenner (1988), Howell and Wolff (1991), Howell and Wieler (1998), Handel (2003), and Card and Dinardo (2006) would view as important—IPUMS does not capture and does not allow one to test. Hence, omitted-variable bias cannot be completely ruled out.
IPUMS also does not capture a worker’s intrinsic skill level, a variable of particular significance to polarization, which by definition is the decline in middle-skill-occupation employment and wages relative to that of low- and high-skill occupations. (Of course, no data source I am aware of can directly capture intrinsic skill.) A&D, as did AKK and A&A, infer polarization’s existence using the average hourly wages within occupations as proxies for the workers’ skills in these occupations. As discussed above, this approach has generated questions in the literature as to whether polarization has even occurred. Yet even assuming it has, the lack of a skill variable precludes the complete characterization of polarization’s factors. A&D test the extent to which labor shifting away from routine-task-intensive occupations has contributed to polarization, yet IPUMS does not allow one to directly test the extent to which skill upgrading has or has not occurred in conjunction with these shifts. SBTC, and its associated skill upgrading, is the other major hypothesis in the literature for growth in wage and employment differentiation—that is, at least between high- and medium-skill-occupation workers—and SBTC and computer and offshore labor substitution are not necessarily mutually exclusive. Yet using IPUMS data, the relative strengths of these forces cannot be directly compared.
The fact that the Censuses and ACSs are human surveys also means that measurement errors can be expected somewhat commonly. MPC has the following to say about the survey nature of IPUMS.
The enumeration forms that make up the source data of the IPUMS have a variety of flaws. In some cases, enumerators or respondents neglected to answer a particular inquiry. In others, the item was completed but is illegible. In the older censuses, poor microfilm or deterioration of the paper occasionally make it impossible to read a response. In other cases, a legible response is present but is clearly incorrect, since it contradicts multiple other items in the unit. For example, if a woman is married, her household relationship is “spouse,” and her age is under 12, the age is considered to be inconsistent with the other values.
As I discuss below, although A&D take steps to impute missing values where possible, the inherent randomness of the imputations, not to mention the randomness within even many completed responses, means that attenuation bias likely affects A&D’s results. In Section III.3 below, I test for the potential scale of the imputation-related attenuation bias and find that its impact is not insignificant. This said, A&D do not draw any precise magnitudinal conclusions from their results; they merely assert that the routine-task-intensity of the occupations in a labor market seems to be a meaningful factor in the market’s susceptibility to polarization, a conclusion presumably little-affected by the bias in their quantitative measurements, as discussed more below. Still, it seems the ideal in which the complete set of factors driving polarization are both identified and quantified cannot be fully realized using IPUMS data.
IPUMS privacy protections may also somewhat affect A&D’s measurements. As to protect uniquely-high earners from identification, IPUMS top-codes individual earnings above a certain threshold. Similarly, A&D question the accuracy of the wage reports of very low earners, and bottom-code earnings below a certain threshold. This top- and bottom-coding necessarily homogenizes the observed wage and employment evolutions at the very-high and very-low ends of the wage distribution, though of course, the extent of the original heterogeneity of this evolution is unknown. And because A&D’s focus is at the low-end of the distribution, where polarization as opposed to monotonic inequality growth across the occupational skill distribution is revealed, the top-coding is presumably of little consequence to their conclusions. Because the bottom-coding is performed at only the first one-percentile mark of the wage distribution, its impact on A&D’s overall conclusions is also likely very small.
F. Autor and Dorn’s Identification Strategy
Despite the limitations of the IPUMS data versus the ideal, A&D are able to formulate a reasonable strategy for testing the influence of the routine-task intensity of the occupations in a labor market on non-college employment shifts into service occupations in this market. (As discussed above, A&D present several findings, though the association between routine-task intensity and non-college service occupation employment share growth is their most salient contribution to the literature, upon which I focus in this section.)
A&D’s Table 5 presents the most central result—a positive, significant, reasonably-instrumented, and reasonably-controlled association between the two. In Section III.2 below, I discuss the detailed steps behind Table 5’s preparation, as well as A&D’s detailed conclusions from it. Yet A&D’s identification strategy behind its results is as follows. First, A&D construct a variable summarizing the routine work the occupations in each CZ entail each year. Second, they construct an instrumental variable for this variable. Third, they prepare a linear two-stage least-squares (2SLS) model incorporating both of these variables, an outcome variable representing changes in the CZ’s share of non-college employment in service occupations (SNESO), as well as several control variables. The model measures changes in SNESO between 1980 and 2005, a time period over which polarization from computer and offshoring substitution is believed to have occurred, and uses a decade fixed-effect term to limit the identification to within each decade.
To construct the variable representing the routine-task intensity of the occupations in each CZ, A&D take several steps. ALM had used the U.S. Department of Labor’s Dictionary of Occupational Titles to estimate, on a scale between zero and ten, the extent to which each occupation in this dictionary has required each of three types of tasks: routine, abstract, and manual. A&D use ALM’s scores to create an index for each occupation’s routine-task intensity (RTI):
where is ALM’s routine task requirement score for occupation k, is their manual task requirement score, and is their abstract score. The idea is that an occupation’s RTI depends not only on the occupation’s requirements for routine tasks, but also on the extent to which workers’ available time must be shared between routine work and any non-routine work. Note that this index is for each occupation independent of the CZs in which it is found.
Following this step, A&D determine whether each occupation falls into the top 1980-labor-hour-weighted third of all occupations according to its RTI index value. If so, A&D denote the occupation as a routine intensive (RI) occupation.
Finally, A&D calculate the routine-intensive occupation share (RSH) in a CZ in a given year by dividing the percent of labor hours expended in the RI occupations in each CZ by all the labor hours expended in the CZ. RSH constitutes the key independent variable that A&D test.
A&D prepare the instrumental variable (IV) to help isolate the influence of the quasi-stationary component of the CZ’s RSH on changes in SNESO. A&D contend that different unobserved short-run characteristics of a CZ could influence both its RSH and its SNESO, biasing RSH’s identified coefficient. For example, workers might shift temporarily between service and routine-task manufacturing occupations in response to seasonal shifts in labor demand. Such shifts could contaminate measures of the influence of the CZ’s long-run, structural RSH on long-run shifts in labor into service occupations. A&D are more interested in this latter effect, presumably because it has more of a causal flavor to it. Were policy to influence it, the policy would be more likely to influence long-run polarization.
A&D assemble the IV using the CZs’ 1950 industry structures:
where is the 1950 national share of RI occupations in industry everywhere except in CZ ’s state, and is the labor-hour weight of industry in CZ . The variable essentially re-weights the industry components of the 1950 national RSH (taken from everywhere but the CZ’s own state) to reflect the industry structure of the CZ. In principle it corresponds with the industry-derived component of the CZ’s RSH, which A&D contend is more structural and long-run consistent than the CZ’s directly-measured RSH. Overall, A&D assert to be an ideal IV: because the CZs’ 1950 industry structures are temporally distant from the model’s keystone observations between 1980 and 2005, they should exhibit little correlation with SNESO shifts over this same time period. That is, except by way of the CZs’ directly-measured RSH values in 1980 through 2005, which are the model’s key explanatory variables. I present a concern regarding this IV below.
Regarding control variables, A&D incorporate the following into their Table 5 2SLS model:
- Variables approximating changes in supply and demand for service labor: the ratio of college to non-college workers in the CZ, and the immigrant share of the CZ’s non-college population.
- Variables approximating labor demand: The share of workers in manufacturing industries in the CZ, and the CZ’s unemployment rate.
- Variables approximating demand for services produced by service labor: Employed females as a fraction of the total population, and senior citizens as a fraction of the total population.
- A variable approximating the effect of the minimum wage on the availability of service labor: the fraction of non-college workers earning a wage below whatever minimum wage was eventually legislated in the subsequent decade.
G. Autor and Dorn Critique
For A&D’s strategy to be valid, first, a number of assumptions must hold. Many of these assumptions relate to the considerations the IPUMS data poses discussed above. In summary, these include that:
- The data is representative. As discussed above, MPC asserts that IPUMS accurately represents geographic areas no smaller than 100,000 individuals. In 1950, approximately 150 million individuals lived in the U.S. This implies that, across the 722 CZs, the average 1950 CZ contained approximately 208,000 individuals, a number which approximately doubled by 2005. Hence, assuming representativeness seems reasonable
- Measurement once every ten years accurately captures the relevant labor market dynamics. As conjecture, I imagine that any of A&D’s numerous dependent, independent, or control variables could have undertaken incidental temporary shocks at the CZ-level during measurement years on the order of, say, 10 percent. Supposing this order of magnitude is correct (A&D do not perform any checks for this—this would require the detailed incorporation of some separate data source, a project in itself), the decennial nature of the IPUMS data is unlikely to have affected A&D’s overall conclusions. This said, as discussed above, MSS use annual CPS data to contend that polarization even began before computer and offshore labor substitution was a salient force. If true, this could challenge the hypothesis that polarization resulted from this substitution. I believe MSS’s methods are questionable; still, there may be value in continued exploration in the literature, triangulating A&D’s results with more-frequently-observed data.
- The CZs are reasonably independent. A&D do not present any tests for this independence (nor do Tolbert and Sizer, 1996, who designed the CZs), yet A&D do cluster their standard errors at the state level. Presumably this clustering accounts for most of any correlation among the CZ outcome variable residuals, as presumably CZs’ behaviors are correlated more-or-less only where CZs are neighbors. However, some CZs cross state boundaries, and so it is possible that state-level clustering does not fully account for these correlations. Also, one could imagine a certain degree of correlation along other dimensions—say, among more-conservative and more-liberal geographic areas, for example. For this reason, it is possible that A&D’s presented standard errors are biased slightly downward.
- Any omitted-variable bias (OVB) is negligible. As discussed above, it is not possible to completely eliminate the prospect of OVB. A&D use an IV and several control variables to reduce this prospect. Yet as I discuss further below, I am not certain that A&D’s IV is perfectly robust. Also, variables representing firm culture, industry norms, degrees of bargaining power between management and labor, and workplace prejudice toward age, race, gender, health status, or other personal characteristics, for example, with any of which CZ RSH could conceivably be correlated, cannot be tested as controls. The consequence is that A&D’s RSH coefficients are possibly biased upward. I believe a future model that somehow tests RSH against these other variables could be of benefit to the literature.
- Attenuation bias from imputation and measurement error is negligible. A&D do not test for attenuation bias, although I present a cursory test for imputation-driven attenuation bias in Section III.3 below. Unfortunately, although the bias is minor, it does not appear negligible; furthermore, attenuation bias from measurement error presumably adds to it. Fortunately, because this implies that RSH coefficients may actually be larger than what A&D report, attenuation bias is unlikely to have affected A&D’s overall conclusions.
- Any artifacts from wage top- and bottom-coding are negligible. However, as discussed above, I believe top- and bottom-coding should be of little concern to A&D. A&D focus on the low-end of the wage distribution, where top-coding is irrelevant. And bottom-coding is only performed among the lowest one percent of wage earners.
A&D’s identification strategy also depends on a few assumptions besides those that using IPUMS data requires. First, it assumes that the relationship between CZ RSH and changes in CZ SNESO is linear. A&D do not test for this linearity quantitatively, yet in their Figure 6 they depict the relationship in a scatterplot. At least visually in this figure, there does not appear to be any egregious nonlinearity important to account for. To be absolutely certain, A&D could have used an F-test to compare the goodness-of-fit of different functional forms. Though, looking at the scatterplot, I do not imagine that an F-test would have identified linearity as a highly suboptimal form.
Of course, as mentioned above, the ideal experiment would fit and cross-validate a nonparametric hyper-spline to identify any and all nonlinearities. Were all assumptions known to be perfectly adhered to and were A&D’s results not subject to other errors, such an approach might have offered up marginal value. Yet, given how linear the relationship already appears, it may not be clear if any newly identified yet subtle nonlinearities reflect true effects or noise. Nor may it be clear if the captured nonlinearities provide meaningful new entry points to policy design or further research.
The functionality of the SNESO variable assumes that workers are consistently classified into occupations across the entire time period over which this variable is used. A&D prepared a special occupation definition set to ensure such, purporting that it operates consistently on data from between 1980 and 2005. (Of course, the prevalence of different occupations likely shifted over this time period, even if the way workers are classified into these occupations did not.) Though, A&D do not present any evidence that these occupations are in fact over-time-consistent. If for some reason they are not, then some of the measured shifts into and out of service and other occupations could be artificial, with potentially dramatic consequences for A&D’s conclusions. RSH and any other coefficients predicting SNESO changes could be biased in either direction, and the presented effects could even be of the wrong sign. Lefter and Sand (2011), analyzing A&A, in fact argued that the entire polarization phenomenon could be merely such an artifact. A&D purport that their occupation set accounts for Lefter and Sand’s (2011) concerns, and it seems A&D’s assertion is in good faith. (Preparing the new occupation set constituted a sizable fraction of Dorn’s (2009) Ph.D. dissertation.) Though, an analysis of the precise degree of success of this correction could be reassuring.
The functionality of the SNESO variable also assumes that the constitution of the non-college population did not meaningfully shift between measurements. A&D mention in their text that, as a simplification, their theoretic model excludes the possibility of skill upgrading. Yet it’s not clear to me that skill upgrading does not nonetheless influence their empirical measurements. In principle, SNESO increases could result not only from non-college workers shifting into service occupations, but also from non-college, non-service occupation workers shifting into college. In their online appendix, A&D even show that 1980 CZ RSH was statistically-significantly correlated with changes in workers’ education levels between 1980 and 2005. This said, in Table 5, A&D present results from a separate model in which they include changes in the ratio of college to non-college labor as a control variable, and this model shows RSH to be just as strong a predictor of SNESO increases as do models without the control variable. Hence, it seems that, even if this assumption does not perfectly hold, the consequences are relatively negligible.
The functionality of A&D’s IV depends on the assumption that it is not correlated with errors in CZ SNESO changes, A&D’s key outcome variable. As discussed above, the IV is intended to isolate the structural component of RSH variation away from any more-momentary, perhaps seasonal shifts into routine task occupations, which wouldn’t explain long-term SNESO changes. The IV is prepared using the historical (1950) national RSH of different industries, which A&D contend correspond with this structural component of present-day CZ RSH. Yet it is not clear to me that historical industrial RSH might not also be correlated with SNESO error, i.e. momentary worker shifts across occupations. Within a hypothetical industry whose historic occupations have been more routine in nature, could it not be easier for labor to seasonally shift into and out of routine-task jobs—perhaps as temporary labor—than in industries with historically fewer routine tasks available? In Section III.3 below, I test A&D’s model using a modified IV in which I remove the contribution of abstract labor to the IV’s composition. Because I imagine that temporary labor shifts occur less frequently across high-skill, abstract-labor occupations, I believe this modification could concentrate the IV’s formulation around industries that differentially support temporary labor shifts. This experiment results in some evidence that this modified IV indeed instruments the RSH coefficients less effectively. I believe this implies that neither the experimental IV, nor the original IV, which is composed of the same labor as the experimental IV plus a little more, completely jettison the ostensible, short-term, less-predictable component of the RSH-SNESO relationship. Hence, I am not positive that A&D’s IV is completely robust. (Also, I believe the fact the IV is prepared using 1950 data—twenty percent of the workers’ labor hours of which need to be imputed—generates even more of a question as to its robustness.) As such, it is not clear to me that A&D have captured purely the structural component of RSH’s association with SNESO changes. All this said, as discussed further below, the experimental IV does not necessarily show RSH to be poorly associated with SENSO. If anything, a more-robust IV might show the relationship to be even stronger than A&D currently contend.
My last critique of A&D is not a question whether an assumption holds, but an observation that the RSH variable definition does not seem perfectly straightforward to intuit. The variable presents the share of labor hours in each CZ in a given year in occupations that in 1980 were among the top labor-hour-weighted third of occupations according to their RTI indices. Of course this measure reflects something about the general routine-nature of the task requirements of the occupations in each CZ, regardless of the year for which RSH is calculated. But ranking CZs in a given year according to this measure might give different results than if RSH counted the occupations that, say, in 1990 were among the top labor-hour-weighted third. Or among the top 25 percent. It’s not that this index should change very much for a given occupation from year to year, presuming the occupations are defined consistently over time (which itself is an assumption as discussed above). A&D also test several variable definition choices in their online appendix to show that the statistical significance of the RSH-SNESO-change relationship does not equivocate with small changes to the RSH definition. But defined the way it is, how many occupations count toward the CZ’s RSH depends on the number of hours workers worked in each occupation in 1980. This arbitrariness makes it somewhat difficult to interpret the magnitudes of the presented relationships between RSH and SNESO changes. We can really only conclude that a reasonably-strong relationship exists between CZ RSH and CZ SNESO changes. (This said, as discussed above, because of the various possible biases affecting the coefficients, I believe precise magnitudinal conclusions should not be drawn anyway.)
All this said, I believe A&D make a meaningful contribution to the question about the extent to which shifts away from routine-task-intensive occupations could explain shifts into service occupations and the resulting polarization. Previously, A&A showed that, on a national level, occupations generally entailing routine work have experienced declines in employment and wages while service occupations have experienced increases. This raised the question whether these simultaneous effects have been linked mechanically through routine task substitution. Yes, data limitations including limited observation frequency, uncertainty around CZs’ independence, the potential for OVB, and the need to impute missing data which leads to attenuation bias preclude A&D from drawing precise conclusions about a causal link between RSH and SNESO changes and where RSH may rank among other possible polarization factors. Yes, we must trust A&D that their occupation classification is consistent across years, which hopefully the literature will at some point be able to validate. Yes, RSH coefficients may be difficult to interpret given the arbitrary nature of RSH’s definition. Yes, because IPUMS data does not trace individual workers longitudinally, A&D cannot draw conclusions about the specific worker trajectories characterizing polarization. And yes, presumably because the analysis would be of much greater scope, not to mention that worker intrinsic skill is very difficult to estimate, A&D do not evaluate to what extent computer and offshore labor substitution in high-RSH markets operates simultaneously to SBTC. Hence, I believe the full characterization of the sources of inequality growth, and of polarization more precisely, remains open for the literature to continue to explore.
Still, because of the progress A&D have made in examining polarization at the local-market level using a reasonably-credible instrument and set of controls, as I review further below, I believe A&D have been able to make a strong case that occupational routine-task intensity and polarization have been related non-coincidentally.
III. Replication of Specific Results
In this section I discuss each of A&D’s tables and figures and how they help validate A&D’s hypotheses discussed in Section II above. I attempt to replicate each of these tables and figures, and I make note of any discrepancies, attempting to explain their sources and their possible consequences.
1. Replication Challenges
First, in summary, there were five key challenges which I believe were responsible for differences between A&D’s and my replicated results. I discuss each of these challenges in this section. Later in the paper I discuss a number of sensitivity tests I ran in an effort to confirm that such challenges stemmed from differences in our raw data and not our approaches.
A. College / No College Classification
To determine whether an individual in the data had entered college, A&D used an IPUMS variable known as “EDUC99,” which categorized individuals based on their educational attainment levels at the time of the given Census or ACS. MPC has since removed this variable and replaced it with a different variable known as “EDUC.” EDUC similarly categorizes individuals, although it treats those who have completed “some college, but less than 1 year” slightly differently. EDUC99 classified these individuals as if they had completed “1-3 years of college.” EDUC instead includes them in the same category as those who have completed only 12th grade. Because it is not possible using EDUC to identify which individuals were previously classified as having entered college but are now classified as having completed only 12th grade, I leave them in the 12th grade category.
In some areas, this difference had a rather sizeable impact on the replicated results. One way to see how such a seemingly minor difference in the definition of a study population had a nontrivial effect on the replicated results is by comparing A&D’s Appendix Table 1 Panel A to a replicated version of it. Appendix Table 1 Panel A presents the levels of and changes in employment share by different occupation groups, and it does so among workers without a college education, making its replication subject to the effect of the new definition of having entered college. The “service occupations” category—a category, of course, that is central to the hypotheses A&D test—represents 12.90 percent of the 1980 non-college labor in A&D’s paper and 12.93 percent in the replicated results. In 2005, these fractions equal 19.8 percent and 18.9 percent, respectively. These differences are small; they’re what one might expect from a minor difference in the definition of having attended college. Yet the growth in share of service employment among non-college workers between 1980 and 2005 differs by a full 13.4 percent between A&D’s and the replicated results, as one can see in the table below. The differences of 0.3 percent in 1980 and -4.4 percent in 2005 compounded each other in this growth calculation.
As discussed in Section II above, one of the weaknesses of the IPUMS database is that it requires imputing values missing from the Census and ACS surveys. Frequently missing values include the worker’s “weeks worked last year,” “usual hours worked per week,” and “hours worked last week,” each of which A&D use to calculate the worker’s annual number of labor hours. For both 1950 and 1970, labor hours cannot be directly calculated for a full twenty percent of the workers. For 1980, 1990, 2000, and 2005, labor hours cannot be directly calculated for less than five percent of the workers. If the annual number of labor hours cannot be directly calculated, A&D impute values using the average number of labor hours within the population with an occupation and educational attainment level the same as those of the person with the missing value. However, as noted above, MPC changed how they report educational attainment levels in between when A&D downloaded the IPUMS data in 2007 and when I downloaded it in 2013. They namely did so by replacing the EDUC99 variable with EDUC. In a few instances, as I discuss in more detail below, the resulting differences in imputed labor hours may be responsible for differences between A&D’s and my final results.
C. Inflation Estimates
I did not have access to A&D’s inflation adjustment figures. A&D state that they adjusted hourly wages for inflation using the Personal Consumption Expenditures deflator. I was able to make similar adjustments using Personal Consumption Expenditures Price Index (PCEPI) data published by the Federal Reserve Bank of St. Louis. However, it seems that my adjustments were not precisely the same as A&D’s. In Table 1 Panel A in Section III.2 below, one can see that the replicated annual average employment share figures appear to match A&D’s closely—to within 2 percent. In Panel B, the replicated growth in annual average wage figures also appear to match A&D’s relatively closely—though up to five percent, a larger difference than with Panel A.
These two panels are constructed using nearly identical calculations, with workers’ hourly wages representing the key additional piece of information in Panel B. (Panel A is constructed by summing labor hours for all individuals in each occupation category and then dividing by the total labor hours across all occupation categories. Panel B is constructed by summing the labor hours multiplied by the wage rates for all individuals in the occupation category and then dividing by the total labor hours in the category .) Since my and A&D’s raw worker wage data is essentially the same, it seems our greater differences in Panel B must be due to differences in our inflation indices. (Note that these inflation index differences affect many of the replicated tables and figures—that is, wherever wages are involved. However, the effect can be seen most clearly in Table 1 Panel B, given this panel presents a basic summary of wages by year and occupational category.)
I was able to somewhat calibrate my inflation adjustment figures to mitigate these differences. Before this calibration, the replicated average wages in Table 1 Panel B differed by up to ten percent from A&D’s. However, determining the correct calibration multiplier for the inflation figure for each year was not as simple as calculating the ratio between A&D’s and my final average wage results. Individual hourly wages are top-coded before they are averaged, and hence any wages that are already near to or surpass the top-code threshold are not subject to the calibration multiplier’s full effect. I used a trial-and-error method wherein I tested a hypothetical calibration multiplier and then examined its effect on the figures in Table 1 Panel B. Given that calculating a given instance of Table 1 Panel B required approximately a full hour of my computer’s CPU time, I did not have time to use this method to achieve precision to more than within approximately five percentage points.
D. Top-Coding Uncertainty
As discussed in Section II above, individuals with sufficiently-high wages may be among few peers in their geographic areas, and hence their wages in principle could identify them. To protect these individuals’ identities, MPC “top-codes” each individual’s annual income to a maximum threshold value.
Some of the results below (notably, in Figure 4) were able to be accurately replicated except at the high end of the wage distribution. A conceivable explanation is that MPC shifted its top income threshold after A&D originally downloaded the IPUMS data in 2007, similar to how MPC shifted the way they code educational attainment levels. Communicating with A&D, they concurred that this was a reasonable possibility.
E. Smoothing Parameters
Several of the figures below present smoothed results—that is, a spline representing an inferred probability density distribution for the underlying data. However, A&D do not discuss what specific smoothing algorithm, nor its parameters, they apply to arrive at these density distributions. To replicate such figures, I chose a locally-weighted scatter plot smoother (“LOWESS”) algorithm with a bandwidth parameter that allowed for a reasonable visual fit between my results and A&D’s. As such, my replicated graphs appear similar but not identical to A&D’s.
As an aside, it is worth noting that, ideally, A&D had numerically optimized their spline parameters, perhaps using a technique such as n-fold cross-validation. In Figure A below, I replicate A&D’s Figure 4, but using three different LOWESS bandwidth parameters.
I constructed the blue distribution using a relatively narrow bandwidth parameter (0.15). Because this distribution seems to fluctuate quite frequently when moving across the occupational skill percentiles, it conceivably over-fits the data, representing some of the data’s random variation as if it were structural. The green distribution uses the bandwidth parameter I chose to best match A&D’s results. The red distribution uses an even wider bandwidth parameter (10).
Note that the density of the green distribution decreases mostly monotonically after reaching approximately the 30th skill percentile, yet it increases again starting at skill percentile 95. A&D refer to this increase at the upper end as a genuinely mechanical (i.e. not random) effect in the data. Yet when I increase the bandwidth parameter beyond that of the green line—i.e., to that of the red line—the effect disappears. Without knowing whether the bandwidth parameter for the green distribution is optimized, it is not clear to me if the effect it displays at the upper end of the occupational skill distribution is in fact nonrandom.
F. Uncertainties in Author Approach
In a few instances, A&D’s results did not seem correct to me given what I was observing in the data. In most such cases, I attempted to construct the result in multiple ways to account for possible choices in how A&D could have prepared the result. I note the cases where, despite these attempts, I was still unable to obtain results that matched A&D’s.
2. Replication Results for Each Table and Figure
A. Figure 1
Figure 1 presents graphical evidence for the polarization of U.S. employment and wages between 1980 and 2005. A&A presented a similar figure and hence Figure 1 is not new to the literature. However, Figure 1 reaffirms polarization’s existence (at least to the extent allowed for by the assumptions discussed in Section II above), and that A&D’s data in fact captures it.
In both Figure 1 panels, the x-axis represents a proxy for occupational skill level: the rank-order of average 1980 wages within occupations. In panel A, we can see that employment share growth was greatest among occupations with both very low and very high average 1980 wages; the occupations with mid-level average wages experienced less employment share growth, with a minimum at around the 30th percentile. Visually, the replicated Figure 1 seems to match A&D’s relatively closely, with some small differences I believe due to imputation and choice of smoothing algorithm and parameters.
Panel B shows changes in wages over this same time interval. Like employment shares, wages have also exhibited polarization, with the greatest growth at the high and low ends of the occupational skill distribution and weaker growth in the middle.
My replicated Figure 1 Panel B is similar to A&D’s, although matching not as closely as my replicated Panel A. Replicated Panel B does demonstrate wage polarization, which is the key result A&D seek to demonstrate with it. However, overall wage growth across the occupational skill distribution appears larger than in A&D’s version. I believe this is due to imprecision in both my 1980 and 2005 inflation figures, as discussed above.
Also, the minimum wage growth in the replicated figure occurs at around the 40th occupational skill percentile, while it occurs at around the 50th percentile in A&D’s figure. This may be a result of differences in choice of smoothing algorithm and parameters.
B. Table 1
Table 1 presents another purely descriptive, yet quantified, view of employment and wages over time—this time by occupation category. Of particular interest, we can see that between 1980 and 2005, within the “managers / professionals / technicians / finance / public safety” category, employment share rose by 29 percent and average wages rose by 31 percent. Similarly, within the “service occupations” category, employment share rose by 30 percent and wages rose by 16 percent. Although A&D do not directly show how these occupational categories fit onto the skill distribution in Figure 1, they claim that these categories’ stronger employment share and wage growth coincide with the stronger employment share and wage growth at the high and low ends of the occupational skill distribution. (A more granular presentation of the concurrence of service occupation employment share and wage growth and polarization is presented in Figure 2 below.)
As mentioned in Section II above, the classification originally appeared in Dorn (2009). Dorn’s key challenge was to consistently classify the occupations despite differences in how they were measured across the various Censuses and the ACS. Dorn supplied me with a bridge to match the occupations in each year of the IPUMS data with his classification scheme. Despite the unknowns expressed above regarding the integrity of this scheme, I ran with the assumption that it is of high integrity, and I did not attempt to recreate it myself. However, also as mentioned above, the integrity of this file is a central assumption upon which the validity of A&D’s results depends. Incorrect occupation classifications could mean that the employment share changes within each of the Table 1 categories are artificial, even evolving in the wrong directions. Were A&D to release a robustness analysis of the classification scheme, it would be reassuring.
In Table 1 Panel A, the replicated employment share figures seem to closely match A&D’s. For years 1980 through 2005, the replicated figures typically differ from A&D’s by less than one percent, and I believe the differences are due to small differences in how missing labor hours were imputed, as described above. The 1950 and 1970 replicated figures also match closely, although sometimes not as closely as those for 1980 through 2005. I believe this is because labor hour data from the 1950 and 1970 censuses are missing more frequently (for a full twenty percent of the sample, as mentioned above) and require more-frequent imputation.
In Panel B, the replicated wages typically match reasonably closely, although, similar to in Figure 1, not as closely as those in Panel A. I believe this is the result of imputation differences, but now compounded by differences between A&D’s and my inflation data. As discussed above, I was somewhat able to calibrate my inflation data by repeatedly adjusting it and then monitoring the effect on the figures in this panel. Before this calibration, replicated Panel B figures typically differed from A&D’s by approximately a full 10 percent.
Also note that differences in wage growth—both between 1950 and 1980, and between 1980 and 2005, are greater than the differences for the wage figures themselves. As with Appendix Table 1, this is because the growth figures by definition are calculated using two different wage figures (for example, one from 1980 and one from 2005), whose replication differences compound.
A&D Notes: “Sample includes persons who were age 18-64 and working in the prior year. Hourly wages are defined as yearly wage and salary income divided by the product of weeks worked times usual weekly hours. Employment share is defined as share in total work hours. Labor supply is measured as weeks worked times usual weekly hours in prior year. All calculations use labor supply weights.” (p. 1556)
C. Figure 2
Figure 2 adds a bit of color to A&A’s observation that polarization at the low-end of the occupational skill distribution was largely driven by employment shifts into service occupations. Figure 2 re-depicts Figure 1, and then superimposes a counterfactual scenario in which employment and wages in service occupations are held at their 1980 levels.
A&D Notes: “The counterfactual in panel A is constructed by pooling ACS data from 2005 with Census data from 1980 and estimating a weighted logit model for the odds that an observation is drawn from 1980 Census sample (relative to the actual sampling year), using as predictors a service occupation dummy and an intercept. Weights used are the product of Census sampling weights and annual hours of labor supply. Observations in 2005 are reweighted using the estimated odds multiplied by the hours-weighted Census sampling weight, effectively weighting downward the frequency of service occupations in 2005 to their 1980 level. Given the absence of other covariates in the model, the extra probability mass is implicitly allocated uniformly over the remainder of the distribution. We calculate the counterfactual change in service occupation wage in panel B by assigning to each service occupation in 2005 its 1980 real log wage level.” (p. 1557)
Visually it appears that employment share growth among service occupations indeed accounted for nearly all of the employment polarization at the low-end of the occupational skill distribution between 1980 and 2005. None of the non-service occupations below approximately the 40th skill percentile experienced any employment share growth over this period.
Interestingly, service occupations did not account for the entirety of the wage polarization over this same period, as the wage counterfactual shows how certain non-service occupations at the low-end of the occupational skill distribution experienced more wage growth than certain occupations in the middle. A&D primarily test for the different factors contributing to increases in service occupation employment. Yet it could be an interesting question for future research why some low-skill non-service occupations experienced more wage growth than some medium-skill non-service occupations. As far as I know, A&D newly uncovered this phenomenon; it did not surface in A&A’s analysis.
Figure 2 replicated just about as well as Figure 1 did. The non-counterfactual plots in the figures are identical to those in Figure 1, and the counterfactual plots seem to depart from the non-counterfactual plots in roughly the same places and ways as they do in A&D’s figure. I believe differences in how labor hours are imputed again create small dissimilarities.
D. Figure 3
Figure 3 presents an alternate view of how growth among service occupations contributed to polarization. It depicts the same phenomenon that Figure 2 depicts, yet separated by decade, within only the lowest occupational skill quintile, and with the various occupations clubbed into service and non-service groups.
We can see that service occupations experienced employment share growth beginning in 1980 while non-service occupations did not. In the 1980s, employment in this quintile declined on net, which A&D note is a decline that authors such as AKK previously observed. By 1990, employment share growth among service occupations was sufficiently pronounced to create positive net employment growth in the whole quintile. This is the employment polarization that is also visible in Figures 1 and 2.
Directionally, the replicated Figure 3 more or less matches A&D’s, although magnitudinally there are a few deviations I believe resulting from imputation differences. Of particular note is that the non-service occupations showed slight growth between 1970 and 1980 in the replicated figure, where they showed declines in A&D’s figure. I believe this is because, as noted above, data were more frequently missing from the 1970 Census than from the others, and hence its values must be imputed more frequently. Fortunately, this discrepancy does not impact most of A&D’s more-detailed analyses below—those in Table 5, for example, which constitute A&D’s central results—given that they generally evaluate phenomena beginning in 1980.
Because Figures 1 and 2 offer up evidence that polarization and employment and wage increases in service occupations has been largely the same phenomenon, A&D choose to focus many of their remaining analyses (particularly in Tables 4 through 6) on identifying factors responsible for labor shifts into service occupations.
E. Table 2
Table 2 presents the first set of results that begin to explore A&D’s central hypothesis, that employment and wage polarization since 1980 has been a function of the routine-task requirements of different occupations. This exploration draws upon the RTI index prepared as discussed in Section II above.
Table 2 presents the same occupation categories as Table 1, and shows whether the labor-hour-weighted RTI index for each category falls above (“+”) or below (“-“) the average across all the categories. The table similarly presents the relative magnitudes of ALM’s routine, manual, and abstract requirement scores.
Although not ascertainable from the table itself, A&D state in their text that the table’s occupation categories are arranged roughly according to skill level, with the “managers / professionals / technicians / finance / public safety” category requiring the highest-level skills on average, and “service occupations” requiring the lowest. We can see that the RTI index is below average (“-“) for both of these book-end categories, hinting that routine-task intensity is in fact related to the wage polarization at the high and low ends of the occupational skill distribution.
I was able to replicate most of this table exactly. One difference is that, in my analysis, the average RTI index within the “production / craft” occupation category is slightly below the average across all of the occupations and hence I report a “-” sign for it instead of a “+.” However, the index was close enough to the cross-category average that, presumably, were it not for differences in how missing labor hour data was imputed, the index could have ended up slightly above average, changing this reported sign to match A&D’s.
Another difference is in the task category—abstract, routine, or manual—that is most prominent (i.e., has the highest reported score) within each occupation category. The gray shading in the table denotes this. I found that routine-task requirements were more prominent among the “transportation / construction / mechanics / mining / farm” and “service occupations” categories than manual task requirements, although A&D’s analysis shows that manual task requirements were more prominent. I attempted to calculate this result in several ways:
- I weighted the task category score by the total number of 1980 labor hours in each occupational category before taking the average within each occupational category
- I took the straight average of the task category score across all the occupations in each occupational category
- I took the natural log of the task category score for each individual in the occupational category before averaging the scores, given the RTI index is also calculated by first taking the natural log of each score
- I removed farm-related occupations from the calculation, given that, for certain other analyses in the paper, A&D remove these occupations due to poorly reported data among them
None of these calculation attempts showed manual tasks as more prominent among the “transportation / construction / mechanics / mining / farm” or “service occupations” categories. It is not clear to me exactly how A&D arrive at this result. I did notice that these two occupation categories showed more manual task prominence than any of the other occupation categories, and perhaps this result was mistaken for what A&D assert in the paper, that manual task requirements were more prominent than other task requirements among these occupation categories. Of course, it is also entirely possible that some other reasonable transformation that did not occur to me to attempt would have generated results that match A&D’s.
Fortunately, I don’t believe these differences impact the viability of A&D’s hypotheses. Most central are that “managers / professionals / technicians / finance / public safety” and “service occupations” exhibit lower-than-average RTI indices, and these results replicated just fine. Also, regardless, A&D’s hypotheses are tested more rigorously with the econometrical models discussed in the sections for Tables 3 through 7 below.
A&D Notes: “The table indicates whether the average task value in occupation group is larger (+) or smaller (-) than the task average across all occupations. Shaded fields indicate the largest task value for each occupation group.” (p. 1571)
F. Figure 4
Figure 4 takes the analysis of routine-task intensity a step further and shows how the routine-task requirements of an occupation vary with the occupation’s required skill level. In other words, it presents a version of the first column of Table 2, but showing individual occupations ordered by skill instead of within occupation groups, and showing a quantitative measure of the occupation’s routine-task requirements instead of whether the occupation’s RTI index falls above (“+”) or below (“-“) average.
To prepare the quantitative measure, A&D first determine whether the occupation fell into the top 1980-labor-hour-weighted third of all occupations according to its RTI index value; i.e., whether the occupation is RI, as discussed in Section II above. A&D then infer a probability distribution that each occupation is RI as a function of the occupation’s required skill level. (Skill levels are approximated in the same way as in Figure 1, according to the rank-order of average wages within the occupations.) Figure 4 plots this probability distribution.
We can see that this inferred probability that an occupation is RI, which A&D refer to as “routine occupation share,” is lower at both the low and high ends of the occupational skill distribution than in the middle, showing how RTI is in fact at least associated with the polarization of employment and wages.
I attempted to replicate Figure 4 in two ways. First, I rank-ordered the average wages among A&D’s cross-decade-consistent set of occupations described above. The resulting graph, in the middle of the three below, is of a similar shape to A&D’s, with lower routine occupation share at the skill-level distribution boundaries than in the middle. However, the figure’s routine occupation share peaks at around the 20th skill percentile rather than around the 30th, which A&D’s does. Also, the magnitude of the replicated peak is roughly 60 percent, where A&D’s is roughly 65 percent.
In a second replication attempt, I ranked-ordered the average wages among the 1980 (non-cross-decade-consistent) set of occupations from the raw IPUMS data itself. This figure appears visually even more similar to A&D’s, with its routine occupation share peaking around the 30th skill percentile and at a magnitude of roughly 65 percent.
Based on how closely this figure matches A&D’s, I have a hunch that A&D actually prepared Figure 4 by rank-ordering the 1980 occupations rather than their cross-decade-consistent set of occupations. The choice is somewhat arbitrary, as the key takeaway from this figure is that the routine occupation share peaks at neither of the occupational skill distribution boundaries but in the middle of the distribution, and both ways of ranking occupations demonstrate such behavior. However, because other analyses in A&D’s paper make use of the cross-decade-consistent set of occupations, this figure is slightly inconsistent with them.
A difference I wasn’t able to resolve is that A&D’s Figure 4 shows a local minimum in the routine occupation share at around the 80th skill percentile, while mine shows a minimum at around the 95th skill percentile. As discussed above, I believe this is because MPC somewhat altered how they top-code wages after A&D downloaded the IPUMS data in 2007. If the top-code income threshold is in fact lower in my data than in A&D’s, then such an effect is conceivable. Very-high-wage individuals who were isolatable in A&D’s data would now be blended with individuals whose wages are not quite as high. Hence, effects previously isolated to within the 80th to 90th wage percentile domain, such as the local minimum in routine occupation share, would now be spread across the entire 80th to 100th wage percentile domain.
G. Figure 5
Figure 5 ties the analysis of the association between routine-task requirements and polarization back to how Figure 1 depicts polarization. Figure 5 re-depicts Figure 1, but after separating the occupations’ labor hours into those in CZs with either above- or below-average RSH in 1980. (This is A&D’s first figure or table to make use of the CZ formulation.)
In Figure 5 Panel A we can see that employment polarization has in fact been more pronounced among the CZs with above average RSH than in those with below average RSH. In Panel B we see the same for wage polarization. This implies that A&D’s deeper exploration in the subsequent tables of polarization as a function of RSH is warranted. Also, Figure 5 in itself presents evidence helping validate hypotheses (2) and (3) mentioned in Section II above.
Interestingly, though, it also appears that both employment and wage polarization still manifested—not insignificantly—among the CZs with less RSH. A&D do not assert that RSH is the only factor responsible for polarization, and in Tables 5 and 6 as discussed below, A&D present a more rigorous analysis controlling for a number of additional possible factors. The fact that additional factors may have been at work seems visible here in Figure 5.
As with Figure 2, Figure 5 replicated just about as well as Figure 1 did. The overall departures from A&D are similar to those in Figure 1 (most noticeable in Panel B), which is expected, because the high and low RSH plots should average to the overall Figure 1 plot. In Panel B, the high RSH plot separates from the low RSH plot at a lower skill percentile than in A&D’s. Though, because the high and low plots follow very similar trajectories between the 50th and 70th percentiles, just a small difference in the trajectory of one of the plots means that the two plots intersect at very different points. The qualitative conclusion from the figures—that above-average-RSH CZs experienced more polarization than below-average-RSH CZs—is discernable in the replicated figures just as well as in A&D’s.
H. Table 3
Table 3 presents evidence for A&D’s hypothesis (1), that “markets that were historically specialized in routine task-intensive industries should differentially adopt computer technology and displace workers from routine-task-intensive occupations.” Panel A uses a set of basic OLS models to predict changes in the number of PCs per employee in local markets as a function of the market’s RSH:
where is the change in PCs per employee in CZ in state across time interval , is the labor-hour-weighted percent of occupations in that CZ that are RI at the start of the time interval, is a time interval fixed-effect coefficient, is a state fixed-effect coefficient, and is the prediction error.
Tolbert and Sizer (1996) developed the “commuting zone” (CZ) level of tabulation, as discussed above. To my understanding, Tolbert and Sizer provided Autor and Dorn with a bridge file matching their CZs to the geographic entities in the IPUMS data (State Economic Entities for 1950, counties for 1970 and 1980, and Census Public Use Micro Areas for 1990, 2000, and 2005). In replicating Autor and Dorn’s paper, I too used this bridge and did not re-create it, assuming Tolbert and Sizer’s analysis to be of high integrity.
***Significant at the 1 percent level. **Significant at the 5 percent level. *Significant at the 10 percent level. A&D Notes: “N = 675, N=660, and N=1,335 in the three columns of panel A, and N=2,166 (3 time periods x 722 commuting zones) in panel B. Adjusted number of PCs per employee is based on firm-level data on PC use which is purged of industry-establishment size fixed effects (Doms and Lewis 2006). The PC variable is unavailable for a small number of computing zones that account for less than 1 percent of total US population. All models include an intercept, state dummies, and in multi-period models, time dummies. Robust standard errors in parentheses are clustered on state. Models are weighted by start of period commuting zone share of national population.” (p. 1576)
Per personal communication with A&D, with both Table 3 and the further tables below, A&D typically analyze the 722 CZs that cover the contiguous 48 United States. A&D exclude Alaska and Hawaii because a bridge between CZs and the 1950 State Economic Areas in Alaska and Hawaii is not available.
Beaudry, Doms, and Lewis provided the PCs-per-employee data, taken from a study (2010) in which they measured the number of PCs per employee in each CZ in each decade by surveying private sector firms. A&D exclude several CZs without PCs per employee data from the Panel A models, though A&D claim that these CZs represent less than one percent of the U.S. population.
Because the models incorporate state fixed-effect variables, only the within-state component of the cross-CZ variation is measured. And because the models incorporate time interval fixed-effect variables, this variation is only measured within each given decade, not more broadly across decades.
Note that the Table 3 analyses do not attempt to isolate the effect of the possible longer-term, more-structural component of the RSH in a CZ. As discussed in Section II above, A&D believe that a CZ’s RSH may have both a quasi-stationary longer-term, and a more-volatile shorter-term, component to it. In their subsequent analyses below, A&D use an instrumental variable to attempt to isolate the effect of this longer-term component. However, for Table 3, they measure the overall association between total start-of-decade CZ RSH and the outcome variables. It is not clear to me why they chose not to isolate the longer-term impact of RSH as in the below analyses, though granted, there is merit in seeing that these effects were in fact associated with overall RSH at the start of each decade, regardless of how much of this RSH was quasi-stationary and how much was driven by shorter-term fluctuations.
In Panel A, we see that, between 1980 and 2000, with every start-of-decade percentage point of RSH in a CZ, the change in the number of PCs per employee in the CZ increased by a statistically significant amount over the course of the decade.
I was able to replicate Panel A very closely, with the coefficient for each model matching A&D’s to within less than one percent. Again, I believe the differences are due to differences in how missing data was imputed in my dataset compared with A&D’s.
Panel B presents a similar set of models, yet measuring changes in the very RSH percentage across the time interval:
With Panel B we see that, over the same time interval as Panel A, CZs with greater start-of-decade RSH experienced greater declines in this very RSH over the course of the decade. Panel B also shows that this decline occurred both among college workers and non-college workers—but most prominently among non-college workers. Since A&D are particularly interested in how lower-education workers have responded to changes in the labor market, this result shows that there is indeed an effect worth exploring among specifically lower-education, non-college workers.
I was also able to match Panel B relatively closely, although not as closely as Panel A. The replicated RSH coefficient for college workers differed by 7.8 percent from A&D’s, for example. This may also seem small, but on the national level, it is much larger than the less than 1 percent within which Panel A’s coefficients were able to be replicated. (I point this out because this effect grows even larger in the subsequent tables that analyze non-college workers exclusively.) As discussed above, how college and non-college workers are defined differs between my dataset and A&D’s, which could result in different outcomes for any model applied specifically to either of these populations. Despite these differences, the replicated model does lead to the same qualitative conclusions as A&D’s: that, between 1980 and 2000, local markets with greater RSH experienced greater declines in this RSH, especially among non-college workers.
I. Figure 6
The remainder of A&D’s figures and tables present results specifically for the non-college worker population subset, their key subset of interest. For this population, Figure 6 presents initial evidence that CZs’ change in service occupation employment between 1980 and 2005 were associated with the initial RSH in the CZ. A&D’s hypothesis (4) states that “Markets that were historically specialized in routine task-intensive industries should differentially experience larger net inflows of workers with both very high and very low education, driven by rising demand for both abstract labor in goods production and manual labor in service production.” This figure specifically shows how workers with low educations (i.e. non-college workers) in local markets with varying degrees of RSH have moved into service production.
The evidence is presented as a pair of scatter plots, with a data point for each CZ weighted by the CZ’s total 1980 population. The x-axes represent 1980 RSH, and the y-axes represent SNESO changes between 1980 and 2005. A&D also estimate an OLS regression for each plot, shown in the figure. Panel A is constructed using all CZs; Panel B is constructed using only the CZs with at least 750,000 residents in 1980. The effect appears strongest among these larger CZs, with an RSH coefficient of 0.495 compared with 0.336 overall.
Visually, both the x-coordinates and the y-coordinates of the CZs in the replicated panels appear to match A&D’s reasonably closely. The x-coordinates in particular, because they are constructed using the CZs’ entire populations and not merely their non-college subpopulations, are relatively unaffected by the difference in the non-college worker definition. Slight differences resulting from the effect of the definition change on the labor hour imputation process ensue, as in the other tables and figures.
Also note that the data available to me did not contain the names of the largest cities in each CZ, and hence the replicated Panel B does not display these names.
J. Table 4
Table 4 builds on the findings of Figure 6 and presents the change in SNESO in response to the initial RSH in a local labor market, this time disaggregated by time interval. In every time interval except between 1950 and 1970, RSH in a CZ predicts SNESO growth in that CZ. This evidence helps further validate A&D’s hypothesis (4). (A&D believe that, in 1950, workers moving from farm occupations into other routine-task-intensive occupations dominated the non-college labor market dynamics.)
As with Table 3, Table 4’s results are derived from an OLS model:
Directionally, the replicated results mostly match A&D’s, though the coefficient magnitudes sometimes differ rather sizably. Between 1970 and 1980, for example, A&D find a 0.042 RSH coefficient. The replicated coefficient is 0.034, 19 percent smaller. I believe this difference results from the same effect creating differences in Table 3 Panel B. A seemingly minor difference in the definition of the non-college population ends up having a sizable impact at the local-market level, and calculating employment share growth across the decade compounds the effect. And because the impact is not necessarily uniform across the CZs, it affects the magnitude of the measured relationship between RSH and SNESO changes.
Also notable is that while A&D’s RSH coefficient in the 1970-1980 model is not statistically significant, the replicated coefficient is significant at the 10-percent level. A&D conclude that RSH only weakly influenced non-college shifts into service occupations before the 1980s, if at all. The replicated result makes it appear that the effect may have begun as early as 1970, even if with humble strength, at least under the replication’s different definition of a non-college worker. Figure 3 does show that service occupations on net contracted nationally in the 1970s; though perhaps, on a CZ-level, certain CZs had begun their service-occupation expansion this early.
***Significant at the 1 percent level. **Significant at the 5 percent level. *Significant at the 10 percent level. A&D Notes: “Dependent variable: 10 x annual change in share of non-college employment in service occupations. N=722 commuting zones. Robust standard errors in parentheses are clustered on state. All models include state dummies and are weighted by start of period commuting zone share of national population.” (p. 1579)
K. Table 5
Table 5 strengthens the robustness of Table 4’s results in two key ways. First, it includes a number of control variables whose absence in principle could have resulted in OVB in the Table 4 results:
- Variables approximating changes in supply and demand for service labor: the ratio of college to non-college workers in the CZ, and the immigrant share of the CZ’s non-college population.
- Variables approximating labor demand: The share of workers in manufacturing industries in the CZ, and the CZ’s unemployment rate.
- Variables approximating demand for services produced by service labor: Employed females as a fraction of the total population, and senior citizens as a fraction of the total population.
- A variable approximating the effect of the minimum wage on the availability of service labor: the fraction of non-college workers earning a wage below whatever minimum wage was eventually legislated in the subsequent decade.
(I also summarized these control variables in Section II above.)
Each variable’s start-of-period level in Panels A and B, as well as its change over the course of the measured time interval in Panel C, is tested as a control, both alone in models (1) through (6), and with all of the variables together in model (7). Even in these models that include these controls, A&D find that RSH remains a strong predictor of SNESO increases. In particular, model (7), which includes all control variables, continues to show a strong and significant RSH coefficient.
Table 5 Panels B and C use 2SLS models with an IV to help isolate the influence of the long-run, quasi-stationary component of RSH on SNESO increases. A&D assert that unobserved short-run characteristics of the CZ could influence its RSH, as discussed above. For example, workers might shift temporarily into routine-task occupations as a function of seasonal shifts in labor demand. The IV is intended to ensure that only the long-run influence of the CZ’s RSH on SNESO changes is measured. How A&D prepare the IV is discussed above.
Table 5’s 2SLS models using this IV present evidence that the long-run, quasi-stationary component of a CZ’s RSH is an even stronger predictor of increases in SNESO than the CZ’s net RSH. This implies that workers in fact may move temporarily into routine-task occupations with little impact on SNESO, and hence that the non-instrumented measured relationship between RSH and SNESO changes in fact may not reflect the full strength of the relationship between a CZ’s long-term/structural RSH and SNESO changes.
Note, though, that I am not certain the IV is completely robust and free from correlation with error in the outcome SNESO measurements, as discussed above. In Section III.3 below, I test a 2SLS model using a modified version of the IV, and find different RSH coefficients.
Also, as discussed above, despite A&D’s different control variables in these models, it is not possible to know whether other unobserved variables, with which RSH could be correlated, bias RSH’s coefficients. Fortunately A&D are able to control for some of the more intuitive factors that could influence polarization: the area unemployment rate, for example. Yet correlation with other more subtle factors proposed in the literature before A&D—such as firm culture, industry norms, and prejudice against certain workers, for example—cannot be ruled out.
Regarding the replication, the non-college worker definition again seems to affect the results. First, as in Table 4, the SNESO change variable is prepared using only non-college labor, affecting its replicated coefficients accordingly. Also, not surprisingly, the coefficients for the control variables that measure characteristics of the non-college population—the immigrant share of the CZ’s non-college population, in particular—are among the least-well replicated. Third, the RSH coefficients in Panels B and C, for the 2SLS models using the IV, also did not perfectly replicate. This reflects one further challenge with the IV, that, given it is prepared entirely using 1950 Census data, a year in which twenty percent of workers’ labor hours were missing and require imputation, the non-college worker definition affects it too. As discussed above, because the imputation procedure must match individuals with similar educational attainment levels, the labor hours for individuals who entered college but left within a year are imputed differently in my analysis than A&D’s.
All this said, overall, I think A&D have shown RSH to be a believably strong factor in polarization at the low-end of the occupational skill distribution, helping affirm A&D’s hypothesis (4). Even the model using the modified IV below still shows RSH to be a reasonably strong predictor of increases in SNESO. Continued exploration in the literature of possible OVB, of RSH’s relative impact among other possible factors, and of the validity of the additional assumptions discussed in Section II above, may be beneficial. But in every model in Table 5,
RSH’s coefficient is positive. And although the replicated models (2) and (3) in Table 5 Panel (A) show RSH as insignificant, these are the two models whose second variables are especially sensitive to the non-college worker definition. In all of the other models, RSH appears significant. Out of A&D’s various contributions in their paper, I believe Table 5’s presentation of a set of reasonably-controlled, reasonably-instrumented, significant RSH coefficients is their most salient.
***Significant at the 1 percent level. **Significant at the 5 percent level. *Significant at the 10 percent level. A&D Notes: “Dependent variable: 10 x annual change in share of non-college employment in service occupations. N= 2,166 (3 time periods x 722 commuting zones). All models include an intercept, time dummies, and state dummies. In panels B and C, share of routine occupations is instrumented by interactions between the 1950 industry mix instrument and time dummies…. Covariates in panels A and B are identical. Covariates in columns 2-5 and 7 of panel C are equal to contemporaneous decadal change in the covariates used in panels A and B. Robust standard errors in parentheses are clustered on state. Models are weighted by start of period commuting zone share of national population.” (p. 1580)
L. Table 6
Table 6 tests the RSH hypothesis against a number of alternative hypotheses for the polarization of employment and wages:
- That the average offshorability of jobs in a CZ predicts a shift toward service occupations. Here, A&D use the US Department of Labor’s Occupational Information Network database to prepare “Face-to-Face Contact” and “On-Site Job” measures, a technique borrowed from Firpo, Fortin, and Lemieux (2011), wherein the measures together serve as a proxy for job offshorability. For the replication, I use A&D’s index values and do not re-create them, assuming them to be of high integrity.
- That income effects predict a shift toward service occupations. The hypothesis is that income increases at the high end of the wage distribution could increase demand for services, resulting in employment and wage increases among service occupations. A&D use changes in the 90th percentile of the wage distribution as a proxy.
- That substitution effects predict a shift toward service occupation employment. The hypothesis is that, with rises in the number of college graduates, fewer individuals may perform their own household services, substituting market services. A&D use changes in the average annual number of labor hours among college graduates as a proxy.
A&D construct several 2SLS models, structured similarly to those in Table 5 in that they again use the SNESO change in a CZ as the dependent variable, they instrument RSH using , the model covers the same time period, and they incorporate both state and decade fixed effect variables. Once for each of the variables above, A&D prepare two separate models: once using the variable as the sole independent variable, and once in conjunction with RSH. Across all of the models, the proxy variables for the alternative hypotheses do not predict SNESO increases as strongly as RSH, whether the RSH variable is also incorporated or not. (The substitution effect variables actually show unexpected negative correlations with SNESO increases.)
From the replicated results, one can draw more or less the same qualitative conclusions. RSH appears to be a much stronger predictor of SNESO increases than the potential labor market income or substitution effects. Unlike in A&D’s results, the offshorability index appears to be a highly significant predictor of SNESO increases when on its own. Yet when the offshorability index and RSH are included in a model together, RSH still dominates, and the offshorability index becomes insignificant.
Quantitatively, as with Table 5, the replicated RSH coefficients are generally smaller than A&D’s on the order of 10 to 20 percent, for I believe the same reasons. Standing out is the offshorability index, which, for model (1), in which this variable is included on its own, is five times larger in the replicated model than in A&D’s. For model (2) in which this variable is included along with RSH, the coefficient is even of the opposite sign as A&D’s. The large multiple in the former case is presumably related to the very small—and insignificant—magnitude of A&D’s coefficient for this variable. I wonder if A&D’s publicly available offshorability index is the same as what they used for their paper, or if perhaps there was a transformation I needed to make to it but didn’t.
***Significant at the 1 percent level. **Significant at the 5 percent level. *Significant at the 10 percent level. A&D Notes: “Dependent variable: 10 x annual change in share of non-college employment in service occupations. N= 2,166 (3 time periods x 722 commuting zones). The offshorability index is standardized with a mean of zero and a cross commuting zone standard deviation of one in 1980. The share of routine occupations is instrumented by interactions between the 1950 industry mix instrument and time dummies. All models include an intercept, state dummies, and time dummies. Robust standard errors in parentheses are clustered on state. Models are weighted by start of period commuting zone share of national population.” (p. 1585)
M. Table 7
Because A&D and those before them have shown that polarization at the low end of the occupational skill distribution and rises in employment share and wages among service occupations have been more or less the same effect, the bulk of A&D’s analyses prior to Table 7 explore rises in non-college employment share and wages among service occupations. Table 7 extends this analysis and tests the extent to which RSH has been associated with non-college employment share and wage changes in additional occupational categories. For Table 7 Panel A, A&D prepare models similar to those for Table 5 (in fact, the upper-left-most cell in Table 7 presents the same coefficient as Table 5 Panel B model (1)), yet six times over, once for each occupational category. A&D sort these occupational categories according to the extent to which the constituent occupations involve routine tasks. (A&D first presented the approximate routine-task requirements of these categories in Table 2.) Panel B presents similar models, yet using changes in average wages among the occupational categories as outcome variables. A&D also use Table 7 to test whether RSH’s impact differed depending on whether the RSH was among males versus females.
Table 7’s key takeaway is that high RSH has generally been associated with rises in non-college employment share and wages within all occupational categories whose task requirements are less routine in nature, and declines in non-college employment share and wages within categories with more routine content. (A&D assert that some of the coefficients for the “managers, processional, technicians, finance, and public safety” category, which is less-routine in nature, emerge as statistically insignificant given few of its workers are non-college.) This notion is the backbone of A&D’s overall polarization hypothesis: that computers, when substituting for labor, have done so predominantly within occupations with routine-task requirements. Labor has shifted from these occupations into occupations requiring less routine work. Specifically because of employment share and wage rises among service occupations—which not only involve little routine work, but also tend to employ workers at the low-end of the occupational skill distribution—the labor market has undertaken polarization.
Given that Table 7 Panel A is prepared very similarly to how Tables 5 and 6 are prepared, it replicates just about as well as they do, with discrepancies I believe resulting from similar causes. I was not able to replicate Table 7 Panel B given the computational resources doing so would have required .
***Significant at the 1 percent level. **Significant at the 5 percent level. *Significant at the 10 percent level. A&D Notes: “Dependent variable: 10 x annual change in share of non-college employment in service occupation; log real hourly wage. Panel A: Each coefficient is based on a separate 2SLS regression with N = 2,166 (3 time periods x 722 commuting zones). Models include an intercept, state dummies, and time dummies, and are weighted by start of period commuting zone share of national population. The routine occupation share is instrumented by interactions between the 1950 industry mix measure interacted with time dummies. Robust standard errors in parentheses are clustered on state. Panel B: Each row presents coefficients from one pooled OLS reduced form regression with N = 5,363,963/2,844,441/2,519,522 in rows i/ii/iii. Observations are drawn from the 1980 Census and 2005 ACS, and exclude self-employed and farm workers. The instrument (share of routine occupations predicted by industry structure in 1950) is interacted with a dummy for the observation of year 2005. All models include an intercept, commuting zone-occupation group fixed effects, time trends for occupation groups and states, an interaction between the time dummy and the share of workers in an occupation group whose 1980 wage was below the federal or state minimum wage of 2005, nine dummies for education levels, a quartic in potential experience, dummies for married, nonwhite, and foreign-born, and interactions of all individual level controls with the time dummy. Pooled sex models also include a female dummy and its interaction with the time dummy. Hourly wages are defined as yearly wage and salary income divided by the product of weeks worked times usual weekly hours. Robust standard errors in parentheses are clustered on commuting zones. Observations are weighted by each worker’s share in total labor supply in a given year.” (p. 1587)
3. Robustness Tests
In this section I present the results of a number of robustness tests. The first of these have to do with the robustness of the replication results in Section III.2 above. Given the replication results frequently somewhat deviated from A&D’s, I test a few plausible alternative variable definitions to ensure that these definitions were not the culprit. (In a few instances, it was not clear to me precisely how A&D defined a variable.) The second set of tests have to do with a few aspects of A&D’s approach which I believe their results could be particularly sensitive to.
In both cases I present the results in the form of re-replicated RSH coefficients for Table 5, given that I believe Table 5’s results are the most salient and central to A&D’s conclusions about polarization.
A. Replication Robustness
Here I present the RSH coefficients for the Table 5 models after testing the following four variable adjustments:
- (I) Removing individuals from the analysis who were not at least age 18. The caption for A&D’s Table 1 mentions that the table reports statistics for individuals ages 18 through 64. This appeared as a non-sequitur to me given that, in A&D’s “Data Sources” Section II.A., they note that their sample contains individuals at least 16 years of age. Most of the replicated results above I derive from a sample of individuals at least age 16, though in this test I reproduce the results for the 18 and above population in case 16 was not correct.
- (II) Including non-reported industries in the IV calculation. As discussed above, the IV is essentially the 1950 national RSH, but with its industry components re-weighted to match the industry structure of each CZ. Some of the 1950 workers did not report their industries to the Census, and hence I removed these workers above when calculating the IV. Here I test the Table 5 models with an IV that includes these individuals, denoting their non-reported industry as if it were an industry in itself, in case A&D had taken this approach.
- (III) Including workers with unreported educational attainment levels in the non-college category. Some of the workers did not report their educational attainment levels to the different Censuses and the ACS. In the analyses above in which A&D specifically evaluate non-college workers, I only include workers in the non-college category whom MPC explicitly report as having at most graduated high school. Here I test the Table 5 models, but after also including the workers who didn’t report their educational attainment levels at all, in case A&D had taken this approach.
- (IV) Defining non-college as having completed no more than 11th grade. As discussed above, MPC altered the “completed one year of college” educational attainment category in between when A&D downloaded the IPUMS data in 2007 and when I downloaded it in 2013. In A&D’s data set, as an approximation, MPC had included individuals who had entered college but left before a year in this “completed one year of college” category. In my data set, MPC included these individuals in the “completed 12th grade” category. Wherever A&D make use of the worker’s educational attainment category field—which they do when imputing missing labor hours, and also when isolating the non-college workers—my replicated results deviate from theirs. Here I test the Table 5 models after redefining non-college as having completed no more than 11th grade. This definition removes the individuals from my non-college category who were originally not present in A&D’s non-college category. Of course, it removes many individuals in addition to them. But the test helps evaluate the sensitivity of the model to the non-college worker definition.
It appears that redefinitions (I), (II), and (III) have only subtle effects on Table 5’s RSH coefficients. Yet redefinition (IV) has a rather sizeable impact. This is reassuring, at least for the replication. As discussed in Section III.2 above, it often seemed that the only deducible culprit of the differences between A&D’s and the replicated results was MPC’s educational attainment category redefinition. Yet it would have been reasonable to believe that such a minor-seeming change should have had negligible consequences, and that something else must be responsible for the differences. After removing all individuals who at least entered the 12th grade from the non-college category definition, many of the RSH coefficients in Table 5 nearly double, and we see that the model is in fact sensitive to this definition.
Interestingly, with this change, these coefficients grow again, just like they grew with the original replication. I expected the coefficients to contract with this test, given that the superfluous workers included in the non-college category in the original replication were once again removed. That is, I imagined that removing these workers would have the opposite effect on the coefficients as including them. Apparently simultaneously removing the additional workers—those who entered the 12th grade and had been included in A&D’s non-college category—has a superseding effect.
How can one interpret the larger Table 5 RSH coefficients when those who entered college but did not complete an entire year of it are clubbed into the non-college worker definition, as in the original replication? How can one interpret the even larger coefficients if all those who had graduated 12th grade are removed from the definition? In my interpretation, perhaps those who entered college but left before a year (in the former case), as well as those who dropped out of high school before graduating from it (in the latter case), were among the most susceptible to moving into low-skill service occupations. Those who precisely graduated from high school, did not enter college, and moved on to, say, a vocational career, might have been acting more according to plan, and found themselves less in need of low-skill work. It is this most-latter group of workers whom were included in A&D’s non-college worker category—and whom may have been responsible for A&D’s relatively smaller Table 5 RSH coefficients than mine—whom one cannot precisely isolate using today’s incarnation of the IPUMS data.
B. Authors’ Results’ Robustness
Here I present the RSH coefficients for each of the Table 5 models after testing the following three variable adjustments:
- (I) Artificially increasing the required amount of imputation. As discussed in Section II above, error intrinsic to the imputation process could lead to attenuation bias. I attempt to ascertain just how concerning attenuation bias could be for A&D by artificially removing an extra swath of data, imputing it too, and then observing the differences in the coefficients versus those in the original replication. Across all of the Census and ACS years, I select a random 50 percent of the workers’ labor hours to remove and then impute.
- (II) Removing abstract occupation labor from the IV. As noted above, the purpose of the IV is to instrument the RSH coefficients against possible short-run shifts into and out of routine-task occupations, due, say, to seasonal shifts in labor demand. The original IV was prepared to reflect the historical industry structure of a CZ, which A&D contend should be correlated with only the long-term component of the CZ’s RSH. However, I am not certain that industries with historically more routine manufacturing labor, for example, might not be structured in a way that more-easily allows for these short-term shifts into and out of routine-task occupations. I test a modified version of the IV in which I remove all labor in A&D’s “managers/ professionals/ technicians/ finance/ public safety” and “production/ craft” occupation categories, which A&D assert require more abstract thinking, and perhaps, I propose, labor that is more difficult to perform on a temporary basis. Believing this could concentrate the remainder of the IV’s fomulation around occupations that do support temporary labor shifts, I test the new IV to see if it instruments the RSH coefficients any differently.
- (III) Constructing the RSH variable using work from non-college workers only. A&D are interested in the relationship between the RSH in a local market and non-college labor shifts into service occupations. Is it possible that the RSH specifically among these very non-college workers could be even more strongly associated with these shifts? Whether non-college workers shift directly from routine-task occupations into service occupations cannot be observed using IPUMS data. But if this variable adjustment leads to even stronger coefficients, such a dynamic may be conceivable.
In test (I), each of the coefficients appears noticeably weaker than its counterpart in the original replication. In particular, the coefficient for Panel A Model (7)—the coefficient which is non-instrumented, yet fully-controlled—is a full 44 percent smaller than the original coefficient (0.034 versus 0.061). These results make it appear that, indeed, some attenuation bias results from the imputation in this model. Granted, the test here is extreme; a full fifty percent of the workers’ labor hours were artificially removed and then imputed. In the original paper, only 20 percent of the 1950 and 1970 workers’ labor hours must be imputed, and in 1980, 1990, 2000, and 2005, less than five percent. And fortunately, at least for A&D’s overall conclusions, attenuation bias isn’t horrible news. It means that the magnitude of the modeled effect is likely even stronger than what is observed. Still, attenuation bias in this type of model, using IPUMS data, might warrant further attention in the literature.
In test (II), removing abstract-task occupations from the IV leads to generally smaller RSH coefficients in Table 5 Panels B and C than in the original replication. Hence, it seems that the historical industry structure of a CZ may in fact be somewhat correlated with present-day, short-run shifts into and out of routine-task occupations in the CZ. Panel B Model (7), for example, shows a 19 percent smaller RSH coefficient when using the experimental IV versus the originally replicated IV.
This said, both the originally replicated instrumented and the experimental RSH coefficients are larger than the coefficients in Panel A, which were not instrumented. Hence, although, on one hand, whether with or without the IV, we can’t be certain that measurements of the structural, longer-term component of the RSH-SNESO relationship are contamination-free, it still seems that A&D’s IV successfully jettisons at least some of the short-term component of it. Using the IV does not seem to degrade the results.
In test (III), Table 5 RSH coefficients are generally smaller than those in the original replication. This suggests that overall RSH is indeed a better predictor of SNESO growth than the RSH specifically among non-college workers. This has interesting implications for the possible mechanisms through which RSH affects SNESO growth. When computers or offshore labor substitute for routine-task labor, it seems less pertinent to non-college labor shifts into service occupations whether specifically non-college or college workers originally performed the routine tasks than the extent to which the substitution occurs overall. As a conjecture, perhaps part of the shift is mediated by mere speculation among non-college workers, witnessing the overall decline in routine-task middle-skill occupational opportunities, that service occupations are becoming better choices for them.
Note that there are still a number of assumptions discussed in Section II above that I do not explore here. First, given the relative infrequency of the Census/ACS observations, it is not possible to ascertain if the results reflect any temporary shocks that could have occurred incidentally only within a measurement year. Identifying any possible such shocks would require using different data with more-frequent observations. In the future literature, I believe contrasting studies that use different, more-frequent data could help triangulate a fuller characterization of polarization. Second, I do not explore whether the CZs are reasonably independent. As mentioned above, A&D cluster standard errors at the state-level and hence presumably mostly account for any possible correlation among residuals. However, understanding whether any further correlation existed, potentially biasing A&D’s standard errors downward, would be a project perhaps also worthwhile for the future literature to explore. Third, I do not explore possible OVB from variables not representable using IPUMS data. Discovering new possible OVB is a diagnostic problem that the literature may need to continually explore over time. Finally, I do not test A&D’s set of occupations for over-time consistency. A&D assert in good faith that these occupations are in fact consistent, and verifying so would be sizable separate project. Hopefully A&D or a different author will at some point be able to take up the challenge.
IV. Paper Extension: Policy Implications
In Table 3, A&D show that RSH has not only been a factor behind employment and wage polarization at the CZ-level, but also that RSH has exhibited an association with growth in the number of computers per employee the businesses in these CZs own. With this evidence, the hypothesis that computers and offshore labor have substituted directly for U.S. labor more routine in nature, and that this mechanism has governed an appreciable component of the polarization phenomenon of the last few decades, does not seem implausible.
I imagine that this result could prompt two questions for policymakers. The first is whether A&D’s analysis could inform an approach to attenuating polarization and the hollowing-out of middle-skill employment and wages. The second and perhaps even more important is whether economic policies to-date have somehow inadvertently contributed to polarization. For example, if computer adoption happens to cause polarization—and if the relaxation of different constraints on business computer purchases, through tax incentives, for example, has happened to allow for polarization’s realization—should policy makers be concerned? I consider this latter question in this section.
Much of recent tax legislation has sought to stimulate business investment, with the thought that business investment is a leverage point through which the government can promote broader economic growth. For example, the Economic Stimulus Act of 2008 allowed businesses to immediately depreciate and deduct 50 percent of their qualified purchases in the very year of the purchase rather than permitting merely a gradual depreciation of the equipment over its useful life (Williams 2008). This provision is frequently referred to as bonus depreciation and is intended to stimulate overall business investment. Yet if businesses purchase computers more frequently as a result of this stimulus, could accelerated employment and wage polarization result as a by-product?
To take a cursory look at the possible relationship between stimulated business investment and polarization, I prepared a new variable using Compustat data and incorporated it into versions of two of A&D’s models. This variable reflects the sum of the capital expenditures among all public companies in the Compustat database over each decade between 1980 and 2000, divided by the sum of the employee-years among these companies over this same time period:
where is the average capital expenditure per employee-year across decade , is the capital expenditure in year by company headquartered in state , is the year-end number of employees in year at company headquartered in state , and is year in decade .
This variable has a number of limitations. First, it reflects only the behaviors of public companies, yet private company computer purchases are presumably no less relevant to polarization and no less affected by legislation than public company computer purchases. Second, the variable does not reflect bonus-depreciation-stimulated expenditures in particular, but rather all capital expenditures. Third, I sum the variable at the state level; I did not have a bridge to map company headquarter addresses into A&D’s CZs.
Still, I attempted to incorporate the variable into both a version of A&D’s Table 7 Panel A model (i) and a version of A&D’s Table 3 to see what it could tell us.
C. Table 7 with ACE Variable
In this section I test a model of the following form. It is similar to A&D’s Table 7 Panel A Model (i), except that it is evaluated at the state level and incorporates the new ACE variable:
where is the change in non-college employment share in decade in state in occupation group , one of A&D’s six occupation groups, is a constant, is the routine share at start of decade in state , is the new variable tabulated over decade in state , and is an error term. I create one model for the decade between 1980 and 1990, and another for the decade between 1990 and 2000.
Here we can see that, between 1980 and 1990, average capital expenditures per employee-year had small p-values for, and hence were statistically-significantly associated with, increases in non-college service occupation employment shares (SNESO) and decreases in non-college production/craft employment shares at the state level. (For service occupations, ACE’s p-value was even smaller than RSH’s, though the more-significantly-appearing of the two could be arbitrary due to collinearity between them.) Between 1990 and 2000, average capital expenditures per employee-year were statistically significantly associated with increases in non-college transportation, construction, mechanics, mining, and farm employment shares, and with declines in non-college management, professional, technician, finance, and public safety employment shares. Note that the coefficients in the table have been standardized to allow for easier comparison between RSH and ACE.
Of course, of the greatest pertinence to polarization is SNESO. In the 1980s, it seems that capital expenditures among public companies overall—not just computer purchases—were associated with SNESO increases and hence polarization. If the 1980s was a decade of intensive computer adoption, it would be reasonable to imagine that computer adoption constituted a sizable share of overall capital expenditures, in turn generating this result. In the 1990s, if computer adoption was diffused among a greater variety of capital expenditures, then the apparent lack of a significant association between overall capital expenditures and SNESO may also be intuitive.
D. Table 3 with ACE Variable
In this section I similarly recreate A&D’s Table 3, tabulated at the state level and with the new ACE variable incorporated. As discussed in Section III.2 above, A&D’s Table 3 presents two outcome variables: growth in adjusted PCs per employee, and change in RSH. The analysis here explores the explanatory power of the new ACE variable for both of these outcomes in comparison to the RSH variable A&D already use.
Here we can see that, between 1980 and 1990, overall capital expenditures per employee-year was in fact a very good predictor of growth in adjusted PCs per employee between 1980 and 1990, as hypothesized above. Perhaps an appreciable component of the capital expenditures in the 1980s in fact derived from PC adoption, with polarization subsequently ensuing.
Of greatest interest to me here is that capital expenditures were also a reasonable predictor, at approximately the 10-percent-significance level, of declines in RSH among non-college workers across the entire 1980 to 2005 window. This makes it appear that the hollowing out of non-college, routine-task, middle-skill-occupation labor between 1980 and 2005 was in fact reasonably-associated with overall business investment, at least among public firms and as analyzed at the state level.
E. Bonus Depreciation and Policy Implications: Discussion
What is not clear from this analysis is whether business investment particularly in response to tax stimulus has tended to be more or less polarizing than business investment overall. In response to the Economic Stimulus Act of 2008, for example, did businesses invest in ways that resembled how they invested throughout the 2000s? Did they purchase up more or fewer computers than on average, or about the same? How about in response to the Tax Reform Act of 1986, in which the top marginal corporate tax rate was reduced from 46 to 34 percent (Williams 2008), in a decade in which overall business investment was already generally associated with polarization? Were these top marginal profiteers more or less prone to purchasing routine-labor-substituting equipment than businesses on average, or about the same?
To explore these questions, I believe one possibly tractable strategy could be to exploit potential geographic variation in the tax-stimulated fraction of overall business investment. IRS Form 4562 requires businesses to report expenditures deducted under the bonus depreciation provision, and hence it seems that IRS tax data could support such an analysis. One would first need to explore what kind of variation in this fraction existed, perhaps at the state level, or at the CZ level if business headquarter addresses can be mapped to CZs. If sizable variation existed, one could then observe to what extent this variation was related to SNESO changes and polarization overall.
A&A demonstrated that both the U.S. employment and wage polarization and the relative rises in service occupation employment and wages beginning in the 1980s have been largely the same phenomenon. They also demonstrated that these rises have occurred simultaneously to relative declines in employment and wages in occupations comprising routine tasks. This raised the question whether the two phenomena may be linked and, more broadly, if computer and offshore substitution for routine tasks could partly explain polarization.
To fully characterize polarization, researchers ideally would have comprehensive panel data at their disposal. IPUMS data is helpful in that it comprises a very large and representative sample—between one and five percent of U.S. households depending on the measurement year—and captures a number of useful variables related to employment and wages. Its drawbacks include that it only allows whole geographic areas to be traced longitudinally, it generally presents merely decennial observations, it does not present variables such as firm culture or worker intrinsic skill pertinent to labor market dynamics (not that any data source could directly capture intrinsic skill), it top-codes the wages of very-high earners, and its survey format means that many of its observations contain missing or inconsistent values which must be imputed.
Still, A&D are able to use IPUMS data to assemble reasonable evidence that polarization in fact has been mechanically linked to the ex-ante shares of labor in routine-task-intensive occupations in local labor markets. A&D present evidence that local markets with differential such shares have differentially embraced labor-substituting computer technology and have seen low-skill, low-education workers shift out of routine-task-intensive occupations and into service occupations.
A&D’s analysis does rely on assumptions including that the IPUMS data is as representative as MPC asserts, that any artifacts or biases resulting from the above IPUMS drawbacks are negligible, that CZs—A&D’s labor market demarcations—are reasonably independent, that modeled relationships take on linear forms, that A&D’s occupation classification scheme is of high integrity, that originally non-college labor did not sufficiently move into college to morph the non-college population, and that an instrumental variable is robust.
Though I believe some of these assumptions are reasonable, I believe some could benefit from further exploration. These include that IPUMS’s decennial measurements accurately capture relevant dynamics, that CZs are reasonably independent, that OVB is negligible, that A&D’s occupation classification scheme is of high integrity, that attenuation bias from imputation and other sources is negligible, and that the IV is robust. I test for and do find some evidence for both attenuation bias and limits to the IV’s robustness. Fortunately, because A&D draw only directional conclusions from their results, I do not believe these biases should be highly concerning. If anything, in both cases, as discussed above, A&D’s presented coefficients are understated as a consequence. Of course, these biases do preclude a broader, quantified characterization of polarization. And the former four assumptions I believe could benefit from greater testing and validation in the literature.
This said, by exploiting variation in both occupational routine-task intensity and changes in service occupation employment share and wages at the local-market level, A&D are able to show that the two are very likely non-coincidentally related. A&D use a reasonable set of control variables, as well as an innovative even if not perfectly-robust IV, to buttress their evidence.
Now that A&D have made this case, I believe a few areas for exploration remain in the literature. One is the robustness of the assumptions mentioned above. In addition, one could imagine utility from the following.
- Models that test both the routine-task substitution hypothesis and the SBTC hypothesis together, exploring which is more pronounced, under what circumstances, across what different segments of the labor market, and whether the two ever interact, whether mutually-compoundingly, mutually-attenuatingly, or in other forms.
- Models that test routine-task substitution and/or SBTC interacting with other variables such as those proposed by Spenner (1988), Howell and Wolff (1991), Howell and Wieler (1998), Handel (2003), or Card and Dinardo (2006) including culture or industry norms, levels of bargaining power between management and labor, supply-side frictions such as discrimination against age, race, or health status or general job search hassles, or any other pertinent variables.
- Tests for factors driving the component of wage polarization not attributable to service occupations as seen in Figure 2 Panel B.
- Models that explore worker-level longitudinal dynamics when shifting in and out of routine-task-intensive and/or service occupations.
- Robust proxies for intrinsic worker skill.
- Models that explore the potential inadvertent influence of public policy on polarization.
Of particular interest to me is this last point, especially in the realm of tax policy. Recent tax legislation has sought to stimulate economic growth by way of stimulating business investment. Given the association A&D demonstrate between computer technology investment and polarization, I am interested in whether these tax policies themselves have indirectly amplified polarization’s evolution. A cursory analysis of Compustat data at the state level shows that public corporation capital expenditures were statistically significantly associated with growth in service occupation employment shares in the 1980s. In a future project, I hope to evaluate whether this association has varied with whether the investment was expensed under a bonus depreciation provision versus under ordinary investment expense provisions.
VI. End Notes
 A&D also choose to remove individuals with missing wages or labor hours from any calculations involving wages.
 A&D prepared Table 7 Panel B using a regression model at the individual-worker level, whereas each of their prior regressions were run at the CZ level. The 1980, 1990, 2000, and 2005 samples collected together comprise 40 million observations. After allowing Stata’s single “regress” command to operate on these observations for a full 24 hours, I opted to terminate the program.
Acemoglu, Daron (1998). Why Do New Technologies Complement Skills? Directed Technical Change and Wage Inequality. Quarterly Journal of Economics, November 1998, 1055-1089.
Acemoglu, Daron (2002). Technical Change, Inequality, and the Labor Market. Journal of Economic Literature, XL, 7-72.
Acemoglu, Daron, & Autor, David (2011). Skills, Tasks, and Technologies: Implications for Employment and Earnings. In Orley Ashenfelter and David Card (Eds.), Handbook of Labor Economics, 4(B), 1043-117. Amsterdam: Elsevier.
Autor, David H., & Dorn, David (2013). The Growth of Low-Skill Service Jobs and the Polarization of the US Labor Market. American Economic Review, 103(5), 1553-1597.
Autor, David H., Katz, Lawrence F., & Kearney, Melissa S. (2006). The Polarization of the U.S. Labor Market. National Bureau of Economic Research, Work Paper 11986.
Autor, David H., Katz, Lawrence F., & Kearney, Melissa S. (2008). Trends in U.S. Wage Inequality: Revising the Revisionists. The Review of Economics and Statistics, 90(2), 300-323.
Autor, David H., Katz, Lawrence F., Krueger, Alan B. (1998). Computing Inequality: Have Computers Changed the Labor Market? Quarterly Journal of Economics, November 1998, 1169-1213.
Autor, David H., Levy, Frank, & Murnane, Richard J. (2003). The Skill Content of Recent Technological Change: An Empirical Exploration. Quarterly Journal of Economics, November 2003, 1279-1333.
Bardhan, Ashok Deo, & Kroll, Cynthia (2003). The New Wave of Outsourcing. Fisher Center Research Reports, Fisher Center for Real Estate and Urban Economics, UC Berkeley.
Bartel, Ann, Ichniowski, Casey, & Shaw, Kathryn (2007). How Does Information Technology Affect Productivity? Plant-Level Comparisons of Product Innovation, Process Improvement, and Worker Skills. Quarterly Journal of Economics, November 2007, 1721-1758.
Beaudry, Paul, Doms, Mark, & Lewis, Ethan (2010). Should the Personal Computer Be Considered a Technological Revolution? Evidence from U.S. Metropolitan Areas. Journal of Political Economy, 118(5), 988-1036.
Berman, Eli, Bound, John, & Griliches, Zvi (1994). Changes in the Demand for Skilled Labor within U.S. Manufacturing: Evidence from the Annual Survey of Manufacturers. Quarterly Journal of Economics, May 1994, 367-397.
Berman, Eli, Bound, John, & Machin, Stephen (1998). Implications of Skill-Biased Technological Change: International Evidence. Quarterly Journal of Economics, November 1998, 1245-1279.
Berrett, Dan (2011). Intellectual Roots of Wall St. Protest Lie in Academe: Movement’s principles arise from scholarship on anarchy. The Chronicle of Higher Education, October 16.
Blinder, Alan S. (2006). Offshoring: The Next Industrial Revolution? Foreign Affairs, 85(2), 113-128.
Blinder, Alan S. (2009). How Many US Jobs Might be Offshorable? World Economics, 10(2), 41-78.
Bresnahan, Timothy F. (1999). Computerisation and Wage Dispersion: An Analytical Reinterpretation. The Economic Journal, 199(456), F390-F415.
Bresnahan, Timothy F., Brynjolfsson, Erik, & Hitt, Lorin M. (2002). Information Technology, Workplace Organization, and the Demand for Skilled Labor: Firm-Level Evidence. Quarterly Journal of Economics, February 2002, 339-376.
Card, David, & DiNardo, John (2002). Skill Biased Technological Change and Rising Wage Inequality: Some Problems and Puzzles. National Bureau of Economic Research, Working Paper 8769.
Card, David, & DiNardo, John (2005). The Impact of Technological Change on Low Wage Workers: A Review. National Poverty Center Working Paper Series, 05-28.
Card, David, & Lemieux, Thomas (2001). Can Falling Supply Explain the Rising Return to College for Younger Men? A Cohort-Based Analysis. Quarterly Journal of Economics, May 2001, 705-746.
Carneiro, Pedro, & Lee, Sokbae (2010). Trends in quality-adjusted skill premia in the United States, 1960-2000, Discussion paper series, Forschungsinstitut zur Zukunft der Arbeit, 5295.
Caselli, Francesco (1999). Technological Revolutions. The American Economic Review, 89(1), 78-102.
Doms, Mark, Dunne, Timothy, Troske, Kenneth R. (1997). Workers, Wages, and Technology. Quarterly Journal of Economics, February 1997, 254-290.
Doms, Mark, & Lewis, Ethan (2006). Labor Supply and Personal Computer Adoption. Federal Reserve Bank of San Francisco, Work Paper 2006-18.
Dorn, David (2009). Essays on Inequality, Spatial Interaction, and the Demand for Skills. University of St. Gallen, Graduate School of Business Administration, Economics, Law and Social Sciences, Dissertation no. 3613.
Firpo, Sergio, Fortin, Nicole, and Lemieux, Thomas (2011). Occupational Tasks and Changes in the Wage Structure. Unpublished.
Gera, Surendra, Gu, Wulong, & Lin, Zhengxi (2001). Technology and the Demand for Skills in Canada: An Industry-Level Analysis. The Canadian Journal of Economics, 34(1), 132-148.
Goldin, Claudia & Katz, Lawrence F. (1998). The Origins of Technology-Skill Complementarity. Quarterly Journal of Economics, August 1998, 693-732.
Goldin, Claudia, & Katz, Lawrence F. (2008). The Race Between Education and Technology. Cambridge, MA: Belknap Press of Harvard University Press.
Goos, Maarten, Manning, Alan (2003). Lousy and Lovely Jobs: The Rising Polarization of Work in Britain. Mimeo, London School of Economics, September 2003.
Handel, Michael J. (2003). Skills Mismatch in the Labor Market. Annual Review of Sociology, 29, 135-165.
Hecker, Daniel E. (2005). Occupational employment projections to 2014. Monthly Labor Review, November 2005, 70-101.
Howell, David (1994). The Collapse of Low-skill Male Earnings in the 1980s: Skill Mismatch or Shifting Wage Norms? Levy Institute, Working Paper 105.
Howell, David (1999). Theory Driven Facts and the Growth of Earnings Inequality. Review of Radical Political Economics, 31(1), 54–86.
Howell, David R., & Wieler, Susan S. (1998). Skill-biased Demand Shifts and Wage Collapse in the United States, A Critical Perspective. Eastern Economic Journal, 24(3), 343–366.
Howell, David R., & Wolff, Edward N. (1991). Skills Bargaining Power and Rising Interindustry Wage Inequality Since 1970. Review of Radical Political Economics, 23(1 & 2), 30–37.
Katz, Lawrence, & Murphy, Kevin (1992). Changes in relative wages: supply and demand factors. Quarterly Journal of Economics, CVII, 35-78.
Krueger, Alan B. (1993). How Computers Have Changed the Wage Structure: Evidence from Microdata, 1984-1989. Quarterly Journal of Economics, February 1993, 33-60.
Lefter, Alexandru M., & Sand, Benjamin M. (2011). Job Polarization in the U.S.: A Reassessment of the Evidence from the 1980s and 1990s. University of St. Gallen, School of Economics Working Paper 1103.
Machin, Stephen, & Van Reenen, John (1998). Technology and Changes in Skill Structure: Evidence from Seven OECD Countries. Quarterly Journal of Economics, November 1998, 1215-1244.
McCarthy, J. (2004) Near-Term Growth of Offshoring Accelerating. Cambridge, MA: Forrester Research.
McKinsey Global Institute (2005). The Emerging Global Labor Market, June 2005. New York: McKinsey & Company.
Mishel, Lawrence, & Bernstein, Jared (1998). Technology and the Wage Structure: Has Technology’s Impact Accelerated Since the 1970s? Research in Labor Economics, 17.
Mishel, Lawrence, Shierholz, Heidi, & Schmitt, John (2013). Don’t Blame the Robots: Assessing the Job Polarization Explanation of Growing Wage Inequality. Economic Policy Institute, Center for Economic and Policy Research, Working Paper.
Ruggles, Steven, Alexander, J. Trent, Genadek, Katie, Goeken, Ronald, Schroeder, Matthew B. , & Sobek, Matthew (2010). Integrated Public Use Microdata Series: Version 5.0 [Machine-readable database]. Minneapolis: University of Minnesota.
Saez, Emmanuel (2013). Striking it Richer: The Evolution of Top Incomes in the United States (Updated with 2012 preliminary estimates). UC Berkeley.
Simon, Herbert A. (1060), The Corporation: Will it Be Managed by Machines? In M.L. Anshen & G.L. Bach (Eds.), Management and Corporations, 1985. New York, NY: McGraw-Hill.
Spenner, Kenneth I. (1988). Technological Change, Skill Requirements and Educations: The Case for Uncertainty. In R.M. Cyert and D.C. Mowery (Eds.), The Impact of Technological Change on Employment and Economic Growth. Cambridge, MA.: Ballinger Books.
Tinbergen, Jan (1974). Substitution of graduate by other labor. Kyklos, 27, 217–226.
Tinbergen, Jan (1975). Income Difference: Recent Research. Amsterdam: North-Holland Publishing Company.
Tolbert, Charles M., & Sizer, Molly (1996). U.S. Commuting Zones and Labor Market Areas: A 1990 Update. Economic Research Service, Staff Paper 9614.
Williams, Roberton (2008). Economic Stimulus: What is the Economic Stimulus Act of 2008? The Tax Policy Briefing Book, Tax Policy Center, Urban Institute and Brookings Institution.