The Coherence of US Cities

Diversified economies are critical for cities to sustain their growth and development, but they are also costly because diversification often requires expanding a city’s capability base. We analyze how cities manage this trade-off by measuring the coherence of the economic activities they support, defined as the technological distance between randomly sampled productive units in a city. We use this framework to study how the US urban system developed over almost two centuries, from 1850 to today. To do so, we rely on historical census data, covering over 600M individual records to describe the economic activities of cities between 1850 and 1940, as well as 8 million patent records and detailed occupational and industrial profiles of cities for more recent decades. Despite massive shifts in the economic geography of the United States over this 170-year period, average coherence in its urban system remains unchanged. Moreover, across different time periods, datasets, and relatedness measures, coherence falls with city size at the exact same rate, pointing to constraints to diversification that are governed by a city’s size in universal ways.

From Products to Capabilities: Constructing a Genotypic Product Space

Economic development is a path-dependent process in which countries accumulate capabilities that allow them to move into more complex products and industries. Inspired by a theory of capabilities that explains which countries produce which products, these diversification dynamics have been studied in great detail in the literature on economic complexity analysis. However, so far, these capabilities have remained latent and inference is drawn from product spaces that reflect economic outcomes: which products are often exported in tandem. Borrowing a metaphor from biology, such analysis remains phenotypic in nature. In this paper we develop a methodology that allows economic complexity analysis to use capabilities directly. To do so, we interpret the capability requirements of industries as a genetic code that shows how capabilities map onto products. We apply this framework to construct a genotypic product space and to infer countries’ capability bases. These constructs can be used to determine which capabilities a country would still need to acquire if it were to diversify into a given industry. We show that this information is not just valuable in predicting future diversification paths and to advance our understanding of economic development, but also to design more concrete policy interventions that go beyond targeting products by identifying the underlying capability requirements. 

Eight Decades of Changes in Occupational Tasks, Computerization and the Gender Pay Gap

We build a new longitudinal dataset of job tasks and technologies by transforming the U.S. Dictionary of Occupational Titles (DOT, 1939 -1991) and four books documenting occupational use of tools and technologies in the 1940s, into a database akin to, and comparable with its digital successor, the O*NET (1998 -today). After creating a single occupational classification stretching between 1939 and 2019, we connect all DOT waves and the decennial O*NET databases into a single dataset, and we connect these with the U.S. Decennial Census data at the level of 585 occupational groups. We use the new dataset to study how technology changed the gender pay gap in the United States since the 1940s. We find that computerization had two counteracting effects on the pay gap -it simultaneously reduced it by attracting more women into better-paying occupations, and increased it through higher returns to computer use among men. The first effect closed the pay gap by 3.3 pp, but the second increased it by 5.8 pp, leading to a net widening of the pay gap.

The impact of return migration on employment and wages in Mexican cities

How does return migration from the US to Mexico affect local workers? Return migrants increase the local labor supply, potentially hurting local workers. However, having been exposed to a more advanced U.S. economy, they may also carry human capital that benefits non-migrants. Using an instrument based on involuntary return migration, we find that, whereas workers who share returnees’ occupations experience a fall in wages, workers in other occupations see their wages rise. These effects are, however, transitory and restricted to the city-industry receiving the returnees. In contrast, returnees permanently alter a city’s long-run industrial composition, by raising employment levels in the local industries that hire them.

Evaluating the Principle of Relatedness: Estimation, Drivers and Implications for Policy

A growing body of research documents that the size and growth of an industry in a place depends on how much related activity is found there. This fact is commonly referred to as the “principle of relatedness.” However, there is no consensus on why we observe the principle of relatedness, how best to determine which industries are related or how this empirical regularity can help inform local industrial policy. We perform a structured search over tens of thousands of specifications to identify robust – in terms of out-of-sample predictions – ways to determine how well industries fit the local economies of US cities. To do so, we use data that allow us to derive relatedness from observing which industries co-occur in the portfolios of establishments, firms, cities and countries. Different portfolios yield different relatedness matrices, each of which help predict the size and growth of local industries. However, our specification search not only identifes ways to improve the performance of such predictions, but also reveals new facts about the principle of relatedness and important trade-offs between predictive performance and interpretability of relatedness patterns. We use these insights to deepen our theoretical understanding of what underlies path-dependent development in cities and expand existing policy frameworks that rely on inter-industry relatedness analysis.

What Can the Millions of Random Treatments in Nonexperimental Data Reveal About Causes?

We propose a new method to estimate causal effects from nonexperimental data. Each pair of sample units is first associated with a stochastic ‘treatment’—differences in factors between units—and an effect—a resultant outcome difference. It is then proposed that all pairs can be combined to provide more accurate estimates of causal effects in nonexperimental data, provided a statistical model relating combinatorial properties of treatments to the accuracy and unbiasedness of their effects. The article introduces one such model and a Bayesian approach to combine the O(n2) pairwise observations typically available in nonexperimental data. This also leads to an interpretation of nonexperimental datasets as incomplete, or noisy, versions of ideal factorial experimental designs. This approach to causal effect estimation has several advantages: (1) it expands the number of observations, converting thousands of individuals into millions of observational treatments; (2) starting with treatments closest to the experimental ideal, it identifies noncausal variables that can be ignored in the future, making estimation easier in each subsequent iteration while departing minimally from experiment-like conditions; (3) it recovers individual causal effects in heterogeneous populations. We evaluate the method in simulations and the National Supported Work (NSW) program, an intensively studied program whose effects are known from randomized field experiments. We demonstrate that the proposed approach recovers causal effects in common NSW samples, as well as in arbitrary subpopulations and an order-of-magnitude larger supersample with the entire national program data, outperforming Statistical, Econometrics and Machine Learning estimators in all cases. As a tool, the approach also allows researchers to represent and visualize possible causes, and heterogeneous subpopulations, in their samples.

The Economic Geography of the War in Ukraine

The war in Ukraine has been waging for a month now, not only causing human suffering on a massive scale, but also sending economic tremors that are felt far beyond the country’s borders. Since the collapse of the Soviet Union, Ukraine’s economy has been pulled between its strong historical ties with the Russian economy and the opportunities in forging new ties with the European Union (EU). With the help of Metroverse, an online tool for analyzing the local economies of over a thousand cities worldwide, and of the data that power this tool, we analyze the evolving economic relations between Ukraine, Russia and the West and weigh the consequences of their disruption.

Explore: The Economic Geography of the War in Ukraine 

Related reading:
Media Release
Bloomberg Opinion: Markets Need to Lose the ‘Peace in Our Time’ Reflex

The Node Vector Distance Problem in Complex Networks

We describe a problem in complex networks we call the Node Vector Distance (NVD) problem, and we survey algorithms currently able to address it. Complex networks are a useful tool to map a non-trivial set of relationships among connected entities, or nodes. An agent—e.g., a disease—can occupy multiple nodes at the same time and can spread through the edges. The node vector distance problem is to estimate the distance traveled by the agent between two moments in time. This is closely related to the Optimal Transportation Problem (OTP), which has received attention in fields such as computer vision. OTP solutions can be used to solve the node vector distance problem, but they are not the only valid approaches. Here, we examine four classes of solutions, showing their differences and similarities both on synthetic networks and real world network data. The NVD problem has a much wider applicability than computer vision, being related to problems in economics, epidemiology, viral marketing, and sociology, to cite a few. We show how solutions to the NVD problem have a wide range of applications, and we provide a roadmap to general and computationally tractable solutions. We have implemented all methods presented in this article in a publicly available open source library, which can be used for result replication.

Knowledge Diffusion in the Network of International Business Travel

We use aggregated and anonymized information based on international expenditures through corporate payment cards to map the network of global business travel. We combine this network with information on the industrial composition and export baskets of national economies. The business travel network helps to predict which economic activities will grow in a country, which new activities will develop and which old activities will be abandoned. In statistical terms, business travel has the most substantial impact among a range of bilateral relationships between countries, such as trade, foreign direct investments and migration. Moreover, our analysis suggests that this impact is causal: business travel from countries specializing in a specific industry causes growth in that economic activity in the destination country. Our interpretation of this is that business travel helps to diffuse knowledge, and we use our estimates to assess which countries contribute or benefit the most from the diffusion of knowledge through global business travel.

Additional content:

The Value of Complementary Coworkers

As individuals specialize in specific knowledge areas, a society’s know-how becomes distributed across different workers. To use this distributed know-how, workers must be coordinated into teams that, collectively, can cover a wide range of expertise. This paper studies the interdependencies among co-workers that result from this process in a population-wide dataset covering educational specializations of millions of workers and their co-workers in Sweden over a 10-year period. The analysis shows that the value of what a person knows depends on whom that person works with. Whereas having co-workers with qualifications similar to one’s own is costly, having co-workers with complementary qualifications is beneficial. This co-worker complementarity increases over a worker’s career and offers a unifying framework to explain seemingly disparate observations, answering questions such as “Why do returns to education differ so widely?” “Why do workers earn higher wages in large establishments?” “Why are wages so high in large cities?”

Additional resources: WebsitePodcast | Video | Media Release