The canonical correlation complexity method
Önder Nomaler & Bart Verspagen
#2022-015
A relatively recent, yet rapidly proliferating strand of literature in
the so-called econophysics domain, known as 'economic complexity' ,
introduces a toolkit to analyse the relationship between specialization,
diversification, and economic development. Different methods that aim at
reducing the high dimensionality in data on the empirical patterns of
co-location (be it nations or regions) of specializations have been
proposed. In terms of the concepts of machine learning, the existing
algorithms follow the framework of 'unsupervised learning'. The
competing alternatives (e.g., Hidalgo and Hausmann, 2009 vs. Tacchella
et al, 2012) have been based on very different assessments of which
products depend on more complex capabilities, and accordingly yield
highly different estimations of complexity at the product level. The
approach that we developed avoids this algorithmic 'confusion' by
drawing on a toolkit of more transparent and long-established methods
that follow the 'supervised learning' principle where the data on
trade/specialization and development are processed together from the
very beginning in order to identify the patterns of mutual association.
The first pillar of the toolkit, Principal Component Analysis (PCA),
serves dimensionality reduction in co-location information. The second
pillar, Canonical Correlation Analysis (CCA), identifies the
mutual-association between the various patterns of (co-)specialization
and more-than-one dimension of economic development. This way, we are
able to identify the products or technologies that can be associated
with the level or the growth rate of per capita GDP and CO2 emissions.
Keywords: Economic complexity, economic development, supervised
learning, canonical correlation analysis, principal component analysis
JEL Classification: F14, F63, O11