
You may download the complete list of countries from the below link.
When conducting visualizations or analyses utilizing country panel data, the initial step involves merging data from various sources. Nonetheless, instances may arise where these sources present differing standardized names for countries. To address this challenge, this post focuses on the creation of a comprehensive list of countries alongside their ISO2, ISO3, and corresponding continents. The ISO2 and ISO3 codes represent standardized abbreviations for countries as designated by the ISO (International Organization for Standardization). In this context, a list will be compiled using data from the World Bank (WB), International Monetary Fund (IMF), and United Nations (UN).
World Bank Country List
The following code is from a previous post concerning the World Bank API. In this instance, I retrieved a population data set, retaining only the ISO2, ISO3, country name, and an additional variable indicating whether the observation is sourced from the World Bank. The total number of countries are 217.
def wbcall(ind):
url = f'http://api.worldbank.org/v2/country/all/indicator/{ind}?&per_page=30000&format=json'
response = requests.get(url)
test = response.json()
if response.status_code != 200 or len(test) < 2:
print(f"Warning: No data for indicator {ind}")
return pd.DataFrame()
a = pd.DataFrame(test[1])
a['iso2'] = a['country'].apply(lambda x: x.get('id') if isinstance(x, dict) else None)
a['country_name'] = a['country'].apply(lambda x: x.get('value') if isinstance(x, dict) else None)
a = a.drop(['indicator', 'country', 'unit', 'obs_status', 'decimal'], axis=1)
a = a.rename(columns={'countryiso3code': 'iso3', 'value': ind})
return a
wb_list = wbcall('SP.POP.TOTL')
wb_list = wb_list[['iso2','iso3', 'country_name']].drop_duplicates().reset_index(drop=True)
wb_list['WB']=1
wb_list = wb_list[49:].reset_index(drop=True) # drop continent or other country groups
wb_list.rename(columns={'iso2':'ISO2', 'iso3':'ISO3', 'country_name':'Country_WB'}, inplace=True)
print(wb_list.shape)
wb_list.head()
IMF Country List
The source of the country list provided by the IMF is the World Economic Outlook (WEO). This data can be accessed through the following link: https://data.imf.org/en/datasets/IMF.RES:WEO. The dataset includes ISO3 codes, country names, and an additional variable indicating whether the observation is sourced from the IMF. Based on my experience, it appears that there are numerous instances where sources utilize the ISO3 codes for Kosovo and the West Bank and Gaza as XKX and PSE, respectively. Consequently, for the purpose of merging each country list later in this code, I have adjusted the ISO3 codes for these two countries accordingly. The total number of countries included in the dataset is 196.
imf_geo = pd.read_excel(parent_dir / 'IMF/IMF_WEO/(2025.10.05)weo_cleaned.xlsx')
imf_geo=imf_geo[['ISO3','Country']].drop_duplicates().reset_index(drop=True)
imf_geo.rename(columns={ 'Country':'Country_IMF'}, inplace=True)
imf_geo['IMF']=1
imf_geo.loc[imf_geo['Country_IMF'] == 'West Bank and Gaza', 'ISO3'] = 'PSE'
imf_geo.loc[imf_geo['Country_IMF'] == 'Kosovo', 'ISO3'] = 'XKX'
imf_geo.head()
UN Country List
The UN source for the country list can be found at the following link (https://geoportal.un.org/arcgis/apps/sites/#/geohub/datasets/21ba52fde4bf4a6989050d55c2fe967d/about). This resource provides a comprehensive list of countries along with their corresponding ISO3 codes and subregions. The total number of countries included in this list is 238. It is important to note that, within the context of the UN, the list encompasses not only sovereign states but also various territories of countries.
un_geo = pd.read_csv(file_path / 'UN_list.csv')
un_geo = un_geo[['iso3cd', 'nam_en', 'subreg', 'intreg']].drop_duplicates().reset_index(drop=True)
un_geo.rename(columns={'iso3cd':'ISO3', 'nam_en':'Country_UN', 'subreg':'Subregion', 'intreg':'Region'}, inplace=True)
un_geo['UN']=1
un_geo = un_geo[un_geo['Country_UN'].notnull() ]
un_geo = un_geo[un_geo['Subregion'].notnull()]
un_geo = un_geo[~un_geo['Country_UN'].isin(['Jammu and Kashmir', 'Gaza','Sint Eustatius', 'Saba', 'Galápagos Islands', 'Canary Islands','Chagos Archipelago', 'Madeira Island', 'Azores Islands', 'Saint Helena'])].reset_index(drop=True)
un_geo.head()
Merge all three sources
The subsequent code shows a Python package designed to assign the continent to which a country corresponds, based on its ISO2 code.
## Assign continents
import pycountry_convert as pc
def get_continent_from_iso3(iso3):
try:
iso2 = pc.country_alpha3_to_country_alpha2(iso3)
continent_code = pc.country_alpha2_to_continent_code(iso2)
continent_map = {
"AF": "Africa",
"NA": "North America",
"OC": "Oceania",
"AN": "Antarctica",
"AS": "Asia",
"EU": "Europe",
"SA": "South America"
}
return continent_map[continent_code]
except Exception:
return NoneAlong with the aforementioned function, the following code consolidates all three sources and assigns certain missing values, such as those for Taiwan and Kosovo. Additionally, I have excluded several regions, including American Samoa and Bermuda, as they are not recognized as sovereign states.
## Merge all sources
geo = pd.merge(wb_list, imf_geo, on='ISO3', how='outer')
geo = pd.merge(geo, un_geo, on='ISO3', how='outer')
geo = geo[geo['Country_WB'].notna() | geo['Country_IMF'].notna()].reset_index(drop=True)
geo.loc[geo['Country_WB'] == 'Kosovo', 'Subregion'] = 'Eastern Europe'
geo.loc[geo['ISO3'] == 'TWN', 'ISO2'] = 'TW'
geo['country_name'] = geo['Country_WB']
geo.loc[geo['ISO3'] == 'TWN', 'country_name'] = 'Taiwan'
geo["Continent"] = geo["ISO3"].apply(get_continent_from_iso3)
geo.loc[geo['Country_WB'] == 'Kosovo', 'Continent'] = 'Europe'
geo.loc[geo['Country_WB'] == 'Timor-Leste', 'Continent'] = 'Asia'
geo = geo[geo['Continent'].notnull()].reset_index(drop=True)
#dropped countries that are not considered countries
geo = geo[~geo['Country_WB'].isin(["American Samoa", "Bermuda", "Curaçao", "Cayman Islands", "Faroe Islands", "Gibraltar", "Greenland", "Guam", "Isle of Man", "Saint Martin (French part)", "Northern Mariana Islands", "New Caledonia", "French Polynesia", "Turks and Caicos Islands", "British Virgin Islands", "United States Virgin Islands"])].reset_index(drop=True)
geo=geo[['ISO2', 'ISO3', 'country_name','Country_WB', 'Country_IMF', 'Country_UN', 'Continent', 'Subregion', 'Region', 'WB', 'IMF', 'UN']]
geo[35:45]
The last data has 203 countries, representing a diverse array of cultures, economies, and geographical features. When merging data using this comprehensive list, you can utilize either ISO2 or ISO3 as a standard column to ensure consistency and accuracy in your datasets. This approach allows for the easy identification and retention of the countries you are particularly interested in, whether they are key partners in trade, regions of focus for development projects, or areas of research exploration. By adhering to these international standards, you can streamline your data processes and enhance the overall quality of your analyses, thereby facilitating more informed decision-making based on reliable information.
Leave a Reply