Share structured common data in a pythonic way.
The source code of the modules in this package are generated by the make_code.py script, which queries miscellaneous sources.
The library just provides pure data, it does not feature any querying or rendering functionality. This data is meant to be imported into existing systems that use their own preferences for rendering and querying data. This is a design choice.
Online version of this document on https://github.com/lsaffre/commondata
>>> from commondata.countries import COUNTRIES, FIELDS
>>> len(COUNTRIES)
195
These are the countries of the world:
>>> lst = ["{} ({})".format(c.name['en'], c.isoCode2) for c in COUNTRIES]
>>> txt = ", ".join(lst)
>>> from textwrap import fill
>>> print(fill(txt, width=78)) #doctest: +REPORT_UDIFF +NORMALIZE_WHITESPACE
Andorra (AD), United Arab Emirates (AE), Afghanistan (AF), Antigua and Barbuda
(AG), Albania (AL), Armenia (AM), Angola (AO), Argentina (AR), Austria (AT),
Australia (AU), Azerbaijan (AZ), Bosnia and Herzegovina (BA), Barbados (BB),
Bangladesh (BD), Belgium (BE), Burkina Faso (BF), Bulgaria (BG), Bahrain (BH),
Burundi (BI), Benin (BJ), Brunei (BN), Bolivia (BO), Brazil (BR), The Bahamas
(BS), Bhutan (BT), Botswana (BW), Belarus (BY), Belize (BZ), Canada (CA),
Democratic Republic of the Congo (CD), Central African Republic (CF), Republic
of the Congo (CG), Switzerland (CH), Ivory Coast (CI), Chile (CL), Cameroon
(CM), People's Republic of China (CN), Colombia (CO), Costa Rica (CR), Cuba
(CU), Cape Verde (CV), Cyprus (CY), Czech Republic (CZ), Germany (DE),
Djibouti (DJ), Dominica (DM), Dominican Republic (DO), Algeria (DZ), Ecuador
(EC), Estonia (EE), Egypt (EG), Eritrea (ER), Spain (ES), Ethiopia (ET),
Finland (FI), Fiji (FJ), Federated States of Micronesia (FM), France (FR),
Gabon (GA), United Kingdom (GB), Grenada (GD), Georgia (GE), Ghana (GH), The
Gambia (GM), Guinea (GN), Equatorial Guinea (GQ), Greece (GR), Guatemala (GT),
Guinea-Bissau (GW), Guyana (GY), Honduras (HN), Croatia (HR), Haiti (HT),
Hungary (HU), Indonesia (ID), Ireland (IE), Israel (IL), India (IN), Iraq
(IQ), Iran (IR), Iceland (IS), Italy (IT), Jamaica (JM), Jordan (JO), Japan
(JP), Kenya (KE), Kyrgyzstan (KG), Cambodia (KH), Kiribati (KI), Comoros (KM),
Saint Kitts and Nevis (KN), North Korea (KP), South Korea (KR), Kuwait (KW),
Kazakhstan (KZ), Laos (LA), Lebanon (LB), Saint Lucia (LC), Liechtenstein
(LI), Sri Lanka (LK), Liberia (LR), Lesotho (LS), Lithuania (LT), Luxembourg
(LU), Latvia (LV), Libya (LY), Morocco (MA), Monaco (MC), Moldova (MD),
Montenegro (ME), Madagascar (MG), Marshall Islands (MH), North Macedonia (MK),
Mali (ML), Myanmar (MM), Mongolia (MN), Mauritania (MR), Malta (MT), Mauritius
(MU), Maldives (MV), Malawi (MW), Mexico (MX), Malaysia (MY), Mozambique (MZ),
Namibia (NA), Niger (NE), Nigeria (NG), Nicaragua (NI), Kingdom of the
Netherlands (NL), Norway (NO), Nepal (NP), Nauru (NR), New Zealand (NZ), Oman
(OM), Panama (PA), Peru (PE), Papua New Guinea (PG), Philippines (PH),
Pakistan (PK), Poland (PL), Palestine (PS), Portugal (PT), Palau (PW),
Paraguay (PY), Qatar (QA), Romania (RO), Serbia (RS), Russia (RU), Rwanda
(RW), Saudi Arabia (SA), Solomon Islands (SB), Seychelles (SC), Sudan (SD),
Sweden (SE), Singapore (SG), Slovenia (SI), Slovakia (SK), Sierra Leone (SL),
San Marino (SM), Senegal (SN), Somalia (SO), Suriname (SR), South Sudan (SS),
São Tomé and Príncipe (ST), El Salvador (SV), Syria (SY), Eswatini (SZ), Chad
(TD), Togo (TG), Thailand (TH), Tajikistan (TJ), Timor-Leste (TL),
Turkmenistan (TM), Tunisia (TN), Tonga (TO), Turkey (TR), Trinidad and Tobago
(TT), Tuvalu (TV), Taiwan (TW), Tanzania (TZ), Ukraine (UA), Uganda (UG),
United States (US), Uruguay (UY), Uzbekistan (UZ), Vatican City (VA), Saint
Vincent and the Grenadines (VC), Venezuela (VE), Vietnam (VN), Vanuatu (VU),
Samoa (WS), Yemen (YE), South Africa (ZA), Zambia (ZM), Zimbabwe (ZW)
This is what we know about each country_
>>> FIELDS
('entity', 'name', 'isoCode2', 'isoCode3', 'zipCode', 'population')
Example:
>>> COUNTRIES[0]
Country(entity='Q228', name={'en': 'Andorra', 'de': 'Andorra', 'fr': 'Andorre', 'nl': 'Andorra', 'et': 'Andorra', 'bn': 'অ্যান্ডোরা', 'es': 'Andorra'}, isoCode2='AD', isoCode3='AND', zipCode=None, population='87097')
The COUNTRY2SCHEME
dict in the commondata.peppol
module maps country codes to the Participant Identifier Scheme of their
respective VAT office.
The make_code.py gets this data
from https://docs.peppol.eu/edelivery/codelists
>>> from commondata.peppolcodes import COUNTRY2SCHEME
>>> COUNTRY2SCHEME['BE']
'9925'
>>> COUNTRY2SCHEME['EE']
'9931'
Not every country has an Electronic Address Scheme:
>>> COUNTRY2SCHEME['US']
Traceback (most recent call last):
...
KeyError: 'US'
Here is a list of the Peppol countries:
>>> " ".join(sorted(COUNTRY2SCHEME.keys()))
'AD AL AT BA BE BG CH CY CZ DE EE ES FI FR GB GR HR HU IE IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR VA international'
This is used by Lino, see https://dev.lino-framework.org/topics/peppol.html#electronic-address-scheme
The following snippet was used to generate the :file:`NAT2EAS.DBC` file used by TIM:
>>> for k in sorted(COUNTRY2SCHEME.keys()):
... print(f"{COUNTRY2SCHEME[k]}|{k}")
9922|AD
9923|AL
9914|AT
9924|BA
9925|BE
9926|BG
9927|CH
9928|CY
9929|CZ
9930|DE
9931|EE
9920|ES
0213|FI
9957|FR
9932|GB
9933|GR
9934|HR
9910|HU
9935|IE
9906|IT
9936|LI
9937|LT
9938|LU
9939|LV
9940|MC
9941|ME
9942|MK
9943|MT
9944|NL
9909|NO
9945|PL
9946|PT
9947|RO
9948|RS
9955|SE
9949|SI
9950|SK
9951|SM
9952|TR
9953|VA
9912|international
The DELIVERY_UNITS
dict in the commondata.peppol
module contains the codes that are allowed in the
unitCode
attribute of a InvoicedQuantity
element. These codes are specified by UNECERec20.
The make_code.py gets this data from the OpenPEPPOL repository.
>>> from commondata.peppolcodes import DELIVERY_UNITS
The DELIVERY_UNITS
dict contains many codes:
>>> len(DELIVERY_UNITS)
2162
And some of them are funny:
>>> DELIVERY_UNITS['14']
('shot', 'A unit of liquid measure, especially related to spirits.')
I wondered what's the code for "hour":
>>> for k, v in DELIVERY_UNITS.items():
... if v[0].lower() == "hour":
... print(k)
HUR
Here are some of the more commonly used units:
>>> for i in "HUR MIN MON LTR CLT DLT KGM XPP XPK XBX MTR MTK MTQ B68".split():
... print(i, DELIVERY_UNITS[i][0])
HUR hour
MIN minute [unit of time]
MON month
LTR litre
CLT centilitre
DLT decilitre
KGM kilogram
XPP Piece
XPK Package
XBX Box
MTR metre
MTK square metre
MTQ cubic metre
B68 gigabit
>>> from commondata.places.estonia import PLACES, COUNTIES
>>> len(PLACES)
4564
>>> len(COUNTIES)
15
>>> for county in COUNTIES:
... print(county.name, ":", ", ".join([p.name for p in county.children]))
Harju : Tallinn, Ääsmäe, Loksa, Vasalemma, Nissi, Saku, Saue, Viimsi, Raasiku, Jõelähtme, Maardu, Rae, Harku, Keila, Anija, Kehra, Kiili, Paldiski, Kose, Padise, Kõue, Kuusalu, Kernu, Aegviidu, Kaasiku, Kibuna, Vahastu, Vansi, Vikipalu, Jägala-Joa, Kersalu, Haapse, Jõesuu, Pohla, Andineeme
Pärnu : Pärnu, Halinga, Tootsi, Vändra, Tori, Tõstamaa, Tahkuranna, Sauga, Paikuse, Sindi, Audru, Häädemeeste, Kilingi-Nõmme, Are, Lavassaare, Varbla, Saarde, Surju, Kihnu, Koonga, Metsaääre, Aruvälja
Rapla : Vigala, Rapla, Kehtna, Märjamaa, Järvakandi, Juuru, Kaiu, Käru, Kohila, Raikküla
Hiiu : Kärdla, Käina, Kõrgessaare, Pühalepa, Emmaste
Ida-Viru : Lohusuu, Sonda, Toila, Tudulinna, Sillamäe, Püssi, Lüganuse, Vaivara, Narva, Avinurme, Narva-Jõesuu, Kohtla-Järve, Aseri, Jõhvi, Iisaku, Kiviõli, Alajõe, Kohtla-Nõmme, Maidla, Mäetaguse, Kohtla, Illuka
Jõgeva : Torma, Põltsamaa, Tabivere, Mustvee, Jõgeva, Palamuse, Puurmani, Saare, Kasepää, Pajusi, Pala, Vägeva
Järva : Türi, Roosna-Alliku, Paide, Väätsa, Ambla, Järva-Jaani, Koeru, Kareda, Albu, Imavere, Koigi, Kolu
Lääne : Lihula, Risti, Ridala, Haapsalu, Hanila, Taebla, Oru, Vormsi, Martna, Noarootsi, Nõva, Kullamaa
Lääne-Viru : Tapa, Rakvere, Vinni, Tamsalu, Rakke, Väike-Maarja, Sõmeru, Vihula, Haljala, Kunda, Kadrina, Laekvere, Viru-Nigula, Eisma
Põlva : Räpina, Põlva, Veriora, Kanepi, Ahja, Kõlleste, Vastse-Kuuste, Värska, Mikitamäe, Mooste, Orava, Valgjärve, Laheda
Saare : Leisi, Salme, Kaarma, Orissaare, Kärla, Kihelkonna, Kuressaare, Valjala, Lümanda, Pöide, Pihtla, Torgu, Mustjala, Laimjala, Muhu, Ruhnu
Tartu : Tartu, Luunja, Ülenurme, Haaslava, Rõngu, Kambja, Elva, Nõo, Kallaste, Puhja, Alatskivi, Mäksa, Tähtvere, Konguta, Rannu, Laeva, Võnnu, Peipsiääre, Meeksi, Vara, Piirissaare, Vehendi, Kriimani, Illi, Neemisküla
Valga : Valga, Tõrva, Otepää, Puka, Õru, Tõlliste, Sangaste, Karula, Helme, Taheva, Põdrala, Palupera, Hummuli
Viljandi : Suure-Jaani, Abja, Abja-Paluoja, Viljandi, Võhma, Mõisaküla, Viiratsi, Halliste, Karksi, Karksi-Nuia, Kolga-Jaani, Pärsti, Tarvastu, Saarepeedi, Paistu, Kõpu, Kõo, Soe
Võru : Vastseliina, Võru, Antsla, Varstu, Sõmerpalu, Rõuge, Mõniste, Haanja, Urvaste, Lasva, Misso, Meremäe, Kirumpää, Navi, Meegomäe
Note: The data about Estonian places is currently obsolete by several years. We plan to maintain it in collaboration with https://maaamet.ee/ruumiandmed-ja-kaardid/aadressid-ja-kohanimed/kohanimeregister
Until March 2024 this was a namespace package and country-specific data was contained in individual subpackages. The following packages are now obsolete
- commondata.be : Common data about Belgium
- commondata.ee: Common data about Estonia
- commondata.eg: Common data about Egypt
How to uninstall the old commondata packages: find your site-packages directory (e.g. ~/env/lib/python3.10/site-packages) and manually remove all files commondata*-nspkg.pth
The remaining part of this document is obsolete but still valid.
How to use the Place and PlaceGenerator classes.
You define a subclass of Place for each "type" of place:
>>> from commondata.utils import Place, PlaceGenerator
>>> class PlaceInFoo(Place):
... def __str__(self):
... return self.name
>>> class Kingdom(PlaceInFoo):
... value = 1
>>> class County(PlaceInFoo):
... value = 2
>>> class Borough(PlaceInFoo):
... value = 3
>>> class Village(PlaceInFoo):
... value = 3
The PlaceGenerator is used to instantiate to populate
Part 1 : configuration:
>>> pg = PlaceGenerator()
>>> pg.install(Kingdom, County, Borough, Village)
>>> pg.set_args('name')
Part 2 : filling data
>>> root = pg.kingdom("Kwargia")
>>> def fill(pg):
... pg.county("Kwargia")
... pg.borough("Kwargia")
... pg.village("Virts")
... pg.village("Vinks")
... pg.county("Gorgia")
... pg.village("Girts")
... pg.village("Ginks")
>>> fill(pg)
Part 3 : using the data
>>> [str(x) for x in root.children]
['Kwargia', 'Gorgia']
>>> kwargia = root.children[0]
>>> [str(x) for x in kwargia.children]
['Kwargia', 'Virts', 'Vinks']
You use the commondata.utils.PlaceGenerator.set_args() method to specify the names of the fields of subsequent places.
>>> pg = PlaceGenerator()
>>> pg.install(Kingdom, County, Borough, Village)
>>> pg.set_args('name name_ar')
>>> root = pg.kingdom("Egypt", u'مصر')
>>> print(root.name_ar)
مصر
2025-06-13 I wondered why Kosovo (XK) is not in our list. Seems that it is not marked as a sovereign_state in Wikidata. But after running make_docs.py I noticed that Bangladesh (BD) has vanished from the list. I ignore why. I don't plan to dig deeper into this because I believe we should rather deprecate this project and start using pycountries.
En passant I fixed a broken link for Peppol in make_docs.py.