WHO mortality data

From Testiwiki
Jump to: navigation, search

WHO mortality data is a study by WHO to collect information about numbers of deaths in different countries. See the model file.


What are the numbers of death per year, country, sex, age group, and diagnosis?



Conditions on the use of the WHO Mortality database

The data available on this web site comprise deaths registered in national vital registration systems, with underlying cause of death as coded by the relevant national authority. These data are official national statistics in the sense that they have been transmitted to the World Health Organization by the competent authorities of the countries concerned. Each Member State reports population data along with their mortality data, for the population covered by the death registration system. Where this is a subset of the national population, the data is labelled accordingly in the WHO Mortality Database, e.g. Brazil (North and North-east) or Paraguay (reporting areas). However, the completeness of death registration may also be less than 100% for the specified registration population.

Note that vital registration data may be 100% complete for the population covered, but not include full coverage of deaths in the country. Caution should therefore be taken when doing inter-country comparisons.

Death registration coverage and cross-national differences in coding practices, particularly in the use of codes for ill-defined and unknown causes, must also be taken into account to validly compare mortality rates for specific causes across countries.

The designations employed and the presentation of material in the MDB do not imply the expression of any opinion whatsoever on the part of the World Health Organization or other parties involved in the MDB concerning the legal status of any country, territory or area, its authorities, its current or former official name, or the delimitation of its frontiers or boundaries. Accordingly: a) strictly for purposes of statistical use, denominations are used which, although applicable at one particular time, may not reflect correct terminology at some other point in the historical context in which they are so used; b) references to "former" entities refer to countries that formerly existed under those names, or abbreviations; and c) in some cases, denominations are used to refer to countries as they currently exist and, when used with respect to data relating to before the existence of these countries as independent states, to sub-national entities of formerly existing larger countries.

WHO asks users to cooperate in the provision of electronically transmitted data by adhering to the following guidelines:

(a) Material drawn from the MDB for publication must be accompanied by an acknowledgment of WHO as the source and a disclaimer crediting analyses, interpretations or conclusions to the author of the published data and not to WHO, which is responsible only for the provision of the original information.

(b) Users wishing to publish a technical description or qualification of the data will make a reasonable effort to ensure that it is not inconsistent with any published by WHO.

(c) Recipients of electronically transmitted data wishing to, or asked to make these, or copies thereof available to a third party are asked to refer such party to WHO, who will transmit the data directly accompanied with the necessary documentation. This will prevent circulation of out-of-date data, as the MDB is updated regularly.

It should be noted that these data are transmitted on the understanding that no use will be made of them for commercial purposes and that no such permission or right to use may be implied thereby.

Responsible person:

Dr Kenji Shibuya
Department of Measurement and Health Information Systems
Health Statistics and Evidence
World Health Organization
CH-1211 Geneva 27
e-mail shibuyak[at]who.int


  • No upstream variables have been defined.
  • If mortality rates (per person-years) are needed, this data should be merged with the population data.


Numbers of deaths.


The mortality data is actually quite complex. One could assume that is it country*ICD code*age group*sex*year, but

  • different ICD code groupings have been used in different countries
  • different age group categories have been used in different countries
  • different observation years.

Therefore, this is not a nice 5D cube. Instead, there are lots of merged and empty cells in the cube. There should be a plan for how this is organised. the current idea:

  • Analyse the data for each country to identify the age, icd, and year locations used.
  • Create indices for all different variations.
  • On the database level,
    • describe which locations in which indices are equal.
    • describe which locations are mutually exhaustive subsets of another location.

The data contains the following locations:

Country (104 countries available): 1125 1300 1360 1365 1400 1430 2005 2010 2020 2025 2030 2040 2045 2050 2070 2085 2090 2110 2120 2130 2140 2150 2160 2170 2180 2190 2210 2230 2240 2260 2270 2300 2310 2320 2340 2350 2360 2370 2380 2385 2400 2410 2420 2430 2440 2445 2450 2455 2460 2470 3020 3030 3080 3090 3150 3160 3190 3255 3320 3325 3380 4010 4012 4018 4038 4045 4050 4055 4070 4080 4084 4085 4150 4160 4180 4182 4184 4186 4188 4190 4200 4210 4220 4230 4240 4260 4270 4272 4273 4274 4276 4280 4290 4300 4303 4308 4310 4320 4330 4335 4350 5020 5105 5150

Admin1, Subdiv: None

Year: 1996..2007

List: 101 103 104 10M

ICD 10 codes: 10584 different codes. Some of these are combinations of several ICD codes, listed as numbers.

Sex: 1,2,9 (Male, Female, Unknown)

Format (for age groupings): 0,1,2,4,7

IM format (for infant mortality): 1,2,8

Structuring the data in the database

For one country, it is straightforward to use indices that have only rows that contain data for that particular country. This will lead to 1 sex index, several year indices, 3 infant mortality indices, 5 non-infant mortality indices, and 4 different ICD indices. However, the whole study will be extremely complex with >= 14 dimensions and a huge number of empty cells.

What we need is a system that is able to aggregate and disaggregate data from one index to another. Aggregation is straightforward, but disaggregation requires data from other countries; this is used if no disaggregation data is not available for the particular country. Should we use Dirichlet for disaggregation?

Can the aggregation and disaggregation be done at the Base level? Or maybe we need an Analytica model that is uploaded to AWP, and that takes care of the (dis)aggregation. This sounds better.


So far, only ICD-10 compliant data is uploaded. The years present in the data depend on when a given country updated their reporting standards.

Show results

See also