Positive matrix factorisation

From Testiwiki
Jump to: navigation, search
The text on this page is taken from an equivalent page of the IEHIAS-project.

Positive Matrix Factorisation (PMF) is a statistical factor analysis method, based on the law of mass conservation. By analysing measured concentrations at a series of measurement locations, the method first identifies a set of factors which can be taken to represent major emission sources. Scores on these factors are then regressed against the concentrations to estimate the contributions from each source.



PMF is used to analyse the contributions of different sources to measured concentrations or loads of pollutants in the environment at receptor locations. It is useful, especially, where detailed data do not exist on the composition of the main emission sources, but where large numbers of sampled data are available on ambient concentrations.


A major advantage of PMF is that the methodology can be applied without the need for data on source emission compositions. The methodology can also help to identify missing sources, and can handle missing data or measurements below the detection limit, but requires information on uncertainties in the measurements of pollutant loads at the sampled receptors. In addition the models are constrained to non-negative species concentrations and source contributions.


  • Composition of the emission sources is constant over the period of sampling at the receptors;
  • Chemical species used in PMF do not interact with each other and their concentrations are linearly additive;
  • Source profiles (fpj) are linearly independent of each other;
  • The numbers of species (j) is greater than or equal to the number of sources (p);
  • Marker elements (tracers) for each source should be included;
  • There are many more samples than source types for statistically meaningful calculations.

Weaknesses and limitations

  • PMF models require large datasets on measured concentrations (preferable >100 samples);
  • Analysis is limited by the accuracy, precision, and range of species measured at the receptor (e.g. ambient monitoring) sites;
  • A determination must be made of how many 'factors' to retain;
  • Emission sources have to be deduced by interpreting these factors;
  • Information is needed on source profiles or existing profiles in order to verify the representativeness of the calculated source profiles and uncertainties in the estimated source contributions.
  • The method relies on many parameters and initial conditions and model input; results are sensitive to the pre-set parameters.


For exposure assessment, the number of samples analysed must be representative both in time and space.

Method explanation


PMF models require data on measured concentrations (of species/elements) for a number of samples, together with information on the associated uncertainties. Where appropriate (e.g. when analysing ambient PM samples), information on meteorological parameters and concentrations of associated gaseous species may also be used.


Output from a PMF model comprises:

  • a set of factors representing the source profiles of major groups of emission sources;
  • estimates of the contribution from each of these sources (and their associated uncertainties).


PMF, like other multivariate receptor models, is based on the analysis of the correlation between measured concentrations of chemical species, assuming that highly correlated compounds come from the same source. The PMF approach has been developed to resolve problems occurring in standard Principle Components Analysis (e.g. negative solutions, and the inability to include uncertainty estimates or deal with missing data), and to enable source contributions to be assessed when detailed information on source profiles is lacking.

Output from the PMF model is a set of factors representing source profiles and estimates of their associated contributions to measured concentrations at the sampled receptor sites. Interpretation of the factors (i.e. allocation to names source types) has to be done by reference to information on source emissions, derived from literature and/or available measured data.


The PMF model assumes that measured concentrations at one or more receptor sites can be explained as the linear product of a source matrix and a contributing matrix. The two matrices are obtained by an interactive minimization algorithm: PMF involves constrained maximization of a weighted object function.

The primary object function is a measure of the goodness-of-fit of the predicted mass contributions for each species. Typically each species is weighted by a measure of trust in the individual measurements. The measure of trust can be adjusted for closeness to the minimum detection level, data completeness, sampling error or other user-defined attributes of the data. The results are constrained to be non-negative (although small negative values can occur) by adding penalty functions to the object function.

As with other forms of factor analysis, numerous procedural decisions have to be made and parameter values set when running PMF. These include the specification of data uncertainties, selection of the best number of factors, and choice of how to identify and deal with outliers. Results may be sensitive to these decisions, so the procedures used and assumptions made should always be fully documented in order to ensure that analysis is transparent.

PMF models are expressed as follows:

Error creating thumbnail: Unable to save thumbnail to destination


p is the number of sources;

j is the number of species, with jp;

Csub>ij</sub> is the measured ambient concentration of species j in samplei;

fpj (source profiles) is the fractional concentration of species j in the emissions from source p;

gip is the concentration contribution of source p to samplei; and

eij is the portion of the measured concentration that cannot be explained by the model.


See also

Tools for PMF:

More info on source attribution:

Other source attribution methods:

Integrated Environmental Health Impact Assessment System
IEHIAS is a website developed by two large EU-funded projects Intarese and Heimtsa. The content from the original website was moved to Opasnet.
Topic Pages

Boundaries · Population: age+sex 100m LAU2 Totals Age and gender · ExpoPlatform · Agriculture emissions · Climate · Soil: Degredation · Atlases: Geochemical Urban · SoDa · PVGIS · CORINE 2000 · Biomarkers: AP As BPA BFRs Cd Dioxins DBPs Fluorinated surfactants Pb Organochlorine insecticides OPs Parabens Phthalates PAHs PCBs · Health: Effects Statistics · CARE · IRTAD · Functions: Impact Exposure-response · Monetary values · Morbidity · Mortality: Database

Examples and case studies Defining question: Agriculture Waste Water · Defining stakeholders: Agriculture Waste Water · Engaging stakeholders: Water · Scenarios: Agriculture Crop CAP Crop allocation Energy crop · Scenario examples: Transport Waste SRES-population UVR and Cancer
Models and methods Ind. select · Mindmap · Diagr. tools · Scen. constr. · Focal sum · Land use · Visual. toolbox · SIENA: Simulator Data Description · Mass balance · Matrix · Princ. comp. · ADMS · CAR · CHIMERE · EcoSenseWeb · H2O Quality · EMF loss · Geomorf · UVR models · INDEX · RISK IAQ · CalTOX · PANGEA · dynamiCROP · IndusChemFate · Transport · PBPK Cd · PBTK dioxin · Exp. Response · Impact calc. · Aguila · Protocol elic. · Info value · DST metadata · E & H: Monitoring Frameworks · Integrated monitoring: Concepts Framework Methods Needs
Listings Health impacts of agricultural land use change · Health impacts of regulative policies on use of DBP in consumer products
Guidance System
The concept
Issue framing Formulating scenarios · Scenarios: Prescriptive Descriptive Predictive Probabilistic · Scoping · Building a conceptual model · Causal chain · Other frameworks · Selecting indicators
Design Learning · Accuracy · Complex exposures · Matching exposure and health · Info needs · Vulnerable groups · Values · Variation · Location · Resolution · Zone design · Timeframes · Justice · Screening · Estimation · Elicitation · Delphi · Extrapolation · Transferring results · Temporal extrapolation · Spatial extrapolation · Triangulation · Rapid modelling · Intake fraction · iF reading · Piloting · Example · Piloting data · Protocol development
Execution Causal chain · Contaminant sources · Disaggregation · Contaminant release · Transport and fate · Source attribution · Multimedia models · Exposure · Exposure modelling · Intake fraction · Exposure-to-intake · Internal dose · Exposure-response · Impact analysis · Monetisation · Monetary values · Uncertainty