API

pyPMF.PMF module

class pyPMF.PMF.PMF(site, reader=None, savedir='./', BDIR=None, SQL_connection=None, SQL_table_names=None, SQL_program=None)[source]

Bases: object

PMF output of the US EPA PMF5.0 software in handy format (pandas DataFrame). Several utilities and plots are also available.

get_seasonal_contribution(specie=None, annual=True, normalize=True, constrained=True)[source]

Get a dataframe of seasonal contribution

Parameters

specie (str, default to total variable) –
annual (Boolean, default True, add annual contribution) –
normalize (Boolean, default True, normalize to 100%) –
constrained (Boolean, default True) –

Returns

df – seasonal contribution

Return type

pd.DataFrame

get_total_specie_sum(constrained=True)[source]

Return the total specie sum profiles in %

Parameters: constrained (boolean, default True) – use the constrained run or not
Returns: df – The normalized species sum per profiles
Return type: pd.DataFrame

print_uncertainties_summary(constrained=True, profiles=None, species=None)[source]

Get the uncertainties given by BS, BS-DISP and DISP for the given profiles and species

Parameters

constrained (boolean, True) – Use the constrained run (False for the base run)
profiles (list of str) – list of profiles, default all profiles
species (list of str) – list of species, default all species

Returns

df – BS, DISP and BS-DISP ranges

Return type

pd.DataFrame

recompute_new_species(specie)[source]

Recompute a specie given the other species. For instance, recompute OC from OC* and a list of organic species.

It modify inplace both dfprofile_b and dfprofile_c, and update self.species.

Parameters: specie (str in ["OC",]) –

rename_factors(mapper)[source]

Rename factors names in all dataframe

Parameters: mapper (dict) – Key of the dictionnary are the old name, and value the desired name

rename_factors_to_factors_category()[source]

Rename the factor profile name to match the category

See pyPMF.utils.get_sourcesCategories

replace_totalVar(newTotalVar)[source]

replace the total var to all dataframe

Parameters: newTotalVar (str) –

to_cubic_meter(specie=None, constrained=True, profiles=None)[source]

Convert the contribution in cubic meter for the given specie

Parameters

constrained (Boolean, default True) –
specie (str, the specie, default totalVar) –
profiles (list of profile, default all profiles) –

Returns

df

Return type

pd.DataFrame

to_relative_mass(constrained=True, species=None, profiles=None)[source]

Compute the factor profile relative mass (i.e. each species divided by the totalVar mass)

Parameters

constrained (Boolean, default True) –
species (list of str, default all species) –
profiles (list of str, default all profiles) –

Returns

df

Return type

pd.DataFrame

pyPMF.reader module

class pyPMF.readers.BaseReader(site, pmf)[source]

Bases: abc.ABC

read_all()[source]

Read all possible data outputed by the PMF

Returns: TODO

abstract read_base_bootstrap()[source]

Read the “base” bootstrap result from the file: ‘_boot.xlsx’ and add :

self.dfBS_profile_b: all mapped profile
self.dfbootstrap_mapping_b: table of mapped profiles

abstract read_base_contributions()[source]

Read the “base” contributions result from the file: ‘_base.xlsx’, sheet “Contributions”, and add :

self.dfcontrib_b: base factors contribution

abstract read_base_profiles()[source]

Read the “base” profiles result from the file: ‘_base.xlsx’, sheet “Profiles”, and add :

self.dfprofiles_b: constrained factors profile

abstract read_base_uncertainties_summary()[source]

Read the _BaseErrorEstimationSummary.xlsx file and add:

self.df_uncertainties_summary_b : uncertainties from BS, DISP and BS-DISP

abstract read_constrained_bootstrap()[source]

Read the “base” bootstrap result from the file: ‘_Gcon_profile_boot.xlsx’ and add :

self.dfBS_profile_c: all mapped profile
self.dfbootstrap_mapping_c: table of mapped profiles

abstract read_constrained_contributions()[source]

Read the “constrained” contributions result from the file: ‘_Constrained.xlsx’, sheet “Contributions”, and add :

self.dfcontrib_c: constrained factors contribution

abstract read_constrained_profiles()[source]

Read the “constrained” profiles result from the file: ‘_Constrained.xlsx’, sheet “Profiles”, and add :

self.dfprofiles_c: constrained factors profile

abstract read_constrained_uncertainties_summary()[source]

Read the _ConstrainedErrorEstimationSummary.xlsx file and add :

self.df_uncertainties_summary_b : uncertainties from BS, DISP and BS-DISP

read_metadata()[source]

Get profiles, species and co

It add a totalVariable (by default one of “PM10”, “PM2.5”, “PMrecons” or “PM10recons”, “PM10rec”). Otherwise, try to guess (variable with “PM” on its name).

class pyPMF.readers.SqlReader(site, pmf, SQL_connection, SQL_table_names=None, SQL_program=None)[source]

Bases: pyPMF.readers.BaseReader

Accessor class for the PMF class with all reader methods.

read_base_bootstrap()[source]

Read the “base” bootstrap result from the file: ‘_boot.xlsx’ and add :

self.dfBS_profile_b: all mapped profile
self.dfbootstrap_mapping_b: table of mapped profiles

read_base_contributions()[source]

Read the “base” contributions result from the file: ‘_base.xlsx’, sheet “Contributions”, and add :

self.dfcontrib_b: base factors contribution

read_base_profiles()[source]

Read the “base” profiles result from database and add :

self.dfprofiles_b: base factors profile

read_base_uncertainties_summary()[source]

Read the base error uncertainties and add:

self.df_disp_swap_b : number of swap
self.df_uncertainties_summary_b : uncertainties from BS, DISP and BS-DISP

read_constrained_bootstrap()[source]

Read the “base” bootstrap result from the file: ‘_Gcon_profile_boot.xlsx’ and add :

self.dfBS_profile_c: all mapped profile
self.dfbootstrap_mapping_c: table of mapped profiles

read_constrained_contributions()[source]

Read the “constrained” contributions result from the file: ‘_Constrained.xlsx’, sheet “Contributions”, and add :

self.dfcontrib_c: constrained factors contribution

read_constrained_profiles()[source]

Read the “constrained” profiles result database and add :

self.dfprofiles_c: constrained factors profile

read_constrained_uncertainties_summary()[source]

Read the constrained error uncertainties and add :

self.df_disp_swap_c : number of swap
self.df_uncertainties_summary_b : uncertainties from BS, DISP and BS-DISP

class pyPMF.readers.XlsxReader(BDIR, site, pmf)[source]

Bases: pyPMF.readers.BaseReader

Accessor class for the PMF class with all reader methods.

read_base_bootstrap()[source]

Read the “base” bootstrap result from the file: ‘_boot.xlsx’ and add :

self.dfBS_profile_b: all mapped profile
self.dfbootstrap_mapping_b: table of mapped profiles

read_base_contributions()[source]

Read the “base” contributions result from the file: ‘_base.xlsx’, sheet “Contributions”, and add :

self.dfcontrib_b: base factors contribution

read_base_profiles()[source]

Read the “base” profiles result from the file: ‘_base.xlsx’, sheet “Profiles”, and add :

self.dfprofiles_b: constrained factors profile

read_base_uncertainties_summary()[source]

Read the _BaseErrorEstimationSummary.xlsx file and add:

self.df_uncertainties_summary_b : uncertainties from BS, DISP and BS-DISP

read_constrained_bootstrap()[source]

Read the “base” bootstrap result from the file: ‘_Gcon_profile_boot.xlsx’ and add :

self.dfBS_profile_c: all mapped profile
self.dfbootstrap_mapping_c: table of mapped profiles

read_constrained_contributions()[source]

Read the “constrained” contributions result from the file: ‘_Constrained.xlsx’, sheet “Contributions”, and add :

self.dfcontrib_c: constrained factors contribution

read_constrained_profiles()[source]

Read the “constrained” profiles result from the file: ‘_Constrained.xlsx’, sheet “Profiles”, and add :

self.dfprofiles_c: constrained factors profile

read_constrained_uncertainties_summary()[source]

Read the _ConstrainedErrorEstimationSummary.xlsx file and add :

self.df_uncertainties_summary_b : uncertainties from BS, DISP and BS-DISP

pyPMF.plotter module

class pyPMF.plotter.Plotter(pmf, savedir)[source]

Bases: object

Lot’s of plot in this class for a PMF object!

plot_all_profiles(constrained=True, profiles=None, specie=None, BS=True, DISP=True, BSDISP=False, plot_save=False, savedir=None)[source]

TODO: Docstring for plot_all_profiles.

Parameters

constrained (Boolean, default True) – Either to use the constrained run or the base one
profiles (list of string) – Profiles to plot
species –
{BS (boolean, default True, True, False) – Use them as error estimation
DISP (boolean, default True, True, False) – Use them as error estimation
BSDISP} (boolean, default True, True, False) – Use them as error estimation
plot_save (boolean, default False) – Either or not saving the plot
savedir (str) – Path to save the plot

plot_contrib(dfBS=None, dfDISP=None, dfcontrib=None, profiles=None, specie=None, constrained=True, plot_save=False, savedir=None, BS=True, DISP=True, BSDISP=False, new_figure=True, **kwargs)[source]

Plot temporal contribution in µg/m3.

Parameters

df (pd.DataFrame, default self.dfBS_profile_c) – DataFrame with multiindex [species, profile] and an arbitrary number of column.
dfcontrib (pd.DataFrame, default self.dfcontrib_c) – Profile as column and specie as index.
profiles (list of string, default self.profiles) – profile to plot (one figure per profile)
specie (string, default totalVar.) – specie to plot (y-axis)
plot_save (boolean, default False) – Save the graph in savedir.
savedir (string) – directory to save the plot

plot_per_microgramm(df=None, constrained=True, profiles=None, species=None, plot_save=False, savedir=None)[source]

Plot profiles in concentration unique (µg/m3).

Parameters

df (DataFrame with multiindex [species, profile] and an arbitrary) – number of column. Default to dfBS_profile_c.
constrained (Boolean, either to use the constrained run or the base run) –
profiles (list of str, profile to plot (one figure per profile)) –
species (list of str, specie to plot (x-axis)) –
plot_save (boolean, default False. Save the graph in savedir.) –
savedir (string, directory to save the plot.) –

plot_polluted_contribution(constrained=True, threshold=None, specie=None, normalize=True, plot_save=False, savedir=None)[source]

Plot a barplot splited by polluted/non-polluted days defined by the threshold given.

Parameters

constrained (boolean, default True) – use the constrained run or not
threshold (int, default 50) – Threshold in µg/m³ to define a polluted days
specie (str, default to total variable) – specie to use
normalize (boolean, default True) – normalized the graph
plot_save (boolean, default False. Save the graph in savedir.) –
savedir (string, directory to save the plot.) –

plot_samples_sources_contribution(constrained=True, specie=None, savedir=None, plot_save=False)[source]

Plot bar plot of the contribution per sample (timeserie)

Parameters

constrained (boolean, True) – Use the constrained run or the base one
specie (str, default to totalVar) – Specie to use
savedir (str, default to self.savedir) – Path where to save figure
plot_save (boolean, False) – Save the plot as png

plot_seasonal_contribution(*args, **kwargs)[source]

plot_seasonal_contributions(constrained=True, dfcontrib=None, dfprofiles=None, profiles=None, specie=None, plot_save=False, savedir=None, annual=True, normalize=True, ax=None, barplot_kwarg={})[source]

Plot the relative contribution of the profiles.

Parameters

dfcontrib (DataFrame with contribution as column and date as index.) –
dfprofiles (DataFrame with profile as column and specie as index.) –
profiles (list, profile to plot (one figure per profile)) –
specie (string, default totalVar. specie to plot) –
plot_save (boolean, default False. Save the graph in savedir.) –
savedir (string, directory to save the plot.) –
annual (plot annual contribution) –
normalize (plot relative contribution or absolute contribution.) –

Returns

df

Return type

DataFrame

plot_stacked_contributions(constrained=True, order=None, plot_kwargs=None, savedir=None, plot_save=False)[source]

Plot a stacked plot for the contributions

Parameters

constrained (TODO) –
order (TODO) –
plot_kwargs (TODO) –
plot_save (boolean, default False) – Either or not saving the plot
savedir (str) – Path to save the plot

plot_stacked_profiles(constrained=True, savedir=None, plot_save=False)[source]

plot the repartition of the species among the profiles, normalized to 100%

Parameters

constrained (boolean, default True) – use the constrained run or not
plot_save (boolean, default False. Save the graph in savedir.) –
savedir (string, directory to save the plot.) –

Returns

ax

Return type

the axe

plot_totalspeciesum(df=None, profiles=None, species=None, constrained=True, plot_save=False, savedir=None, **kwargs)[source]

Plot profiles in percentage of total specie sum (%).

Parameters

df (DataFrame with multiindex [species, profile] and an arbitrary) – number of column. Default to dfBS_profile_c.
profiles (list, profile to plot (one figure per profile)) –
species (list, specie to plot (x-axis)) –
plot_save (boolean, default False. Save the graph in savedir.) –
savedir (string, directory to save the plot.) –

pyPMF.plotter.pretty_specie(text)[source]

pyPMF.utils module

pyPMF.utils.add_season(df, month=True, month_to_season=None)[source]

Add a season column to the DataFrame df.

df must have either a Date column or an named index Date

Parameters

df (pd.DataFrame) – The DataFrame to work with.
month (Boolean (True)) – Add month number
month_to_season (dict, optional (None)) –
Dictionary mapping between month number to season name. Default to

month_to_season = {
1: “Winter”, 2: “Winter”, 3: “Spring”, 4: “Spring”, 5: “Spring”, 6: “Summer”, 7: “Summer”, 8: “Summer”, 9: “Fall”, 10: “Fall”, 11: “Fall”, 12: “Winter”,

}

Returns

dfnew – Copy of input dataframe with a ‘season’ (and ‘month’) columns.

Return type

pd.DataFrame

pyPMF.utils.format_xaxis_timeseries(ax)[source]

Format the x-axis timeseries with minortick = month and majortick=year

Parameters: ax (mpl.axes) – The axe to format

pyPMF.utils.get_sourceColor(source=None)[source]

Return the hexadecimal color of the source(s)

If no option, then return the whole dictionary

Parameters: source (str) – The name of the source
Returns: color – color in hexadecimal
Return type: str or pd.DataFrame

pyPMF.utils.get_sourcesCategories(profiles)[source]

Get the sources category according to the sources name.

Ex. Aged sea salt → Aged_sea_salt

Parameters: profiles (list of str) –
Returns: profiles_renamed
Return type: list of str