pyphi_batch package

Submodules

pyphi_batch.pyphi_batch module

Created on Mon Apr 11 14:58:35 2022

Batch data is assumed to come in an excel file with first column being batch identifier and following columns being process variables. Optionally the second column labeled ‘PHASE’,’Phase’ or ‘phase’ indicating the phase of exceution

Change log: * added Dec 28 2023 Titles can be sent to contribution plots via plot_title flag

Monitoring diagnostics are also plotted against sample starting with 1

  • added Dec 27 2023 Corrected plots to use correct xaxis starting with sample =1

    Ammended indicator variable alignment not to replace the IV with a linear sequence but to keep orginal data

  • added Dec 4 2023 Added a BatchVIP calculation

  • added Apr 23 2023 Corrected a very dumb mistake I made coding when tired

  • added Apr 18 2023 Added descriptors routine to obtain landmarks of the batch

    such as min,max,ave of a variable [during a phase if indicated so] Modifed plot_var_all_batches to plot against the values in a Time column and also add the legend for the BatchID

  • added Apr 10 2023 Added batch contribution plots

    Added build_rel_time to create a tag of relative run time from a timestamp

  • added Apr 7 2023 Added alignment using indicator variable per phase

  • added Apr 5 2023 Added the capability to monitor a variable in “Soft Sensor” mode

    which implies there are no measurements for it (pure prediction) as oppose to a forecast where there are new measurments coming in time.

  • added Jul 20 2022 Distribution of number of samples per phase plot

  • added Aug 10 2022 refold_horizontal | clean_empty_rows | predict

  • added Aug 12 2022 replicate_batch

@author: S. Garcia-Munoz sgarciam@ic.ak.uk salg@andrew.cmu.edu

pyphi_batch.pyphi_batch.batch_vip(mmvm_obj, *, addtitle=False)[source]

plot the summation across componets of the integral of the absolute value of loadings for a batch multiplied by the R2 [which kinda mimicks the VIP]

batch_vip(mmvm_obj,*,addtitle=False)

Parameters:

mmvm_obj – A multiway PCA or PLS model

by Salvador Garcia Munoz sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.build_rel_time(bdata, *, time_unit='min')[source]

Converts the column ‘Timestamp’ into ‘Time’ in time_units relative to the start of each batch

bdata_new = build_rel_time(bdata,*,time_unit=’min’)

by Salvador Garcia Munoz

sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.clean_empty_rows(X, *, shush=False)[source]

Cleans empty rows in batch data Input:

X: Batch data to be cleaned of empty rows (all np.nan) DATAFRAME

Output:

X: Batch data Without observations removed

pyphi_batch.pyphi_batch.contributions(mmvmobj, X, cont_type, *, to_obs=False, from_obs=False, lv_space=False, phase_samples=False, dyn_conts=False, which_var=False, plot_title='')[source]

Plot batch contribution plots to Scores, HT2 or SPE

contributions (mmvmobj,X,cont_type,*,to_obs=False,from_obs=False,

lv_space=False,phase_samples=False,dyn_conts=False,which_var=False, plot_title=’’)

Parameters:
  • Model (mmvmobj= Multiway)

  • data (X = batch)

  • 'spe' (cont_type = 'scores' | 'ht2' |)

  • to (to_obs = Observation to calculate contributions)

  • only] (from_obs = Relative basis to calculate contributions to [for 'scores' and 'ht2') – if not sent the origin of the model us used as the base.

Returns:

contribution_vector

by Salvador Garcia Munoz

sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.descriptors(bdata, which_var, desc, *, phase=False)[source]

Get descriptor values for a batch trajectory

descriptors_df = descriptors(bdata,which_var,desc,*,phase=False)

Parameters:
  • bdata – Dataframe of batch data, first column is batch ID, second column can be phase id

  • which_var – List of variables to get descriptors for

  • desc – List of descriptors to calculate, options are: ‘min’ ‘max’ ‘mean’ ‘median’ ‘std’ ‘var’ ‘range’ ‘ave_slope’

  • phase – to specify what phases to do this for

Returns:

A dataframe with the descriptors per batch

Return type:

descriptors

by Salvador Garcia Munoz

sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.find(a, func)[source]
pyphi_batch.pyphi_batch.loadings(mmvm_obj, dim, *, r2_weighted=False, which_var=False)[source]

Plot batch loadings for variables as a function of time/sample

loadings(mmvm_obj,dim,*,r2_weighted=False,which_var=False)

Parameters:
  • mmvm_obj – Multiway PCA or PLS object

  • dim – What component or latent variable to plot

  • r2_weighted – If True => weight the loading by the R2pv

  • which_var – Variable for which the plot is done, if not sent all are plotted

by Salvador Garcia Munoz sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.loadings_abs_integral(mmvm_obj, *, r2_weighted=False, addtitle=False)[source]

Plot the integral of the absolute value of loadings for a batch

loadings_abs_integral(mmvm_obj,*,r2_weighted=False,addtitle=False)

Parameters:
  • mmvm_obj – A multiway PCA or PLS model

  • r2_weighted – Boolean flag, if True then in weights the loading by the R2pv

  • addtitle – Text to place in the title of the figure

by Salvador Garcia Munoz sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.mean(X, axis)[source]
pyphi_batch.pyphi_batch.monitor(mmvm_obj, bdata, *, which_batch=False, zinit=False, build_ci=True, shush=False, soft_sensor=False)[source]

Routine to mimic the real-time monitoring of a batch given a model

monitor(mmvm_obj,bdata,*,which_batch=False,zinit=False,build_ci=True,shush=False,soft_sensor=False):

usage: 1st you need to run: monitor(mmvm_obj,bdata)

to mimic monitoring for all bdata batches and build CI these new parameters are written back to mmvm_obj

Then you can run:

diagnostics = monitor(mmvm_obj,bdata,which_batch=your_batchid)

to mimic monitoring for your_batchid and will produce all dynamic metrics and forecasts

diagnostics = monitor(mmvm_obj,bdata,which_batch=your_batchid,zinit=your_z_data)

to mimic monitoring for your_batchid using initial conditions will produce all dynamic metrics and forecasts

diagnostics = monitor(mmvm_obj,bdata,which_batch=your_batchid,soft_sensor=your_variable)

to mimic monitoring for your_batchid will produce all dynamic metrics and forecasts and produce soft-sensor predictions for your_variable

Returns:

diagnostics:A dictionary with all the monitoring diagnostics and contributions

by Salvador Garcia Munoz

sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.mpca(xbatch, a, *, unfolding='batch wise', phase_samples=False, cross_val=0)[source]

Multi-way PCA for batch analysis

mpca_obj= mpca(xbatch,a,*,unfolding=’batch wise’,phase_samples=False,cross_val=0)

Parameters:
  • xbatch – Pandas dataframe with aligned batch data it is assumed that all batches have the same number of samples

  • a – Number of PC’s to fit

  • unfolding – ‘batch wise’ or ‘variable wise’

  • phase_samples – information about samples per phase [optional]

  • cross_val – percent of elements for cross validation (defult is 0 = no cross val)

Returns:

A Dictionary with all the parameters for the MPCA model

Return type:

mpca_obj

by Salvador Garcia Munoz sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.mpls(xbatch, y, a, *, zinit=False, phase_samples=False, mb_each_var=False, cross_val=0, cross_val_X=False)[source]

Multi-way PLS for batch analysis

mpls_obj = mpls(xbatch,y,a,*,zinit=False,phase_samples=False,mb_each_var=False,cross_val=0,cross_val_X=False):

Parameters:
  • xbatch – Pandas dataframe with aligned batch data it is assumed that all batches have the same number of samples

  • y – Response to predict, one row per batch

  • a – Number of PC’s to fit

  • zinit – Initial conditions <optional>

  • phase_samples – alignment information

  • mb_each_var – if “True” will make each variable measured a block otherwise zinit is one block and xbatch another

Returns:

A dictionary with all the parameters of the MPLS model

Return type:

mpls_obj

by Salvador Garcia Munoz

sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.phase_iv_align(bdata, nsamples)[source]

Batch alignment using an indicator variable

batch_aligned_data = phase_iv_align(bdata,nsamples)

Parameters:

Identifier (bdata is a Pandas DataFrame where 1st column is Batch) –

the second column is a phase indicator

and following columns are variables, each row is a new time sample. Batches are concatenated vertically.

nsamples:

if nsamples is a dictionary: samples to generate per phase e.g.

nsamples = {‘Heating’:100,’Reaction’:200,’Cooling’:10}

  • If an indicator variable is used, with known start and end values

indicate it with a list like this:

[IVarID,num_samples,start_value,end_value]

example:

nsamples = {‘Heating’:[‘TIC101’,100,30,50],’Reaction’:200,’Cooling’:10}

During the ‘Heating’ phase use TIC101 as an indicator variable take 100 samples equidistant from TIC101=30 to TIC101=50 and align against that variable as a measure of batch evolution (instead of time)

  • If an indicator variable is used, with unknown start but known end values

indicate it with a list like this:

[IVarID,num_samples,end_value]

example:

nsamples = {‘Heating’:[‘TIC101’,100,50],’Reaction’:200,’Cooling’:10}

During the ‘Heating’ phase use TIC101 as an indicator variable take 100 samples equidistant from the value of TIC101 at the start of the phase to the point when TIC101=50 and align against that variable as a measure of batch evolution (instead of time)

If no IV is sent, the resampling is linear with respect to row number per phase

Returns:

A pandas dataframe with batch data resampled (aligned)

by Salvador Garcia Munoz sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.phase_sampling_dist(bdata, time_column=False, addtitle=False, use_phases=False)[source]

Count and plot a histogram of the distribution of samples (or time if time_column is indicated) consumed per phase on a batch dataset

phase_sampling_dist(bdata,time_column=False,addtitle=False,use_phases=False)

Parameters:
  • bdata

    Batch data organized as: column[0] = Batch Identifier column name is unrestricted column[1] = Phase information per sample must be called ‘Phase’,’phase’, or ‘PHASE’

    this information is optional

    column[2:]= Variables measured throughout the batch

  • time_column – Indicates the name of the column with time, if not sent, counting is done in terms samples

  • add_title – Optional text to be placed as the figure title

  • use_phases – In case the user wants to only do counting for a subset of phases

by Salvador Garcia Munoz

sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.phase_simple_align(bdata, nsamples)[source]

Simple batch alignment (0 to 100%) per phase

bdata_aligned = phase_simple_align(bdata,nsamples)

Parameters:
  • bdata – is a Pandas DataFrame where 1st column is Batch Identifier the second column is a phase indicator and following columns are variables, each row is a new time sample. Batches are concatenated vertically.

  • nsamples

    if integer: Number of samples to collect per phase

    if dictionary: samples to generate per phase e.g.

  • {'Heating' (nsamples =) – 100,’Reaction’:200,’Cooling’:10}

  • number (resampling is linear with respect to row)

Returns:

a pandas dataframe with batch data resampled (aligned)

by Salvador Garcia Munoz (sgarciam@imperial.ac.uk , salvadorgarciamunoz@gmail.com)

pyphi_batch.pyphi_batch.plot_batch(bdata, which_batch, which_var, *, include_mean_exc=False, include_set=False, phase_samples=False, single_plot=False, plot_title='')[source]

Plotting routine for batch data

plot_batch(bdata,which_batch,which_var,*,include_mean_exc=False,include_set=False,

phase_samples=False,single_plot=False,plot_title=’’)

Parameters:
  • bdata

    Batch data organized as: column[0] = Batch Identifier column name is unrestricted column[1] = Phase information per sample must be called ‘Phase’,’phase’, or ‘PHASE’

    this information is optional

    column[2:]= Variables measured throughout the batch

    The data for each batch is one on top of the other in a vertical matrix

  • which_batch – Which batches to plot

  • which_var – Which variables are to be plotted, if not sent, all are.

  • include_mean_exc – Include the mean trajectory of the set EXCLUDING the one batch being plotted

  • include_set – Include all other trajectories (will be colored in light gray)

  • phase_samples – Information used to align the batch, so that phases are marked in the plot

  • single_plot – If True => Plot everything in a single axis

  • plot_title – Optional text to be added to the title of all figures

  • Munoz (by Salvador Garcia)

  • salvadorgarciamunoz@gmail.com (sgarciam@imperial.ac.uk)

pyphi_batch.pyphi_batch.plot_var_all_batches(bdata, *, which_var=False, plot_title='', mkr_style='.-', phase_samples=False, alpha_=0.2, timecolumn=False, lot_legend=False)[source]

Plotting routine for batch data plot data for all batches in a dataset

plot_var_all_batches(bdata,*,which_var=False,plot_title=’’,mkr_style=’.-‘,

phase_samples=False,alpha_=0.2,timecolumn=False,lot_legend=False):

Parameters:
  • bdata

    Batch data organized as: column[0] = Batch Identifier column name is unrestricted column[1] = Phase information per sample must be called ‘Phase’,’phase’, or ‘PHASE’

    this information is optional

    column[2:]= Variables measured throughout the batch

    The data for each batch is one on top of the other in a vertical matrix

  • which_var – Which variables are to be plotted, if not sent, all are.

  • plot_title – Optional text to be used as the title of all figures

  • phase_samples – information used to align the batch, so that phases are marked in the plot

  • alpha – Transparency for the phase dividing line

  • timecolumn – Name of the column that indicates time, if given all data is plotted against time

  • lot_legend – Flag to add a legend for the batch identifiers

by Salvador Garcia Munoz sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.predict(xbatch, mmvm_obj, *, zinit=False)[source]

Generate predictions for a Multi-way PCA/PLS model

predictions = predict(xbatch,mmvm_obj,*,zinit=False)

Parameters:
  • xbatch – Batch data with same variables and alignment as model will generate predictions for all batches

  • mmvm_obj – Multi-way PLS or PCA

  • zinit – Initial conditions [if any]

Returns:

A dictionary with keys [‘Yhat’, ‘Xhat’, ‘Tnew’, ‘speX’, ‘T2’]

Return type:

preds

by Salvador Garcia Munoz

sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.r2pv(mmvm_obj, *, which_var=False)[source]

Plot batch r2 for variables as a function of time/sample

r2pv(mmvm_obj,*,which_var=False)

Parameters:
  • mmvm_obj – Multiway PCA or PLS object

  • which_var – Variable for which the plot is done, if not sent all are plotted

by Salvador Garcia Munoz sgarciam@imperial.ac.uk salvadorgarciamunoz@gmail.com

pyphi_batch.pyphi_batch.refold_horizontal(xuf, nvars, nsamples)[source]
pyphi_batch.pyphi_batch.simple_align(bdata, nsamples)[source]
Simple alignment for bacth data using row number to linearly interpolate

to the same number of samples

bdata_aligned= simple_align(bdata,nsamples)

Parameters:
  • Identifier (bdata is a Pandas DataFrame where 1st column is Batch) – and following columns are variables, each row is a new time sample. Batches are concatenated vertically.

  • batch (nsamples is the new number of samples to generate per) – irrespective of phase

Returns:

A pandas dataframe with batch data resampled to nsamples for all batches

by Salvador Garcia Munoz (sgarciam@imperial.ac.uk, salvadorgarciamunoz@gmail.com)

pyphi_batch.pyphi_batch.unfold_horizontal(bdata)[source]
pyphi_batch.pyphi_batch.unique(df, colid)[source]

Replacement of the np.unique routine, specifically for dataframes

unique(df,colid)

Parameters:
  • df – A pandas dataframe

  • colid – Column identifier

Returns:

A list with unique values in the order found in the dataframe

by Salvador Garcia (sgarciam@ic.ac.uk)

Module contents