pyphi_batch package
pyphi_batch.pyphi_batch module
- pyphi_batch.pyphi_batch.batch_vip(mmvm_obj, *, addtitle=False)[source]
plot the summation across componets of the integral of the absolute value of loadings for a batch multiplied by the R2 [which kinda mimicks the VIP]
- Parameters:
mmvm_obj – A multiway PCA or PLS model
- pyphi_batch.pyphi_batch.build_rel_time(bdata, *, time_unit='min')[source]
Converts the column ‘Timestamp’ into ‘Time’ in time_units relative to the start of each batch
bdata_new = build_rel_time(bdata,*,time_unit=’min’)
- pyphi_batch.pyphi_batch.clean_empty_rows(X, *, shush=False)[source]
Cleans empty rows in batch data Input:
X: Batch data to be cleaned of empty rows (all np.nan) DATAFRAME
- Output:
X: Batch data Without observations removed
- pyphi_batch.pyphi_batch.contributions(mmvmobj, X, cont_type, *, to_obs=False, from_obs=False, lv_space=False, phase_samples=False, dyn_conts=False, which_var=False, plot_title='')[source]
Plot batch contribution plots to Scores, HT2 or SPE
- contributions (mmvmobj,X,cont_type,*,to_obs=False,from_obs=False,
lv_space=False,phase_samples=False,dyn_conts=False,which_var=False, plot_title=’’)
- Parameters:
Model (mmvmobj= Multiway)
data (X = batch)
'spe' (cont_type = 'scores' | 'ht2' |)
to (to_obs = Observation to calculate contributions)
only] (from_obs = Relative basis to calculate contributions to [for 'scores' and 'ht2') – if not sent the origin of the model us used as the base.
- Returns:
- pyphi_batch.pyphi_batch.descriptors(bdata, which_var, desc, *, phase=False)[source]
Get descriptor values for a batch trajectory
descriptors_df = descriptors(bdata,which_var,desc,*,phase=False)
- Parameters:
bdata – Dataframe of batch data, first column is batch ID, second column can be phase id
which_var – List of variables to get descriptors for
desc – List of descriptors to calculate, options are: ‘min’ ‘max’ ‘mean’ ‘median’ ‘std’ ‘var’ ‘range’ ‘ave_slope’
phase – to specify what phases to do this for
- Returns:
A dataframe with the descriptors per batch
- Return type:
- pyphi_batch.pyphi_batch.loadings(mmvm_obj, dim, *, r2_weighted=False, which_var=False)[source]
Plot batch loadings for variables as a function of time/sample
- Parameters:
mmvm_obj – Multiway PCA or PLS object
dim – What component or latent variable to plot
r2_weighted – If True => weight the loading by the R2pv
which_var – Variable for which the plot is done, if not sent all are plotted
- pyphi_batch.pyphi_batch.loadings_abs_integral(mmvm_obj, *, r2_weighted=False, addtitle=False)[source]
Plot the integral of the absolute value of loadings for a batch
- Parameters:
mmvm_obj – A multiway PCA or PLS model
r2_weighted – Boolean flag, if True then in weights the loading by the R2pv
addtitle – Text to place in the title of the figure
- pyphi_batch.pyphi_batch.monitor(mmvm_obj, bdata, *, which_batch=False, zinit=False, build_ci=True, shush=False, soft_sensor=False)[source]
Routine to mimic the real-time monitoring of a batch given a model
- usage: 1st you need to run: monitor(mmvm_obj,bdata)
to mimic monitoring for all bdata batches and build CI these new parameters are written back to mmvm_obj
Then you can run:
- diagnostics = monitor(mmvm_obj,bdata,which_batch=your_batchid)
to mimic monitoring for your_batchid and will produce all dynamic metrics and forecasts
- diagnostics = monitor(mmvm_obj,bdata,which_batch=your_batchid,zinit=your_z_data)
to mimic monitoring for your_batchid using initial conditions will produce all dynamic metrics and forecasts
- diagnostics = monitor(mmvm_obj,bdata,which_batch=your_batchid,soft_sensor=your_variable)
to mimic monitoring for your_batchid will produce all dynamic metrics and forecasts and produce soft-sensor predictions for your_variable
diagnostics:A dictionary with all the monitoring diagnostics and contributions
- pyphi_batch.pyphi_batch.mpca(xbatch, a, *, unfolding='batch wise', phase_samples=False, cross_val=0)[source]
Multi-way PCA for batch analysis
mpca_obj= mpca(xbatch,a,*,unfolding=’batch wise’,phase_samples=False,cross_val=0)
- Parameters:
xbatch – Pandas dataframe with aligned batch data it is assumed that all batches have the same number of samples
a – Number of PC’s to fit
unfolding – ‘batch wise’ or ‘variable wise’
phase_samples – information about samples per phase [optional]
cross_val – percent of elements for cross validation (defult is 0 = no cross val)
- Returns:
A Dictionary with all the parameters for the MPCA model
- Return type:
- pyphi_batch.pyphi_batch.mpls(xbatch, y, a, *, zinit=False, phase_samples=False, mb_each_var=False, cross_val=0, cross_val_X=False)[source]
Multi-way PLS for batch analysis
mpls_obj = mpls(xbatch,y,a,*,zinit=False,phase_samples=False,mb_each_var=False,cross_val=0,cross_val_X=False):
- Parameters:
xbatch – Pandas dataframe with aligned batch data it is assumed that all batches have the same number of samples
y – Response to predict, one row per batch
a – Number of PC’s to fit
zinit – Initial conditions <optional>
phase_samples – alignment information
mb_each_var – if “True” will make each variable measured a block otherwise zinit is one block and xbatch another
- Returns:
A dictionary with all the parameters of the MPLS model
- Return type:
- pyphi_batch.pyphi_batch.phase_iv_align(bdata, nsamples)[source]
Batch alignment using an indicator variable
batch_aligned_data = phase_iv_align(bdata,nsamples)
- Parameters:
Identifier (bdata is a Pandas DataFrame where 1st column is Batch) –
- the second column is a phase indicator
and following columns are variables, each row is a new time sample. Batches are concatenated vertically.
if nsamples is a dictionary: samples to generate per phase e.g.
nsamples = {‘Heating’:100,’Reaction’:200,’Cooling’:10}
If an indicator variable is used, with known start and end values
indicate it with a list like this:
nsamples = {‘Heating’:[‘TIC101’,100,30,50],’Reaction’:200,’Cooling’:10}
During the ‘Heating’ phase use TIC101 as an indicator variable take 100 samples equidistant from TIC101=30 to TIC101=50 and align against that variable as a measure of batch evolution (instead of time)
If an indicator variable is used, with unknown start but known end values
indicate it with a list like this:
nsamples = {‘Heating’:[‘TIC101’,100,50],’Reaction’:200,’Cooling’:10}
During the ‘Heating’ phase use TIC101 as an indicator variable take 100 samples equidistant from the value of TIC101 at the start of the phase to the point when TIC101=50 and align against that variable as a measure of batch evolution (instead of time)
If no IV is sent, the resampling is linear with respect to row number per phase
- Returns:
A pandas dataframe with batch data resampled (aligned)
- pyphi_batch.pyphi_batch.phase_sampling_dist(bdata, time_column=False, addtitle=False, use_phases=False)[source]
Count and plot a histogram of the distribution of samples (or time if time_column is indicated) consumed per phase on a batch dataset
- Parameters:
bdata –
Batch data organized as: column[0] = Batch Identifier column name is unrestricted column[1] = Phase information per sample must be called ‘Phase’,’phase’, or ‘PHASE’
this information is optional
column[2:]= Variables measured throughout the batch
time_column – Indicates the name of the column with time, if not sent, counting is done in terms samples
add_title – Optional text to be placed as the figure title
use_phases – In case the user wants to only do counting for a subset of phases
- pyphi_batch.pyphi_batch.phase_simple_align(bdata, nsamples)[source]
Simple batch alignment (0 to 100%) per phase
bdata_aligned = phase_simple_align(bdata,nsamples)
- Parameters:
bdata – is a Pandas DataFrame where 1st column is Batch Identifier the second column is a phase indicator and following columns are variables, each row is a new time sample. Batches are concatenated vertically.
nsamples –
if integer: Number of samples to collect per phase
if dictionary: samples to generate per phase e.g.
{'Heating' (nsamples =) – 100,’Reaction’:200,’Cooling’:10}
number (resampling is linear with respect to row)
- Returns:
a pandas dataframe with batch data resampled (aligned)
- pyphi_batch.pyphi_batch.plot_batch(bdata, which_batch, which_var, *, include_mean_exc=False, include_set=False, phase_samples=False, single_plot=False, plot_title='')[source]
Plotting routine for batch data
- plot_batch(bdata,which_batch,which_var,*,include_mean_exc=False,include_set=False,
- Parameters:
bdata –
Batch data organized as: column[0] = Batch Identifier column name is unrestricted column[1] = Phase information per sample must be called ‘Phase’,’phase’, or ‘PHASE’
this information is optional
column[2:]= Variables measured throughout the batch
The data for each batch is one on top of the other in a vertical matrix
which_batch – Which batches to plot
which_var – Which variables are to be plotted, if not sent, all are.
include_mean_exc – Include the mean trajectory of the set EXCLUDING the one batch being plotted
include_set – Include all other trajectories (will be colored in light gray)
phase_samples – Information used to align the batch, so that phases are marked in the plot
single_plot – If True => Plot everything in a single axis
plot_title – Optional text to be added to the title of all figures
- pyphi_batch.pyphi_batch.plot_var_all_batches(bdata, *, which_var=False, plot_title='', mkr_style='.-', phase_samples=False, alpha_=0.2, timecolumn=False, lot_legend=False)[source]
Plotting routine for batch data plot data for all batches in a dataset
- plot_var_all_batches(bdata,*,which_var=False,plot_title=’’,mkr_style=’.-‘,
- Parameters:
bdata –
Batch data organized as: column[0] = Batch Identifier column name is unrestricted column[1] = Phase information per sample must be called ‘Phase’,’phase’, or ‘PHASE’
this information is optional
column[2:]= Variables measured throughout the batch
The data for each batch is one on top of the other in a vertical matrix
which_var – Which variables are to be plotted, if not sent, all are.
plot_title – Optional text to be used as the title of all figures
phase_samples – information used to align the batch, so that phases are marked in the plot
alpha – Transparency for the phase dividing line
timecolumn – Name of the column that indicates time, if given all data is plotted against time
lot_legend – Flag to add a legend for the batch identifiers
- pyphi_batch.pyphi_batch.predict(xbatch, mmvm_obj, *, zinit=False)[source]
Generate predictions for a Multi-way PCA/PLS model
predictions = predict(xbatch,mmvm_obj,*,zinit=False)
- Parameters:
xbatch – Batch data with same variables and alignment as model will generate predictions for all batches
mmvm_obj – Multi-way PLS or PCA
zinit – Initial conditions [if any]
- Returns:
A dictionary with keys [‘Yhat’, ‘Xhat’, ‘Tnew’, ‘speX’, ‘T2’]
- Return type:
- pyphi_batch.pyphi_batch.r2pv(mmvm_obj, *, which_var=False)[source]
Plot batch r2 for variables as a function of time/sample
- Parameters:
mmvm_obj – Multiway PCA or PLS object
which_var – Variable for which the plot is done, if not sent all are plotted
- pyphi_batch.pyphi_batch.simple_align(bdata, nsamples)[source]
- Simple alignment for bacth data using row number to linearly interpolate
to the same number of samples
bdata_aligned= simple_align(bdata,nsamples)
- Parameters:
Identifier (bdata is a Pandas DataFrame where 1st column is Batch) – and following columns are variables, each row is a new time sample. Batches are concatenated vertically.
batch (nsamples is the new number of samples to generate per) – irrespective of phase
- Returns:
A pandas dataframe with batch data resampled to nsamples for all batches
- pyphi_batch.pyphi_batch.unique(df, colid)[source]
Replacement of the np.unique routine, specifically for dataframes
- Parameters:
df – A pandas dataframe
colid – Column identifier
- Returns:
A list with unique values in the order found in the dataframe
by Salvador Garcia