pyphi.batch module¶
Created on Mon Apr 11 14:58:35 2022
Batch data is assumed to come in an excel file with first column being batch identifier and following columns being process variables. Optionally the second column labeled ‘PHASE’,’Phase’ or ‘phase’ indicating the phase of exceution
Change log: * added Dec 28 2023 Titles can be sent to contribution plots via plot_title flag
Monitoring diagnostics are also plotted against sample starting with 1
- added Dec 27 2023 Corrected plots to use correct xaxis starting with sample =1
Ammended indicator variable alignment not to replace the IV with a linear sequence but to keep orginal data
added Dec 4 2023 Added a BatchVIP calculation
added Apr 23 2023 Corrected a very dumb mistake I made coding when tired
- added Apr 18 2023 Added descriptors routine to obtain landmarks of the batch
such as min,max,ave of a variable [during a phase if indicated so] Modifed plot_var_all_batches to plot against the values in a Time column and also add the legend for the BatchID
- added Apr 10 2023 Added batch contribution plots
Added build_rel_time to create a tag of relative run time from a timestamp
added Apr 7 2023 Added alignment using indicator variable per phase
- added Apr 5 2023 Added the capability to monitor a variable in “Soft Sensor” mode
which implies there are no measurements for it (pure prediction) as oppose to a forecast where there are new measurments coming in time.
added Jul 20 2022 Distribution of number of samples per phase plot
added Aug 10 2022 refold_horizontal | clean_empty_rows | predict
added Aug 12 2022 replicate_batch
@author: S. Garcia-Munoz sgarciam@ic.ak.uk salg@andrew.cmu.edu
- pyphi.batch.unique(df, colid)[source]¶
Return unique values from a DataFrame column, preserving order of first occurrence.
A replacement for
np.uniquethat does not sort the result, returning values in the order they first appear in the DataFrame.- Parameters:
df (pd.DataFrame) – Input DataFrame.
colid (str) – Name of the column to extract unique values from.
- Returns:
Unique values in the order they first appear in
df[colid].- Return type:
list
- pyphi.batch.mean(X, axis)[source]¶
Compute the mean of a 2-D array along an axis, ignoring NaN values.
- Parameters:
X (np.ndarray) – 2-D input array, may contain
np.nan.axis (int) – Axis along which to compute the mean.
0= column-wise (mean of each column across rows).1= row-wise (mean of each row across columns).
- Returns:
1-D array of mean values, with NaN entries excluded from the denominator so results remain unbiased in the presence of missing data.
- Return type:
np.ndarray
- pyphi.batch.simple_align(bdata, nsamples)[source]¶
Align batch data to a common length by linear interpolation on row index.
Resamples every batch to exactly
nsamplesrows by linearly interpolating each variable against the original row sequence. No phase information is used; all samples are treated as a single continuous trajectory.- Parameters:
bdata (pd.DataFrame) – Batch data. First column is batch ID; second column may optionally be a phase label (
'Phase','phase', or'PHASE'); remaining columns are process variables. Batches are stacked vertically.nsamples (int) – Target number of samples per batch after alignment.
- Returns:
Aligned batch data with all batches resampled to
nsamplesrows. Phase labels (if present) are mapped to the nearest original sample using rounded interpolation indices.- Return type:
pd.DataFrame
- pyphi.batch.phase_simple_align(bdata, nsamples)[source]¶
Align batch data to a common length per phase by linear interpolation.
Resamples each phase of each batch independently to the specified number of samples, then concatenates phases back in order. Requires phase information in the second column.
- Parameters:
bdata (pd.DataFrame) – Batch data. First column is batch ID; second column must be a phase label (
'Phase','phase', or'PHASE'); remaining columns are process variables. Batches are stacked vertically.nsamples (dict) – Number of samples to generate per phase. Keys must match the phase labels in the data. Resampling is linear with respect to row number within each phase. Example:
{'Heating': 100, 'Reaction': 200, 'Cooling': 10}.
- Returns:
Aligned batch data with each phase resampled to the specified number of samples, phases concatenated in key order of
nsamples.- Return type:
pd.DataFrame
- pyphi.batch.phase_iv_align(bdata, nsamples)[source]¶
Align batch data using an indicator variable (IV) or row index, per phase.
Provides the most flexible alignment: each phase can be aligned either by linear resampling (default) or by using a monotonically changing process variable (indicator variable) as the alignment axis.
- Parameters:
bdata (pd.DataFrame) – Batch data. First column is batch ID; second column must be a phase label (
'Phase','phase', or'PHASE'); remaining columns are process variables. Batches are stacked vertically.nsamples (dict) – Alignment specification per phase. Each value can be: an integer for linear alignment, e.g.
{'Heating': 100, 'Reaction': 200}; a 4-element list for IV alignment with known start and end, e.g.['TIC101', 100, 30, 50](IVarID, num_samples, start_value, end_value); or a 3-element list for IV alignment with known end only, e.g.['TIC101', 100, 50]where start_value is taken from the first row of that phase. The indicator variable must be monotonically increasing or decreasing within the phase; non-monotonic samples are removed with a warning before interpolation.
- Returns:
Aligned batch data with each phase resampled as specified, phases concatenated in key order of
nsamples.- Return type:
pd.DataFrame
- pyphi.batch.plot_var_all_batches(bdata, *, which_var=False, plot_title='', mkr_style='.-', phase_samples=False, alpha_=0.2, timecolumn=False, lot_legend=False)[source]¶
Plot trajectories of one or more variables for all batches.
Produces one Matplotlib figure per variable, with each batch overlaid as a separate line. Optionally adds phase boundary annotations and a batch legend.
- Parameters:
bdata (pd.DataFrame) – Batch data. First column is batch ID; second column may optionally be a phase label; remaining columns are process variables. Batches are stacked vertically.
which_var (list[str], str, or bool) – Variables to plot. If
False(default), all process variables are plotted.plot_title (str) – Title applied to all figures. Default
''.mkr_style (str) – Matplotlib line/marker style string, e.g.
'.-'(default),'o','-'.phase_samples (dict or bool) – Phase structure used to annotate phase boundaries as vertical magenta lines. Pass the same
nsamplesdict used for alignment. DefaultFalse(no annotations).alpha (float) – Transparency of phase boundary lines (0–1). Default
0.2.timecolumn (str or bool) – If a column name is given, the x-axis uses the values in that column instead of the sample sequence index. Default
False.lot_legend (bool) – If
True, adds a legend showing each batch ID. DefaultFalse.
- Returns:
Displays one Matplotlib figure per variable.
- Return type:
None
- pyphi.batch.plot_batch(bdata, which_batch, which_var, *, include_mean_exc=False, include_set=False, phase_samples=False, single_plot=False, plot_title='')[source]¶
Plot the trajectory of one or more batches for selected variables.
Highlights the specified batch(es) in black against an optional backdrop of all other batch trajectories and/or their mean.
- Parameters:
bdata (pd.DataFrame) – Batch data. First column is batch ID; second column may optionally be a phase label; remaining columns are process variables. Batches are stacked vertically.
which_batch (str or list[str]) – Batch ID(s) to highlight.
which_var (str or list[str]) – Variable name(s) to plot.
include_mean_exc (bool) – If
True, overlays the mean trajectory of all other batches (excluding the highlighted batch) in red. DefaultFalse.include_set (bool) – If
True, overlays all other batch trajectories in light magenta for context. DefaultFalse.phase_samples (dict or bool) – Phase structure for annotating phase boundaries. Pass the same
nsamplesdict used for alignment. DefaultFalse(no annotations).single_plot (bool) – If
True, plots all selected variables on a single axis. IfFalse(default), each variable gets its own figure.plot_title (str) – Text appended to each figure title after the batch ID. Default
''.
- Returns:
Displays one or more Matplotlib figures.
- Return type:
None
- pyphi.batch.unfold_horizontal(bdata)[source]¶
Unfold batch data horizontally (batch-wise unfolding).
Reshapes aligned batch data from the vertical stacked format (samples × variables) into a 2-D matrix where each row is one batch and columns represent all variables at all time points:
[Var1_t1, Var1_t2, ..., Var1_tN, Var2_t1, ...].This is the standard preprocessing step before fitting
mpca()ormpls()with batch-wise unfolding.- Parameters:
bdata (pd.DataFrame) – Aligned batch data. First column is batch ID; second column may optionally be a phase label; remaining columns are process variables. All batches must have the same number of rows.
- Returns:
bdata_hor (pd.DataFrame): Unfolded matrix, one row per batch. First column is batch ID; remaining columns are variable-time combinations.
colnames (list[str]): Column names of the unfolded matrix (e.g.
['Var1_1', 'Var1_2', ..., 'VarN_T']).bid (list[str]): Variable-block identifiers — for each column in
bdata_hor, the name of the original process variable it belongs to. Used internally to reconstruct block structure.
- Return type:
tuple
- pyphi.batch.refold_horizontal(xuf, nvars, nsamples)[source]¶
Refold a horizontally unfolded batch matrix back to 3-D array form.
Inverts the operation of
unfold_horizontal(), converting a 2-D unfolded matrix (one row per batch) back to a 3-D arrangement (total_samples × nvars), suitable for conversion back to a DataFrame.- Parameters:
xuf (np.ndarray) – Horizontally unfolded batch data, shape (n_batches × (nvars × nsamples)). Strictly numeric — no ID column.
nvars (int) – Number of process variables per sample.
nsamples (int) – Number of time samples per batch.
- Returns:
Refolded array of shape (n_batches × nsamples, nvars), where the rows for each batch are stacked vertically.
- Return type:
np.ndarray
- pyphi.batch.loadings(mmvm_obj, dim, *, r2_weighted=False, which_var=False)[source]¶
Plot batch model loadings as a function of sample number.
For each process variable, produces a filled-area plot of the loading (W* for PLS, P for PCA) vs. sample index, so the temporal pattern of each variable’s influence on the model can be inspected visually.
For PLS models with initial conditions (
ninit > 0), an additional bar chart is produced for the initial-condition variables.- Parameters:
mmvm_obj (dict) – Multi-way PCA model from
mpca()or multi-way PLS model frommpls().dim (int) – Component index to plot (1-indexed).
r2_weighted (bool) – If
True, multiplies each loading by its corresponding per-variable R² value before plotting, so variables that explain more variance appear larger. DefaultFalse.which_var (str, list[str], or bool) – Process variable name(s) to plot. If
False(default), all variables are plotted.
- Returns:
Displays one Matplotlib figure per variable (plus one bar chart for initial conditions if applicable).
- Return type:
None
- pyphi.batch.loadings_abs_integral(mmvm_obj, *, r2_weighted=False, addtitle=False)[source]¶
Plot the integral of absolute loadings per variable across all LVs/PCs.
For each latent variable / principal component, produces a bar chart where each bar represents the sum of absolute loading values over all time samples for that variable. This gives a scalar importance measure for each process variable per component.
- Parameters:
mmvm_obj (dict) – Multi-way PCA model from
mpca()or multi-way PLS model frommpls().r2_weighted (bool) – If
True, weights each loading by its per-variable R² before summing, emphasising variables that also explain more variance. DefaultFalse.addtitle (str or bool) – Optional string to use as the figure title. If
False(default), no title is added.
- Returns:
Displays one Matplotlib figure per latent variable / PC.
- Return type:
None
- pyphi.batch.batch_vip(mmvm_obj, *, addtitle=False)[source]¶
Plot a batch-level VIP score summarising variable importance across all LVs.
Computes a scalar VIP-like score for each process variable by summing the absolute loadings weighted by R²Y across all latent variables, then summing over all time samples. The result is shown as a bar chart, sorted by variable (not by VIP magnitude).
This is conceptually analogous to the standard VIP but adapted for the temporal, unfolded batch model structure.
- Parameters:
mmvm_obj (dict) – Multi-way PLS model from
mpls(). For PCA models the plot uses R²X weighting instead of R²Y.addtitle (str or bool) – Optional string to use as the figure title. If
False(default), no title is added.
- Returns:
Displays a single Matplotlib bar chart.
- Return type:
None
- pyphi.batch.r2pv(mmvm_obj, *, which_var=False)[source]¶
Plot cumulative R² per variable as a function of sample number.
For each process variable, produces a stacked filled-area plot where each band represents the cumulative R² contribution of one LV/PC at each time sample. For PLS models, a separate stacked bar chart is also produced for the Y-space R²pvY.
For models with initial conditions (
ninit > 0), an additional bar chart is produced for the initial-condition variable R² values.- Parameters:
- Returns:
Displays one Matplotlib figure per variable, plus additional figures for Y-space and initial conditions where applicable.
- Return type:
None
- pyphi.batch.mpca(xbatch, a, *, unfolding='batch wise', phase_samples=False, cross_val=0)[source]¶
Fit a Multi-way PCA (MPCA) model to aligned batch data.
Unfolds the batch data into a 2-D matrix (batch-wise or variable-wise) and fits a PCA model. Low-variance columns are removed automatically and their positions are restored in the model loadings for consistent interpretation.
- Parameters:
xbatch (pd.DataFrame) – Aligned batch data, all batches having the same number of samples. First column is batch ID; second column may optionally be a phase label; remaining columns are process variables. Batches are stacked vertically.
a (int) – Number of principal components to fit.
unfolding (str) – Unfolding strategy.
'batch wise'(default) unfolds to one row per batch;'variable wise'keeps the observation-per-sample structure.phase_samples (dict or bool) – Phase structure stored in the model for use in plotting functions. Pass the same
nsamplesdict used for alignment. DefaultFalse.cross_val (int) – Cross-validation percentage of elements to remove per round.
0(default) = no CV;100= leave-one-out.
- Returns:
Multi-way PCA model object extending the standard PCA dict from
pyphi.calc.pca()with additional batch-specific keys:'varidX'(list[str]): Variable column names in unfolded order.'bid'(list[str]): Block ID for each column (original variable name).'uf'(str): Unfolding strategy used ('batch wise').'phase_samples': Phase structure passed in (for plotting).'nvars'(int): Number of process variables per sample.'nbatches'(int): Number of batches in the training set.'nsamples'(int): Number of samples per batch.'ninit'(int): Number of initial-condition variables (always0for MPCA).'A'(int): Number of principal components fitted.
- Return type:
dict
- pyphi.batch.monitor(mmvm_obj, bdata, *, which_batch=False, zinit=False, build_ci=True, shush=False, soft_sensor=False)[source]¶
Mimic real-time batch monitoring and produce dynamic diagnostic plots.
Two-stage workflow:
Stage 1 — Build confidence intervals (call once after fitting):
monitor(mmvm_obj, training_data)
Simulates monitoring for every batch in
bdata, computes per-sample confidence intervals for scores, HT², global SPE, and instantaneous SPE, and writes them back intommvm_objin place.Stage 2 — Monitor a new batch (call after Stage 1):
diags = monitor(mmvm_obj, bdata, which_batch='Batch01')
Simulates real-time monitoring for the specified batch, plots dynamic score, HT², SPE, and (for PLS models) Y-forecast trajectories with the Stage 1 confidence interval overlays.
- Parameters:
mmvm_obj (dict) – Multi-way PCA or PLS model from
mpca()ormpls(). Confidence interval keys are added in Stage 1.bdata (pd.DataFrame) – Batch data (aligned, same structure as training data). Used to look up batch trajectories.
which_batch (str, list[str], or bool) – Batch ID(s) to monitor. If
False(default), Stage 1 is performed (CI building).zinit (pd.DataFrame or bool) – Initial-condition data for the batch(es) being monitored. Required if the model was fitted with
zinit. DefaultFalse.build_ci (bool) – If
True(default) andwhich_batch=False, builds and stores confidence intervals inmmvm_obj.shush (bool) – If
True, suppresses progress messages. DefaultFalse.soft_sensor (str, list[str], or bool) – Variable name(s) to treat as soft-sensor targets — their measurements are set to NaN before prediction so that only model-based estimates are produced. Default
False.
- Returns:
In Stage 2, returns a
diagsdictionary (or a list of dicts if multiple batches are requested) with keys:'Batch'(str): Batch ID.'t_mon'(ndarray): Score trajectories (nsamples × A).'HT2_mon'(ndarray): Hotelling’s T² trajectory.'spe_mon'(ndarray): Global SPE trajectory.'spei_mon'(ndarray): Instantaneous SPE trajectory.'cont_spe'(list[pd.DataFrame]): SPE contributions per sample.'cont_spei'(pd.DataFrame): Instantaneous SPE contributions.'cont_ht2'(list[pd.DataFrame]): HT² contributions per sample.'forecast'(list[pd.DataFrame]): X-space forecast per sample.'forecast y'(pd.DataFrame): Y forecast trajectory (PLS models only).'spe z','cont_spe_z','cont_ht2_z','reconstructed z': Initial-condition diagnostics (ifzinitwas provided).
Returns
'error batch not found'if the requested batch is not inbdata. In Stage 1, returnsNone(results written tommvm_obj).- Return type:
dict or str
- pyphi.batch.mpls(xbatch, y, a, *, zinit=False, phase_samples=False, mb_each_var=False, cross_val=0, cross_val_X=False)[source]¶
Fit a Multi-way PLS (MPLS) model to aligned batch data.
Unfolds the batch data batch-wise, optionally prepends initial-condition variables, and fits a PLS (or Multi-Block PLS) model to predict
y. Low-variance columns are removed and their positions are restored for consistent interpretation.- Parameters:
xbatch (pd.DataFrame) – Aligned batch data, all batches having the same number of samples. First column is batch ID; second column may optionally be a phase label; remaining columns are process variables. Batches are stacked vertically.
y (pd.DataFrame or np.ndarray) – Response matrix, one row per batch. If a DataFrame, the first column is the batch ID.
a (int) – Number of latent variables.
zinit (pd.DataFrame or bool) – Initial-condition variables, one row per batch. First column must be batch ID. If
False(default), no initial conditions are used.phase_samples (dict or bool) – Phase structure stored in the model for use in plotting functions. Default
False.mb_each_var (bool) – If
True, treats each process variable as a separate block in a Multi-Block PLS model. IfFalse(default), trajectories form a single block (plus an initial-conditions block ifzinitis provided).cross_val (int) – Cross-validation level (
0= none,100= LOO). Default0.cross_val_X (bool) – If
True, also cross-validates the X-space. DefaultFalse.
- Returns:
Multi-way PLS model object extending the standard PLS dict from
pyphi.calc.pls()(orpyphi.calc.mbpls()) with additional batch-specific keys:'Yhat'(np.ndarray): In-sample Y predictions.'varidX'(list[str]): Variable column names in unfolded order.'bid'(list[str]): Block ID for each column.'uf'(str):'batch wise'.'nvars'(int): Number of process variables per sample.'nbatches'(int): Number of batches in the training set.'nsamples'(int): Number of samples per batch.'A'(int): Number of latent variables fitted.'phase_samples': Phase structure passed in.'mb_each_var'(bool): Whether MB-PLS was used.'ninit'(int): Number of initial-condition variables (0ifzinitwas not provided).
- Return type:
dict
- pyphi.batch.find(a, func)[source]¶
Return indices of elements in a list that satisfy a predicate function.
- Parameters:
a (list) – Input list to search.
func (callable) – A function that takes a single element and returns
Trueif the element should be included. Example:lambda x: x == 0finds all zero-valued elements.
- Returns:
Indices of elements in
afor whichfuncreturnsTrue.- Return type:
list[int]
- pyphi.batch.clean_empty_rows(X, *, shush=False)[source]¶
Remove rows that are entirely NaN from a batch DataFrame.
- Parameters:
X (pd.DataFrame) – Batch data. First column is batch ID; second column may optionally be a phase label (
'Phase','phase', or'PHASE'); remaining columns are process variables.shush (bool) – If
True, suppresses printed output listing removed rows. DefaultFalse.
- Returns:
Batch data with fully empty rows removed. Returns the original DataFrame unchanged if no empty rows are found.
- Return type:
pd.DataFrame
- pyphi.batch.phase_sampling_dist(bdata, time_column=False, addtitle=False, use_phases=False)[source]¶
Plot and return the distribution of samples (or time) consumed per phase.
Produces a histogram panel — one subplot per phase plus one for the total — showing how many samples (or how much time) each batch spends in each phase. Useful for diagnosing alignment issues and batch variability before fitting a model.
- Parameters:
bdata (pd.DataFrame) – Batch data. First column is batch ID; second column must be a phase label (
'Phase','phase', or'PHASE'); remaining columns are process variables. Batches are stacked vertically.time_column (str or bool) – If a column name is given, the x-axis represents elapsed time in that column rather than sample count. Default
False.addtitle (str or bool) – Optional string used as the overall figure super-title. Default
False.use_phases (list[str] or bool) – Subset of phases to include. If
False(default), all phases present in the data are used.
- Returns:
Nested dictionary
{phase: {batch_id: value}}wherevalueis the sample count or elapsed time for that batch in that phase.- Return type:
dict
- pyphi.batch.predict(xbatch, mmvm_obj, *, zinit=False)[source]¶
Generate predictions for all batches in a dataset using a fitted MPCA or MPLS model.
Unfolds
xbatchbatch-wise, projects through the model, and returns reconstructed X (and predicted Y for PLS models) refolded back to the original batch DataFrame structure.- Parameters:
xbatch (pd.DataFrame) – Aligned batch data, same variable structure and number of samples per batch as the training data. First column is batch ID; second column may optionally be a phase label.
mmvm_obj (dict) – Multi-way PCA or PLS model from
mpca()ormpls().zinit (pd.DataFrame or bool) – Initial-condition data, one row per batch. First column must be batch ID. Required if the model was fitted with initial conditions. Default
False.
- Returns:
Prediction results with keys:
'Tnew'(ndarray): Batch scores (n_batches × A).'Xhat'(pd.DataFrame): Reconstructed X in original batch-stacked format (same structure asxbatch).'speX'(ndarray): X-space SPE per batch.'T2'(ndarray): Hotelling’s T² per batch.'Yhat'(pd.DataFrame): Predicted Y (PLS models only), one row per batch with batch IDs as the first column.'Zhat'(pd.DataFrame): Reconstructed initial conditions (PLS models only, whenzinitis provided).
- Return type:
dict
- pyphi.batch.contributions(mmvmobj, X, cont_type, *, to_obs=False, from_obs=False, lv_space=False, phase_samples=False, dyn_conts=False, which_var=False, plot_title='')[source]¶
Plot variable contributions to scores, HT², or SPE for a batch model.
Computes and visualises how much each process variable at each time point contributes to the specified monitoring statistic for the given batch(es). Both a summary bar chart (absolute contributions summed over time) and an optional dynamic time-series plot are produced.
- Parameters:
mmvmobj (dict) – Multi-way PCA or PLS model from
mpca()ormpls().X (pd.DataFrame) – Batch data, same structure as the training data (aligned, batch-wise stacked).
cont_type (str) – Type of contribution to compute. Options:
'scores','ht2','spe'.to_obs (list[str] or bool) – Batch ID(s) to diagnose. This argument is required. Default
False.from_obs (list[str] or bool) – Reference batch ID(s) for difference-based contributions (
'scores'and'ht2'only). IfFalse(default), the model origin is used as the reference. Ignored for'spe'.lv_space (int, list[int], or bool) – Component index/indices to compute contributions for (
'scores'only). IfFalse(default), contributions are summed across all components.phase_samples (dict or bool) – Phase structure for annotating phase boundaries in dynamic contribution plots. Default
False.dyn_conts (bool) – If
True, also produces dynamic time-series contribution plots (one per variable) in addition to the summary bar chart. DefaultFalse.which_var (str, list[str], or bool) – Variables to include in the dynamic contribution plots (only used when
dyn_conts=True). IfFalse(default), all variables are shown.plot_title (str) – Title applied to all figures. Default
''.
- Returns:
Displays Matplotlib figures (bar chart always; time-series plots if
dyn_conts=True).- Return type:
None
- pyphi.batch.build_rel_time(bdata, *, time_unit='min')[source]¶
Convert a
'Timestamp'column to relative elapsed time from batch start.For each batch, computes elapsed time since the first timestamp and adds it as a new
'Time (<unit>)'column, replacing the original'Timestamp'column.- Parameters:
bdata (pd.DataFrame) – Batch data containing a
'Timestamp'column with datetime-compatible values. First column is batch ID.time_unit (str) – Unit for the output time column.
'min'(default) produces minutes;'hr'produces hours;'s'keeps seconds.
- Returns:
Batch data with
'Timestamp'replaced by'Time (<time_unit>)', where values are elapsed time from the start of each batch.- Return type:
pd.DataFrame
- pyphi.batch.descriptors(bdata, which_var, desc, *, phase=False)[source]¶
Compute summary descriptors for batch trajectories, optionally per phase.
Calculates one or more statistical descriptors for each variable in each batch, returning a single row per batch suitable for use as input to a PLS or PCA model.
- Parameters:
bdata (pd.DataFrame) – Batch data. First column is batch ID; second column may optionally be a phase label (
'Phase','phase', or'PHASE'); remaining columns are process variables.which_var (list[str]) – Variable names to compute descriptors for.
desc (list[str]) –
Descriptor types to calculate. Supported values:
'min': Minimum value.'max': Maximum value.'mean': Arithmetic mean.'median': Median value.'std': Standard deviation (ddof=1).'var': Variance (ddof=1).'range': Max minus min.'ave_slope': Average linear slope (estimated via least squares).
phase (list[str] or bool) – If a list of phase names is provided, descriptors are computed separately within each phase, and column names are suffixed with
'_<phase>_<descriptor>'. IfFalse(default), descriptors are computed over the full batch trajectory.
- Returns:
One row per batch, first column is batch ID, remaining columns are descriptor values named
'<variable>_<phase>_<descriptor>'(with phase) or'<variable>_<descriptor>'(without phase).- Return type:
pd.DataFrame