Statistics

Functions for statistical analysis of biomechanical time series data.

biomechzoo.statistics.eventval.eventval(fld, dim1=None, dim2=None, ch=None, localevts=None, globalevts=None, anthroevts=None)[source]

Extract event values from .zoo files and compile into a pandas DataFrame.

Parameters:
  • fld (str) – Path to the root data folder containing .zoo files.

  • dim1 (list of str, optional) – List of conditions (subfolder names under fld).

  • dim2 (list of str, optional) – List of participant identifiers.

  • ch (list of str) – List of channels to extract events from.

  • localevts (list of str, optional) – List of local events.

  • globalevts (list of str, optional) – List of global events.

  • anthroevts (list of str, optional) – List of events stored in the metadata.Usually anthropometric data

Returns:

pd.DataFrame – Columns: [‘condition’, ‘subject’, ‘file’, ‘event_name’, ‘event_value’]

biomechzoo.statistics.lineval.lineval(root_folder, channel_name, output_format='array', subject_level=0, condition_level=1)[source]

Extract time-normalized line arrays from Zoo files.

This function recursively searches root_folder for .zoo files and extracts the line field from the specified channel. Folder levels are used to assign subject and condition labels.

Data must already be time-normalized. The function will raise an error if inconsistent signal lengths are detected.

Parameters:
  • root_folder (str) – Root directory containing data.

  • channel_name (str) – Name of the channel to extract.

  • output_format (Literal['array', 'wide']) – Output format. - 'array': one column containing the full array (default) - 'wide': one column per timepoint (p0, p1, …)

  • subject_level (int) – Folder index used to define subject label (0 = first folder below root).

  • condition_level (int) – Folder index used to define condition label (0 = first folder below root).

Raises:
  • KeyError – If the specified channel or line field is missing.

  • ValueError – If signals are not equal length (not normalized).

  • ValueError – If invalid format is provided.

  • IndexError – If folder depth is insufficient for specified levels.

Returns:

DataFrame containing extracted line data with subject, condition, and trial references.

Return type:

pandas.DataFrame

biomechzoo.statistics.lineval_wide2arrays.lineval_wide2arrays(df, condition_col='condition', conditions=None)[source]

Convert a wide-format DataFrame into arrays grouped by condition.

Parameters:
  • df (pd.DataFrame) – DataFrame returned by lineval(format=’wide’). Must include columns for condition and timepoints (p0…pN).

  • condition_col (str) – Column name containing condition labels. Default ‘condition’.

  • conditions (list of str, optional) – List of condition labels in desired order. If None, will use all unique conditions sorted alphabetically.

Returns:

Dict[str, np.ndarray] – Keys = condition labels Values = 2D numpy arrays (n_trials x n_timepoints)

Parameters:
  • df (DataFrame)

  • condition_col (str)

  • conditions (List[str] | None)

Return type:

Dict[str, ndarray]

biomechzoo.statistics.rmse.rmse(a, b)[source]