3.4. Experiment
Experiment.py contains the pipeline’s implementation of the Experiment object which in turn inherits from the Experiment object in the metdatamodel.
The Experiment is largely a computational aid and orchestrates an analysis. This includes keeping track of intermediate results and acquisitions.
- class pcpfm.Experiment.Experiment(experiment_name, experiment_directory, acquisitions=None, qcqa_results=None, feature_tables=None, empCpds=None, log_transformed_feature_tables=None, ionization_mode=None, cosmetics=None, used_cosmetics=None, MS2_methods=None, MS1_only_methods=None, command_history=None, study=None, sequence=None, final_empCpds=None, species=None, tissue=None, provenance=None, converted_subdirectory=None, raw_subdirectory=None, acquisition_data=None, annotation_subdirectory=None, filtered_feature_tables_subdirectory=None, ms2_directory=None, qaqc_figs=None, asari_subdirectory=None, output_subdirectory=None)[source]
Bases:
ExperimentThe experiment object represents a set of acquisitions.
This super vague constructor was useful during testing, now will explicitly define all the fields.
- add_acquisition(acquisition, mode='link')[source]
This method adds an acquisition to the list of acquisitions in the experiment, ensures there are no duplicates and then links or copies the acquisition, currently only as a .raw file, to the experiment directory
Args:
acquisition (object): an Acquistiion object mode (str): how to move acquisitions into the experiment, default “link”, can be “copy” method_field (str): this is the field to check for the method name, used to shortcircuit
MS2 determination
- asari(asari_cmd, force=False)[source]
This command will run asari on the mzml acquisitions in an experiment. The details of the command to be ran is defined by asari_cmd.
Args:
- asari_cmd (str or list): can be string or space delimited list, must
contain the fields $IONIZATION_MODE, $CONVERTED_SUBDIR, and $ASARI_SUBDIR which will be populated by this function.
$CONVERTED_SUBDIR is where input mzml is located $ASARI_SUBDIR is where the results will be output $IONIZATION_MODE is the ionization mode of the experiment
force (bool): if true, rerun asari if previously ran
- batches(batch_field)[source]
This will group samples into ‘batches’, based on the user provided ‘batch_field’.
- Parameters:
batch_field – field by which to batch samples
- Returns:
dictionary of batches to lists of samples
- static construct_experiment_from_CSV(experiment_directory, csv_filepath, sample_filter=None, name_field='File Name', path_field='Filepath', sample_skip_list_fp=None)[source]
For a given sequence file, create the experiment object, and add all acquisitions
- Parameters:
experiment_directory (str) – path to store experiment and intermediates
csv_filepath (str) – filepath to sequence CSV
ionization (str, optional) – default None, can be ‘pos’ or ‘neg’. The ionization mode of the experiment. If None, it will be determined automatically
filter (dict, optional) – a filter dictionary, only matching entries are included
name_field (str, optional) – the column from which to extract the acquisition name
path_field (str, optional) – the column from which to extract the acquisition filepath
sample_skip_list (str, optional) – path to a txt file with sample names to exclude
- Returns:
experiment object
- convert_raw_to_mzML(conversion_command, num_cores=4)[source]
Convert all raw files to mzML
Args
- conversion_command (str or list): This specifies the command to call
to perform the conversion. Can be list or space-delimited string. Must contain $RAW_PATH and $OUT_PATH where the input and output file names will go.
- static create_experiment(experiment_name, experiment_directory, sequence=None)[source]
This is the main constructor for an experiment object.
This requires a name for the experiment, the directory to which to write intermediates and optionally the ionization mode.
- Parameters:
experiment_name (str) – a moniker for the experiment
experiment_directory (str) – if true, return the object else its path. Defaults to False.
ionization_mode (str) – the ionization mode of the acquisitions can be ‘pos’, ‘neg’, or None for auto-detection.
- Returns:
experiment object
- create_sample_annotation_table()[source]
Create the sample annotation table which maps samples to their metadata.
- Returns:
the table as a dataframe
- Return type:
dataframe
- delete_empCpds(moniker)[source]
This method will safely delete an empcpd and unregister it with the experiment.
- Parameters:
moniker (str) – the empcpd moniker to delete
- delete_feature_table(moniker)[source]
This method will safely delete a feature table and unregister it with the experiment.
- Parameters:
moniker (str) – the table moniker to delete
- filter_samples(sample_filter, return_field=None)[source]
Find the set of acquisitions that pass the provided filter and return either the acquisition object or the specified field of each passing sample
Args:
filter (dict): a filter dictionary return_field (str, optional): if provided and valid
the field specified is return per matching acquisition.
- Returns:
list of matching acquisitions or the return_field value of the acquisitions
- generate_cosmetic_map(field=None, provided_cos_type='color', seed=None)[source]
This generates the mapping of acquisition properties to colors, markers, text, etc. used in figure generation. This allows for consistency across runs as the mapping is stored in the object.
Args:
field (str, optional): field for which to generate the mapping cos_type (str, optional): ‘color’, ‘marker’, or ‘text’ seed (int, optional): used for shuffling, by setting this value, the
same exact mapping can be generated each time.
- Returns:
the mapping of fields to cosmetic types.
- Return type:
dict
- generate_output(empCpd_moniker, table_moniker)[source]
This generates and stores the the feature table, sample annotation table, and the feature annotation table to the output directory. It also copies the JSON for the desried empcpd and experiment to the directory.
- Parameters:
empCpd_moniker (str) – moniker of empcpd to use
table_moniker (str) – moniker of table to use
- property ionization_mode
This returns the user-specified or determined ionization mode of the experiment’s acquisitions. Lazily evaluated.
- Returns:
the ionization mode ‘pos’ or ‘neg’
- static load(experiment_json_filepath)[source]
Reconstitute the experiment object from a saved JSON file representing the object
- Parameters:
experiment_json_filepath (str) – path to the JSON file
- Returns:
an experiment object
- property ms2_acquisitions
This returns all acquisitions in the experiment that have MS2. Lazily evaluated.
- Returns:
list of acquisitions with MS2
- order_samples()[source]
This updates the ordered_samples param, part of metdatamodel implementation.
- retrieve_empCpds(moniker, as_object=False)[source]
For a given moniker return either the empcpd object or its path.
- Parameters:
moniker (str) – the empcpd to retrieve
as_object (bool, optional) – if true, return the object else its path. Defaults to False.
- Returns:
the feature table or its path
- Return type:
str or object
- retrieve_feature_table(moniker, as_object=False)[source]
For a given moniker return either the feature table object or its path.
- Parameters:
moniker (str) – the table to retrieve
as_object (bool, optional) – if true, return the object else its path. Defaults to False.
- Returns:
the feature table or its path
- Return type:
str or object
- property sample_names
This returns the name of all acquisitions in the experiment Lazily evaluated.
- Returns:
names of all samples in the experiment
- subdirectories = {'acquisition_data': 'acquisitions/', 'annotation_subdirectory': 'annotations/', 'asari_subdirectory': 'asari/', 'converted_subdirectory': 'converted_acquisitions/', 'filtered_feature_tables_subdirectory': 'filtered_feature_tables/', 'ms2_directory': 'ms2_acquisitions/', 'output_subdirectory': 'output/', 'qaqc_figs': 'QAQC_figs/', 'raw_subdirectory': 'raw_acquisitions/'}