API

Import DrEvalPy using

import drevalpy as dep

Subpackages

DrEvalPy consists of three major subpackages:

Datasets
Models
Visualization

Other functions

Major functions for running the experiment

Main module for running the drug response prediction experiment.

drevalpy.experiment.consolidate_single_drug_model_predictions(models, n_cv_splits, results_path, cross_study_datasets, randomization_mode=None, n_trials_robustness=0, out_path='')

Consolidate single drug model predictions into a single file.

Parameters:

models (list[type[DRPModel]]) – list of model classes to compare, e.g., [SimpleNeuralNetwork, RandomForest]
n_cv_splits (int) – number of cross-validation splits, e.g., 5
results_path (str) – path to the results directory, e.g., results/
cross_study_datasets (list[str]) – list of cross-study datasets, e.g., [CCLE, GDSC1]
randomization_mode (list[str] | None) – list of randomization modes, e.g., [“SVCC”, “SVRC”]
n_trials_robustness (int) – number of robustness trials, e.g., 10
out_path (str) – for the package, this is the same as results_path. For the pipeline, this is empty because it will be stored in the work directory.

Return type:

None

drevalpy.experiment.cross_study_prediction(dataset, model, test_mode, train_dataset, path_data, early_stopping_dataset, response_transformation, path_out, split_index, single_drug_id=None)

Run the drug response prediction experiment on a cross-study dataset to assess the generalizability of the model.

Parameters:

dataset (DrugResponseDataset) – cross-study dataset, e.g., GDSC1 if trained on GDSC2
model (DRPModel) – model to use, e.g, SimpleNeuralNetwork
test_mode (str) – test mode one of “LPO”, “LCO”, “LDO” (leave-pair-out, leave-cell-line-out, leave-drug-out)
train_dataset (DrugResponseDataset) – training dataset, e.g., GDSC2
path_data (str) – path to the data directory, e.g., data/
early_stopping_dataset (DrugResponseDataset | None) – early stopping dataset
response_transformation (TransformerMixin | None) – normalizer to use for the response data, e.g., StandardScaler
path_out (str) – path to the output directory, e.g., results/
split_index (int) – index of the split
single_drug_id (str | None) – drug id to use for single drug models None for global models

Raises:

ValueError – if feature loading fails, if the test mode is invalid, or if LTO and no tissues are supplied.

Return type:

None

drevalpy.experiment.drug_response_experiment(models, response_data, baselines=None, response_transformation=None, run_id='', test_mode='LPO', hpam_optimization_metric='RMSE', n_cv_splits=5, multiprocessing=False, randomization_mode=None, randomization_type='permutation', cross_study_datasets=None, n_trials_robustness=0, path_out='results/', overwrite=False, path_data='data', model_checkpoint_dir='TEMPORARY', hyperparameter_tuning=True, final_model_on_full_data=False, wandb_project=None, custom_splitter=None, custom_split_name=None)

Run the drug response prediction experiment. Save results to disc.

Parameters:

models (list[type[DRPModel]]) – list of model classes to compare
baselines (list[type[DRPModel]] | None) – list of baseline models. No randomization or robustness tests are run for the baseline models.
response_data (DrugResponseDataset) – drug response dataset
response_transformation (TransformerMixin | None) – normalizer to use for the response data
hpam_optimization_metric (str) – metric to use for hyperparameter optimization (i.e., for selecting the best model on the validation set)
n_cv_splits (int) – number of cross-validation splits
multiprocessing (bool) – whether to use multiprocessing. This requires Ray to be installed.
randomization_mode (list[str] | None) –
list of randomization modes to do. Modes: SVCC, SVRC, SVCD, SVRD Can be a list of randomization tests e.g. ‘SVCC SVCD’. Default is None, which means no randomization tests are run.
- SVCC: Single View Constant for Cell Lines: in this mode, one experiment is done for every cell line view
  the model uses (e.g. gene expression, mutation, …). For each experiment one cell line view is held constant while the others are randomized.
- SVRC Single View Random for Cell Lines: in this mode, one experiment is done for every cell line view the
  model uses (e.g. gene expression, mutation, …). For each experiment one cell line view is randomized while the others are held constant.
- SVCD: Single View Constant for Drugs: in this mode, one experiment is done for every drug view the model
  uses (e.g. fingerprints, target_information, …). For each experiment one drug view is held constant while the others are randomized.
- SVRD: Single View Random for Drugs: in this mode, one experiment is done for every drug view the model uses
  (e.g. gene expression, target_information, …). For each experiment one drug view is randomized while the others are held constant.
randomization_type (str) –
type of randomization to use. Choose from “permutation” and “invariant”. Default is “permutation”.
- ”permutation”: permute the features over the instances, keeping the distribution of the features the same
  but dissolving the relationship to the target
- ”invariant”: the features are permuted in a way that a key characteristic of the feature is kept. In case of
  matrices, this is the mean and standard deviation of the feature view for this instance, for networks it is the degree distribution.
cross_study_datasets (list[DrugResponseDataset] | None) – list of datasets for the cross-study prediction. The trained model is assessed for its generalization to these datasets. Default is None, which means no cross-study prediction is run.
n_trials_robustness (int) – number of trials to run for the robustness test. The robustness test is a test where models are retrained multiple times with varying seeds. Default is 0, which means no robustness test is run.
path_out (str) – path to the output directory
run_id (str) – identifier to save the results
test_mode (str) – test mode one of “LPO”, “LCO”, “LTO”, “LDO” (leave-pair-out, leave-cell-line-out, leave-tissue-out, leave-drug-out)
overwrite (bool) – whether to overwrite existing results
path_data (str) – path to the data directory, usually data/
model_checkpoint_dir (str) – directory to save model checkpoints. If “TEMPORARY”, a temporary directory is created.
hyperparameter_tuning – whether to run in debug mode - if False, only select first hyperparameter set
final_model_on_full_data (bool) – if True, a final/production model is saved in the results directory. If hyperparameter_tuning is true, the final model is produced according to the hyperparameter tuning procedure which was evaluated in the nested cross validation.
wandb_project (str | None) – if provided, enables wandb logging for all DRPModel instances throughout training. All hyperparameters and metrics will be logged to the specified wandb project.
custom_splitter (Callable[[DrugResponseDataset, SplitParams], list[dict[str, Any]]] | str | Path | None) – optional path to a Python script or callable implementing create_splits. When provided, built-in split_dataset is skipped and test_mode selects validation checks.
custom_split_name (str | None) – optional result-directory label when using a custom splitter. Defaults to test_mode when omitted.

Raises:

ValueError – if no cv splits are found

Return type:

None

drevalpy.experiment.generate_data_saving_path(model_name, drug_id, result_path, suffix)

Generate a path to save data to.

For single drug models, the path is result_path/model_name/drugs/drug_id/suffix. For all others, it is result_path/model_name/suffix.

Parameters:

model_name – model name
drug_id – drug id
result_path – path to the results directory
suffix – suffix to add to the path, e.g., “predictions”, “best_hpams”, “randomization”, “robustness”

Return type:

str

Returns:

path to save data to

drevalpy.experiment.get_datasets_from_cv_split(split, model_class, model_name, drug_id=None)

Get train, validation, (early stopping), and test datasets from the CV split.

Returns copies of the datasets to prevent in-place modifications (e.g., add_rows, reduce_to) from affecting the original split data used by subsequent models.

Parameters:

split (dict[str, DrugResponseDataset]) – dictionary of the CV split
model_class (type[DRPModel]) – model class
model_name (str) – model name
drug_id (str | None) – drug id for single drug models

Return type:

tuple[DrugResponseDataset, DrugResponseDataset, DrugResponseDataset | None, DrugResponseDataset]

Returns:

tuple of train, validation, (early stopping), and test datasets (as copies)

drevalpy.experiment.get_model_name_and_drug_id(model_name)

Get the model name and drug id from the model name.

Parameters:: model_name (str) – model name, e.g., SimpleNeuralNetwork or MOLIR.Afatinib
Return type:: tuple[str, str | None]
Returns:: tuple of model name and, potentially drug id if it is a single drug model
Raises:: AssertionError – if the model name is not found in the model factory

drevalpy.experiment.get_randomization_test_views(model, randomization_mode)

Get the views to use for the randomization tests.

For SVCC, a single cell line view (e.g., gene expression) is held constant while the others are randomized.
For SVCD, a single drug view (e.g., fingerprints) is held constant while the others are randomized.
For SVRC, a single cell line view is randomized while the others are held constant.
For SVRD, a single drug view is randomized while the others are held constant.

Parameters:

model (DRPModel) – model to use, e.g., SimpleNeuralNetwork
randomization_mode (list[str]) – list of randomization modes to do, e.g., [“SVCC”, “SVRC”]

Return type:

dict[str, list[str]]

Returns:

dictionary of randomization test views

drevalpy.experiment.hpam_tune(model, train_dataset, validation_dataset, hpam_set, early_stopping_dataset=None, response_transformation=None, metric='RMSE', path_data='data', model_checkpoint_dir='TEMPORARY', *, split_index=None, wandb_project=None, wandb_base_config=None)

Tune the hyperparameters for the given model in an iterative manner.

Parameters:

model (DRPModel) – model to use
train_dataset (DrugResponseDataset) – training dataset
validation_dataset (DrugResponseDataset) – validation dataset
hpam_set (list[dict]) – hyperparameters to tune
early_stopping_dataset (DrugResponseDataset | None) – early stopping dataset
response_transformation (TransformerMixin | None) – normalizer to use for the response data
metric (str) – metric to evaluate which model is the best
path_data (str) – path to the data directory, e.g., data/
model_checkpoint_dir (str) – directory to save model checkpoints
split_index (int | None) – optional CV split index, used for naming wandb runs
wandb_project (str | None) – optional wandb project name; if provided, enables per-trial wandb runs
wandb_base_config (dict[str, Any] | None) – optional base config dict to include in each wandb run

Return type:

dict

Returns:

best hyperparameters

Raises:

AssertionError – if hpam_set is empty

drevalpy.experiment.hpam_tune_raytune(model, train_dataset, validation_dataset, early_stopping_dataset, hpam_set, response_transformation=None, metric='RMSE', ray_path='raytune', path_data='data', model_checkpoint_dir='TEMPORARY')

Tune the hyperparameters for the given model using Ray Tune. Ray[tune] must be installed.

Parameters:

model (DRPModel) – model to use
train_dataset (DrugResponseDataset) – training dataset
validation_dataset (DrugResponseDataset) – validation dataset
early_stopping_dataset (DrugResponseDataset | None) – early stopping dataset
hpam_set (list[dict]) – hyperparameters to tune
response_transformation (TransformerMixin | None) – normalizer for response data
metric (str) – evaluation metric
ray_path (str) – path to the raytune directory
path_data (str) – path to data directory, e.g., data/
model_checkpoint_dir (str) – directory for model checkpoints

Return type:

dict

Returns:

best hyperparameters

Raises:

ValueError – if best_result is None

drevalpy.experiment.load_features(model, path_data, dataset)

Load and reduce cell line and drug features for a given dataset.

Parameters:

model (DRPModel) – model to use, e.g., SimpleNeuralNetwork
path_data (str) – path to the data directory, e.g., data/
dataset (DrugResponseDataset) – dataset to load features for, e.g., GDSC2

Return type:

tuple[FeatureDataset, FeatureDataset | None]

Returns:

tuple of cell line and, potentially, drug features

drevalpy.experiment.make_model_list(models, response_data)

Make a list of models to evaluate: if it is a single drug model, add the drug id to the model name.

Parameters:

models (list[type[DRPModel]]) – list of models to evaluate
response_data (DrugResponseDataset) – response data, needed to get the unique drugs for single drug models

Return type:

dict[str, str]

Returns:

dictionary of model names: model class, e.g., {“SimpleNeuralNetwork”: “SimpleNeuralNetwork”, “MOLIR.Afatinib”: “MOLIR”}

drevalpy.experiment.make_train_val_split(dataset, test_mode, val_ratio=0.1, random_state=42)

Split a dataset into train and validation sets according to the test mode and desired ratio.

Parameters:

dataset (DrugResponseDataset) – full dataset to split
test_mode (str) – one of “LPO”, “LCO”, “LDO”, “LTO”
val_ratio (float) – approximate fraction of data to use for validation
random_state (int) – random seed

Return type:

tuple[DrugResponseDataset, DrugResponseDataset]

Returns:

(train_dataset, validation_dataset)

Raises:

ValueError – if no tissue information is provided for the DrugResponseDataset

drevalpy.experiment.prepare_response_splits(response_data, *, split_path, result_path, split_label, test_mode, n_cv_splits, overwrite, result_folder_exists, custom_splitter=None, validation_ratio=0.1, random_state=42, split_early_stopping=True)

Create, load, or reuse CV splits for an experiment run.

Parameters:

response_data (DrugResponseDataset) – dataset whose splits are created or loaded
split_path (str) – directory for persisted split CSV files
result_path (str) – experiment result directory
split_label (str) – directory label under the dataset results folder
test_mode (str) – built-in split mode or validation mode for custom splits
n_cv_splits (int) – number of CV splits for built-in splitting
overwrite (bool) – whether to replace existing results and splits
result_folder_exists (bool) – whether result_path already exists
custom_splitter (Callable[[DrugResponseDataset, SplitParams], list[dict[str, Any]]] | str | Path | None) – optional script path or callable for custom splits
validation_ratio (float) – validation fraction for built-in splitting
random_state (int) – random seed for built-in splitting
split_early_stopping (bool) – whether to derive early-stopping roles

Return type:

int

Returns:

number of splits available after preparation

drevalpy.experiment.randomization_test(randomization_test_views, model, hpam_set, path_data, train_dataset, test_dataset, early_stopping_dataset, path_out, split_index, randomization_type='permutation', response_transformation=None, model_checkpoint_dir='TEMPORARY')

Run randomization tests for the given model and dataset.

Parameters:

randomization_test_views (dict[str, list[str]]) – views to use for the randomization tests. Key is the name of the randomization test and the value is a list of views to randomize e.g. {“randomize_genomics”: [“copy_number_var”, “mutation”], “methylation_only”: [“gene_expression”, “copy_number_var”, “mutation”]}”
model (DRPModel) – model to evaluate
hpam_set (dict) – hyperparameters to use
path_data (str) – path to the data directory
train_dataset (DrugResponseDataset) – training dataset
test_dataset (DrugResponseDataset) – test dataset
early_stopping_dataset (DrugResponseDataset | None) – early stopping dataset
path_out (str) – path to the output directory
split_index (int) – index of the split
randomization_type (str) – type of randomization to use. Choose from “permutation”, “invariant”. Default is “permutation” which permutes the features over the instances, keeping the distribution of the features the same but dissolving the relationship to the target. invariant randomization is done in a way that a key characteristic of the feature is preserved. In case of matrices, this is the mean and standard deviation of the feature view for this instance, for networks it is the degree distribution.
response_transformation (TransformerMixin | None) – sklearn.preprocessing scaler like StandardScaler or MinMaxScaler to use to scale the target
model_checkpoint_dir (str) – directory to save model checkpoints

Return type:

None

drevalpy.experiment.randomize_train_predict(view, test_name, randomization_type, randomization_test_file, model, hpam_set, path_data, train_dataset, test_dataset, early_stopping_dataset, model_checkpoint_dir='TEMPORARY', response_transformation=None)

Randomize the features for a given view and run the model.

Parameters:

view (str) – view to randomize, e.g., gene_expression
test_name (str) – name of the randomization test, e.g., SVRC_gene_expression
randomization_type (str) – type of randomization to use, e.g., permutation
randomization_test_file (str) – file to save the results to
model (DRPModel) – model to evaluate
hpam_set (dict) – hyperparameters to use
path_data (str) – path to the data directory
train_dataset (DrugResponseDataset) – training dataset
test_dataset (DrugResponseDataset) – test dataset
early_stopping_dataset (DrugResponseDataset | None) – early stopping dataset
model_checkpoint_dir (str) – directory to save model checkpoints
response_transformation (TransformerMixin | None) – sklearn.preprocessing scaler like StandardScaler or MinMaxScaler to use to scale

Return type:

None

drevalpy.experiment.robustness_test(n_trials, model, hpam_set, path_data, train_dataset, test_dataset, early_stopping_dataset, path_out, split_index, response_transformation=None, model_checkpoint_dir='TEMPORARY')

Run robustness tests for the given model and dataset.

This will run the model n times with different random seeds to get a distribution of the results.

Parameters:

n_trials (int) – number of trials to run
model (DRPModel) – model to evaluate
hpam_set (dict) – hyperparameters to use
path_data (str) – path to the data directory
train_dataset (DrugResponseDataset) – training dataset
test_dataset (DrugResponseDataset) – test dataset
early_stopping_dataset (DrugResponseDataset | None) – early stopping dataset
path_out (str) – path to the output directory
split_index (int) – index of the split
response_transformation (TransformerMixin | None) – sklearn.preprocessing scaler like StandardScaler or MinMaxScaler to use to scale the target
model_checkpoint_dir (str) – directory to save model checkpoints, if “TEMPORARY”: temporary directory is used

drevalpy.experiment.robustness_train_predict(trial, trial_file, train_dataset, test_dataset, early_stopping_dataset, model, hpam_set, path_data, response_transformation=None, model_checkpoint_dir='TEMPORARY')

Train and predict for the robustness test.

Parameters:

trial (int) – trial number
trial_file (str) – file to save the results to
train_dataset (DrugResponseDataset) – training dataset
test_dataset (DrugResponseDataset) – test dataset
early_stopping_dataset (DrugResponseDataset | None) – early stopping dataset
model (DRPModel) – model to evaluate
hpam_set (dict) – hyperparameters to use
path_data (str) – path to the data directory, e.g., data/
response_transformation (TransformerMixin | None) – sklearn.preprocessing scaler like StandardScaler or MinMaxScaler to use to scale
model_checkpoint_dir (str) – directory to save model checkpoints. If “TEMPORARY”, a temporary directory is created.

Return type:

None

drevalpy.experiment.seed_everything(seed=42)

Seed python random, numpy, torch (CPU + CUDA), and PYTHONHASHSEED.

Call once at the top of a run. The dataset/model code uses local np.random.default_rng instances for its own randomness, so this exists to lock down everything else (torch op order, sklearn fallbacks, library-internal RNG, hash randomization).

Parameters:: seed (int) – base seed value
Return type:: None

drevalpy.experiment.split_early_stopping(validation_dataset, test_mode)

Split the validation dataset into a validation and early stopping dataset.

Parameters:

validation_dataset (DrugResponseDataset) – validation dataset
test_mode (str) – test mode one of “LPO”, “LCO”, “LDO” (leave-pair-out, leave-cell-line-out, leave-drug-out)

Return type:

tuple[DrugResponseDataset, DrugResponseDataset]

Returns:

tuple of validation and early stopping datasets

drevalpy.experiment.train_and_evaluate(model, hpams, path_data, train_dataset, validation_dataset, early_stopping_dataset=None, response_transformation=None, metric='RMSE', model_checkpoint_dir='TEMPORARY')

Train and evaluate the model, i.e., call train_and_predict() and then evaluate().

Parameters:

model (DRPModel) – model to use
hpams (dict[str, Any]) – hyperparameters to use
path_data (str) – path to the data directory
train_dataset (DrugResponseDataset) – training dataset
validation_dataset (DrugResponseDataset) – validation dataset
early_stopping_dataset (DrugResponseDataset | None) – early stopping dataset
response_transformation (TransformerMixin | None) – normalizer to use for the response data
metric (str) – metric to evaluate the model on
model_checkpoint_dir (str) – directory to save model checkpoints

Return type:

dict[str, float]

Returns:

dictionary of the evaluation results, e.g., {“RMSE”: 0.1}

drevalpy.experiment.train_and_predict(model, hpams, path_data, train_dataset, prediction_dataset, early_stopping_dataset=None, response_transformation=None, cl_features=None, drug_features=None, model_checkpoint_dir='TEMPORARY')

Train the model and predict the response for the prediction dataset.

Parameters:

model (DRPModel) – model to use, e.g., SimpleNeuralNetwork
hpams (dict) – hyperparameters to use
path_data (str) – path to the data directory, e.g., data/
train_dataset (DrugResponseDataset) – training dataset
prediction_dataset (DrugResponseDataset) – prediction dataset
early_stopping_dataset (DrugResponseDataset | None) – early stopping dataset, optional
response_transformation (TransformerMixin | None) – normalizer to use for the response data, e.g., StandardScaler
cl_features (FeatureDataset | None) – cell line features
drug_features (FeatureDataset | None) – drug features
model_checkpoint_dir (str) – directory for model checkpoints, if “TEMPORARY”, checkpoints are not saved. Default is “TEMPORARY”

Return type:

DrugResponseDataset

Returns:

prediction dataset with predictions

Raises:

ValueError – if train_dataset does not have a dataset_name

drevalpy.experiment.train_final_model(model_class, full_dataset, response_transformation, path_data, model_checkpoint_dir, metric, final_model_path, test_mode='LCO', val_ratio=0.1, hyperparameter_tuning=True)

Final Production Model Training.

Tune a final model on the full data set using a validation split that reflects intended generalization. No test set is used here. The performance during the nested CV is a pessimistic estimate of the final model performance. The validation split strategy is determined by test_mode: - LCO: generalization to unseen cell lines (e.g., personalized medicine) - LDO: generalization to new drugs (e.g., drug repurposing) - LTO: generalization to new tissues - LPO: general (pair-level) prediction

Parameters:

model_class (type[DRPModel]) – model to use
full_dataset (DrugResponseDataset) – full training dataset (union of outer folds)
response_transformation (TransformerMixin) – sklearn scaler used for response normalization
path_data (str) – path to data directory
model_checkpoint_dir (str) – checkpoint dir for intermediate tuning models
metric (str) – metric for tuning, e.g., “RMSE”
final_model_path (str) – path to final_model save directory
test_mode (str) – split logic for validation (LCO, LDO, LTO, LPO)
val_ratio (float) – validation size ratio
hyperparameter_tuning (bool) – whether to perform hyperparameter tuning

Return type:

None

Evaluation functions

Functions for evaluating model performance.

drevalpy.evaluation.evaluate(dataset, metric)

Evaluates the model on the given dataset.

Parameters:

dataset (DrugResponseDataset) – dataset to evaluate on
metric (list[str] | str) – evaluation metric(s) (one or a list of “MSE”, “RMSE”, “MAE”, “R^2”, “Pearson”, “spearman”, “kendall”)

Returns:

evaluation metric

Raises:

AssertionError – if metric is not in AVAILABLE

drevalpy.evaluation.get_mode(metric)

Get whether the optimum value of the metric is the minimum or maximum.

Parameters:: metric (str) – metric, e.g., RMSE
Returns:: whether the optimum value of the metric is the minimum or maximum
Raises:: ValueError – if the metric is not in MINIMIZATION_METRICS or MAXIMIZATION_METRICS

drevalpy.evaluation.kendall(y_pred, y_true)

Computes the kendall tau correlation between predictions and response.

Parameters:

y_pred (ndarray) – predictions
y_true (ndarray) – response

Return type:

float

Returns:

kendall tau correlation float

Raises:

AssertionError – if predictions and response do not have the same length

drevalpy.evaluation.pearson(y_pred, y_true)

Computes the pearson correlation between predictions and response.

Parameters:

y_pred (ndarray) – predictions
y_true (ndarray) – response

Return type:

float

Returns:

pearson correlation float

Raises:

AssertionError – if predictions and response do not have the same length

drevalpy.evaluation.spearman(y_pred, y_true)

Computes the spearman correlation between predictions and response.

Parameters:

y_pred (ndarray) – predictions
y_true (ndarray) – response

Return type:

float

Returns:

spearman correlation float

Raises:

AssertionError – if predictions and response do not have the same length

Utility functions

Utility functions for the evaluation pipeline.

drevalpy.utils.check_arguments(args)

Check the validity of the arguments for the evaluation pipeline.

Parameters:

args – arguments passed from the command line

Raises:

AssertionError – if any of the arguments is invalid
ValueError – if the number of cross-validation splits or curve_curator_cores is less than 1
FileNotFoundError – if a custom dataset name was specified and the input file could not be found.

Return type:

None

drevalpy.utils.get_datasets(dataset_name, cross_study_datasets, path_data='data', measure='response', curve_curator=False, cores=1, normalize=False)

Load the response data and cross-study datasets.

Parameters:

dataset_name (str) – The name of the dataset to load. Can be one of (‘GDSC1’, ‘GDSC2’, ‘CCLE’, CTRPv1’, ‘CTRPv2’, ‘TOYv1’, ‘TOYv2’) to download provided datasets, or any other name to use a custom datasets.
cross_study_datasets (list) – list of cross-study datasets. CurveCurator is not applicable to these. If you wish to provide custom cross_study_datasets, you have to invoke curve fitting manually using drevalpy.datasets.curvecurator.fit_curves
path_data (str) – The parent path in which custom or downloaded datasets should be located, or in which raw viability data is to be found for fitting with CurveCurator (see param curve_curator for details). The location of the datasets are resolved by <path_data>/<dataset_name>/<dataset_name>.csv.
measure (str) – The name of the column containing the measure to predict, default = “response”. If curve_curator is True, this measure is appended with “_curvecurator”, e.g. “response_curvecurator” to distinguish between measures provided by the original source of a dataset, or the measures fit by CurveCurator.
curve_curator (bool) – If True, the measure is appended with “_curvecurator”. If a custom dataset_name was provided, this will invoke the fitting procedure of raw viability data, which is expected to exist at <path_data>/<dataset_name>/<dataset_name>_raw.csv. The fitted dataset will be stored in the same folder, in a file called <dataset_name>.csv
cores (int) – Number of cores to use for CurveCurator fitting. Only used when curve_curator is True, default = 1
normalize (bool) – Whether to normalize the response values to [0, 1] for curvecurator. Default = False. Only used for custom datasets when curve_curator is True.

Return type:

tuple[DrugResponseDataset, list[DrugResponseDataset] | None]

Returns:

response data and, potentially, cross-study datasets

drevalpy.utils.get_response_transformation(response_transformation)

Get the skelarn response transformation object of choice.

Users can choose from “None”, “standard”, “minmax”, “robust”.

Parameters:: response_transformation (str | None) – response transformation to apply
Return type:: TransformerMixin | None
Returns:: response transformation object
Raises:: ValueError – if the response transformation is not recognized

drevalpy.utils.main(args)

Main function to run the drug response evaluation pipeline.

Parameters:: args – passed from command line
Return type:: None

Pipeline function decorator

Decorator to mark a function as a pipeline function.

drevalpy.pipeline_function.pipeline_function(func)

Decorator to mark a function as a pipeline function.

Parameters:: func – function to decorate
Returns:: function with custom attribute