Visualization
Outplot
Abstract wrapper class for all visualizations.
- class drevalpy.visualization.outplot.OutPlot
Bases:
ABCAbstract wrapper class for all visualizations.
- abstractmethod draw_and_save(out_prefix, out_suffix)
Draw and save the plot.
- abstractmethod static write_to_html(test_mode, f, *args, **kwargs)
Write the plot to the final report file.
- Parameters:
test_mode (
str) – LPO, LCO, LDOf (
TextIOWrapper) – the file to write toargs – additional arguments
kwargs – additional keyword arguments
- Return type:
- Returns:
the file to write to
Comparison scatter plot
Contains the code needed to draw the correlation comparison scatter plot.
- class drevalpy.visualization.comp_scatter.ComparisonScatter(df, color_by, test_mode, metric='R^2', algorithm='all')
Bases:
OutPlotClass to draw scatter plots for comparison of correlation metrics between models.
Produces two types of plots: an overall comparison plot and a dropdown plot for comparison between all models. If one model is consistently better than the other, the points deviate from the identity line (higher if the model is on the y-axis, lower if it is on the x-axis. The dropdown plot allows to select two models for comparison of their per-drug/per-cell-line pearson correlation. The overall plot facets all models and visualizes the density of the points.
- draw_and_save(out_prefix, out_suffix)
Draws and saves the scatter plots.
- Parameters:
- Raises:
AssertionError – if out_suffix does not match self.name
- Return type:
- static write_to_html(test_mode, f, *args, **kwargs)
Inserts the generated files into the result HTML file.
- Parameters:
test_mode (
str) – test_mode, e.g., LCOf (
TextIOWrapper) – file to write toargs – unused
kwargs – used to get all files generated by create_report / the pipeline
- Return type:
- Returns:
the file f
Critical difference plot
Draws the critical difference plot.
This method performs the following steps:
Friedman Test: First, it performs the Friedman test, which is a non-parametric statistical test used to detect differences in treatments across multiple test attempts. It compares the ranks of multiple groups and is suitable when there are repeated measurements for each group (as is the case here with cross-validation splits). The p-value of this test is used to assess whether there are any significant differences in the performance of the models. We use Benjamini/Hochberg correction for multiple testing.
Post-hoc Conover Test: If the Friedman test returns a significant result (p-value < 0.05), the post-hoc Conover test can be used to identify pairs of algorithms that perform significantly different. This test is necessary because the Friedman test only tells if there is a difference somewhere among the models, but not which ones are different. The scikit_posthocs library is used for this step.
Rank Calculation: Next, the average ranks of each classifier across all cross-validation splits are computed. The models are ranked based on their performance (lower ranks indicate better performance) and the average rank across all splits is calculated for each model.
Critical Difference Diagram: Finally, the method draws the critical difference diagram. This diagram visually displays the significant differences between the algorithms. A horizontal line groups a set of models that are not significantly different. The critical difference is determined based on the post-hoc test results.
- class drevalpy.visualization.critical_difference_plot.CriticalDifferencePlot(eval_results_preds, metric='MSE')
Bases:
OutPlotDraws the critical difference diagram.
The critical difference diagram is used to compare the performance of multiple classifiers and show whether a model is significantly better than another model. This is calculated over the average ranks of the classifiers which is why there need to be at least 3 classifiers to draw the diagram. Because the ranks are calculated over the cross-validation splits and the significance threshold is set to 0.05, e.g., 10 CV folds are advisable.
- Parameters:
eval_results_preds (DataFrame)
- draw_and_save(out_prefix, out_suffix)
Draws the critical difference plot and saves it to a file.
- Parameters:
- Raises:
ValueError – if the figure is None or the test results are None
- Return type:
- static write_to_html(test_mode, f, *args, **kwargs)
Inserts the critical difference plot into the HTML report file.
- Parameters:
test_mode (
str) – test_mode, e.g., LPOf (
TextIOWrapper) – HTML report fileargs – not needed
kwargs – not needed
- Return type:
- Returns:
HTML report file
Cross study tables
Module for generating evaluation tables for cross-study drug response prediction.
- class drevalpy.visualization.cross_study_tables.CrossStudyTables(evaluation_metrics, path_data)
Bases:
objectGenerate evaluation tables for cross-study drug response prediction.
- Parameters:
evaluation_metrics (DataFrame)
path_data (Path)
- draw()
Create and store Plotly table figures sorted by MSE.
- draw_and_save(out_prefix, out_suffix)
Generate and save HTML tables for each cross-study dataset.
- static write_to_html(test_mode, f, files, prefix)
Embed HTML table files into an open HTML file handle.
- Parameters:
test_mode (
str) – Substring to match filenames (e.g., ‘lpo’, ‘lco’).f (
TextIOWrapper) – Open writable file handle to insert HTML blocks.files (
list[str]) – List of filenames in the target directory.prefix (
str) – Path prefix to locate HTML table files.
- Return type:
- Returns:
Updated file handle with HTML blocks written in.
Regression slider plot
Module for generating regression plots with a slider for Pearson correlation coefficient.
- class drevalpy.visualization.regression_slider_plot.RegressionSliderPlot(df, test_mode, model, group_by='drug_name', normalize=False)
Bases:
OutPlotGenerates regression plots with a slider for the Pearson correlation coefficient.
- draw_and_save(out_prefix, out_suffix)
Draw the regression plot and save it to a file.
- static write_to_html(test_mode, f, *args, **kwargs)
Write the plot to the final report file.
- Parameters:
test_mode (
str) – test_mode, e.g., LPOf (
TextIOWrapper) – final report fileargs – additional arguments
kwargs – additional keyword arguments, in this case all files
- Return type:
- Returns:
the final report file
Violin and heatmap parent class
Parent class for Violin and Heatmap plots of performance measures over CV runs.
- class drevalpy.visualization.vioheat.VioHeat(df, normalized_metrics=False, whole_name=False)
Bases:
OutPlotParent class for Violin and Heatmap plots of performance measures over CV runs.
- Parameters:
df (DataFrame)
- draw_and_save(out_prefix, out_suffix)
Draw and save the plot.
- static write_to_html(test_mode, f, *args, **kwargs)
Write the Violin and Heatmap plots into the result HTML file.
- Parameters:
test_mode (
str) – test_mode, e.g., LPOf (
TextIOWrapper) – result HTML fileargs – additional arguments
kwargs – additional keyword arguments, in this case, the plot type and the files
- Return type:
- Returns:
the result HTML file
Heatmap
Plots a heatmap of the evaluation metrics.
Violin plot
Plots a violin plot of the evaluation metrics.
Utility functions
Utility functions for the visualization part of the package.
- drevalpy.visualization.utils.compute_evaluation(df, return_df, group_by, model)
Compute the evaluation metrics per group.
- Parameters:
- Return type:
DataFrame- Returns:
dataframe with the evaluation results per group
- drevalpy.visualization.utils.create_html(run_id, test_mode, files, prefix_results)
Create the html file for the given test mode, e.g., LPO.html.
- drevalpy.visualization.utils.create_index_html(custom_id, test_modes, prefix_results)
Create the index.html file.
- drevalpy.visualization.utils.create_output_directories(result_path, custom_id)
If they do not exist yet, make directories for the visualization files.
- drevalpy.visualization.utils.draw_algorithm_plots(model, ev_res, ev_res_per_drug, ev_res_per_cell_line, t_vs_p, test_mode, custom_id, result_path)
Draw all plots for a specific algorithm.
- Parameters:
model (
str) – name of the model/algorithmev_res (
DataFrame) – overall evaluation resultsev_res_per_drug (
DataFrame|None) – evaluation results per drugev_res_per_cell_line (
DataFrame|None) – evaluation results per cell linet_vs_p (
DataFrame) – true response values vs. predicted response valuestest_mode (
str) – test_modecustom_id (
str) – run id passed via command lineresult_path (
Path) – path to the results
- Return type:
- drevalpy.visualization.utils.draw_test_mode_plots(test_mode, ev_res, ev_res_per_drug, ev_res_per_cell_line, custom_id, path_data, result_path)
Draw all plots for a specific test_mode (LPO, LCO, LDO, LTO).
- Parameters:
test_mode (
str) – test_modeev_res (
DataFrame) – overall evaluation resultsev_res_per_drug (
DataFrame|None) – evaluation results per drugev_res_per_cell_line (
DataFrame|None) – evaluation results per cell linecustom_id (
str) – run id passed via command linepath_data (
Path) – path to the dataresult_path (
Path) – path to the results
- Return type:
- Returns:
list of unique algorithms
- Raises:
ValueError – if no evaluation results are found for the given test_mode
- drevalpy.visualization.utils.evaluate_file(pred_file, test_mode, model_name, dataset_name='NO_DATASET_NAME')
Evaluate the predictions from the final models.
- drevalpy.visualization.utils.parse_results(path_to_results, dataset)
Parse the results from the given directory.
- drevalpy.visualization.utils.prep_results(eval_results, eval_results_per_drug, eval_results_per_cell_line, t_vs_p, path_data)
Prepare the results by introducing new columns for algorithm, randomization, test_mode, split, CV_split.
- Parameters:
eval_results (
DataFrame) – evaluation resultseval_results_per_drug (
DataFrame) – evaluation results per drugeval_results_per_cell_line (
DataFrame) – evaluation results per cell linet_vs_p (
DataFrame) – true vs. predicted valuespath_data (
Path) – path to the data
- Return type:
tuple[DataFrame,DataFrame,DataFrame,DataFrame]- Returns:
the same dataframes with new columns
- Raises:
ValueError – if NaiveMeanEffectsPredictor is not found in the evaluation results
- drevalpy.visualization.utils.write_results(path_out, eval_results, eval_results_per_drug, eval_results_per_cl, t_vs_p)
Write the results to csv files.
- Parameters:
path_out (
str) – path to the output directory, e.g., results/my_run/eval_results (
DataFrame) – evaluation resultseval_results_per_drug (
DataFrame) – evaluation results per drugeval_results_per_cl (
DataFrame) – evaluation results per cell linet_vs_p (
DataFrame) – true vs. predicted values
- Return type: