PharmaFormer
PharmaFormer Model
Contains PharmaFormer, a transformer-based deep learning model for drug response prediction.
A Transformer-based deep learning model designed to predict clinical drug responses by integrating gene expression profiles and drug molecular structures.
Original authors: Zhou et al. (2025, 10.1038/s41698-025-01082-6) Code adapted from their Github: https://github.com/zhouyuru1205/PharmaFormer
- class drevalpy.models.PharmaFormer.pharmaformer.PharmaFormerModel
Bases:
DRPModelPharmaFormer model for drug response prediction.
- build_model(hyperparameters)
Builds the PharmaFormer model with the specified hyperparameters.
- cell_line_views = ['gene_expression']
- drug_views = ['bpe_smiles']
- early_stopping = True
- classmethod load(directory)
Load the PharmaFormer model using PyTorch conventions.
This method expects the following files in the given directory:
“pharmaformer_model.pt”: PyTorch state_dict of the model
“hyperparameters.json”: Dictionary of hyperparameters
“gene_scaler.pkl”: Fitted StandardScaler (optional)
“gene_normalizer.pkl”: Fitted MinMaxScaler (optional)
- Parameters:
directory (
str) – Path to the directory containing the model files- Return type:
- Returns:
An instance of PharmaFormerModel with loaded model
- load_cell_line_features(data_path, dataset_name)
Load cell line features.
- Parameters:
- Return type:
- Returns:
cell line features
- load_drug_features(data_path, dataset_name)
Load drug features (BPE-encoded SMILES).
- Parameters:
- Return type:
- Returns:
drug features
- Raises:
FileNotFoundError – if the BPE SMILES file is not found
- predict(cell_line_ids, drug_ids, cell_line_input, drug_input=None)
Predicts the response values for the given cell lines and drugs.
- Parameters:
cell_line_ids (
ndarray) – list of cell line IDsdrug_ids (
ndarray) – list of drug IDscell_line_input (
FeatureDataset) – input data associated with the cell linedrug_input (
FeatureDataset|None) – input data associated with the drug
- Return type:
- Returns:
predicted response values
- Raises:
ValueError – if drug_input is None or if the model is not initialized
- save(directory)
Save the PharmaFormer model using PyTorch conventions.
This method stores:
“pharmaformer_model.pt”: PyTorch state_dict of the model
“hyperparameters.json”: All hyperparameters
“gene_scaler.pkl”: Fitted StandardScaler for gene expression
“gene_normalizer.pkl”: Fitted MinMaxScaler for gene expression
- Parameters:
directory (
str) – Target directory where the model files will be saved- Raises:
ValueError – If model is not built
- Return type:
- train(output, cell_line_input, drug_input=None, output_earlystopping=None, model_checkpoint_dir='checkpoints')
Trains the model.
- Parameters:
output (
DrugResponseDataset) – training data associated with the response outputcell_line_input (
FeatureDataset) – input data associated with the cell linedrug_input (
FeatureDataset|None) – input data associated with the drugoutput_earlystopping (
DrugResponseDataset|None) – early stopping data associated with the response outputmodel_checkpoint_dir (
str) – directory to save the model checkpoint
- Raises:
ValueError – if drug_input is None or if early stopping data is missing
- Return type:
Model utils
Neural network components for PharmaFormer model.
- class drevalpy.models.PharmaFormer.model_utils.CombinedModel(gene_input_size, gene_hidden_size, drug_hidden_size, feature_dim, nhead, num_layers=3, dim_feedforward=2048, dropout=0.1)
Bases:
ModuleCombined model integrating feature extraction and transformer.
- Parameters:
- forward(gene_expr, smiles)
Forward pass of the combined model.
- Parameters:
gene_expr (
Tensor) – Gene expression features [batch_size, gene_input_size]smiles (
Tensor) – BPE-encoded SMILES features [batch_size, 128]
- Return type:
Tensor- Returns:
Output predictions [batch_size, 1]
- class drevalpy.models.PharmaFormer.model_utils.FeatureExtractor(gene_input_size, gene_hidden_size, drug_hidden_size)
Bases:
ModuleFeature extractor for gene expression and drug SMILES.
- forward(gene_expr, smiles)
Forward pass of the feature extractor.
- Parameters:
gene_expr (
Tensor) – Gene expression features [batch_size, gene_input_size]smiles (
Tensor) – BPE-encoded SMILES features [batch_size, 128]
- Return type:
Tensor- Returns:
Combined features [batch_size, gene_hidden_size + drug_hidden_size]
- class drevalpy.models.PharmaFormer.model_utils.TransModel(feature_dim, nhead, seq_len, dim_feedforward=2048, dropout=0.1, num_layers=3)
Bases:
ModuleTransformer model for processing combined features.
- Parameters:
- forward(x)
Forward pass of the transformer model.
- Parameters:
x (
Tensor) – Input tensor [batch_size, seq_len, feature_dim]- Return type:
Tensor- Returns:
Output predictions [batch_size, 1]