-
Notifications
You must be signed in to change notification settings - Fork 52
Open
Description
So i was using gnn explainer in this GNN based model using OPENPOM model and pytorch's GNNExplainer algorithm. But i was encountering error while generating explanation using the below code. Any help would be beneficial.
input_file = 'curated_GS_LF_merged_4983.csv' # or new downloaded file path
featurizer = GraphFeaturizer()
smiles_field = 'nonStereoSMILES' # column that contains SMILES
loader = dc.data.CSVLoader(tasks=TASKS,
feature_field=smiles_field,
featurizer=featurizer)
dataset = loader.create_dataset(inputs=[input_file])
n_tasks = len(dataset.tasks)
len(dataset)
randomstratifiedsplitter = dc.splits.RandomStratifiedSplitter()
train_dataset, test_dataset, valid_dataset = randomstratifiedsplitter.train_valid_test_split(dataset, frac_train = 0.8, frac_valid = 0.1, frac_test = 0.1, seed = 1)
print("train_dataset: ", len(train_dataset))
print("valid_dataset: ", len(valid_dataset))
print("test_dataset: ", len(test_dataset))
train_ratios = get_class_imbalance_ratio(train_dataset)
assert len(train_ratios) == n_tasks
learning_rate = dc.models.optimizers.ExponentialDecay(initial_rate=0.001, decay_rate=0.5, decay_steps=32*20, staircase=True)
model = MPNNPOMModel(n_tasks = n_tasks, # general configuration
batch_size = 128,
learning_rate = learning_rate,
class_imbalance_ratio = train_ratios, # loss calculation
loss_aggr_type = 'sum',
node_out_feats = 100, # for node and edge feature representation
edge_hidden_feats = 75,
edge_out_feats = 100,
num_step_message_passing = 5, # for message passing and aggregation
mpnn_residual = True,
message_aggregator_type = 'sum',
mode = 'classification',
number_atom_features = GraphConvConstants.ATOM_FDIM, #number of input features per atom
number_bond_features = GraphConvConstants.BOND_FDIM, #number of input features per bond
n_classes = 1, #suggest its a binary classification problem as per the dataset where 0 is no odor and 1 is odor
readout_type = 'global_sum_pooling',
num_step_set2set = 3,
num_layer_set2set = 2,
ffn_hidden_list = [392, 392],
ffn_embeddings = 256,
ffn_activation = 'relu',
ffn_dropout_p = 0.12,
ffn_dropout_at_input_no_act = False,
weight_decay = 1e-5,
self_loop = False,
optimizer_name = 'adam',
log_frequency = 32,
model_dir = './experiments',
device_name ='cpu')
explainer = Explainer(
model=model,
algorithm=GNNExplainer(epochs=200),
explanation_type='model',
node_mask_type='attributes',
edge_mask_type='object',
model_config=dict(
mode='multiclass_classification',
task_level='graph',
return_type='log_probs',
),
)
batch = torch.zeros(data.x.size(0), dtype=torch.long)
with torch.no_grad():
output = model(data.x, data.edge_index, batch)
print(f"Model output shape: {output.shape}")
print("Forward pass successful!")
print("Running explainer...")
exp = explainer(
x=data.x,
edge_index=data.edge_index,
target=label_id,
batch=batch
)
print(f"Explanation for '{TASKS[label_id]}' odor:")
print(f"Node importance: max={exp.node_mask.max().item():.3f}, min={exp.node_mask.min().item():.3f}")
print(f"Edge importance: max={exp.edge_mask.max().item():.3f}, min={exp.edge_mask.min().item():.3f}")
# Optionally, if you need to visualize the results
print("Explanation generated successfully!")
while the explainer uses mode='multiclass_classification', and is set as multiclass, and the problem that openpom solves is multi label, i was handling it already from extracting 571 smile strings that have only single label as their discriptor, so it sorts of represent multi class so considering that the dataset is already converted to PyG format . So please help how to resolve this error.
Metadata
Metadata
Assignees
Labels
No labels