-
Notifications
You must be signed in to change notification settings - Fork 32
Description
Describe the bug
When running the muon.tl.leiden function, defaulting the random_state or setting it to 0 leads to non-deterministic outcome. Setting it non-zero numbers seems to set an actual seed. I'm not sure if this is a bug or it's so by design, but the documentation wasn't very clear about this and I think it would be awesome if you would add a note about that.
To Reproduce
Here is an MRE:
import pandas as pd
import numpy as np
import scanpy as sc
import mudata as md
import muon as mu
sampleDataA = pd.DataFrame(np.random.randint(0, 1000, size=(1000, 10)), columns=list('ABCDEFGHIJ'))
sampleDataB = pd.DataFrame(np.random.randint(0, 1000, size=(1000, 10)), columns=list('KLMNOPQRST'))
metadataDFA = pd.DataFrame(index=sampleDataA.columns.tolist())
metadataDFB = pd.DataFrame(index=sampleDataB.columns.tolist())
adObjA = sc.AnnData(X=sampleDataA.values, obs=None, var=metadataDFA)
adObjB = sc.AnnData(X=sampleDataB.values, obs=None, var=metadataDFB)
sc.pp.neighbors(adObjA)
sc.pp.neighbors(adObjB)
mdObj1 = md.MuData({'A': adObjA, 'B': adObjB})
mdObj2 = mdObj1.copy()
totalIterations = 50
seed = 0
isNotIdentical = 0
for i in range(totalIterations):
mu.tl.leiden(mdObj1, resolution=1, key_added='leiden', random_state=seed)
clusters1 = mdObj1.obs['leiden'].copy()
mu.tl.leiden(mdObj2, resolution=1, key_added='leiden', random_state=seed)
clusters2 = mdObj2.obs['leiden'].copy()
if not clusters1.equals(clusters2):
isNotIdentical += 1
print(f"{isNotIdentical} iterations out of {totalIterations} returned non-identical results")
Expected behaviour
I expected that setting random_state=0 means that the seed=0, i.e. deterministic. In the above MRE, after 50 iterations, I typically get 2-4 instances where the clustering is not identical. I also investigated whether it was simply the cluster naming being out of order, but the actual partitioning was different.
System
- OS: Ubuntu 24.04.1 LTS
- Python version 3.10.15
- Versions of libraries involved: Muon 0.1.7, MuData 0.2.3
Additional context
As I mentioned above, this might not be a bug at all, but I still think it would be nice if you would add a note about this to the documentation. Thanks!