Hidden Conflicts in Neural Networks
This is a code repository for the paper "Hidden Conflicts in Neural Networks and their Implications for Explainability".
To replicate our experiments, please set up a new Conda environment using the provided environment.yml file and install the txai package from the cafe folder using either conda or pip (e.g., by running pip install -e . in the given folder). You will also need to install an editable version of Captum from the source (using pip install -e . in the cloned Captum repo after checking out the v0.7.0 version tag) and make the following changes to the code:
- In the file
captum/attr/_core/lrp.py, addnn.Flatten: EpsilonRuletoSUPPORTED_LAYERS_WITH_RULESand addnn.GELUtoSUPPORTED_NON_LINEAR_LAYERS. This enables the application of LRP to some of our MLP models. - In the file
captum/metrics/_core/sensitivity.py, removewith torch.no_grad():at line 309, as it interferes with the computation of CAFE scores when evaluating sensitivity.
Taking the above steps should provide you with the basic environment necessary for running the main scripts in the cafe/experiments subfolder (though there are some additional requirements for running certain experiments — see the details below):
- Running the experiments on the OpenXAI models requires an additional installation of the OpenXAI package in the environment.
- To run the MIMIC-IV experiments, you will first need to regenerate our version of the MIMIC-IV data using the provided
MIMIC.ipynbnotebook and the specificmimic.ymlenvironment after placing the MIMIC-IV files downloaded from PhysioNet at the locations expected by the code. Unfortunately, we are unable to directly share our version of the MIMIC-IV data due to the restrictions associated with the Physionet Credentialed Health Data License 1.5.0 under which the data is distributed. - Running the Covertype experiments requires the associated data. You should not need to download it manually, as it can be automatically downloaded by the scikit-learn library.
If you find our paper or code useful in your research, please consider citing the original work:
@inproceedings{10.1145/3715275.3732100,
author = {Dejl, Adam and Zhang, Dekai and Ayoobi, Hamed and Williams, Matthew and Toni, Francesca},
title = {Hidden Conflicts in Neural Networks and their Implications for Explainability},
year = {2025},
isbn = {9798400714825},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3715275.3732100},
doi = {10.1145/3715275.3732100},
booktitle = {Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency},
pages = {1498–1542},
numpages = {45},
keywords = {conflicts, explainable AI, feature attributions, interpretability, out-of-distribution, trustworthy AI},
location = {
},
series = {FAccT '25}
}