Precedent finder

Example code for the Precedent finder project. The code works with the USPTO dataset.

The code was made for providing an open-source version of the tool, it was never intended to be production ready. Nor will it be maintained.

Prerequisites

Before you begin, ensure you have met the following requirements:

Linux, Windows or macOS platforms are supported - as long as the dependencies are supported on these platforms.
You have installed anaconda or miniconda with python 3.8-3.11

The tool has been developed on a Linux platform.

Installation

First clone the repository using Git.

Then execute the following commands in the root of the repository

conda env create -f env.yml
conda activate pfenv

Finally, you need to download the pre-processed USPTO dataset to the data folder. The files are located at Zenodo and can be downloaded with

python data/download_data.py

Usage

To use the Precedent Finder tool use the precedent_finder.py script

conda activate pfenv
python precedent_finder.py --smiles REACTION_SMILES --output precedents.csv

or in interactive mode

conda activate pfenv
python precedent_finder.py

in this mode, you will be ask for one or more reaction SMILES and the results are saved to individual CSV-files.

Note: the reaction SMILES need to be atom-mapped. This can accomplished with the rxnmapper project.

The examples folder contains worked examples with a Jupyter notebook showing how the output can be analyzed/visualized.

Pre-processing data

To re-produce the pre-processing of the USPTO data, you can to do the following

First change to the data folder

cd data

Second, perform the pipelines in the rxnutils package to process the USPTO data set, as described here

Finally, run the pipeline in the data folder with something like this

conda activate pfenv
python setup_pipeline.py run --nbatches 200  --max-workers 8 --max-num-splits 200

Contributors

Samuel Genheden
Christoph Bauer
Thierry Kogej
Per-Ola Norrby

The contributors have limited time for support questions.

License

The software is licensed under the Apache 2.0 license (see LICENSE file), and is free and provided as-is.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Precedent finder

Prerequisites

Installation

Usage

Pre-processing data

Contributors

License

References

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
env.yml		env.yml
precedent_finder.py		precedent_finder.py

License

MolecularAI/precedent_finder

Folders and files

Latest commit

History

Repository files navigation

Precedent finder

Prerequisites

Installation

Usage

Pre-processing data

Contributors

License

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages