This repository contains well-established datasets for interpretable and reliable protein language model (pLM) benchmarking.
All included datasets are listed below. Details and files can be found in the respective folders.
If you want to benchmark a new or existing pLM on these datasets, please check out one of the following methods:
- biotrainer: autoeval - Automatic evaluation of pLMs on our supervised benchmark datasets. You can find an example notebook here.
- BETA biocentral: plm_eval - Automatic evaluation of pLMs on all benchmark datasets, including a visual leaderboard and model-to-model comparison.