Welcome to Protein Workshop’s documentation!#
This repository provides the code for the protein structure representation learning benchmark detailed in the paper Evaluating Representation Learning on the Protein Structure Universe.
In the benchmark, we implement numerous featurisation schemes, datasets for self-supervised pre-training and downstream evaluation, pre-training tasks, and auxiliary tasks.
The benchmark can be used as a working template for a protein representation learning research project, a library of drop-in components for use in your projects, or as a CLI tool for quickly running protein representation learning evaluation and pre-training configurations.
Processed datasets and pre-trained weights are made available. Downloading datasets is not required; upon first run all datasets will be downloaded and processed from their respective source.