[go: up one dir, main page]

Skip to content

Simple Contrastive Embedding of the Primary sequence of T cell Receptors

License

Notifications You must be signed in to change notification settings

yutanagano/sceptr

Repository files navigation

Latest release Tests Documentation Status License arXiv

Check out the documentation page.


SCEPTR (Simple Contrastive Embedding of the Primary sequence of T cell Receptors) is a small, fast, and accurate TCR representation model that can be used for alignment-free TCR analysis, including for TCR-pMHC interaction prediction and TCR clustering (metaclonotype discovery). Our preprint demonstrates that SCEPTR can be used for few-shot TCR specificity prediction with improved accuracy over previous methods.

SCEPTR is a BERT-like transformer-based neural network implemented in Pytorch. With the default model providing best-in-class performance with only 153,108 parameters (typical protein language models have tens or hundreds of millions), SCEPTR runs fast- even on a CPU! And if your computer does have a CUDA-enabled GPU, the sceptr package will automatically detect and use it, giving you blazingly fast performance without the hassle.

sceptr's API exposes three intuitive functions: calc_vector_representations, calc_cdist_matrix, and calc_pdist_vector- and it's all you need to make full use of the SCEPTR models. What's even better is that they are fully compliant with pyrepseq's tcr_metric API, so sceptr will fit snugly into the rest of your repertoire analysis workflow.

Installation

pip install sceptr

Citing SCEPTR

Please cite our preprint.

BibTex

@misc{nagano2024contrastive,
      title={Contrastive learning of T cell receptor representations}, 
      author={Yuta Nagano and Andrew Pyo and Martina Milighetti and James Henderson and John Shawe-Taylor and Benny Chain and Andreas Tiffeau-Mayer},
      year={2024},
      eprint={2406.06397},
      archivePrefix={arXiv},
      primaryClass={q-bio.BM}
}