Logan Hallee lhallee

👋 Hi, I’m @lhallee

My name is Logan Hallee, a PhD Candidate in Bioinformatics Data Science at the University of Delaware (Gleghorn Lab) specializing in curating large-dimensional feature spaces for biological data. My primary focus is on protein design and annotation using transformer neural networks. Techniques I developed led to the creation of SYNTERACT, the first large language model approach to protein-protein interaction prediction, ranking in the top 3% of research outputs by Altmetric.

At the Wolfram Winter School, I collaborated with Stephen Wolfram and other mentors to create "Tetris For Proteins," a shape-based metric for protein-protein interactions that emulates lock-and-key enzyme-substrate dynamics, generating hypotheses about protein aggregation likelihood.

I created the Annotation Vocabulary, a unique set of integers mapped to popular protein and gene ontologies, enabling state-of-the-art protein annotation and generation models when used with its own token embedding.

My work also supports the paradigm of codon usage bias as a key biological phenomenon for phylogenetic analysis. Our models, published in Nature Scientific Reports, highlight codon usage as a unique phylogenetic predictor. Our lab recently produced cdsBERT, showcasing cost-effective techniques to enhance the biological relevance of protein language models using a codon vocabulary.

In natural language processing, I invented Mixture of Experts extension for scalable transformer networks adept at sentence similarity tasks. We believe future networks with N experts will perform like N independently trained networks, offering significant time and computational savings for vector retrieval systems and search relying on semantic vector representations.

I also manage lab projects in computer vision, utilizing deep learning to reconstruct anatomically accurate 3D organs from 2D Z-stacks, informing morphometric and pharmacokinetic studies.

Some other stuff I've worked on over the years:

featureranker, a Python package for feature ranking
My textbook section about Protein Language Models
Machine learning to identify cardioprotective molecules in minority groups
Writing about the relationships of Hsp90 and Gamma secretase in cardiac diseases

Norway, ME ➔ Newark, DE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logan Hallee lhallee

Achievements

Achievements

Block or report lhallee

Pinned Loading