An index of datasets, SDKs, APIs and other open source code created by Microsoft researchers and shared with the broader academic community. We also maintain a collection highlighting some of the tools you’ll find here.
MatterGen
MatterGen is a generative model for inorganic materials design across the periodic table that can be fine-tuned to steer the generation towards a wide range of property constraints.
HeurAgenix
HeurAgenix is a novel framework based on LLM, designed to generate, evolve, evaluate, and select heuristic algorithms for solving combinatorial optimization problems. It leverages the power of large language models to autonomously handle various optimization…
MarS
MarS is a cutting-edge financial market simulation engine powered by the Large Market Model (LMM), a generative foundation model.
Reducio Variational Autoencoder (Reducio-VAE)
Reducio-VAE is a model for encoding videos into an extremely small latent space. It is part of the Reducio-DiT, which is a highly efficient video generation method. Reducio-VAE encodes a 16-frame video clip to T/4∗H/32∗W/32…
TamGen
This is the implementation of the paper “TamGen: Target-aware Molecule Generation for Drug Design Using a Chemical Language Model”.
RAD-DINO model
RAD-DINO is a vision transformer model trained to encode chest X-rays using the self-supervised learning method DINOv2. RAD-DINO is described in detail in RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision (F. Pérez-GarcÃa, H. Sharma, S.…