Listen "SAIR: Accelerating Pharma R&D with AI-Powered Structural Intelligence"
Episode Synopsis
This September 2025 paper describe SAIR, the Structurally Augmented IC50 Repository, a groundbreaking open-source dataset developed by SandboxAQ in collaboration with NVIDIA. SAIR is the largest publicly available collection of over 5 million AI-generated 3D protein-ligand structures, each linked with experimentally measured drug potency data (IC₅₀ values). This dataset aims to bridge a critical data gap in AI-powered drug discovery by providing comprehensive structural intelligence, thereby enabling researchers to accelerate R&D, explore novel drug targets, and improve the accuracy of AI models for predicting drug properties. The creation of SAIR involved extensive high-performance computing, taking over 130,000 GPU hours, and its structures were rigorously validated with industry-standard tools, achieving a 97% pass rate. By offering this resource for free commercial and non-commercial use on platforms like Hugging Face, SAIR seeks to revolutionize how pharmaceutical, biotech, and tech-bio leaders approach drug design and optimization.Sources:https://go.sandboxaq.com/rs/175-UKR-711/images/sair_paper.pdfhttps://huggingface.co/datasets/SandboxAQ/SAIRhttps://huggingface.co/blog/SandboxAQ/sair-data-accelerating-drug-discovery-with-ai
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.