Inside A-alpha Bio: Negative Data, AlphaSeq & Why Confidence ≠ Affinity

Better protein-design models need better data — and the data we're missing is what fails, not what works. Leo Wan and Michael Holden talk with Joseph (principal scientist) and Natasha (director of applied data science & ML) from A-alpha Bio about AlphaSeq, their library-on-library yeast-display platform that measures millions of protein-protein interactions per experiment; why negative data and "hard negatives" are the next frontier; their AlphaBind dataset of 7.5M antibody-antigen interactions; the SEPIA idea of designing binders against antibodies; and why confidence scores tell you a structure is valid, not whether it will bind. ▸ Work with RANOMICS: https://www.ranomics.com CHAPTERS 0:00 Welcome & meet A-alpha Bio (Joseph & Natasha) 3:27 What is A-alpha Bio / AlphaSeq 5:36 The case for negative data 8:55 Hard negatives, explained 11:17 AlphaBind: 7.5M binding interactions 12:19 Optimizing antibodies: "run it till it breaks" 16:53 What you actually do with the data 21:43 SEPIA: designing binders against antibodies 28:49 Confidence is not biophysics 33:41 What affinity should you expect? 36:03 The non-specific binding problem 43:34 Benchmarking models & the ATLAS consortium 46:08 One takeaway each 49:19 How to work with A-alpha Bio #proteindesign #machinelearning #antibodies #drugdiscovery #data