The rapid advancement of protein:ligand structure prediction tools, such as AlphaFold3 and RoseTTAFold All-Atom, presents a transformative opportunity for small molecule drug discovery.
However, these tools are limited by one thing: data. Today’s models lack access to large, high-quality datasets of protein-ligand structures, which are essential for training AI to predict how drugs interact with the body.
Why OpenBind?
With OpenBind we’re building the world’s largest open dataset of protein-ligand interactions. Rather than being reactive and focusing on a particular challenge, or one company’s drug pipeline, OpenBind will provide data for the entire scientific community.This will unlock:
- Smarter AI models that generalise across diseases and targets
- Better predictions of how molecules bind and behave
- Faster, more reliable drug discovery
OpenBind also tackles a critical gap in the field: the lack of consistent, public affinity data and real-world testing environments. By generating clean, scalable datasets and running regular blind prediction challenges, we’ll accelerate innovation and push AI tools past a key inflection point.
How will it work?
The OpenBind Project is designed to meet complex needs in a highly co-ordinated and streamlined way. Our mission is to deliver high-value open data and open models quickly and cost-effectively, enabling innovation and accessibility for all.The goal of this initiative is to accelerate the development of next-generation structure-based drug discovery tools, helping the technology move beyond a critical inflection point.
Key benefits of the OpenBind project include:
- High-Throughput Access:
- Using Diamond’s XChem and MX beamlines for rapid fragment screening and structural analysis.
- Two-Way Data Flow:
- Partners share compounds and metadata.
- Diamond returns processed structures and FAIR-compliant datasets.
- Secure & Transparent:
- Tiered-access portals ensuring confidentiality whilst enabling agreed data release timelines.






