AI in Life Science: Weekly Insights

Weekly Insights | March 9, 2025

Mar 09, 2025

In this issue:

Welcome back to your weekly dose of AI news for Life Science!

This week, we have some exciting new models lined up for you:

Dive into these game-changing innovations and explore how they are transforming the biotech and healthcare landscapes!

If you are interested to learn how you can connect all these tools, reach out to us!

MDCrow: Automating Molecular Dynamics Simulations with AI 🤖

Molecular dynamics (MD) simulations play a crucial role in understanding biomolecular interactions, but their complexity makes them challenging to automate. MDCrow, developed at the University of Rochester, is an AI-driven agent that integrates 40+ expert-designed tools to streamline simulation setup, execution, and analysis. By leveraging large language models like GPT-4o and Llama3-405b, MDCrow automates tedious tasks, allowing researchers to focus on scientific discovery.

🔨 Applications:

Drug Discovery & Protein-Ligand Interactions - Automates binding studies, stability assessments, and force field optimisations to accelerate therapeutic research.
Structural Biology & Biomolecular Simulations - Predicts molecular motion, folding dynamics, and conformational changes to support protein engineering and material science.
Computational Chemistry & High-Throughput Screening - Enables large-scale MD simulations across HPC environments, optimising molecular designs with AI-driven analysis.

📌 Key Insights:

Automates MD Workflows: Handles PDB file preparation, force field selection, parameter optimisation, and analysis without human intervention.
LLM-Powered Flexibility: Uses GPT-4o and Llama3-405b, achieving >72% task completion on complex MD simulations.
Error Handling & Adaptability: Dynamically adjusts parameters, fixes missing force fields, and retrieves relevant literature to improve simulations.
Interactive & Adaptive Execution: Users can pause, resume, and refine simulations interactively, allowing step-by-step guidance or full automation.

FusOn-pLM: Predict fusion onco-protein 🧬

Fusion oncoproteins, arising from chromosomal translocations, drive pediatric and other cancers but are challenging therapeutic targets due to intrinsic disorder and lack of druggable pockets. Existing protein language models (pLMs) like ESM-2 and ProtT5, trained on stable proteins, fail to capture fusion-specific features. Introducing FusOn-pLM, a pLM fine-tuned on 44,414 fusion oncoprotein sequences (FusOn-DB). FusOn-pLM addresses a critical gap by providing biologically relevant embeddings for fusion oncoproteins, enabling advancements in therapeutic discovery for fusion-driven cancers.

🔨 Applications:

Drug Resistance Prediction: FusOn-pLM identifies resistance mutations in kinase fusions aiding in designing therapies that anticipate resistance.
Disorder-Aware Biologics Design: Accurately predicts disordered regions and IDR properties, informing the design of antibodies, peptides, or degraders targeting fusion-specific features.
Phase Separation Analysis: Predicts puncta formation and localization (nuclear/cytoplasmic), offering insights into condensate-driven oncogenic mechanisms.

📌 Key Insights:

Performance: Outperforms ESM-2, ProtT5, and FOdb embeddings in fusion-specific tasks (e.g., puncta prediction AUROC >0.9) and ranks top 5 in CAID2 disorder prediction (AUROC=0.825).
Training Data: Curated FusOn-DB with 44,414 sequences, including rare fusions, validated against AlphaFold2 structures and experimental datasets.
Innovative Training: Cosine-scheduled masking (15%–40%) improves sequence reconstruction (pseudo-perplexity=3.61) and feature extraction, balancing context learning and difficulty.

DiffPROTACs: Deep learning-based model for PROTACs design 🔬

PROTACs (PROteolysis TArgeting Chimeras) are heterobifunctional small molecule compounds, which consist of a ligand for the target protein, a linker, and a ligand to recruit E3 ligase. They have emerged as a promising drug discovery strategy, but rational PROTAC design remains challenging, particularly for linker generation. Introducing DiffPROTACs, a novel diffusion model that combines Transformers and graph neural networks (GNNs) to generate new PROTAC linkers with high validity based on the spatial structure of the warhead and ligand.

🔨 Applications:

Targeting ‘Undruggable’ Proteins – By designing linkers that enable PROTACs to degrade previously inaccessible proteins, DiffPROTACs unlock new therapeutic avenues for diseases that currently lack effective treatments.

📌 Key Insights:

Employs the OEGT module, integrating GNN and Transformer architectures to ensure rotational equivariance within the model.
Achieves 93.86% validity for generated PROTACs.
Demonstrates comparable performance to the current state-of-the-art models (such as DiffLinker and DeLinker) on FBDD data.
Patterns learned from DiffPROTACs allowed to augment the PROTAC dataset with 1,724,424 unique linkers, resulting in a dataset of 2,601,818 PROTACs for further research.

Kiin Bio Weekly

Discussion about this post

Ready for more?

Kiin Bio Weekly

AI in Life Science: Weekly Insights

Weekly Insights | March 9, 2025

In this issue:

MDCrow: Automating Molecular Dynamics Simulations with AI 🤖

🔨 Applications:

📌 Key Insights:

FusOn-pLM: Predict fusion onco-protein 🧬

🔨 Applications:

📌 Key Insights:

DiffPROTACs: Deep learning-based model for PROTACs design 🔬

🔨 Applications:

📌 Key Insights:

Connect With Us

Discussion about this post

Ready for more?