A semantic map of NeurIPS 24' Workshop on AI for New Drug Modalities

will connell · December 11, 2024

I built a semantic map that organizes research papers by analyzing their titles, abstracts, and keywords. This visualization highlights thematic clusters, hopefully making it easier to explore related research between domains. The themes that relate these papers are diverse, including biology, chemistry, problem domain or technical method.

To explore the papers interactively, select a paper that interests you and discover how it connects to others sharing similar themes within its cluster (toggle ‘Semantic Clusters’). Below is also a a table detailing all the papers, organized by cluster.

Subscribe for direct content through my Substack, Behind BioML.



A semantic map of AIDrugX Workshop papers

Title Authors Type Link
Cluster 1
DiffER: Categorical Diffusion Models for Chemical Retrosynthesis Sean Current ... srinivasan parthasarathy Accept Open Paper
Homomorphism Counts as Structural Encodings for Molecular Property Prediction Linus Bao ... Matthias Lanzinger Accept Open Paper
SmileyLlama: Modifying Large Language Models \\for Directed Chemical Space Exploration Joe Cavanagh ... Thomas D. Bannister Accept Open Paper
An Efficient Tokenization for Molecular Language Models Seojin Kim ... Jinwoo Shin Accept Open Paper
Chain-of-thoughts for molecular understanding Yunhui Jang ... Sungsoo Ahn Accept Open Paper
Cluster 2
Deep Interactions for Multimodal Molecular Property Prediction Patrick Soga ... Jundong Li Accept Open Paper
Geometry-text Multi-modal Foundation Model for Reactivity-oriented Molecule Editing Haorui Li ... Anima Anandkumar Accept Open Paper
3D Interaction Geometric Pre-training for Molecular Relational Learning Namkyeong Lee ... Chanyoung Park Accept Open Paper
GFlowNet Pretraining with Inexpensive Rewards Mohit Pandey ... Emmanuel Bengio Accept Open Paper
MolKD: Distilling Cross-Modal Knowledge in Chemical Reactions for Molecular Property Prediction Liang Zeng Accept Open Paper
Molphenix: A Multimodal Foundation Model for PhenoMolecular Retrieval Philip Fradkin ... Dominique Beaini Accept Open Paper
Learning Molecular Representation in a Cell Gang Liu ... Shantanu Singh Accept Open Paper
Understanding the Sources of Performance in Deep Drug Response Models Reveals Insights and Improvements Nikhil Branson ... Conrad Bessant Accept Open Paper
Reinforcement Learning for Enhanced Targeted Molecule Generation Via Language Models Salma J. Ahmed, Emad A. Mohammed Accept Open Paper
SMORE-DRL: Scalable Multi-Objective Robust and Efficient Deep Reinforcement Learning for Molecular Optimization Aws Al Jumaily ... Yifeng Li Accept Open Paper
Pharmacophore-based design by learning on voxel grids Omar Mahmood ... Vishnu Sresht Accept Open Paper
MF-LAL: Drug Compound Generation Using Multi-Fidelity Latent Space Active Learning Peter Eckmann ... Rose Yu Accept Open Paper
GNNAS-Dock: Budget Aware Algorithm Selection with Graph Neural Networks for Molecular Docking Yiliang Yuan, Mustafa Misir Accept Open Paper
Similarity-Quantized Relative Difference Learning for Improved Molecular Activity Prediction Karina Zadorozhny ... Colin A Grambow Accept Open Paper
Cluster 3
Improved Off-policy Reinforcement Learning in Biological Sequence Design Hyeonah Kim ... Jinkyoo Park Accept Open Paper
Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design Chenyu Wang ... Aviv Regev Accept Open Paper
Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding Xiner Li ... Masatoshi Uehara Accept Open Paper
Designing DNA With Tunable Regulatory Activity Using Discrete Diffusion Anirban Sarkar ... Peter K Koo Accept Open Paper
Latent Diffusion Models for Controllable RNA Sequence Generation Kaixuan Huang ... Mengdi Wang Accept Open Paper
Cluster 4
Modeling Complex System Dynamics with Flow Matching Across Time and Conditions Martin Rohbeck ... Romain Lopez Accept Open Paper
Correlational Lagrangian Schrodinger Bridge: Learning Dynamics with Population-Level Regularization Yuning You ... Yang Shen Accept Open Paper
Cluster 5
Training-Free Guidance with Applications to Protein Engineering Lewis Cornwall ... Aaron Sim Accept Open Paper
ProtPainter: Draw or Drag Protein via Topology-guided Diffusion Zhengxi Lu ... Min Zhang Accept Open Paper
TaxDiff: Taxonomic-Guided Diffusion Model for Protein Sequence Generation Zongying Lin ... Yonghong Tian Accept Open Paper
Structure Language Models for Protein Conformation Generation Jiarui Lu ... Jian Tang Accept Open Paper
MeMDLM: De Novo Membrane Protein Design with Masked Discrete Diffusion Protein Language Models Shrey Goel ... Pranam Chatterjee Accept Open Paper
Improving Antibody Design with Force-Guided Sampling in Diffusion Models Paulina Kulytė ... Pietro Lio Accept Open Paper
JAMUN: Transferable Molecular Conformational Ensemble Generation with Walk-Jump Sampling Ameya Daigavane ... Joseph Kleinhenz Accept Open Paper
Learning Protocols for Non-Equilibrium Conformational Free-Energy Estimation Using Optimal Transport and Conditional Flow Matching Lars Holdijk ... Max Welling Accept Open Paper
Generalized Flow Matching for Transition Dynamics Modeling Haibo Wang ... Yuanqi Du Accept Open Paper
Cluster 6
A Deep Generative Model for the Design of Synthesizable Ionizable Lipids Yuxuan Ou ... José Miguel Hernández-Lobato Accept Open Paper
Generative Model for Synthesizing Ionizable Lipids: A Monte Carlo Tree Search Approach Jingyi Zhao ... José Miguel Hernández-Lobato Accept Open Paper
Cluster 7
Discrete Diffusion Schrödinger Bridge Matching for Graph Transformation Jun Hyeong Kim ... Woo Youn Kim Accept Open Paper
Improving Molecular Graph Generation with Flow Matching and Optimal Transport Xiaoyang Hou ... Shiwei Sun Accept Open Paper
Directly Optimizing for Synthesizability in Generative Molecular Design using Retrosynthesis Models Jeff Guo, Philippe Schwaller Spotlight Open Paper
Generative Flows on Synthetic Pathway for Drug Design Seonghwan Seo ... Woo Youn Kim Accept Open Paper
Molecular Generation with State Space Sequence Models Anri Lombard ... Jan Buys Accept Open Paper
Saturn: Sample-efficient Generative Molecular Design using Memory Manipulation Jeff Guo, Philippe Schwaller Accept Open Paper
Best Practices for Multi-Fidelity Bayesian Optimization in Materials and Molecular Research Victor Sabanza Gil ... Loïc Roch Accept Open Paper
MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks Nayoung Kim ... Sungsoo Ahn Accept Open Paper
Improving Structural Plausibility in 3D Molecule Generation via Property-Conditioned Training with Distorted Molecules Lucy Vost ... Charlotte Deane Accept Open Paper
Applications of Modular Co-Design for De Novo 3D Molecule Generation Danny Reidenbach ... Saee Gopal Paliwal Accept Open Paper
Cluster 8
Modeling variable guide efficiency in pooled CRISPR screens with ContrastiveVI+ Ethan Weinberger ... Ryan Conrad Accept Open Paper
Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all Ihab Bendidi ... Alisandra Kaye Denton Accept Open Paper
PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis Yan Wu ... Thore Graepel Spotlight Open Paper
PertEval-scFM: Benchmarking Single-Cell Foundation Models for Perturbation Effect Prediction Aaron Wenteler ... César Miguel Valdez Córdova Accept Open Paper
Weighted Diversified Sampling for Efficient Data-Driven Single-Cell Gene-Gene Interaction Discovery Yifan Wu ... Zhaozhuo Xu Accept Open Paper
Active learning for efficient discovery of optimal gene combinations in the combinatorial perturbation space Jason Qin ... Yuhan Hao Accept Open Paper
CancerFoundation: A single-cell RNA sequencing foundation model to decipher drug resistance in cancer Alexander Theus ... Valentina Boeva Accept Open Paper
Signals in the Cells: Multimodal and Contextualized Machine Learning Foundations for Therapeutics Alejandro Velez-Arce ... Marinka Zitnik Spotlight Open Paper
Learning multi-cellular representations of single-cell transcriptomics data enables characterization of patient-level disease states Tianyu Liu ... Graham Heimberg Accept Open Paper
Scaling Dense Representations for Single Cell Gene Expression with Transcriptome-Scale Context Nicholas Ho ... Eric P. Xing Accept Open Paper
Cell ontology guided transcriptome foundation model Xinyu Yuan ... Jian Tang Accept Open Paper
Cluster 9
Video Representation Learning of Cardiac MRI for Genetic Discovery Matt Sooknah ... Jun Xu Accept Open Paper
Representation Learning based Target Discovery from UKBB MRI data Sivaramakrishnan Sankarapandian ... Jun Xu Accept Open Paper
Cluster 10
Foundational Model-aided Automatic High-throughput Drug Screening Using Self-controlled Cohort Study Shenbo Xu ... Kenney Ng Accept Open Paper
TrialDura: Hierarchical Attention Transformer for Interpretable Clinical Trial Duration Prediction Ling Yue ... Tianfan Fu Accept Open Paper
Interpretable Causal Representation Learning for Biological Data in the Pathway Space Jesus de la Fuente Cedeño ... Mikel Hernaez Accept Open Paper
Learning to refine domain knowledge for biological network inference Peiwen Li, Menghua Wu Accept Open Paper
FluxGAT: Integrating Flux Sampling with Graph Neural Networks for Unbiased Gene Essentiality Classification Kieren Sharma ... Lucia Marucci Accept Open Paper
Small-cohort GWAS discovery with AI over massive functional genomics knowledge graph Kexin Huang ... Jure Leskovec Accept Open Paper
Leveraging Disease-Specific Topologies and Counterfactual Relationships in Knowledge Graphs for Inductive Reasoning in Drug Repurposing Cerag Oguztuzun ... Rong Xu Spotlight Open Paper
A Foundational Multi-Modal Knowledge Graph for Pancreatic Cancer Drug Effects Prediction Jingwen Hui ... Anima Anandkumar Accept Open Paper
Cluster 11
TCRGenesis: Generation of SIINFEKL-specific T-cell receptor sequences using autoregressive Transformer Yang An ... Benjamin Schubert Accept Open Paper
Modeling CAR Response at the Single-Cell Level Using Conditional OT Alice Driessen ... Marianna Rapsomaniki Accept Open Paper
Disentangling the Peptide Space: A Contrastive Approach with Wasserstein Autoencoders Mihir Agarwal, Progyan Das Accept Open Paper
Epitope Generation for Peptide-based Cancer Vaccine using Goal-directed Wasserstein Generative Adversarial Network with Gradient Penalty Yen-Che Hsiao, Abhishek Dutta Accept Open Paper
Cluster 12
LLMs are Highly-Constrained Biophysical Sequence Optimizers Angelica Chen ... Nathan C. Frey Spotlight Open Paper
Language Models for Text-guided Protein Evolution Zhanghan Ni ... Anima Anandkumar Accept Open Paper
Distilling Structural Representations into Protein Sequence Models Jeffrey Ouyang-Zhang ... Daniel Jesus Diaz Accept Open Paper
LatentDE: Latent-based Directed Evolution accelerated by Gradient Ascent for Protein Sequence Design Thanh V. T. Tran ... Truong Son Hy Accept Open Paper
PLMFit: Benchmarking Transfer Learning with Protein Language Models for Protein Engineering Thomas Bikias ... Sai T. Reddy Accept Open Paper
Harnessing Preference Optimisation in Protein LMs for Hit Maturation in Cell Therapy Katarzyna Janocha ... Nils Yannick Hammerla Accept Open Paper
Alignment-based and protein foundation models for viral evolution, vaccines and vectors Sarah Gurev ... Debora Susan Marks Spotlight Open Paper
Exploring Log-Likelihood Scores for Ranking Antibody Sequence Designs Talip Ucar ... Ferran Gonzalez Accept Open Paper
EnzymeFlow: Generating Reaction-specific Enzyme Catalytic Pockets through Flow Matching and Co-Evolutionary Dynamics Chenqing Hua ... Shuangjia Zheng Accept Open Paper
Evaluating synergies among generative design models for multi-objective optimization of drug-like proteins Jung-Eun Shin ... Ryan Peckner Accept Open Paper
Machine learning enables engineering of potent, specific, and therapeutically developable proteases Jung-Eun Shin ... Ryan Peckner Accept Open Paper
IgBlend: Unifying 3D Structure and Sequence for Antibody LLMs Cedric Malherbe, Talip Ucar Spotlight Open Paper
Antibody Library Design by Seeding Linear Programming with Inverse Folding and Protein Language Models Conor F. Hayes ... Mikel Landajuela Accept Open Paper
Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences Alan Nawzad Amin ... Andrew Gordon Wilson Spotlight Open Paper
Computational Antigen Optimization through Symbolic Optimization and Affinity Maturation Simulation Jonathan G. Faris ... Felipe Leno da Silva Spotlight Open Paper
Cluster 13
DeepADAR: A deep learning approach to model regulatory elements of ADAR-based RNA editing and its application to gRNA design Andrew J Jung ... Brendan Frey Spotlight Open Paper
Detection of RNA Editing Sites by GPT Fine-tuning Zohar Rosenwasser ... Gal Oren Accept Open Paper
Mixture of Experts Enable Efficient and Effective Protein Understanding and Design Ning Sun ... Eric P. Xing Spotlight Open Paper
A Large-Scale Foundation Model for RNA Function and Structure Prediction Shuxian Zou ... Eric P. Xing Spotlight Open Paper
Accurate and General DNA Representations Emerge from Genome Foundation Models at Scale Caleb Ellington ... Le Song Accept Open Paper
Orthrus: Towards Evolutionary and Functional RNA Foundation Models Philip Fradkin ... BO WANG Spotlight Open Paper
HELM: Hierarchical Encoding for mRNA Language Modeling Mehdi Yazdani-Jahromi ... Rui Liao Accept Open Paper
Beyond Sequence: Impact of Geometric Context for RNA Property Prediction Junjie Xu ... Rui Liao Accept Open Paper
ML-driven design of 3’ untranslated regions for mRNA stability Alyssa Kramer Morrow ... Uri Laserson Accept Open Paper
Cluster 14
AlphaFold3, a secret sauce for predicting mutational effects on protein-protein interactions Wei Lu ... Shuangjia Zheng Accept Open Paper
BindingGYM: A Large-Scale Mutational Dataset Toward Deciphering Protein-Protein Interactions Wei Lu ... Shuangjia Zheng Accept Open Paper
A Deep Learning Approach for RNA-Compound Interaction Prediction with Binding Site Interpretability Haelee Bae, Hojung Nam Spotlight Open Paper
AptaBLE: A Deep Learning Platform for SELEX Optimization Sawan Patel ... Sherwood Yao Accept Open Paper
DeepProtein: Deep Learning Library and Benchmark for Protein Sequence Learning Jiaqing Xie ... Tianfan Fu Spotlight Open Paper
Understanding Protein-DNA Interactions by Paying Attention to Protein and Genomics Foundation Models Dhruva Rajwade ... Anima Anandkumar Accept Open Paper
Cluster 15
Bridging the Gap between Database Search and \emph{De Novo} Peptide Sequencing with SearchNovo Jun Xia ... Stan Z. Li Spotlight Open Paper
Effective Protein-Protein Interaction Exploration with PPIretrieval Chenqing Hua ... Shuangjia Zheng Accept Open Paper
PQA: Zero-shot Protein Question Answering for Free-form Scientific Enquiry with Large Language Models Eli M Carrami, Sahand Sharifzadeh Spotlight Open Paper
GeneGench: Systematic Evaluation of Genomic Foundation Models and Beyond Zicheng Liu ... Stan Z. Li Accept Open Paper
Diverse Genomic Embedding Benchmark for functional evaluation across the tree of life. Jacob West-Roberts ... Yunha Hwang Accept Open Paper
MV-CLAM: Multi-View Molecular Interpretation with Cross-Modal Projection via Language Model Sumin Ha ... Sun Kim Accept Open Paper
Natural Language Prompts Guide the Design of Novel Functional Protein Sequences Niksa Praljak ... Andrew Ferguson Accept Open Paper
Probing the Embedding Space of Protein Foundation Models through Intrinsic Dimension Analysis Soojung Yang ... Rafael Gomez-Bombarelli Accept Open Paper
PepDoRA: A Unified Peptide Language Model via Weight-Decomposed Low-Rank Adaptation Leyao Wang ... Pranam Chatterjee Accept Open Paper

Methods

  • Data pulled from the AIDrugX Workshop OpenReview portal here
  • Find the workshop website here
  • Semantic embeddings were generated using Sentence-BERT, using paper titles, abstracts, and keywords
  • PCA was applied to reduce dimensionality for visualization

This is a cross post from my Behind BioML substack. Find the original here.

Twitter, Facebook