Job Summary We are seeking a scientist to join our team at Iambic Therapeutics, working on data acquisition and curation for Enchant , our multimodal transformer model trained at scale on a wide variety of biomedical data. In this role, you will design and build agentic systems that acquire, clean, format, and quality-control the large-scale datasets that power Enchant training. You will work at the intersection of LLM-based automation and biomedical data engineering—developing AI agents that can navigate heterogeneous data sources, enforce quality standards, and operate reliably at scale. This role is ideal for candidates who combine strong software engineering instincts with scientific understanding of biomedical data, and who are excited about using LLMs as tools to solve practical data problems. Key
Responsibilities Design, build, and maintain agentic systems for automated data acquisition from public and proprietary biomedical data sources Develop LLM-based pipelines for data cleaning, normalization, and formatting across diverse data modalities (e.g., molecular, genomic, clinical, literature) Implement automated quality-control workflows that detect anomalies, flag inconsistencies, and enforce data standards Evaluate and iterate on agent architectures, prompting strategies, and tool-use patterns to improve reliability and throughput Collaborate with ML scientists on the Enchant team to understand data
requirements and translate them into scalable acquisition and processing systems Monitor and maintain data pipelines in production, diagnosing failures and improving robustness over time Document data provenance, processing decisions, and quality metrics to support reproducibility and auditing
Qualifications Required: Master's or PhD in a computational STEM field, or equivalent industry experience Strong Python engineering skills, including experience building and maintaining production-quality software Hands-on experience with LLM APIs (e.g., Claude, GPT) and agentic patterns such as tool use, orchestration, and multi-step reasoning Familiarity with biomedical or chemical data sources and formats (e.g., PDB, UniProt, ChEMBL, SDF/MOL, FASTA, or similar) Comfort with data engineering fundamentals: ETL design, data validation, and working with structured and unstructured data at scale Desired: Experience with agent orchestration frameworks Familiarity with cloud infrastructure and workflow orchestration (e.g., AWS, Docker, Kubernetes) Knowledge of multimodal biomedical data—spanning small molecules, proteins, assays, images, ‘omics, and/or clinical records Experience with large-scale dataset construction or curation for ML model training Location Remote (US or UK). On-site available in Bristol, UK and Boston, US. ABOUT IAMBIC THERAPEUTICS Iambic is a clinical-stage life-science and technology company developing novel medicines using its AI-driven discovery and development platform. Based in San Diego and founded in 2020, Iambic has assembled a world-class team that unites pioneering AI experts and experienced drug hunters. The Iambic platform has demonstrated delivery of new drug candidates to human clinical trials with unprecedented speed and across multiple target classes and mechanisms of action. Iambic is advancing a pipeline of potential best-in-class and first-in-class clinical assets, both internally and in partnership, to address urgent unmet patient need. Learn more about the Iambic team, platform, pipeline, and partnerships at iambic.ai . MISSION & CORE VALUES Our mission is to deliver better medicines through innovations in AI-based discovery technologies. The culture and work at Iambic Therapeutics are profoundly strengthened by the diversity of our people and our differences in background, culture, national origin, religion, sexual orientation, and life experiences. We are committed to building an inclusive environment where a diverse group of talented humans work together to discover therapeutics and create technologies. PAY AND
BENEFITS We offer industry leading competitive pay, company paid healthcare, flexible spending accounts, voluntary life insurance, 401K matching, and uncapped vacation to our team. We are in a brand-new state-of-the art facility in beautiful San Diego with an onsite gym, dining, and easy access to great places to live and play.
Machine Learning Scientist — Large multimodal models
Senior Scientist, Computational Chemistry
Director/Senior Director, Medicinal Chemistry
Associate Director/Director, Biology (Oncology Drug Discovery)
Software Engineer II, Machine Learning Systems & Productization
Salary
$148,000 - $210,000
Location
Boston, MA, US
Total raised
$306.0M
Last stage
Series C
Investors
No applications, no recruiter spam. Just the intro.
A few questions to make sure this role is the right shape for you. Two minutes.
I write the intro, send it to the founder, and handle the back-and-forth.
If they’re a yes, I book the chat. You show up — that’s the whole job-hunt.