Principal Research Scientist
Overview
We are seeking a highly analytical and mathematically rigorous Research Scientist to develop next-generation algorithms for encoding digital information into DNA. This role sits at the intersection of information theory, computational science, and experimental biology, with a primary focus on translating theoretical constructs into scalable wet-lab and computational workflows.
The successful candidate will play a central role in building a “codex” framework for DNA-based data storage, beginning with reproducing published work and advancing toward optimized, scalable encoding systems.
Key Responsibilities
- Reproduce and validate existing DNA data encoding methods from published literature
- Develop and refine algorithms for encoding digital data into DNA sequences, grounded in Shannon information theory and entropy principles
- Design and execute experiments to test encoding efficiency, error rates, and scalability
- Collaborate across computational and wet-lab domains to integrate theoretical models with experimental workflows
- Optimize protocols for robustness, throughput, and cost-efficiency
- Leverage available GPU resources for simulation, modeling, and optimization tasks
- Contribute to the development of a scalable “codex” architecture for DNA data storage
- Document findings and support knowledge transfer within a cross-disciplinary team
Candidate Profile
Essential Requirements:
- Strong mathematical foundation, particularly in:
- Shannon information theory
- Entropy and coding theory
- Probabilistic modeling and abstract problem formulation
- Ability to translate theoretical constructs into practical algorithms and experimental designs
- Experience working across disciplines (e.g., physics, mathematics, computer science, or related fields)
- Demonstrated capability to independently reproduce and critically assess published research
- Proficiency in programming (e.g., Python, C++, or similar) and working with computational resources such as GPUs
Highly Desirable:
- Background in astrophysics, theoretical physics, or other mathematically intensive domains
- Familiarity with DNA data storage concepts or synthetic biology workflows
- Experience with error-correcting codes or data compression algorithms
- Exposure to machine learning or AI methodologies applied to scientific problems