Deep mutational scanning services
Deep mutational scanning services that map the fitness landscape of your protein — every single amino acid substitution at every position, scored for function in a single pooled experiment.
Discuss your project →What is deep mutational scanning?
Deep mutational scanning (DMS) is a high-throughput protein engineering method that measures the functional impact of every possible single amino acid substitution in a protein region of interest. A saturation mutagenesis library, typically all 19 non-wild-type substitutions at every position in the target region, is constructed, introduced into a display or selection system, and subjected to functional pressure.
Pre- and post-selection populations are sequenced by NGS, and the enrichment or depletion of each variant quantifies its fitness relative to wild type. The result is a complete mutational landscape: a position-by-substitution matrix that reveals which residues are essential, which tolerate mutation, and which substitutions improve function.
DMS was introduced by Fowler and Fields in 2014 and has since become the standard approach for mapping fitness landscapes in antibody affinity maturation, enzyme engineering, viral epitope mapping, and benchmarking AI-based variant effect predictors. It delivers more functional data in one pooled experiment than years of conventional single-mutant screening.
How a deep mutational scanning experiment works
Every DMS campaign follows the same four-stage logic. The selection step is what makes the score meaningful: choose the wrong selection and you measure the wrong thing.
Library design
Saturation mutagenesis using NNK, NNS, or trimer codons across the target region. Library diversity is matched to selection throughput so every variant is sampled by 100-1000x before sorting.
Functional selection
Library transformed into a display or assay system, then sorted under the phenotype of interest: binding, surface expression, catalytic activity, or in vivo fitness. Stringency tunes which substitutions register as gain or loss.
NGS readout
Pre- and post-selection populations sequenced on Illumina or PacBio. UMI-tagged reads and replicate sorting control for sampling noise and PCR bias.
Fitness scoring
Log-enrichment ratios computed per variant, normalized to wild-type and synonymous control variants. Output: a position by amino acid fitness matrix ready for downstream design.
What you receive from a DMS experiment
Fitness score matrix
A position-by-amino-acid matrix of log-enrichment ratios. Each cell quantifies how a specific substitution at a specific position affects the selected phenotype relative to wild type. Scores are normalized and reproducible across replicates.
Heatmap visualization
Publication-ready heatmaps showing gain-of-function, loss-of-function, and neutral mutations across the entire scanned region. Immediately identifies functional hotspots, conserved positions, and mutation-tolerant loops.
Beneficial variant list
Rank-ordered list of substitutions that improve the selected phenotype — binding affinity, thermostability, expression level, or catalytic activity depending on your selection pressure. Ready for combinatorial optimization.
Combinatorial design guidance
Beneficial single mutations are candidates for combination. DMS data de-risks combinatorial library design by identifying which positions tolerate simultaneous substitution, guiding the next round of optimization.
DMS applications in protein engineering
Affinity maturation
Identify every substitution that improves binding at every position in a binder or antibody. Combine top hits to achieve multi-log improvements in Kd without random screening.
Stability engineering
Map thermostability contributions across your protein. Identify stabilizing mutations for formulation development, extended shelf life, or higher expression yields.
Epitope mapping
Determine which target residues are critical for binder interaction. Loss-of-binding mutations on the target surface map the functional epitope at single-residue resolution.
Variant effect prediction benchmarking
Generate ground-truth fitness data to benchmark computational variant effect predictors (ESM, EVE, AlphaMissense). Essential for calibrating in silico models against experimental reality.
Training data for AI protein models
DMS datasets are among the highest-value training inputs for machine learning models that predict variant effects, protein fitness, and structure-function relationships. Systematic, quantitative, and covering complete sequence space, DMS data provides the ground truth that computational models need to generalize.
Enzyme optimization
Score every substitution for catalytic activity, substrate specificity, or product selectivity using activity-coupled selection. Identify positions that decouple activity from stability.
Functional scoring methods for DMS
The phenotype you measure determines the mutations you find. We match the selection system to your engineering objective.
Display-based binding selection
Variants displayed on yeast or mammalian cells, sorted by FACS for target binding. Enrichment ratios quantify the binding fitness of each substitution. Multi-concentration sorting provides apparent affinity rankings across the entire library.
Surface expression as stability proxy
Surface expression level on display platforms correlates with thermodynamic stability. Sorting for high expression enriches stabilizing mutations and depletes destabilizing ones. A rapid, functional proxy for thermal stability measurements.
Activity-based functional selection
Variants selected for enzymatic or biological activity rather than binding. Intracellular selection links protein function to cell survival or reporter output, while microfluidic droplet compartmentalization enables single-variant catalytic readouts. Examples include recombinase activity in cells and polymerase activity via droplet PCR.
Foundational papers in deep mutational scanning
The methodology and downstream analysis behind every DMS campaign rests on a small set of foundational papers. These are the references our protocols and scoring pipelines build on.
Deep mutational scanning questions
What is deep mutational scanning? +
Deep mutational scanning (DMS) is a high-throughput protein engineering technique that measures the functional impact of thousands of amino acid substitutions in a single pooled experiment. A saturation mutagenesis library is subjected to functional selection (binding, expression, or catalytic activity), and pre- and post-selection populations are sequenced by NGS to compute enrichment-based fitness scores for every variant.
How many variants can DMS characterize in a single experiment? +
A full saturation DMS campaign covers 19 non-wild-type amino acid substitutions at every position in the target region: 19 × N variants for a protein of length N. For a 100-residue domain, that is approximately 1,900 variants characterized in a single experiment with quantitative fitness scores.
What protein engineering objectives can DMS address? +
Affinity maturation (identifying substitutions that improve binding Kd), stability engineering (finding thermostabilizing mutations), epitope mapping (determining which target residues are critical for binder interaction), enzyme optimization (scoring substitutions for catalytic activity or substrate specificity), and generating training data for AI protein engineering models.
What library sizes do deep mutational scanning experiments require? +
A saturation DMS library covers 19 × N variants for a target region of length N. For a 100-residue domain that is approximately 1,900 variants; for a full 250-residue scFv it is approximately 4,750. We oversample at 100-1000x coverage during transformation and sorting so that every variant is sampled enough times to call enrichment confidently after NGS.
What NGS readout do you use for DMS? +
Most campaigns use Illumina paired-end sequencing (MiSeq or NextSeq) with UMI tagging to control for PCR amplification bias. For targets longer than the Illumina read length, we use PacBio HiFi sequencing to capture full-length variants in single reads. Replicate sequencing of pre- and post-selection pools is standard for variance estimation.
What is the typical turnaround time for a DMS campaign? +
A standard DMS campaign from library design through fitness matrix delivery runs 10-14 weeks. Library construction takes 3-4 weeks, selection and sorting takes 2-4 weeks depending on rounds, and NGS plus analysis takes 3-4 weeks. Timeline is specified in the SOW at project start.
When is DMS the right choice vs single-variant assays or alanine scanning? +
DMS makes sense when you need to characterize the full mutational tolerance of a protein region, when downstream design depends on knowing which positions are mutable, or when you need ground-truth fitness data for AI model training. Alanine scanning is faster and cheaper but only reveals which positions matter, not which substitutions help. Single-variant assays are appropriate when you already know which mutations you want to test.
How does DMS compare to traditional directed evolution? +
Traditional directed evolution tests variants sequentially across screening rounds, producing a handful of improved clones per cycle. DMS measures all single substitutions simultaneously in one experiment, providing a complete fitness landscape that de-risks combinatorial library design for subsequent rounds.
Can DMS data train AI protein design models? +
Yes. DMS datasets are among the highest-value training inputs for machine learning models that predict variant effects (ESM, EVE, AlphaMissense) and protein fitness. The systematic, quantitative, position-resolved structure of DMS data is what these models need to generalize beyond the natural sequence record.
Technical articles on DMS and directed evolution
Deep mutational scanning: mapping fitness landscapes
How thousands of variants, functional selection, and NGS resolve a protein at single-residue resolution.
DMS for antibody affinity maturation
Using CDR-saturation scanning to build the second-round library from evidence instead of a random walk.
Directed evolution for stability and function
Selection pressure, library design, and iteration strategy for pushing function forward each round.
In vivo mutagenesis for AI training data
Generating labeled variant datasets at scale to feed machine-learning fitness models.
AI and DMS for enzyme engineering
Closing the loop between scanning data and model-guided design across iterative rounds.
How not to build a dataset for AI protein engineering
The data-quality failure modes that quietly sink an otherwise well-designed scanning campaign.
Map your protein's fitness landscape
Send us your target protein and the phenotype you want to optimize. We will design a DMS campaign and deliver a complete mutational landscape.
Start a project →