DNA sequencing gel electrophoresis results in a genetics laboratory

Capability

Deep mutational scanning services

Deep mutational scanning services that map the fitness landscape of your protein — every single amino acid substitution at every position, scored for function in a single pooled experiment.

Overview

What is deep mutational scanning?

Deep mutational scanning (DMS) is a high-throughput protein engineering method that measures the functional impact of every possible single amino acid substitution in a protein region of interest. A saturation mutagenesis library, typically all 19 non-wild-type substitutions at every position in the target region, is constructed, introduced into a display or selection system, and subjected to functional pressure.

Pre- and post-selection populations are sequenced by NGS, and the enrichment or depletion of each variant quantifies its fitness relative to wild type. The result is a complete mutational landscape: a position-by-substitution matrix that reveals which residues are essential, which tolerate mutation, and which substitutions improve function.

DMS was introduced by Fowler and Fields in 2014 and has since become the standard approach for mapping fitness landscapes in antibody affinity maturation, enzyme engineering, viral epitope mapping, and benchmarking AI-based variant effect predictors. It delivers more functional data in one pooled experiment than years of conventional single-mutant screening.

19×N

Variants per protein (saturation)

Experiment for complete coverage

Quantitative

Enrichment-based fitness scores

Position

Resolution across full sequence

Method

How a deep mutational scanning experiment works

Every DMS campaign follows the same four-stage logic. The selection step is what makes the score meaningful: choose the wrong selection and you measure the wrong thing.

Library design

Saturation mutagenesis using NNK, NNS, or trimer codons across the target region. Library diversity is matched to selection throughput so every variant is sampled by 100-1000x before sorting.

Functional selection

Library transformed into a display or assay system, then sorted under the phenotype of interest: binding, surface expression, catalytic activity, or in vivo fitness. Stringency tunes which substitutions register as gain or loss.

NGS readout

Pre- and post-selection populations sequenced on Illumina or PacBio. UMI-tagged reads and replicate sorting control for sampling noise and PCR bias.

Fitness scoring

Log-enrichment ratios computed per variant, normalized to wild-type and synonymous control variants. Output: a position by amino acid fitness matrix ready for downstream design.

Fitness landscape

Gain Neutral Loss

A single DMS experiment delivers a complete position-by-substitution fitness matrix. Conserved positions, mutation-tolerant loops, and gain-of-function substitutions are visible in one view.

Data output

What you receive from a DMS experiment

Fitness score matrix

A position-by-amino-acid matrix of log-enrichment ratios. Each cell quantifies how a specific substitution at a specific position affects the selected phenotype relative to wild type. Scores are normalized and reproducible across replicates.

Heatmap visualization

Publication-ready heatmaps showing gain-of-function, loss-of-function, and neutral mutations across the entire scanned region. Immediately identifies functional hotspots, conserved positions, and mutation-tolerant loops.

Beneficial variant list

Rank-ordered list of substitutions that improve the selected phenotype — binding affinity, thermostability, expression level, or catalytic activity depending on your selection pressure. Ready for combinatorial optimization.

Combinatorial design guidance

Beneficial single mutations are candidates for combination. DMS data de-risks combinatorial library design by identifying which positions tolerate simultaneous substitution, guiding the next round of optimization.

Applications

DMS applications in protein engineering

Affinity maturation

Identify every substitution that improves binding at every position in a binder or antibody. Combine top hits to achieve multi-log improvements in Kd without random screening.

Stability engineering

Map thermostability contributions across your protein. Identify stabilizing mutations for formulation development, extended shelf life, or higher expression yields.

Epitope mapping

Determine which target residues are critical for binder interaction. Loss-of-binding mutations on the target surface map the functional epitope at single-residue resolution.

Variant effect prediction benchmarking

Generate ground-truth fitness data to benchmark computational variant effect predictors (ESM, EVE, AlphaMissense). Essential for calibrating in silico models against experimental reality.

Training data for AI protein models

DMS datasets are among the highest-value training inputs for machine learning models that predict variant effects, protein fitness, and structure-function relationships. Systematic, quantitative, and covering complete sequence space, DMS data provides the ground truth that computational models need to generalize.

Enzyme optimization

Score every substitution for catalytic activity, substrate specificity, or product selectivity using activity-coupled selection. Identify positions that decouple activity from stability.

Selection modes

Functional scoring methods for DMS

The phenotype you measure determines the mutations you find. We match the selection system to your engineering objective.

Binding DMS

Display-based binding selection

Variants displayed on yeast or mammalian cells, sorted by FACS for target binding. Enrichment ratios quantify the binding fitness of each substitution. Multi-concentration sorting provides apparent affinity rankings across the entire library.

Expression DMS

Surface expression as stability proxy

Surface expression level on display platforms correlates with thermodynamic stability. Sorting for high expression enriches stabilizing mutations and depletes destabilizing ones. A rapid, functional proxy for thermal stability measurements.

Functional DMS

Activity-based functional selection

Variants selected for enzymatic or biological activity rather than binding. Intracellular selection links protein function to cell survival or reporter output, while microfluidic droplet compartmentalization enables single-variant catalytic readouts. Examples include recombinase activity in cells and polymerase activity via droplet PCR.

Selected literature

Foundational papers in deep mutational scanning

The methodology and downstream analysis behind every DMS campaign rests on a small set of foundational papers. These are the references our protocols and scoring pipelines build on.

Fowler & Fields (2014).

Deep mutational scanning: a new style of protein science.

Nature Methods 11, 801-807.

Fowler, Araya, Fleishman et al. (2010).

High-resolution mapping of protein sequence-function relationships.

Nature Methods 7, 741-746.

Starr, Greaney, Hilton et al. (2020).

Deep mutational scanning of SARS-CoV-2 receptor binding domain reveals constraints on folding and ACE2 binding.

Cell 182, 1295-1310.

Hietpas, Jensen & Bolon (2011).

Experimental illumination of a fitness landscape.

PNAS 108, 7896-7901.

Bloom (2015).

Software for the analysis and visualization of deep mutational scanning data.

BMC Bioinformatics 16, 168.

Wrenbeck, Faber & Whitehead (2017).

Deep sequencing methods for protein engineering and design.

Current Opinion in Structural Biology 45, 36-44.

Notin, Dias, Frazer et al. (2024).

ProteinGym: large-scale benchmarks for protein fitness prediction and design.

NeurIPS Datasets and Benchmarks.

dms-view (Hilton et al.).

dms-view: interactive visualization tool for deep mutational scanning data.

Journal of Open Source Software.

FAQ

Deep mutational scanning questions

What is deep mutational scanning? +

Deep mutational scanning (DMS) is a high-throughput protein engineering technique that measures the functional impact of thousands of amino acid substitutions in a single pooled experiment. A saturation mutagenesis library is subjected to functional selection (binding, expression, or catalytic activity), and pre- and post-selection populations are sequenced by NGS to compute enrichment-based fitness scores for every variant.

How many variants can DMS characterize in a single experiment? +

A full saturation DMS campaign covers 19 non-wild-type amino acid substitutions at every position in the target region: 19 × N variants for a protein of length N. For a 100-residue domain, that is approximately 1,900 variants characterized in a single experiment with quantitative fitness scores.

What protein engineering objectives can DMS address? +

Affinity maturation (identifying substitutions that improve binding Kd), stability engineering (finding thermostabilizing mutations), epitope mapping (determining which target residues are critical for binder interaction), enzyme optimization (scoring substitutions for catalytic activity or substrate specificity), and generating training data for AI protein engineering models.

What library sizes do deep mutational scanning experiments require? +

A saturation DMS library covers 19 × N variants for a target region of length N. For a 100-residue domain that is approximately 1,900 variants; for a full 250-residue scFv it is approximately 4,750. We oversample at 100-1000x coverage during transformation and sorting so that every variant is sampled enough times to call enrichment confidently after NGS.

What NGS readout do you use for DMS? +

Most campaigns use Illumina paired-end sequencing (MiSeq or NextSeq) with UMI tagging to control for PCR amplification bias. For targets longer than the Illumina read length, we use PacBio HiFi sequencing to capture full-length variants in single reads. Replicate sequencing of pre- and post-selection pools is standard for variance estimation.

What is the typical turnaround time for a DMS campaign? +

A standard DMS campaign from library design through fitness matrix delivery runs 10-14 weeks. Library construction takes 3-4 weeks, selection and sorting takes 2-4 weeks depending on rounds, and NGS plus analysis takes 3-4 weeks. Timeline is specified in the SOW at project start.

When is DMS the right choice vs single-variant assays or alanine scanning? +

DMS makes sense when you need to characterize the full mutational tolerance of a protein region, when downstream design depends on knowing which positions are mutable, or when you need ground-truth fitness data for AI model training. Alanine scanning is faster and cheaper but only reveals which positions matter, not which substitutions help. Single-variant assays are appropriate when you already know which mutations you want to test.

How does DMS compare to traditional directed evolution? +

Traditional directed evolution tests variants sequentially across screening rounds, producing a handful of improved clones per cycle. DMS measures all single substitutions simultaneously in one experiment, providing a complete fitness landscape that de-risks combinatorial library design for subsequent rounds.

Can DMS data train AI protein design models? +

Yes. DMS datasets are among the highest-value training inputs for machine learning models that predict variant effects (ESM, EVE, AlphaMissense) and protein fitness. The systematic, quantitative, position-resolved structure of DMS data is what these models need to generalize beyond the natural sequence record.

Map your protein's fitness landscape

Send us your target protein and the phenotype you want to optimize. We will design a DMS campaign and deliver a complete mutational landscape.

Start a project →