← All GROQ-seq datasets

TEV Protease - Homolog and SSVL Libraries

GROQ-seq function measurements for TEV protease SSVL homolog libraries

About This Dataset

The project aims to collect data on thousands of protease sequence variants, with applications in understanding protease specificity and developing therapeutic interventions.

This dataset combines a sequence-diverse TEV protease homolog library (4,022 natural homologs plus 422 AI-generated variants from the SCISOR discrete-diffusion model) with a site-saturation variant library (SSVL) of TEV. Assayed via split-DHFR at the Living Measurement Systems Foundry (LMSF) at NIST.

Dataset Information

Owning Organization
The Align Foundation
Gene
TEV
Released
May 11, 2026

Experiment Details

total Records
21,991
host Organism
Escherichia coli
antibiotic
Trimethoprim
experiment Date
12/15/2025
collection Site
LMSF

Downloads

TEV_Homolog_SSVL_output_v1.2.zip

File containing final function values for variants

26.7 MB

Download

TEV_Homolog_Supplemental_v1.2.zip

Supplemental data including QC analysis notebooks

1.4 GB

Download

README.md

Dataset README

5.2 KB

Download