← All GROQ-seq datasets

TEV Protease - Homolog Library

GROQ-seq function measurements for TEV protease homolog libraries

About This Dataset

The project aims to collect data on thousands of protease sequence variants, with applications in understanding protease specificity and developing therapeutic interventions.

This dataset contains GROQ-seq measurements of a sequence-diverse TEV protease library: 4,022 natural homologs (from JackHMMER searches of UniRef100, BFD, and MGnify) plus 422 AI-generated variants from the SCISOR discrete-diffusion model, spanning broad evolutionary distance from TEV protease S219V. Assayed via split-DHFR at the Living Measurement Systems Foundry (LMSF) at NIST.

Dataset Information

Owning Organization
The Align Foundation
Gene
TEV
Released
Mar 30, 2026

Experiment Details

total Records
11,731
host Organism
Escherichia coli
antibiotic
Trimethoprim
experiment Date
12/15/2025
collection Site
LMSF

Downloads

TEV_Homolog_output_v1.csv

File containing final function values for variants

68.2 MB

Download

TEV_Homolog_Supplemental_v1.zip

Supplemental data including QC analysis notebooks

1.4 GB

Download