← All GROQ-seq datasets

TEV Protease - Homolog Library

GROQ-seq function measurements for TEV protease homolog libraries

About This Dataset

The project aims to collect data on thousands of protease sequence variants, with applications in understanding protease specificity and developing therapeutic interventions.

This dataset contains GROQ-seq measurements of a sequence-diverse TEV protease library: 4,022 natural homologs (from JackHMMER searches of UniRef100, BFD, and MGnify) plus 422 AI-generated variants from the SCISOR discrete-diffusion model, spanning broad evolutionary distance from TEV protease S219V. Assayed via split-DHFR at the Living Measurement Systems Foundry (LMSF) at NIST.

Dataset Information

Owning Organization
The Align Foundation
Gene
TEV
Released
May 11, 2026

Experiment Details

total Records
11,909
host Organism
Escherichia coli
antibiotic
Trimethoprim
experiment Date
12/15/2025
collection Site
LMSF

Downloads

TEV_Homolog_output_v1.2.zip

File containing final function values for variants

15.8 MB

Download

TEV_Homolog_Supplemental_v1.2.zip

Supplemental data including QC analysis notebooks

1.4 GB

Download

README.md

Dataset README

5.2 KB

Download