NOW LET US – AI RAG SaaS Studio TP.HCM
NOW LET US
Digital Product Studio
Back to news
AI-FRONTIER...7 min read

Inside Genebench-Pro

Share
NOW LET US Article – Inside Genebench-Pro

A closer look at Genebench-Pro, a cutting-edge benchmark designed to evaluate AI models on complex genomic and bioinformatics tasks through real-world case studies.

Inside Genebench-Pro

A closer look at the benchmark, its questions, and supporting materials.

These 10 case studies showcase representative questions from GeneBench-Pro. Each case study includes the original prompt, datasets, and supporting materials. For an overview of the benchmark and key findings, see the announcement blog.

Note: File previews show excerpts from the full datasets.

Estimate whether a synthetic TXR1-directed inhibitor has positive clinical utility in tumors whose target activation is driven by a structural variant. TXR1, TXR1i, DLR1, and star-allele labels are synthetic benchmark labels.

The target subgroup has to be recovered from long-read, expression, tumor-quality, and pharmacogenomic evidence before benefit and toxicity can be interpreted as a treatment decision.

Released prompt shown to the model

Files provided to the model

patient_id | analysis_set | age | sex | site | calendar_period | ecog | tumor_burden | prior_lines | prior_resistance | lineage_class | therapy_class | assessed16 | benefit16 | tox_stop_8wk | time_zero_day | | MTB0001 | 1 | 73.8 | M | S1 | P2 | 2 | 0.787 | 3 | 1 | A | TXR1i | 0 | 1 | 0 | | | MTB0002 | 1 | 55.2 | M | S3 | P1 | 1 | 2.637 | 0 | 1 | A | TXR1i | 1 | 0 | 0 | 0 | | MTB0003 | 1 | 68.8 | F | S4 | P2 | 0 | 0.891 | 2 | 1 | A | TXR1i | 1 | 1 | 1 | 0 | | MTB0004 | 1 | 82.8 | F | S2 | P2 | 2 | 4.101 | 0 | 0 | B | TXR1i | 1 | 0 | 0 | 0 | | MTB0005 | 1 | 65.5 | F | S1 | P3 | 1 | 7.0 | 1 | 1 | A | TXR1i | 1 | 0 | 0 | 0 |

Registry covariates, therapy, week-16 assessment, benefit, and early toxicity.

Decide whether an apparent lncRNA dependency is transcript-specific or driven by nearby-locus and neighbor-gene effects.

Transcript-directed evidence has to survive controls for local DNA-locus perturbation, neighbor-gene repression, guide swaps, GC toxicity, and plate effects.

Released prompt shown to the model

Files provided to the model

guide_id | nominal_target | chr | coord | strand | dist_lnc_tss_bp | dist_neighbor_tss_bp | guide_gc_frac | | g001 | LINC473 | chr7 | 100014 | + | 14 | 30 | 0.624 | | g002 | LINC473 | chr7 | 100035 | - | 43 | 67 | 0.584 | | g003 | LINC473 | chr7 | 100051 | + | 116 | 56 | 0.622 | | g004 | LINC473 | chr7 | 100066 | - | 59 | 66 | 0.617 | | g005 | LINC473 | chr7 | 100088 | + | 74 | 77 | 0.715 |

Guide coordinates, targets, distances, and GC features.

Estimate direct disease effects for two nearby proteins using cis multivariable Mendelian randomization (cis-MVMR) while handling assay scale, allele orientation, winner's curse, LD, and residual local pleiotropy.

The two proteins share a correlated locus. The analysis has to move from marginal associations to conditional, LD-aware disease effects on a common protein scale.

Released prompt shown to the model

Files provided to the model

snp | pos_bp | effect_allele | other_allele | maf | beta | se | pval | | rs200000 | 50000000 | A | C | 0.42215 | 0.006438668310706808 | 0.003267330091203412 | 0.04876727714241972 | | rs200001 | 50010126 | A | C | 0.05709 | 0.011008993337581301 | 0.006955239208750407 | 0.11345916603941006 | | rs200002 | 50020253 | G | T | 0.09021 | 0.009922014757116319 | 0.005633023027015518 | 0.07817048492026045 | | rs200003 | 50030379 | G | T | 0.48399 | 0.010569215614164573 | 0.0032291419740237445 | 0.0010638520681901973 | | rs200004 | 50040506 | A | G | 0.37703 | 0.007036551378238654 | 0.0033297592321269802 | 0.034580976884336506 |

Screening-stage protein association summaries for PROTA.

Estimate ancestry-specific carrier frequencies, residual risk after a negative screen, partner carrier frequency, and affected-conceptus risk from carrier-screening assay data.

The residual-risk estimate depends on pseudogene-aware carrier calls, founder-haplotype collapse, ancestry-specific assay calibration, and standardization from tested partners back to the full partner roster.

Released prompt shown to the model

Files provided to the model

sample_id | collection | ancestry | family_history_tier | | S_EUR_0001 | screening | EUR | 0 | | S_EUR_0002 | screening | EUR | 0 | | S_EUR_0003 | screening | EUR | 0 | | S_EUR_0004 | screening | EUR | 0 | | S_EUR_0005 | screening | EUR | 1 |

Screening-roster adults with ancestry and screening context.

Estimate a genotype effect on activated-monocyte expression after removing ambient RNA and technical contamination from single-cell RNA-seq data.

Ambient RNA affects both target expression and the marker panel used to call activation state, so correction has to occur before the eQTL model.

Released prompt shown to the model

Files provided to the model

cell_id | donor | total_umi | HBB | IFI6 | ISG15 | LST1 | CXCL10 | | D01_C001 | D01 | 1113 | 7 | 3 | 4 | 83 | 5 | | D01_C002 | D01 | 1103 | 6 | 3 | 3 | 112 | 10 | | D01_C003 | D01 | 1141 | 9 | 8 | 12 | 63 | 9 | | D01_C004 | D01 | 1250 | 7 | 60 | 43 | 2 | 17 | | D01_C005 | D01 | 1045 | 9 | 1 | 2 | 51 | 15 |

Per-cell UMI counts for marker genes, contamination markers, and the target gene.

Estimate whether a nested structural subhaplotype inside an anonymous inversion-like locus has a calibrated clinical association and credible expression support.

A nested copy-dosage signal can be confounded by the broader inversion orientation, so dosage calibration, expression support, and clinical modeling have to remain distinct.

Released prompt shown to the model

Files provided to the model

sample_id | case | age | age_band | sex | pc1 | pc2 | pc3 | ancestry_group | clinic_stratum | recruitment_stream | | Q00012 | 1 | 50.45 | 50_64 | 0 | -1.01514 | -0.21032 | -0.08849 | EUR | tertiary | clinic | | Q00028 | 0 | 57.39 | 50_64 | 0 | -1.25987 | -0.12498 | 0.2344 | EUR | regional | registry | | Q00029 | 1 | 68.4 | 65_plus | 0 | 0.91598 | 0.62177 | 0.01891 | AFR | tertiary | clinic | | Q00030 | 1 | 74.07 | 65_plus | 1 | 0.21125 | -0.59634 | -0.08197 | EAS | community | registry | | Q00032 | 1 | 82.82 | 65_plus | 0 | -1.12034 | -0.24372 | 0.14665 | EUR | community | clinic |

Clinical and covariate data for the full cohort.

Quantify a focal case-control Hi-C loop-strength difference after removing low-mappability and structural-variant artifacts from the expected-contact background.

The target loop is defined at 20 kb resolution, but the expected-contact model is distorted unless low-mappability contacts and a case-only SV stripe are masked first.

Released prompt shown to the model

Files provided to the model

bin_id | chrom | start | end | gc_content | mappability | re_sites | | 0 | chr8 | 400000 | 420000 | 0.46199033821572594 | 0.9787574214704273 | 5 | | 1 | chr8 | 420000 | 440000 | 0.5044124208534677 | 0.8901084943498397 | 5 | | 2 | chr8 | 440000 | 460000 | 0.43218451584938194 | 0.9056879289326712 | 3 | | 3 | chr8 | 460000 | 480000 | 0.4733197282681218 | 0.9376529840664789 | 3 | | 4 | chr8 | 480000 | 500000 | 0.4444956062150748 | 0.8682565517981877 | 4 |

Target-resolution bin annotations.

Map a chromosome-1 quantitative-trait locus in an eight-founder recombinant population by reconstructing founder ancestry before testing the phenotype association.

The visible marker data are biallelic, but the biological signal is founder ancestry. A defensible analysis therefore has to reconstruct founder state, check marker orientation, and separate the QTL from a batch-aligned nuisance peak.

Released prompt shown to the model

Files provided to the model

marker_id | chr | pos_cM | | m2_065 | 2 | 59.762431265596575 | | m2_103 | 2 | 94.52656615104739 | | m2_107 | 2 | 98.18761427503033 | | m2_079 | 2 | 72.20130244108847 | | m1_054 | 1 | 49.907510212292195 |

Marker identifiers, chromosomes, and genetic-map positions.

Infer parent-specific ancestry proportions and recent admixture timing from phased local-ancestry tracts after repairing reciprocal artifacts and a chromosome-specific label inversion.

Ancestry fractions and pulse times both change if reciprocal artifacts and label inversions are not properly corrected.

© 2026 Now Let Us. All rights reserved.

Source: OpenAI News

Advertisement
Ad slot ready: 5887729102

More in this category

NOW LET US Related – Anthropic’s long-sidelined Fable 5 is greenlit to return

ai-frontier

Anthropic’s long-sidelined Fable 5 is greenlit to return

After weeks of negotiating with the Trump administration, Anthropic is finally going to be able to bring Claude Fable 5 back online. In a post on X, Anthropic said it plans to begin restoring access tomorrow.

NOW LET US Related – Claude Science is Anthropic’s newest flagship product

ai-frontier

Claude Science is Anthropic’s newest flagship product

Anthropic has announced Claude Science, a major new flagship product designed to support scientific research, particularly in computational biology and drug development. This launch signals Anthropic's serious commitment to life sciences, positioning the company to challenge Google DeepMind's long-standing dominance in scientific AI.

NOW LET US Related – Google’s NotebookLM can sum up your research in a TikTok-style clip

ai-frontier

Google’s NotebookLM can sum up your research in a TikTok-style clip

Google’s NotebookLM is adding a new way to catch up on your notes: TikTok-style AI videos. The new feature is rolling out to Google AI Ultra and Pro subscribers, allowing NotebookLM to generate 60-second vertical AI clips based on the sources you upload to the app.

NOW LET US Related – Netflix is using an AI-generated Gene Wilder voice in its Willy Wonka reality show

ai-frontier

Netflix is using an AI-generated Gene Wilder voice in its Willy Wonka reality show

Netflix is set to premiere a new reality show, "Wonka’s The Golden Ticket," on September 23rd, featuring an AI-generated voice of the late actor Gene Wilder. The voice recreation was developed by ElevenLabs with the consent of Wilder's estate.

NOW LET US Related – Libby will filter out AI content, kind of

ai-frontier

Libby will filter out AI content, kind of

The popular ebook-lending app Libby is introducing filters to let users opt out of AI-generated books and audiobooks, though the system will rely heavily on self-labeling by publishers.

NOW LET US Related – Agriculture is ready for AI, but its data isn’t

ai-frontier

Agriculture is ready for AI, but its data isn’t

Artificial intelligence promises to revolutionize agriculture, but these solutions are only effective with a clean, solid data foundation. Industry leaders must prioritize data readiness before investing in AI.

EXPLORE TOPICS

Discover All Categories

Deep dive into the specific technology sectors that matter most to you.