Projectsfrontier-model assessment Q3

frontier-model assessment Q3

Q3 frontier model sweep — claude-3-7, gpt-5, gemini-2.5, llama-4.

Q3 model upgradeInnovation LabActive · 51% docsOwner: Saanvi Nair
Progress
Kickoff
Discovery
Execution
Sign-off
4
Linked artifacts
6
Open work items
12
Eval runs
3
Red-team campaigns
5 medium
Open findings
51%
Doc completeness
Days to deadline
3
Team members
Recent activity
  • 2h agoVikram Shettycompleted work itemRun pre-cert eval suite — claims-copilot-v3 v18
  • 5h agoAnjali Krishnancommented onEU AI Act Article 14 human-oversight wording
  • 8h agoCatherine O'Brienattached evidence toSign-off chain
  • 1d agoArjun Iyerclosed campaignPre-deployment red-team — claims-copilot-v3 v18
  • 2d agoFatima Khanopened findingF-2026-04-1289 indirect injection (critical)
Members
  • SNSN
  • MPMP
  • AKAK
Linked initiative
Sibling projects in this initiative: 5