M
ops-summarizer-v1
ModelCritical · EU AI ActLimited · SR 11-7Model for mortgage workflows.
PROwner: Priya Ramaswamy· Team: Meridian Insurance — Lending Risk· Current version: v7 · 7 versions
Last eval
94.2
faithfulness · +1.8
Findings
4
4 crit · 0 high · 0 med
Open findings
4
all triaged
Re-cert in
69d
May 21, 2026
Runs (90d)
312
246 eval · 66 RT
Cost (30d)
$8,940
judge $4.2k · RT $4.7k
Compliance
91%
EU AI Act readiness
Production
Live
14% traffic · 2.3M req/24h
Health timeline · last 90 days
evalred-teamfindingdeploy★ cert
★
★
−90d−60d−30dtoday
Composition snapshot
- Underlying modelanthropic.claude-3-5-sonnet-20241022-v2Anthropic · Bedrock-served
- Active promptclaims-extraction-v18Internal · v18 of 18
- RAG pipelineclaims-knowledge-rag-v3Pinecone · 2.4M chunks · last refresh 6h ago
- Configurationclaims-prod-config-v12temp 0.2 · max_tokens 2048
- Tools4 connectedclaims-databasepolicy-searchcoverage-calculatorclaims-photo-analysis-mcp
Recent verdicts
- 4h agoEval · Faithfulness eval94.2%
- 1d agoRed Team · Pre-deployment audit4 critical findings
- 2d agoEval · Hallucination sweep0.8% trigger rate
- 3d agoEval · Helpfulness — adjuster scenarios91.5%
- 5d agoRed Team · Indirect prompt-injection probe2 high findings
- 6d agoEval · Latency regressionp95 1.4s · within SLO
Quality & safety trends · last 6 months
FaithfulnessHelpfulnessRefusal rateHallucination
Findings by month
Oct
Nov
Dec
Jan
Feb
Mar
criticalhighmedium
Top 5 standing concerns
- #1Indirect prompt injection via RAG contentcritical7 occurrences
- #2Citation hallucination on out-of-policy claimshigh5 occurrences
- #3Refusal-bypass via Hinglish encodinghigh4 occurrences
- #4Tool argument over-fetching (PII overshare)medium3 occurrences
- #5Inconsistent disclaimer placementmedium2 occurrences
Compliance posture
Evidence pack last assembled Mar 8 ·- EU AI Act91%Rev. Mar 8
- NIST AI RMF88%Rev. Mar 4
- ISO 4200194%Rev. Mar 11
- DPDP96%Rev. Mar 1
- SR 11-787%Rev. Feb 22
Open compliance gaps
- Missing: human-oversight model documentation for EU AI Act Article 14
- Outdated: data provenance attestation for claims-knowledge-rag-v3 corpus
Risk classification rationale
Classified High under EU AI Act Annex III §5(b) (insurance underwriting & claims) due to material impact on consumer financial outcomes.
Classified Limited under SR 11-7 — model is decision-supporting only; adjuster retains final authority and reviews 100% of outputs.
AI-assisted classification by gpt-4o · confidence 0.92
Human override by Catherine O'Brien on Jan 14 — confirmed High tier with additional human-oversight guardrail (mandatory adjuster sign-off).