EUCLID

for oscar

A second set of eyes on every
insurance lead you ingest.

AI-powered insurance lead fraud & anomaly detection. Euclid scores every inbound lead across three risk engines, surfaces the patterns that matter, and routes only what needs a human to your analysts.

Product Overview · Confidential

Prepared for
Oscar Health

Savings delivered $1M+ Saved for customers · to date, across deployed carriers

Monthly volume 500K/mo Lead records processed · and growing

False-positive rate < 11% Industry average · ~30%

Score distribution · last 90 days 2.1M leads scored · all time, cross-carrier

Low

~1.43M · fast-track

Medium

~441K · standard review

High

~168K · analyst queue

Critical

~63K · escalate now

Distribution stays consistent across deployed carriers < 11% false-positive on flagged, vs. ~30% industry

01 / pipeline

How a lead moves through Euclid

CSV upload & layout detection

Auto-maps unknown carrier formats via LLM.

Validate & enrich

Email + Phone + Address · PII tokenized and decentralized before processing.

3-engine risk scoring

Identity · Fraud · Health, run in parallel.

Composite score & tier

Single 0-100 Euclid Score · Low / Med / High / Critical.

Results, reports & analyst review

Routed queues · audit trail · top-5 reasons.

Live job tracker 4 active · 312 jobs today

JOB-7842 carrier_intake_may26.csv layout: auto-detected 47,500 rows Processing · 04:11
JOB-7841 broker_book_batch_q2.csv scored: 22,000 / 22,000 22,000 rows Completed · 2m ago
JOB-7840 ffm_extract_q2_2026.csv scored: 38,000 / 38,000 38,000 rows Completed · 4m ago
JOB-7839 monthly_agent_review.csv queued behind 3 jobs 15,000 rows Pending

Celery async workers · 15-min stuck-job retry All PII tokenized at ingestion

02 / scoring engines

Three engines, one Euclid Score

IR · Identity Risk w = 0.80

Synthetic identity detection

Checks every lead for fake PII, recycled contact details, and identities that show up under different names across the book. When email and phone APIs return results, those signals feed directly into the score. When they don't, heuristics fill the gap.

FS · Fraud Signal w = 0.20

Organized fraud patterns

Looks across the whole batch for patterns that only appear when something organized is happening. The same phone on five different names, an agent submitting 300 policies in a week, a cluster of addresses that don't exist. 35 documented rules, derived from real forensic reviews.

Graph · 35 catalogue rules · 6 pattern families

HR · Health Outlook Risk w = 0.10

Comorbidity scoring

When health data is present, age, BMI, smoker status, and chronic conditions are scored against a weighted comorbidity matrix. This layer is optional. Carriers without health columns skip it entirely, and the composite re-weights automatically.

Clinical · weighted comorbidity matrix

Composite Score Formula

IRC = 0.8 × IR + 0.2 × FS

Euclid Score = 0.9 × IRC + 0.1 × HR

Low< 35

Medium35 – 64

High65 – 84

Critical≥ 85

Note on weights Weights flex with evidence strength. Multiple weaker flags balance each other and keep the composite moderate. When fraud signals concentrate heavily on a single agent or lead, FS pushes past its base 0.20 share and dominates the score, overriding identity alone.

Sample lead · score breakdown 5 reasons surfaced · routed to analyst queue

LEAD-04812 · anonymized 78/ 100 High Risk

IR · Identity Risk

FS · Fraud Signal

HR · Health Outlook

Top 5 weighted risk reasons

$0 premium concentration+ FFM ID reused across policies+ Auto-generated email fingerprint+ Cross-state agent licensing mismatch+ Multi-carrier termination footprint+

03 / human + machine

LLM orchestration meets analyst judgement

LLM Orchestration

Theo

Theo handles the parts of the job that don't fit neatly into a rule: reading a carrier CSV format it has never seen before, turning a score into a sentence an analyst can act on, and answering plain-English questions over a scored book. Theo manages prompt routing, PII tokenization, and ties every AI output to a versioned audit record so every flag is traceable.

Layout detection Risk explanations NL Q&A

Predictive Models

Euclid-trained models

The scoring models were trained on real broker books with known outcomes, including what clean, legitimate books look like, not just fraudulent ones. A dedicated comorbidity layer adds clinical context on top of identity and fraud signals for carriers whose leads include health data.

BoB corpus Partner-carrier data Medical & comorbidity

⌥

Manual Review

Analyst queues

Not everything gets auto-decided. High-risk and borderline leads land in a structured queue where a human makes the final call. Euclid shows the five highest-weight reasons for the flag so analysts spend their time investigating, not re-deriving the score. Overrides feed back into future model improvements.

Top-5 reasons Audit trail Override & feedback

Benchmark Intelligence

What powers every Euclid Score

Every Euclid Score can be walked back to a specific rule. The pattern catalogue is derived from three real-world corpora: actual broker-book forensic reviews, carrier-side investigation records, and cross-industry suspension watchlists. Theo processes everything with PII tokenized before any prompt is assembled, and every catalogue version that produced a score is written into the audit trail.

Documented patterns 35 Real forensic-review provenance · not synthetic

Pattern families 6 Policy · broker · agency levels

Agency watchlist 180d Rolling cross-industry suspension window

Catalogue version v1.2.2 Last refresh 2026-05-10

w · IR / FS / HR

Scoring formula

Hybrid composite. Algorithmic logic enriched by AI pattern recognition and external validation APIs.

IRC = 0.8·IR + 0.2·FS
Score = 0.9·IRC + 0.1·HR
LLM enrichment · weighted by pattern confidence
Four tiers: Low / Med / High / Critical

0–100 composite→

35 · 6 families

Pattern catalogue

Documented fraud signatures derived from real broker-book forensic reviews, not theoretical edge cases.

Per-policy signatures (subsidy & eligibility)
Per-broker concentration & velocity
Per-agency coordination fingerprints
Every flag links back to a rule

Quarterly catalogue refresh→

Cross-industry

Brokerage oversight layer

Cross-industry signal that elevates risk priors for agents and agencies with recent suspension, termination, or compliance history, independent of any single carrier's view.

Agency reputation watchlist
Multi-carrier suspension footprint
Coaching & compliance history
Agent-network linkage signals

Industry-wide signal · agent-level resolution→

v1.2.2 · 2026-05-10

Versioning & governance

Catalogue versions are immutable. Every score is stamped with the version that produced it.

PII tokenized pre-prompt (names, emails, phones)
Score-version audit trail
Historical scores preserved on update
Published diff with every catalogue release

Audit-grade explainability→

The model also knows what a clean book looks like. Legitimate broker books have a spread of premium tiers, varied FPL distributions, multi-carrier diversity, and real engagement signals like autopay and account creation. Teaching the model that boundary is what keeps the false-positive rate where it is.

< 11%False positives

Four fraud categories Euclid detects Surfaced via top-5 reason chips on every flagged lead

Identity fraud

Invalid contact data, suspicious or non-residential addresses, auto-generated email fingerprints, disposable phone numbers.

IR · w=0.80 Primary engine

Inflated households

≥3 lives on majority of policies, ≥6 lives always investigated, FFM application IDs reused across distinct policies.

FS · w=0.20 Signal layer

Premium & subsidy

$0 premium with positive APTC across the book, FPL clustering at subsidy sweet-spots, single-plan-SKU funneling.

FS · w=0.20 Signal layer

Agent red flags

Excessive policy volume per state (250+), cross-state licensing mismatches, future-effective enrollment surges, multi-carrier termination footprint.

FS · w=0.20 + Brokerage oversight

04 / market

Why now

Insurance fraud detection market

$7.2B (2025) → $20.2B by 2031 · 19% CAGR

ML deployment among global insurers

62% in 2025 · and rising