01 / 17
Product Status · Internal Review

What we are building, where it stands, and what it taught us.

2026-07-03 · companion to the AI Product Guide
01 · THE PRODUCT

One system, one shape.

A human-in-the-loop pipeline that turns messy inbound customer communication into structured commercial output. Every client is a configuration of this same pipeline, never a rebuild.

1 Gather 2 Identify 3 Parse 4 Match 5 Human loop 6 Act & deliver
Foundation: LLM gateway, tenancy, trust, cost control, observability
01 · THE PRODUCT

Six areas in a chain, one shared foundation.

01
Gather
Every channel a client's customers use, in one place
02
Identify & tag
What is this: request, clarification, noise?
03
Parse
Items requested, plus the context around them
04
Match
Request × system of record, schema in the middle
05
Human loop
Humans fix cases; their behavior teaches the product
06
Act & deliver
Branded quote, ERP write, delivery
02 · AREA 01

Gather

Shipped
Outlook + Gmail OAuth pollers, 5-min beat
Attachment extraction: PDF, Excel, CSV, ZIP, image OCR
Dedup by fingerprint + provider message-id
Manual .eml upload
Partial / missing
#268Body truncated at 10k chars, no flag
#270Scanned PDFs never reach the vision path
#252 P0Reply has no structural link to its case
not trackedWhatsApp, phone calls
Priority: #268 → #270 → the #252 thread model the moment its workflow questions are answered.
02 · AREA 02

Identify and tag

Shipped
Classifier with confidence gate
Full audit log of every decision
Customer match at intake
Partial / missing
#252Clarification reply tagged INQUIRY, silently discarded
#198No email-direction awareness: supplier mail misread as RFQ
futureCase types beyond RFQ (orders, returns, after-sales)
Worst silent failure in the product: a client asking a question gets no row, no extraction, no signal, just silence.
02 · AREA 03

Parse

Parse output is not a list of items. It is items plus a context ledger. "Outdoor project" constrains every item's IP rating. "Like this SKU but horizontal" is a reference plus a modification, not free text.

Shipped
Line-item extraction + prompt-injection defence
Spec normalizer + extractors (watt, lumen, kelvin, IP, dims, UGR)
Partial / missing
#271One whole-email context only: mixed projects unmatchable
not trackedThe context ledger · relational requests
#271 is the flagship known bug: headers detected, then dropped. It gates the extraction restructure and is where this belief was born.
02 · AREA 04

Match

Parsed request on one side, the client's system of record on the other. The attribute schema is the bridge. This is where the value technology concentrates.

Shipped
Cheap-first chain: code → register → competitor → retrieval → judge
LLM catalog enrichment with human review
Partial / missing
#277Schema still lives in code, not a written artifact
#277/2782 live invariant violations: CRI, UGR judged with no data
Method: attribute-schema-bootstrap-playbook.md. Derive it once, checkably, never through incidents.
02 · AREA 05

Human loop: two loops share the screen

Loop 1: fix this case
Two-panel review UI, quote builder
Per-item override, assignment, locking
Loop 2: teach the product
44-case eval corpus, 3-phase reporter
not trackedBehavioral signal taxonomy · coverage KPI
Loop 2 is the biggest gap. Only the final SKU correction is captured. The coverage KPI (diffing Odoo orders against platform-created ones to catch bypass) is cheap to build and unmeasured today.
02 · AREA 06

Act and deliver

Shipped
Quote PDF · Excel export
Idempotent Odoo draft order, never auto-confirms
Partial / missing
Delivery stub: hardcoded to an internal address
Automated send · fiscal compliance · translated output
02 · FOUNDATION

The trust cluster: pending, urgent

The boring substrate every product shares, built once. Four urgent audit findings sit here; none are optional before a security claim to a client.

Shipped
Async workers, retry, cost tracking per call
JWT auth, error monitoring, prod Docker profile
Missing, urgent
LLM gateway · EU data residency · ZDR contract
Field encryption · secret manager · tenant_id · rate limiting
03 · ARONLIGHT

Where we are

Live in production at aronlight.adaptto.ai, running the full chain end to end on an all-Haiku pipeline with per-call cost accounting.

Now
#262 deterministic baseline: shipped #297/#299 lifecycle status: shipped #258 mounting: shipped end-to-end #309 adjustability retrieval (P0, live edge)
Next
#315 retrieval-data-path CI test #277/278 canonical schema + CI #261 hybrid retrieval #227 correction flywheel
The live edge right now: #309, adjustability retrieval. Extraction and catalog data both existed, the judge could reason about it, and the system still quoted the wrong fixture, because retrieval never surfaced the correct one for the judge to see. Next slide: why that keeps happening.
03 · ARONLIGHT

Three invariants, each found one incident deeper

The attribute schema records, per attribute: can we extract it, does the catalog have it, can retrieval act on it. All three came from the same original case, resurfacing one layer deeper each time.

01
Discriminator invariant.If two products differ only by attribute X, X must be extractable. LUKE (fixed) vs. Torq (rotable): nothing extracted adjustability. RFQ #120.
02
Judge data-path invariant.Every attribute the judge verdicts needs data on at least one side. Fixed, but the wrong fixture still shipped.
03
Retrieval data-path invariant.The judge can only rank what retrieval hands it. Extraction + catalog data existed; retrieval never used them. #309, found a day after the prior fix closed citing this case as resolved, without re-running the eval.
The pattern underneath the pattern: an attribute can satisfy every invariant written down so far and still fail, because each one was discovered by the specific way the previous ones weren't enough. Expect a fourth. The list is a floor, not a ceiling.
03 · ARONLIGHT

What building it taught us

01
Schema first, never incidents.Four same-class matching bugs in a week were all knowable in week zero.
02
Silent failure is the default state.Truncated emails, blank OCR, discarded replies: none threw an error.
03
You cannot improve what you cannot measure.±10-point eval noise was hiding every real regression.
04
Catalog data quality beats matcher cleverness.1,000+ uncategorized products explain more misses than any ranking bug.
05
Closed does not mean done.Security issues closed without implementation created false confidence.
06
Structure must survive parsing.Flattened headers and columns make downstream matching structurally impossible.
07
The human loop only teaches if it's wired.Corrections were captured, not fed back. Learning is built, not assumed.
08
Closing an issue isn't closing the gap.Twice now, closed without implementation or without re-running the eval. Verification needs evidence, not a merged diff.
09
An invariant list is a floor, not a ceiling.Two invariants shipped; a production incident found a third. Expect a fourth.
04 · HORIZONTAL + VERTICAL, ONE WORK LIST

Where the guides disagree is the roadmap.

The System Guide is horizontal: what good looks like, for any product. The Product Guide is vertical: what exists, for this one. The gap between them is the work list, and this week an audit of the two-guide system itself found two more "closed but not done" incidents in the guides' own upkeep.

Why both exist: the Product Guide means the next build starts from what we already know, not from zero. The System Guide means moving fast never quietly becomes moving carelessly, on any product, not just this one.
01
Adjustability retrieval#309: the live edge of the schema work
02
The trust clusterGateway, ZDR, encryption, secrets, corpus anonymization: still the top unresolved finding
03
Schema formalization#277/#278: still lives in code, not in docs/attribute-schema.md
04
Stable IDs55 checklist items, 3-and-growing invariants: neither has a durable citation anchor
05
Adaptto Core decision#55: lands once, hard to retrofit
05 · WHAT MATTERS MOST

Not just "an AI that reads emails."

01
Context is the moat
Items are easy. Understanding that "outdoor" constrains every item, or that a tender implies specs nobody wrote, is the hard and valuable part.
02
The schema is the technology
Between a messy request and a 4,300-product catalog sits a checkable, portable schema. Checkable turned out to mean three invariants deep so far, not one. It turns a wrong quote into a build failure.
03
The product learns from use
Every accept, override, and abandonment is a verdict on one part of the chain. Wiring these signals is what compounds.
06 · NEW CLIENTS

What onboarding looks like, area by area

Because every client is a configuration of the same chain, coverage reads per area.

AreaSecond lighting clientNew vertical
GatherStrongGood
IdentifyStrongGood
ParseGoodPartial
MatchStrongPartial
Human loopStrong / WeakSame
ActGoodPartial
FoundationWeakWeak
The two weak cells are the same for every client. That's exactly why they rank high on the roadmap. Paid once, every client benefits.
Full detail

The registry behind this story