Product Manager,
Human Data Platform
Declared semantics beat inferred semantics. I have built this at five scales. This is the argument that the same discipline works here, at the most consequential scale I have encountered.
Every platform that relies on human judgment to improve AI has the same underlying problem: the humans guess what the AI needs, and the AI guesses what the humans meant. I have spent a decade designing systems that remove the guess. The work is not clever. It is patient and structural: find the unnamed thing, name it, and build a surface that makes the name load-bearing.
The pattern across every role is the same move at a different scale. Take a domain where people are guessing. Decompose the uncertainty into a named schema. Ship an interface that converts the guess into a declaration.
Five receipts:
- Oracle · FieldSync (Principal PM). First of more than 250 teams. Built a semantic event store for construction project management. AI traverses declared relationships rather than guessing from text. Presented to VP of Construction. Framework: Vendor Abdication, Dark by Default, Human Authority Preserved.
- Oracle · Procore Imports (PM). 94% adoption year one. The adoption came from removing the ambiguous error surface. Once the import schema was explicit, the guessing stopped.
- Amazon · Connections (PM). Rebuilt the daily sentiment instrument for approximately two million employees. Participation tripled. The fix was removing ambiguity from the question itself, not improving how answers were analyzed.
- Instill (Founding PM). Research-to-product interface. NPS climbed from 80 to 95. Chose growth potential over flight risk at the individual level. Refused to ship AI that treats people as liabilities.
- ORÍ Central (Builder). Personal AI operating system. Five emission types. Multi-agent orchestration. HITL gating on every destructive operation. More than 700 structured outputs. The closest artifact for "uses AI regularly."
The HDP PM role is the canonical version of this work. The platform I would build ships rubrics, not instructions; contracts, not guidelines; feedback loops, not hope.
Get in touch: tobiolofintuyi@gmail.com
The market is structuring human judgment at scale for the first time. The problem no platform has solved is the interface between the researcher who needs training data and the contractor who creates it. Every gap in that interface is a quality defect in the model.
I have spent my career closing that gap in adjacent domains. The HDP PM who closes it at Anthropic sets the ceiling on what these models can know.
The technical argument is one move applied at every layer: replace inference with declaration. Vectorless RAG proved it for enterprise configuration data: when you declare relationships rather than embedding and retrieving, the query surface becomes deterministic and auditable. The same move applies to labeling: when you replace "use your judgment" with a versioned, typed rubric, quality becomes measurable. One is a configuration grammar. The other is a data contract. Same primitive, different domain.
Get in touch: tobiolofintuyi@gmail.com
Every interface that relies on human judgment has an implicit schema. Most of the time, that schema lives in the labeler's head. When it lives in the labeler's head, you cannot version it, you cannot diff it, and you cannot debug it when the model behaves unexpectedly.
The systems I have built make the schema explicit and load-bearing:
- CQL (Configuration Query Language). Declarative grammar for Vectorless RAG configuration relationships. Ten instantiated instances. No embedding layer.
- The 4xx error contract for Procore Imports. A typed schema for the import surface. 94% adoption. The contract replaced the error message.
- The semantic event store for Oracle Unifier Configuration Services. Declared relationships that NL queries traverse without inference. Built as a business case before the role that would have owned it existed.
- The "rather not answer" option on Amazon Connections. A declared signal about data quality, not an absence of data. Trust as system property, not feature.
The HDP platform needs this primitive at the labeling layer: a rubric schema that is versioned, diffable, and provenance-tracked. The research team gets ground truth they can audit. The labeler gets instructions they can follow. The gap closes.
Get in touch: tobiolofintuyi@gmail.com
I got interested in this problem because of therbligs. Frank Gilbreth decomposed physical work into 17 atomic motions in 1908. He did it not to dehumanize workers but to make their skill explicit, teachable, and improvable. The therblig said: here is what you are actually doing, in terms that survive your departure.
I have been doing that for knowledge work for a decade. At Procore it was import errors. At Amazon it was engagement signals. At Oracle it was configuration relationships. At Instill it was psychological trait indicators. The work is the same each time: find the unnamed thing, name it, and build a surface that makes the name useful to the next person.
At Anthropic, the unnamed things are: what makes a good annotation, what signals a bad rubric, what constitutes a quality defect in human-generated training data. I want to name those. I want to build the surface. I would rather explain why we shipped late than why we shipped harm.
Get in touch: tobiolofintuyi@gmail.com
This page exists because a portfolio URL on a resume is a signal. It says: there is a body of work, it is maintained, and it is specific to this application.
If you arrived outside the expected modes, the short version: I make things explicit that used to be tacit. I think that is what most good product work is. The Anthropic Human Data Platform is where that work happens at the highest stakes I have encountered.
Get in touch: tobiolofintuyi@gmail.com
The work in long form
The receipts above point at deeper case studies. Each opens with an Anthropic-specific reading lens.
Founder-PM with 12+ years across Procore, Amazon, Instill, and Oracle. Background: I-O Psychology. Through-line: declared semantics beat inferred semantics, and trust is a system property you architect, not a tag you apply. Currently building AJAO Studio and ORÍ Central.
Get in touch: tobiolofintuyi@gmail.com