This is a new breed of role — sitting at the intersection of product management and agentic engineering. You will define, build, and own the harness layer that makes AI agents actually work in our suite of advertising products. The harness is everything between the model and the user: the infrastructure, connectors, evaluation systems, and feedback loops that determine whether our agents are reliable, useful, and safe. You are not a traditional PM who hands specs to engineers, nor a pure engineer who ships features without a user lens. You are both — with a deep enough technical foundation to get your hands dirty, and a sharp enough product instinct to know what matters. This is a greenfield function. There is no existing playbook, no inherited team structure. The right person sees that as the opportunity, not the risk.
Harness Design & Ownership
Design and own the agent harness framework — the full layer between model and user, including connectors, execution environments, orchestration scaffolding, and evaluation pipelines. This is a greenfield build; you define the architecture and set the standards.
Systems & Connector Integration
Build and maintain integrations to internal systems (e.g. Slack, Hubspot, Google Drive, codebases), managing access permissions and ensuring integrations do not decay over time. Own the reliability of the agent's ability to act in our environment.
Evaluation & Performance Benchmarking
Develop evaluation pipelines and benchmarking systems to measure agent performance, reliability, and safety in both production and simulated environments. Define what "the agent is genuinely helping more people in more scenarios" means — and own the metrics that prove it.
Feedback Loops & Signal Extraction
Act as the primary signal source: use internal real-world tasks as training feedback, and engage the user community to extract product signals at scale. Translate raw signal into prioritised harness improvements and roadmap decisions.
Research Collaboration & Roadmap
Collaborate closely with model/ML researchers to co-evolve the harness alongside the model — influencing what gets trained, not just what gets shipped. Own the agent harness product roadmap, connecting researchers, engineers, and users around a coherent vision.
Diagnostics & Engineering Contribution
Diagnose and troubleshoot agent failures, hallucinations, orchestration errors, and integration issues — with enough technical depth to root-cause, not just escalate. Contribute to architecture discussions, code reviews, and engineering ceremonies.