AI Harness Engineering PM — A65 Careers

About the job

This is a new breed of role — sitting at the intersection of product management and agentic engineering. You will define, build, and own the harness layer that makes AI agents actually work in our suite of advertising products. The harness is everything between the model and the user: the infrastructure, connectors, evaluation systems, and feedback loops that determine whether our agents are reliable, useful, and safe. You are not a traditional PM who hands specs to engineers, nor a pure engineer who ships features without a user lens. You are both — with a deep enough technical foundation to get your hands dirty, and a sharp enough product instinct to know what matters. This is a greenfield function. There is no existing playbook, no inherited team structure. The right person sees that as the opportunity, not the risk.

Key responsibilities

Harness Design & Ownership

Design and own the agent harness framework — the full layer between model and user, including connectors, execution environments, orchestration scaffolding, and evaluation pipelines. This is a greenfield build; you define the architecture and set the standards.

Systems & Connector Integration

Build and maintain integrations to internal systems (e.g. Slack, Hubspot, Google Drive, codebases), managing access permissions and ensuring integrations do not decay over time. Own the reliability of the agent's ability to act in our environment.

Evaluation & Performance Benchmarking

Develop evaluation pipelines and benchmarking systems to measure agent performance, reliability, and safety in both production and simulated environments. Define what "the agent is genuinely helping more people in more scenarios" means — and own the metrics that prove it.

Feedback Loops & Signal Extraction

Act as the primary signal source: use internal real-world tasks as training feedback, and engage the user community to extract product signals at scale. Translate raw signal into prioritised harness improvements and roadmap decisions.

Research Collaboration & Roadmap

Collaborate closely with model/ML researchers to co-evolve the harness alongside the model — influencing what gets trained, not just what gets shipped. Own the agent harness product roadmap, connecting researchers, engineers, and users around a coherent vision.

Diagnostics & Engineering Contribution

Diagnose and troubleshoot agent failures, hallucinations, orchestration errors, and integration issues — with enough technical depth to root-cause, not just escalate. Contribute to architecture discussions, code reviews, and engineering ceremonies.

What we're looking for

4+ years of software engineering experience — full-stack or platform engineering preferred. ML/LLM model training is not the target profile for this role.
Hands-on experience building testing frameworks, evaluation systems, or developer tooling. Strong grasp of CI/CD, source control, and Agile development practices.
Familiarity with LLM-based systems, AI agents, or autonomous workflows — and genuine working knowledge of: LLM API, KV Cache, Agent Loop, Tool Use, Reasoning, Planning, MCP, Memory, Subagent, Multi-Agent, Prompt Engineering, and Context Engineering.
Product instinct developed through 2+ years of product thinking (formal PM title not required). Demonstrated ability to define metrics, design systematic data collection, and translate user signals into prioritised decisions.
Deep power user of agent products — Claude Code, Cursor, Manus, Codex, GitHub Copilot, Cowork, or similar. Strong intuition for model behaviour, developer experience, and end-user experience across all forms of agent product.
Comfort operating as a sole contributor in a greenfield function with no existing team or playbook to inherit. Pragmatic builder who ships rather than theorises, with a strong sense of ownership.
Genuine enthusiasm for agentic AI as the future of how products are built — ambivalence about AI-assisted development is a disqualifier. Personal tinkering or side projects with AI tooling is a strong signal.
UI/UX design sensibility — able to produce product prototypes and design briefs with AI assistance. Strong English communication for cross-functional and external collaboration.

Nice to haves

Experience with AI agent frameworks (LangChain, LangGraph, CrewAI, AutoGen, or similar).
Experience with LLM evaluation frameworks and prompt/context testing.
Knowledge of distributed systems, Kubernetes, cloud infrastructure, or observability tooling for AI systems.
Experience building simulation environments or synthetic test datasets.
Open-source community involvement, or experience leading a product roadmap as the sole PM in a small team.
Deep collaboration experience with ML researchers.

What we offer

Cutting-edge technology — Work at the frontier of agentic AI, building systems that don't yet have a playbook. You won't be incrementing existing products — you'll be defining a new category.
Career growth potential — With A65's expansion plans and eventual IPO, you will grow alongside a company building the future of marketing intelligence — with a seat at the table from day one.