Ask your AI agent to check Salesforce for the latest deal status. Simple request. The agent can write code, reason about architecture, plan multi-step workflows. But it opens Salesforce and stares at a login page. No session. No cookies. No SSO token. It doesn't know your password, and even if it did, there's a 2FA prompt waiting. The most capable model in the world can't get past a login form it's never seen before.
That's an infrastructure problem, not a model problem. And it turns out to be one of the biggest gaps in AI agent tooling right now.
The headless assumption
Playwright, Puppeteer, Selenium — they were all built for automated testing. You spin up a fresh, headless browser instance, navigate to a URL, interact with elements, assert that things look correct. The browser starts empty every time. No cookies, no saved sessions, no history. For testing, that's a feature. You want deterministic, isolated runs.
For agent workflows, it's a dead end.
The web that matters to a working professional — their CRM, their project management tool, their analytics dashboard, their internal admin panels — sits entirely behind authentication. And not simple authentication. Enterprise SSO, SAML redirects, OAuth flows, hardware security keys, time-based one-time passwords. These systems were specifically designed to prevent automated access by unknown clients. They're doing exactly what they're supposed to do.
Giving an agent a headless browser is like giving a new employee a laptop with no accounts configured and asking them to get work done.
A different bet
When I started building Pagerunner, the first design decision was to invert the headless assumption entirely. Instead of launching a blank browser, Pagerunner connects to your real Chrome installation. Your actual profiles, your saved sessions, your cookies, your logged-in state. The agent inherits the access you already have, the same way a colleague would if you handed them your laptop and said "check this for me."
Under the hood, it's Chrome DevTools Protocol (CDP) — the same debugging interface Chrome's own dev tools use. Pagerunner launches Chrome with remote debugging enabled, connects over CDP, and exposes browser control as an MCP server. Navigation, clicks, form fills, screenshots, JavaScript evaluation. All within a session that's already authenticated to your services.
I think of this as a philosophical choice as much as a technical one. The browser is already the universal interface to human software. Your SaaS tools, your internal dashboards, your admin panels — they all have a browser interface. Why build API integrations for each one when the agent can use the same interface you do?
The privacy question
If the agent has access to your real browser, it has access to everything in it. Your email. Your bank. Your messages. That's a real concern, and I won't gloss over it.
In Pagerunner, when you open a session, you can enable PII anonymisation. The system scans page content for emails, phone numbers, credit card numbers, IBANs, and other sensitive patterns, then redacts them before the agent sees the page. An optional ONNX-based named entity recognition model catches person and organisation names too. The agent gets the page structure and the relevant information, but personal data is masked.
Session snapshots get similar treatment. Pagerunner can save the state of a browser session (cookies, local storage, everything needed to resume later) and encrypts it with AES-256-GCM. The encryption keys live in the macOS Keychain, never on disk as plaintext. Snapshot a session on Monday, resume it on Wednesday, and the sensitive state stays protected at rest the whole time.
Does this solve every privacy concern? No. Any tool that gives an AI agent access to your authenticated browser sessions requires trust in the tool, in the model, and in the workflow you've designed. But agents that can't access any of your real tools aren't actually safer. They just push the work back onto you.
Agents that can't reach your real tools don't protect you. They just move the task back to your screen.
What you can actually do with this
Once agents can access real browser sessions, the work they handle looks different. A few workflows we've built:
One monitors a client's analytics dashboard and sends a summary every morning. No API integration needed. It opens the dashboard, reads the numbers, takes a screenshot for context, composes the summary. When the dashboard vendor changes their UI, the agent adapts. When they change their API, an API integration breaks. The browser just keeps working.
Another fills out compliance forms in a vendor portal. The portal has no API. It's a web form behind SSO, with dropdown menus and conditional fields. A human spends forty minutes on it. The agent does it in two.
A third cross-references information across a ticketing system, a knowledge base, and a CRM to prepare a customer call brief. Each tool requires separate authentication. The agent switches between tabs the way you would.
No APIs required for any of these. No custom integrations. The browser is already the integration layer — every tool speaks HTML, CSS, and JavaScript. Agents that can use a real browser inherit decades of web infrastructure for free. That's genuinely exciting to me, because it means the problem isn't "how do we connect to everything" — it's "how do we give the agent a seat at the table."
Where this sits
Pagerunner isn't the only project in this space. Browser-use takes a vision-model approach, interpreting screenshots to decide what to click. Stagehand augments Playwright with natural-language primitives. Others record and replay human actions. The approach I took — connecting to real profiles over CDP — optimises for practical access over abstraction. It works well when authentication is the primary barrier, less well when the agent needs to bounce across dozens of unfamiliar sites.
Honestly, all of this is early. Every approach has trade-offs. None are mature enough to run unattended forever. But the direction is clear: agents that can't interact with the web as it actually exists — behind login walls, inside enterprise portals, across real sessions — can't do most of the work that matters.
The tools will improve. That won't change.