03 Healthcare Heroku · Salesforce · Integration

No API? No problem. Automating patient data sync with a browser robot on Heroku.

A pediatric therapy clinic had two systems that couldn't talk to each other: Salesforce for patient intake and an EMR for clinical records. Every new patient meant staff manually re-entering the same information twice — names, dates of birth, insurance, parent contacts, case notes. The EMR vendor provided no public API. CloudAlgo built a Heroku-hosted integration that closes the gap: a queue-backed Node.js worker that uses Puppeteer to drive a headless Chrome session through the EMR portal, populating every field exactly as a human would — but continuously, automatically, and without errors.

0 manual data entry required — every patient sync fully automated

3-month engagement engagement

The challenge

What needed solving.

Healthcare operations teams rarely choose which software they use — the EMR is mandated, the CRM is the enterprise standard, and the integration gap between them becomes a staffing problem. For this clinic, every patient intake created a dual-entry burden: clinical and administrative data recorded in Salesforce had to be manually transcribed into the EMR portal, field by field, form by form.

Every patient intake required staff to open two systems and enter the same information twice — patient details, parent/guardian contacts, insurance coverage, and clinical case notes.
Transcription errors in patient records carry real risk in a clinical setting. A wrong date of birth, an incorrect insurance ID, or a missing parent phone number creates downstream problems in billing, scheduling, and care coordination.
Staff time spent on dual data entry could not be spent on patient care, scheduling, or clinical support — the work the clinic actually hired for.
The EMR vendor provided no public API, no webhook support, and no data export endpoint. The only integration surface was the web portal itself.

Why standard tools failed

Why off-the-shelf wasn't enough.

The absence of an API is not a configuration problem — it's an architectural one. No integration platform, no middleware connector, and no low-code tool can connect to a system that doesn't expose an integration interface. Zapier, MuleSoft, and Heroku Connect all require at minimum a REST API or a database connection. When the only interface is a JavaScript-heavy web portal, the only viable bridge is a programmatic browser that can interact with it as a human would.

Tool	Category	What it does well	Why it wasn't enough
Zapier	No-code automation	Simple trigger-action workflows between systems with native connectors	Requires an API or a native app connector. No EMR connector exists. Cannot drive a web portal.
MuleSoft	iPaaS	Enterprise-grade API connectivity with deep integration logic	Only as capable as the APIs available. No API means MuleSoft has nothing to connect to on the EMR side.
Heroku Connect	Database sync	Bidirectional Salesforce ↔ Postgres sync with no custom code	Syncs to a Postgres database, not to a web portal. The EMR has no database access layer.
Manual entry	Status quo	Always works — no technical risk or upfront investment	Staff time per patient, transcription errors, and no scaling path as patient volume grows.

The solution

What we built.

CloudAlgo built a two-process Heroku application: a web process that manages Salesforce OAuth credentials and a worker process that listens to a RabbitMQ queue. When a patient record is created or updated in Salesforce, an Apex trigger publishes a job to the queue. The worker picks it up, launches a Puppeteer-controlled headless Chrome session, operates the EMR portal to create or update the patient record, and returns the resulting EMR patient ID back to Salesforce via an Apex REST callback.

Salesforce Triggers the Queue

When a patient record is created or updated in Salesforce, an Apex trigger publishes a job payload to a RabbitMQ queue hosted on CloudAMQP. The queue decouples Salesforce event timing from the sync operation — Salesforce doesn't wait for the browser to finish, job bursts don't stack up Puppeteer sessions, and failed jobs are nack'd without data loss.

OAuth Credentials Cached in Redis

A web process handles the Salesforce OAuth 2.0 flow and stores access tokens, refresh tokens, and instance URLs in Heroku Redis. The jsforce library automatically refreshes expired tokens and updates the cache — so the integration stays connected indefinitely without manual re-authentication.

Puppeteer Drives the EMR Portal

For each job, the worker launches a headless Chrome browser via the Google Chrome buildpack on Heroku. Puppeteer navigates to the EMR portal, logs in, and programmatically interacts with the interface: clicking buttons, selecting dropdown options, filling form fields, and waiting for network idle before each step. The worker detects and resolves idle lock screen re-authentication automatically, and checks field values before writing — only updating fields that have actually changed.

EMR Patient ID Written Back to Salesforce

After creating or updating the patient record, the worker extracts the EMR patient ID from the portal and calls a Salesforce Apex REST endpoint to write it back. The Salesforce record is now linked to the EMR record by ID, enabling future updates to target the correct patient without re-searching. On any failure, the worker sends an error callback so the Salesforce record reflects the failure rather than remaining silently stale.

Engineering depth

Technical highlights.

Browser Automation as Integration Layer

Puppeteer driving a full Chromium instance is not the first-choice integration pattern — but when there is no API, it is the only viable one. The architecture treats the browser as a typed, programmatic interface: CSS selectors as contracts, network idle waits as synchronisation points, and field-level read-before-write logic to prevent unnecessary mutations.

Queue-Backed, Decoupled Architecture

A synchronous Salesforce → EMR call would mean Salesforce waits for Puppeteer to complete — a multi-second operation that can time out, stall on a lock screen, or encounter unexpected DOM state. RabbitMQ decouples the trigger from the execution: the queue absorbs bursts, failed jobs are nack'd without data loss, and the worker processes at its own pace without blocking the Salesforce transaction.

Lock Screen and Idempotent Field Writes

The EMR portal shows a password re-entry dialog after idle periods. The worker detects this overlay at every interaction point and resolves it before continuing. Field writes are also idempotent: the current value is read before typing, and the field is only updated if the value differs — preventing spurious writes and reducing the risk of triggering portal-side validation errors on unchanged fields.

Heroku Add-Ons Eliminate Infrastructure

The Google Chrome buildpack makes Chrome available on the dyno without Docker. Heroku Redis provides token caching, CloudAMQP provides the managed RabbitMQ queue, and Papertrail aggregates logs. No container orchestration, no managed cloud infrastructure. The stack deploys with a git push.

The results

Measurable outcomes.

Manual data entry — every patient sync is fully automated

24/7

Continuous coverage — the worker runs around the clock without supervision

Transcription errors — Salesforce is the source of truth, written once

The CloudAlgo difference

What this shows about how we work.

We find the integration path, even when there isn't one.

No API doesn't mean no solution. It means the solution requires a different kind of engineering. Recognising that a headless browser is a legitimate, architecturally sound integration layer — not a workaround — is what made this problem solvable.

Architecture decisions prevent operational debt.

A synchronous direct-call integration would have worked initially and broken under any load or timeout. Queue decoupling, Redis token caching, and error callbacks to Salesforce are not over-engineering — they're the difference between something that works at 3am on a Monday and something that only works when someone is watching.

Healthcare constraints require defensive engineering.

Patient data accuracy is not a UX concern — it's a clinical one. Idempotent writes, lock screen detection, and failure callbacks were built because silent errors in a medical context are not acceptable. The integration behaves defensively by design.

Platform choice is an operational cost decision.

Heroku's add-on ecosystem eliminated a significant infrastructure footprint: no managed Redis to provision, no AMQP cluster to configure, no Chrome runtime to containerise. That operational simplicity is a cost advantage that compounds over the lifetime of the integration.

Technology stack

Runtime Node.js · TypeScript

Web Framework Express.js

Browser Automation Puppeteer (headless Chromium)

Salesforce API jsforce (OAuth 2.0, Apex REST callbacks)

Job Queue RabbitMQ via CloudAMQP

Token Cache Heroku Redis (mini)

Process Management Throng (clustered worker processes)

Hosting Heroku (web + worker dyno formation)

Chrome Runtime Google Chrome buildpack

Logging Papertrail

← Previous Enterprise Manufacturer From disconnected systems to a unified, analytics-ready data layer. Next → Specialty Wholesale Distributor One source of truth: bidirectional Salesforce ↔ NetSuite sync via MuleSoft API-led integration.

Work with us

Facing a similar challenge?

Tell us about your data problem — we'll scope it and respond within one business day.

Start a conversation → See all case studies

No API? No problem. Automating patient data sync with a browser robot on Heroku.

What needed solving.

Why off-the-shelf wasn't enough.

What we built.

Salesforce Triggers the Queue

OAuth Credentials Cached in Redis

Puppeteer Drives the EMR Portal

EMR Patient ID Written Back to Salesforce

Technical highlights.

Browser Automation as Integration Layer

Queue-Backed, Decoupled Architecture

Lock Screen and Idempotent Field Writes

Heroku Add-Ons Eliminate Infrastructure

Measurable outcomes.

What this shows about how we work.

We find the integration path, even when there isn't one.

Architecture decisions prevent operational debt.

Healthcare constraints require defensive engineering.

Platform choice is an operational cost decision.

Facing a similar challenge?

Let's build somethingtogether.

We use cookies

Cookies — Modifications & Details

Cookies and Modifications

What are Cookies?

Purposes of Cookies

Recognition and Response

Continuous Improvements

Advertising and Retargeting

Cookie Categories

What do you need to know about cookies?

First-Party Cookies

Third-Party Cookies

Rejecting Cookies

Cookie Management in your browser

Let's build something
together.