What is a unified ATS API?

A unified ATS API is a single standardized interface that normalizes data from multiple applicant tracking systems (Greenhouse, Lever, Workable, etc.) into a common schema. You integrate once instead of building separate connectors for each platform. The unified API handles per-vendor authentication, pagination, rate limiting, and schema normalization behind the scenes.

What are the main differences between Greenhouse and Lever APIs?

Greenhouse uses HTTP Basic Auth and Link header pagination, while Lever uses OAuth 2.0 and body-cursor pagination. Their data models differ fundamentally: Greenhouse separates Candidates and Applications as distinct entities, while Lever merges them into an Opportunity object. Greenhouse uses integer IDs; Lever uses UUIDs.

Are webhooks enough to keep ATS data in sync?

No. Use webhooks for lower latency, but keep periodic reconciliation jobs and hydration reads in place. Greenhouse documents webhook retries up to 7 attempts, Lever signs webhook payloads and expects verification, and Workable exposes subscription endpoints — but none of that removes the need for periodic re-sync. Treat webhooks as hints, not as your only source of truth.

How do you handle custom fields across different ATS platforms?

Each ATS exposes custom fields differently — Greenhouse gates some behind Enterprise-tier plans, Lever uses tags, Workable uses custom questions. A unified API with per-account override capabilities lets you map vendor-specific custom fields into your normalized schema without code changes for each customer.

How to Integrate Multiple ATS Platforms (Greenhouse, Lever, Workable)

Q: How much does it cost to build an ATS API integration?

A single well-scoped ATS integration costs $5,000–$15,000 for standard workflows, but complex integrations with bi-directional sync, custom field mapping, and webhook handling regularly push into the $20,000–$50,000+ range. Annual maintenance adds 10–20% of the initial cost. Multiply by the number of ATS platforms you need to support.

Your product doesn't integrate with Greenhouse. The $200K ARR deal that was "basically closed" just died in legal review because the prospect's talent acquisition team won't adopt a tool that doesn't plug into their ATS. Your engineering lead says it's a two-week project. They're wrong — and you've probably heard this before if you've dealt with HRIS integrations.

Building the initial HTTP request to fetch a candidate is easy. Managing OAuth token lifecycles, translating inconsistent data models, handling brutal rate limits, and maintaining webhooks when a vendor pushes a breaking change is where the real time goes.

This guide breaks down what it actually takes to integrate with Greenhouse, Lever, and Workable simultaneously — the schema mismatches, auth headaches, pagination traps, and webhook edge cases — and how to ship these integrations without burning your team's entire quarter.

The Enterprise Deal Blocker: Why You Need Greenhouse, Lever, and Workable Integrations

The ATS market is not slowing down. According to MarketsandMarkets, the applicant tracking system market is projected to grow from $3.28 billion in 2025 to $4.88 billion by 2030, at a CAGR of 8.2%. That growth means more companies buying more ATS platforms, and more prospects expecting your product to work with whichever one they've chosen.

Your prospects aren't all on the same ATS. Greenhouse dominates structured hiring at scale — companies like DoorDash and Wayfair use it. Lever combines ATS and CRM into LeverTRM, popular with mid-market sourcing-heavy teams. Workable appeals to growing companies that want something faster to deploy. Your sales pipeline probably has prospects on all three.

HR tech stacks are notoriously disjointed. According to HR.com's 2025 State of HR Technology report, the majority of organizations use multiple paid HR solutions, with roughly 62% running two to four paid systems and only about 15% relying on a single platform. The same report found that only 10% of organizations say their HR solutions integrate extremely well, while about 21% describe integration as poor or very poor.

If your product is one more system in that stack, the customer expects you to be the part that doesn't make the mess worse. Recruiting teams don't buy your product in isolation. They buy it as a node in a bigger workflow that already includes an ATS, an HRIS, interview scheduling, assessments, and reporting. If your tool can't ingest jobs, attach candidates to applications, react to stage movement, or push outcomes back where they're needed, you're asking them to create a parallel system by hand.

The business case is binary: either you support the ATS platforms your prospects use, or you lose deals. The question is how you get there without destroying your engineering velocity.

The Hidden Costs of Building ATS API Integrations In-House

The two-week estimate your engineer gave you covers the happy path for one integration. That's roughly 10% of the actual work.

Initial build cost per integration: Development costs for a well-scoped SaaS API integration range from roughly $5,000 to $15,000 for standard workflows, but complex integrations with bi-directional sync, custom field mapping, and webhook handling regularly push into the $20,000–$50,000+ range. Multiply that by three ATS platforms and you're looking at a significant chunk of your engineering budget before you've written a line of core product code.

The maintenance multiplier: The number that kills you isn't the build cost — it's the ongoing maintenance. Annual upkeep typically runs 10–20% of the initial development investment. API versioning changes, deprecated endpoints, authentication flow updates, and undocumented behavioral changes mean someone on your team is permanently assigned to babysitting these integrations. Greenhouse recently shipped a v3 Harvest API with a completely different pagination model (cursor-based instead of their older page-number approach). If you'd built against v1, that's a rewrite.

Opportunity cost is the real killer: Every sprint spent wrestling with ATS API pagination quirks is a sprint not spent on the features that differentiate your product. If you're building an interview intelligence platform, your competitive advantage is in AI analysis of interview transcripts — not in figuring out why Greenhouse returns a 429 at 50 requests per 10 seconds while Workable uses a completely different rate limit header scheme.

If someone tells you ATS integration is just a few API calls, they're pricing the demo, not the system. The system is everything around the call:

Workstream	What gets estimated	What production actually needs
Jobs sync	List jobs	Department and office mapping, archived states, pagination, caching, visibility rules
Candidate sync	List candidates	Candidate vs. application separation, attachments, custom fields, re-hydration, dedupe
Realtime updates	Add webhooks	Signature verification, retries, replay protection, queueing, dead-letter handling
Write actions	Move stage	Permissions, actor identity, auditability, provider-specific required IDs
Support	Read the docs	Surviving when the docs are wrong, stale, or too vague to matter

There's also a nasty organizational cost. Once you build one ATS connector, sales immediately asks for a second. The moment you build a second, product assumes a third is cheap. By the time you have three, you don't have three integrations — you have the start of an integration platform whether you meant to build one or not. As we've seen in countless integration horror stories, the build vs. buy calculation almost always favors buy for non-core integrations.

Schema Normalization: The Hardest Part of ATS Integration

Schema normalization means taking vendor-specific objects and translating them into one model your product can trust — without erasing the relationships that make the data meaningful. In the ATS domain, this is notoriously difficult because every vendor has a different philosophy on how recruiting data should be structured. If you want the full argument, Why Schema Normalization is the Hardest Problem in SaaS Integrations goes deeper.

The Candidate vs. Application Data Model Split

The most fundamental divergence is how these platforms model the relationship between a person and a job application.

Concept	Greenhouse	Lever	Workable
Person record	`Candidate` (separate from applications)	`Opportunity` (person + job context merged)	`Candidate` (nested under job)
Job application	`Application` (links Candidate to Job)	Part of the `Opportunity` object	`Candidate` record scoped to a job
Pipeline stages	`JobInterviewStages` per Job	Stages on the `Opportunity`	`Stages` per job pipeline
Moving between stages	PATCH on Application with `stage_id`	Stage change on Opportunity	Move action on Candidate
Rejection	Reject Application with `RejectReason`	Archive Opportunity with reason	Disqualify Candidate

This isn't a cosmetic difference. Greenhouse treats the recruitment process as highly structured and requisition-driven: a Candidate represents the person, an Application is the transaction linking that person to a specific Job, and a single candidate can have multiple applications across different roles over time. When you query the Greenhouse API for a candidate, you receive an array of application IDs — if you want to know what stage of the interview process the candidate is in, you must make a secondary request to fetch the specific application data.

Lever was built with a CRM-first philosophy (bringing many of the same challenges we've discussed regarding native CRM integrations). Its Opportunity object conflates what Greenhouse treats as two distinct entities (Candidate and Application). Lever's API heavily centers on this Opportunity — if you want to move a candidate through a pipeline or add a note, you're almost always interacting with the Opportunity, not the core Contact record.

Workable takes a flatter approach where candidates are often tied directly to jobs in specific endpoints. Candidate creation is job-scoped on /jobs/:shortcode/candidates, and fetching candidates frequently requires filtering by job shortcode or specific pipeline stages. Workable also notes that the IDs exposed in the API are not the same as the ones users see in the application UI — a detail that trips up support debugging.

If your product's data model assumes a Greenhouse-style separation, bolting on Lever support means rethinking your relational assumptions from the ground up.

Response Shape Differences

Here's a concrete example of how the same concept — a candidate's name and email — looks across all three APIs:

// Greenhouse Harvest API response
{
  "id": 12345,
  "first_name": "Jane",
  "last_name": "Doe",
  "emails": [
    { "value": "jane@example.com", "type": "personal" }
  ]
}
 
// Lever API response
{
  "id": "abc-def-123",
  "name": "Jane Doe",
  "emails": ["jane@example.com"],
  "phones": [{"value": "+15551234567"}]
}
 
// Workable API response
{
  "id": "ABCD1234",
  "firstname": "Jane",
  "lastname": "Doe",
  "email": "jane@example.com"
}

Three vendors, three ID formats (integer, UUID, alphanumeric string), three approaches to names (split fields, single field, differently-cased split fields), and three approaches to email (array of objects, flat array, single string). Every one of these differences is a mapping you need to build, test, and maintain.

Custom Fields: Where Enterprise Deals Get Complicated

Every enterprise customer has custom fields in their ATS. Greenhouse exposes custom fields through their API, but there's a catch: custom fields on the Application object are gated behind Enterprise-tier accounts. If your customer is on a lower Greenhouse plan, those fields simply won't appear in the API response, and your integration needs to handle that gracefully instead of breaking.

Lever and Workable handle custom fields differently again — different naming conventions, different data types, different nesting structures. Normalizing company_size from Greenhouse's custom field format, Lever's tag-based system, and Workable's custom question format into a single schema is exactly the kind of work that eats sprints alive.

The Normalization Nightmare in Practice

If your application needs to display a simple list of "Active Candidates and their Current Interview Stage," your backend logic looks like this:

For Greenhouse: Fetch candidates → Extract application IDs → Fetch applications → Extract stage IDs → Fetch stages → Map to UI.
For Lever: Fetch opportunities → Extract stage IDs → Map to UI.
For Workable: Fetch candidates filtered by job → Read stage directly from the candidate object → Map to UI.

Writing if (provider === 'greenhouse') logic throughout your codebase creates a fragile, unmaintainable mess. A sane canonical ATS model starts with these entities: Jobs, Candidates, Applications, JobInterviewStages, Interviews, Scorecards, Offers, RejectReasons, Activities, and Attachments.

Do not collapse candidate and application into one table just because one vendor makes it tempting. A person can exist before they apply, apply to multiple jobs, get moved, get rejected, re-enter the funnel, and eventually receive an offer — all as separate events and relationships.

A few implementation rules save a lot of pain later:

Keep canonical fields small and opinionated. First name, last name, emails, phones, current stage, applied date, offer date — top level.
Preserve raw provider payloads. If you throw away vendor-native data too early, custom enterprise requirements come back as emergency tickets.
Treat custom fields as a typed bag, not a landfill. Keep metadata like provider field ID, type, and label alongside the value.
Map stage IDs, not just stage names. Stage names get renamed by customers constantly. IDs are what your write actions need.
Expect hydration passes. Some list endpoints are discovery endpoints, not final truth. You list first, then fetch full detail where necessary.

Field-Level Mapping: From Vendor Schemas to a Canonical Model

The response shape examples above show the surface differences. Below are complete field mapping tables for the candidate object across all three platforms - the kind of reference you'd build an actual adapter from. These mappings are also what you need if you're standardizing ATS responses for LLM consumption, where consistent field names and types directly affect model output quality.

Greenhouse → Canonical Candidate

Greenhouse Harvest Field	Canonical Field	Transform
`id` (integer)	`id` (string)	`$string(id)`
`first_name`	`first_name`	Direct
`last_name`	`last_name`	Direct
`email_addresses [].value`	`emails [].email`	Unwrap from `{value, type}` objects
`email_addresses [].type`	`emails [].type`	Direct
`phone_numbers [].value`	`phones [].phone`	Unwrap from `{value, type}` objects
`applications [].id`	`applications [].id`	`$string()` - integer to string
`applications [].current_stage.name`	`applications [].stage_name`	Flatten nested object
`tags`	`tags`	Direct string array
`keyed_custom_fields`	`custom_fields`	Pivot: key is immutable `field_key`, value includes `name`, `type`, and `value`
`attachments [].url`	`attachments [].url`	Direct (URLs expire in 7 days)
`attachments [].type`	`attachments [].category`	Map `resume` / `cover_letter` / `admin_only`
`created_at` (ISO 8601)	`created_at`	Direct
`updated_at` (ISO 8601)	`updated_at`	Direct

Key gotcha: Greenhouse exposes custom fields in two formats. The custom_fields hash uses human-readable names as keys, while keyed_custom_fields uses the immutable field_key that won't change even if an admin renames the field in the UI. Always map from keyed_custom_fields - it includes the field type (e.g. single_select, short_text, number) which your canonical model needs for proper type coercion. Also note that attachment URLs are pre-signed and expire after 7 days, so your sync pipeline should download and store them rather than caching URLs.

Lever → Canonical Candidate

Lever Opportunity Field	Canonical Field	Transform
`id` (UUID string)	`id`	Direct
`name`	`first_name`	`$substringBefore(name, " ")`
`name`	`last_name`	`$substringAfter(name, " ")`
`contact`	`person_id`	Unique person ID across opportunities
`emails []` (flat strings)	`emails [].email`	Wrap each string: `{email: $}`
`phones [].value`	`phones [].phone`	Rename key
`tags []`	`tags`	Direct
`sources []`	`sources`	Direct
`stage.text`	`applications [].stage_name`	Extract from stage object
`origin`	`source_type`	Map to enum
`createdAt` (epoch ms)	`created_at` (ISO 8601)	`$fromMillis(createdAt)`
`updatedAt` (epoch ms)	`updated_at` (ISO 8601)	`$fromMillis(updatedAt)`
`archived.reason`	`applications [].reject_reason`	Only present when archived

Lever's biggest traps: timestamps are Unix epoch in milliseconds, not seconds. Dividing by 1000 when you should use the raw value (or vice versa) is a silent data corruption bug. The name field is a single string - splitting on the first space works for "Jane Doe" but produces wrong results for "Mary Jane Watson." Store the raw name alongside the split fields. And critically, the contact field is the unique person identifier that persists across multiple opportunities - use it as your canonical person ID, not the opportunity id.

Workable → Canonical Candidate

Workable Field	Canonical Field	Transform
`id` (alphanumeric)	`id`	Direct
`firstname`	`first_name`	Rename (no underscore in source)
`lastname`	`last_name`	Rename
`email` (single string)	`emails [].email`	Wrap in array: `[{email: $}]`
`phone` (single string)	`phones [].phone`	Wrap in array: `[{phone: $}]`
`stage`	`applications [].stage_name`	From job-scoped context
`skills []`	`skills`	Direct array
`tags []`	`tags`	Direct array
`education_entries [].school`	`education [].institution`	Rename key
`experience_entries [].company`	`experience [].company`	Direct
`social_profiles [].url`	`links [].url`	Flatten with `type`
`resume_url`	`attachments [].url`	Wrap as `{url, category: "resume"}`
`created_at` (ISO 8601)	`created_at`	Direct

Workable's single-value email and phone fields mean you lose multi-value support. If a candidate has two email addresses, only one survives in the API response. Your canonical model should still use arrays - just expect them to contain a single element for Workable sources. Also, Workable's list endpoint returns a reduced candidate JSON - fields like education_entries, experience_entries, and answers only appear when you fetch the full candidate detail via /candidates/:id. Plan for hydration passes.

Handling Custom Fields and Multi-Value Properties

Custom fields are where the "simple mapping table" approach breaks down. Each ATS stores them differently:

Greenhouse: The keyed_custom_fields hash uses immutable field keys and includes name, type, and value for each field. Types include single_select, multi_select, short_text, long_text, number, date, url, and user. The simpler custom_fields hash uses display names as keys and contains just the values.
Lever: Tags are flat strings. Structured custom data lives in form responses on the Opportunity, accessed through the /opportunities/:id/forms endpoint - a separate API call per candidate. Custom requisition fields have their own type and options properties.
Workable: Custom fields appear as answers to custom questions, retrieved from /jobs/:shortcode/questions for the schema and from the answers key on the full candidate object for values. Each answer is an {id, label, value} pair.

For your canonical model, treat custom fields as a typed bag:

type CanonicalCustomField = {
  key: string          // normalized key, e.g. "hiring_urgency"
  label: string        // human-readable display label
  value: unknown       // the actual value
  type: 'string' | 'number' | 'date' | 'boolean' | 'select' | 'multi_select' | 'url'
  provider_id?: string // original field ID in the vendor system
}

Multi-value properties like skills, tags, sources, and social profiles need consistent handling:

Always normalize to string arrays at the canonical level for tags, skills, and sources.
If a vendor provides objects (e.g., Greenhouse skill objects with id and name), extract the display value and preserve the full object in remote_data.
For attachments, normalize to {url, filename, category, mime_type} where category is one of resume, cover_letter, portfolio, other.
Attachment URLs from ATS platforms are often pre-signed and temporary. Greenhouse URLs expire in 7 days. Your sync pipeline should download and store attachments rather than caching URLs that will break.
Social profiles vary wildly: Lever uses links [] as flat URL strings, Workable provides structured social_profiles with type, name, and url. Normalize to {url, type} pairs.

Authentication, Pagination, and Rate Limits Across Recruiting Platforms

Beyond data models, each ATS has its own authentication scheme, pagination strategy, and rate limiting policy. Getting any of these wrong means your integration silently fails in production.

Authentication: Three Platforms, Three Approaches

Greenhouse uses HTTP Basic Auth for its Harvest API. Your API key is the username, the password is blank. You Base64-encode your_api_key: (note the trailing colon) and send it as an Authorization: Basic header. Greenhouse also requires that API keys be granted specific endpoint permissions individually — access is binary per endpoint.

Lever supports OAuth 2.0, which means managing the full authorization code flow: redirect URLs, authorization codes, token exchange, and refresh token rotation. When a token expires, you need the refresh token to get a new pair. If the refresh fails, you need to prompt the customer to re-authenticate. As we've written in OAuth at Scale, token refresh is one of those problems that looks solved until you hit 500 connected accounts.

Workable offers both bearer token authentication and OAuth 2.0 for partners. The OAuth flow requires pre-authorization by Workable to receive client_id and client_secret. Scopes are explicit and additive — r_jobs, r_candidates, w_candidates — meaning you request exactly the permissions you need. Workable also has no CORS support, which matters if anyone on your team was planning to shortcut with browser-side calls.

Pagination: The Silent Complexity Multiplier

Pagination sounds simple until you're debugging why your sync job stopped halfway through a customer's 10,000 candidates.

Greenhouse Harvest API v3 uses cursor-based pagination with opaque cursors returned in the Link header (RFC 5988). The cursor value is Base64-encoded, and Greenhouse explicitly warns you not to parse or construct it yourself. When you pass a cursor, it must be the only query parameter — you can't combine it with filters. Greenhouse v1/v2 uses the older Link header approach with next, prev, and last URLs plus page and per_page query parameters (max 500 per page). If you built against v1 and need to migrate to v3, your pagination logic changes completely.

Lever uses cursor-based pagination with a next token in the response body rather than in headers. A paginated list response includes a next attribute containing a token, which you pass as the offset parameter in your subsequent request.

Workable paginates differently again, using limit and since_id (or offset depending on the endpoint) with pagination metadata inline in the response.

Three pagination strategies means three sets of iteration logic, three sets of rate-limit-aware retry policies, and three different ways your sync can silently stop returning data.

Rate Limits: No Two Vendors Agree

Platform	Rate Limit	Header Format
Greenhouse	~50 requests per 10 seconds	`X-RateLimit-Limit`, `X-RateLimit-Remaining`
Lever	~10 req/sec steady state, bursts to 20	Response headers (varies by endpoint)
Workable	~10 requests per 10 seconds	`X-Rate-Limit-Limit`, `X-Rate-Limit-Remaining`, `X-Rate-Limit-Reset`

Note the inconsistency: Greenhouse uses X-RateLimit-Limit (no hyphens between words), while Workable uses X-Rate-Limit-Limit (with hyphens). This kind of trivial-but-breaking inconsistency is the daily reality of multi-vendor API integration. If you write a simple while loop to fetch all candidates without backoff logic, you will get throttled almost immediately — and Workable's punishing 10-requests-per-10-seconds limit means your sync jobs will fail outright without an exponential backoff queue.

Cursor Translation: Unifying Pagination Across Vendors

When you expose a single next_cursor parameter to your callers - or to an LLM agent making tool calls - you need a way to encode and decode vendor-specific pagination state behind an opaque token.

The approach: your cursor is a Base64url-encoded JSON payload containing the vendor name and whatever state that vendor needs to resume pagination.

type VendorCursor =
  | { vendor: 'greenhouse'; link_next: string }   // opaque Link header cursor
  | { vendor: 'lever'; offset: string }            // Lever's next token
  | { vendor: 'workable'; since_id: string }       // Workable's since_id
 
function encodeCursor(state: VendorCursor): string {
  return Buffer.from(JSON.stringify(state)).toString('base64url')
}
 
function decodeCursor(cursor: string): VendorCursor {
  return JSON.parse(Buffer.from(cursor, 'base64url').toString())
}

When your proxy receives a request with next_cursor, it decodes the cursor, determines the vendor, and translates it into the vendor-native pagination parameter:

async function listCandidates(accountId: string, nextCursor?: string) {
  const account = await getIntegrationAccount(accountId)
  const vendor = account.vendor
 
  let vendorParams: Record<string, string> = { per_page: '100' }
 
  if (nextCursor) {
    const state = decodeCursor(nextCursor)
    switch (state.vendor) {
      case 'greenhouse':
        // Greenhouse v3: cursor replaces all other query params
        vendorParams = { cursor: state.link_next }
        break
      case 'lever':
        vendorParams.offset = state.offset
        break
      case 'workable':
        vendorParams.since_id = state.since_id
        break
    }
  }
 
  const raw = await fetchFromVendor(account, '/candidates', vendorParams)
  const candidates = raw.data.map(normalize)
 
  // Build the next cursor from the vendor response
  let newCursor: string | null = null
  if (vendor === 'greenhouse' && raw.headers.link?.includes('rel="next"')) {
    const linkNext = parseLinkHeader(raw.headers.link).next
    newCursor = encodeCursor({ vendor: 'greenhouse', link_next: linkNext })
  } else if (vendor === 'lever' && raw.body.next) {
    newCursor = encodeCursor({ vendor: 'lever', offset: raw.body.next })
  } else if (vendor === 'workable' && raw.body.paging?.next) {
    newCursor = encodeCursor({
      vendor: 'workable',
      since_id: extractSinceId(raw.body.paging.next)
    })
  }
 
  return { data: candidates, next_cursor: newCursor, has_more: newCursor !== null }
}

The caller sees a clean interface: an array of normalized candidates and an opaque next_cursor. They never need to know whether the underlying ATS uses Link headers, body tokens, or offset parameters. This is especially important for LLM agents, which should never be parsing pagination headers or managing vendor-specific state between tool calls.

How to Handle ATS Webhooks Without Losing Events

Polling APIs for changes is inefficient and eats your rate limits. The correct approach is webhooks — you subscribe to events like candidate_created or application_stage_changed, and the ATS pushes data to your server. But webhooks introduce their own set of engineering challenges.

Signature verification varies by vendor. Greenhouse uses HMAC-style signature verification and retries up to 7 attempts over roughly 15 hours on failures. Lever signs webhook requests using SHA-256 HMAC over the request token and timestamp, and requires HTTPS endpoints. Workable exposes webhook subscription endpoints with its own verification scheme. Each vendor uses a different hashing algorithm and header format.

Event ordering is not guaranteed. You might receive an application_rejected event before the candidate_created event due to network latency. Your database logic must handle upserts and missing foreign keys gracefully.

Events will get dropped. If your server goes down during a deployment and returns 500 errors, some vendors will retry a few times and then permanently drop the event. You need a dedicated message queue (SQS, Kafka, or similar) sitting in front of your application to ingest webhooks instantly, returning a 200 OK before any processing happens.

The best practice here is boring, and boring is good: webhooks should reduce latency, not carry correctness on their backs. Your webhook pipeline should follow this pattern:

Verify signature.
Reject replays.
Put the event on a queue.
Re-fetch the affected object from the vendor API.
Upsert into your own store.
Run periodic reconciliation anyway.

A tiny dedupe check saves you from replay storms and vendor retries:

const inserted = await redis.set(`ats:${vendor}:${deliveryId}`, '1', {
  NX: true,
  EX: 86400,
})
 
if (!inserted) return
await queue.publish('ats-sync', payload)

Treat every webhook as a hint that something changed, not as the only source of truth.

Standardizing ATS Responses for LLM Consumption

If you're building LLM-powered recruiting tools - AI sourcing agents, interview copilots, automated screening systems - the quality of your ATS data pipeline directly determines the quality of your model's outputs. Inconsistent field names, mixed timestamp formats, and unpredictable nesting structures aren't just annoying for engineers. They actively degrade LLM performance.

Why Canonical Schemas Matter for LLM Agents

LLM agents consume ATS data in two primary ways: as tool call responses during agentic workflows, and as context documents in retrieval-augmented generation (RAG) pipelines. Both demand predictable structure.

Tool call responses: When an agent calls list_candidates, the response schema defines what the model can reason about. If Greenhouse returns email_addresses [].value and Lever returns emails [] and Workable returns email, the model either needs three different tool definitions (which bloats the system prompt and confuses tool selection) or you normalize before the response reaches the model. Normalization wins every time.

RAG context: When candidate profiles are embedded into a vector store, inconsistent field names mean the same semantic concept gets embedded as different tokens. A query for "candidate email" might miss records where the field was stored as email_addresses vs email. Normalizing before embedding ensures consistent retrieval quality across all ATS sources.

The LLM Response Envelope

Wrap every ATS API response in a consistent envelope that gives the model the context it needs to interpret the data:

type LLMResponseEnvelope<T> = {
  source: {
    vendor: string           // 'greenhouse' | 'lever' | 'workable'
    account_id: string       // which customer's ATS
    resource: string         // 'candidates' | 'jobs' | 'applications'
    fetched_at: string       // ISO 8601 timestamp
  }
  data: T[]                  // normalized records
  pagination: {
    has_more: boolean
    next_cursor: string | null
    total_count?: number     // when the vendor provides it
  }
  warnings?: string[]        // e.g. "3 records had null email fields"
}

The source block tells the agent where the data came from without leaking vendor-specific implementation details. The warnings array surfaces data quality issues - missing required fields, truncated responses, rate limit slowdowns - that the agent can report to the user rather than silently ignoring.

Here's how a proxy layer translates a vendor list endpoint into this normalized envelope:

async function proxyListCandidates(
  account: IntegrationAccount,
  cursor?: string
): Promise<LLMResponseEnvelope<CanonicalCandidate>> {
  const raw = await listCandidates(account.id, cursor)
 
  const warnings: string[] = []
  for (const c of raw.data) {
    if (!c.emails?.length) warnings.push(`Candidate ${c.id} has no email`)
  }
 
  return {
    source: {
      vendor: account.vendor,
      account_id: account.id,
      resource: 'candidates',
      fetched_at: new Date().toISOString(),
    },
    data: raw.data,
    pagination: {
      has_more: raw.has_more,
      next_cursor: raw.next_cursor,
      total_count: raw.total_count ?? undefined,
    },
    warnings: warnings.length ? warnings : undefined,
  }
}

This envelope is what your LLM agent's tool definition should declare as its return type. The model sees the same shape regardless of which ATS backs the connected account. The fetched_at timestamp also lets the agent reason about data freshness - if a tool response is 3 hours old, the agent can decide to re-fetch before acting on it.

How a Unified ATS API Solves the N-to-1 Integration Problem

The pattern should be clear by now: building point-to-point integrations with each ATS vendor means maintaining N separate connectors with N different authentication flows, N different data models, N different pagination strategies, and N different webhook formats. When N is 3, it's painful. When your sales team needs Ashby, iCIMS, and JazzHR next quarter, it becomes unsustainable.

A Unified ATS API collapses this N-to-1. Instead of integrating with each ATS directly, you integrate once with a standardized interface that normalizes all ATS data into a common schema. The unified API handles the per-vendor translation behind the scenes.

graph TD
    A[Your Application] -->|GET /unified/ats/candidates| B(Unified API Router)
    B --> C{Resolve Account}
    C -->|Load Credentials| D[Fetch Integration Config]
    D --> E[Map Request via JSONata]
    E --> F[Proxy Layer HTTP Client]
    F -->|Basic Auth / Link Headers| G((Greenhouse))
    F -->|OAuth / Offset Token| H((Lever))
    F -->|Bearer Token / Limit-Offset| I((Workable))
    G -->|Raw JSON| J[Response Mapper]
    H -->|Raw JSON| J
    I -->|Raw JSON| J
    J -->|JSONata Transform| K[Normalized Candidate Object]
    K --> A

With this approach:

One data model — Candidates, Applications, Jobs, Interview Stages, Offers, and Scorecards all follow a single schema regardless of the source ATS.
One auth integration — You authenticate with the unified API once; the platform manages per-vendor OAuth tokens, API key lifecycles, and refresh flows.
One pagination interface — Consistent cursor-based pagination with a next_cursor parameter, regardless of whether the underlying ATS uses Link headers, body cursors, or offset pagination.
One webhook format — Real-time events normalized into a consistent payload structure.

A practical canonical interface might look like this:

type UnifiedApplication = {
  id: string
  candidateId: string
  jobId: string
  stageId?: string
  status?: string
  appliedAt?: string
  rejectedAt?: string
  hiredAt?: string
  customFields?: Record<string, unknown>
  remoteData?: unknown
}

The remoteData field is critical. It preserves the raw vendor response so you can always access vendor-specific fields that the unified schema doesn't cover.

The Honest Trade-offs

Unified APIs are not magic bullets. The trade-offs are real, and you should understand them before committing.

You lose some vendor-specific features. A unified schema is inherently a lowest-common-denominator view. Greenhouse's structured scorecard system is richer than what most unified schemas expose. Lever's CRM-specific nurture campaign data might not map cleanly. Any good unified API should preserve the raw vendor response alongside the normalized data.

You add a dependency. Your integration now depends on a third party. If the unified API provider has an outage, all your ATS integrations go down simultaneously. Evaluate uptime SLAs and architectural resilience before committing.

Schema coverage varies by resource and method, not by logo. Not every field from every ATS will be mapped. Enterprise customers with heavily customized ATS instances will need per-account mapping overrides. If a unified API vendor talks only in terms of supported logos, ask for support by resource and method. "Jobs list" is not the same as "applications move-stage." Candidate reads are not the same as attachment uploads.

For a deeper comparison of integration approaches, see 3 Models for Product Integrations: A Choice Between Control and Velocity.

Integrating Multiple ATS Platforms at Once with Truto

Truto's Unified ATS API covers the full recruiting data model: Organizations, Departments, Offices, Jobs, Candidates, Applications, Interview Stages, Interviews, Scorecards, Offers, Reject Reasons, EEOC data, Activities, Attachments, Tags, and Users. Every entity follows a standardized schema that works identically whether the underlying ATS is Greenhouse, Lever, Workable, or Ashby.

Here's what a request to list candidates looks like:

curl -X GET "https://api.truto.one/unified/ats/candidates?integrated_account_id=YOUR_ACCOUNT_ID&limit=10" \
  -H "Authorization: Bearer YOUR_API_TOKEN"

That same request works for any supported ATS. The integrated_account_id determines which ATS and which customer account the request targets. The response shape is identical regardless of source.

Zero Integration-Specific Code

Most unified API platforms maintain separate code paths per integration — essentially if (provider === 'greenhouse') { ... } branches throughout their codebase. Every new integration requires new code, new deployments, and new opportunities for regressions.

Truto takes a different approach. Integration-specific behavior — how to authenticate with Greenhouse's Basic Auth, how to parse Lever's cursor pagination, how to map Workable's flat candidate object into the normalized schema — is defined entirely as data: JSON configuration for API communication patterns, and JSONata transformation expressions for field mapping. The runtime is a generic execution engine that loads the appropriate mapping and evaluates it.

Here's how a JSONata expression maps a raw Greenhouse response into a unified candidate object:

response.{
  "id": $string(id),
  "first_name": first_name,
  "last_name": last_name,
  "name": first_name & ' ' & last_name,
  "emails": email_addresses.{
    "email": value,
    "type": type
  },
  "applications": applications.{
    "id": $string(id),
    "job_id": $string(job_id)
  },
  "custom_fields": custom_fields
}

While a Lever mapping for the same unified output looks like:

response.{
  "id": $string(id),
  "first_name": $substringBefore(name, " "),
  "last_name": $substringAfter(name, " "),
  "emails": emails.[{"email": $}],
  "created_at": createdAt,
  "updated_at": updatedAt
}

Both produce the same unified output shape. The same code path handles both — it just evaluates different transformation expressions. Adding a new ATS is a data operation, not a code deployment. When a bug in the pagination logic gets fixed, every ATS integration benefits immediately. There's no risk that fixing Greenhouse pagination breaks Lever.

Per-Customer Customization Without Code Changes

Enterprise ATS instances are heavily customized. Your customer at a Fortune 500 might have 30 custom fields in Greenhouse that are critical to their hiring workflow. A rigid unified schema that ignores custom fields is useless to them.

Truto handles this with a three-level override system:

Platform-level mappings — The default schema mapping that works for most customers.
Environment-level overrides — Your product can customize mappings for specific deployment environments.
Account-level overrides — Individual connected accounts can have their own mapping customizations.

If one customer's Greenhouse instance uses a custom field called hiring_urgency that's critical to your product, you can map it into the unified response for that specific account without affecting any other customer's integration and without waiting for a code deployment.

The raw vendor response is always preserved alongside the normalized data (in a remote_data field), so you can always access the full original payload when the unified schema doesn't cover a specific field.

Syncing ATS Data at Scale with RapidBridge

If you're building an AI sourcing agent, you can't query the ATS API in real-time for every prompt. You need to sync the entire candidate database to your own vector store or relational database.

Truto provides RapidBridge, a pipeline engine that pulls data from third-party APIs and syncs it to your datastores via webhooks. You can configure a sync job to incrementally fetch candidates updated since the last run, automatically handling rate limits and pagination quirks.

curl -X POST "https://api.truto.one/sync-job" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "integration_name": "greenhouse",
    "resources": [
      { "resource": "ats/candidates", "method": "list" },
      { "resource": "ats/jobs", "method": "list" },
      {
        "resource": "ats/applications",
        "method": "list",
        "query": {
          "updated_at": { "gt": "{{previous_run_date}}" }
        }
      }
    ]
  }'

The previous_run_date placeholder automatically tracks when the last successful sync completed. If Greenhouse throws a rate limit error, Truto's proxy layer catches it, reads the Retry-After header, pauses, and resumes without any intervention from your engineering team. Swap integration_name to lever or workable and the rest stays the same.

You can schedule syncs on a cron, run them on-demand, or trigger them automatically when a new customer connects their ATS account. Error handling defaults to resilient mode — individual record errors are reported via webhook events without stopping the entire sync.

Post-Connection Configuration with RapidForm

Different customers want to sync different subsets of their ATS data. A staffing agency might have 500 active jobs but only want to sync candidates for specific departments. Truto's RapidForm lets you present a dynamic form to customers immediately after they connect their ATS account. The form fetches live data from the connected ATS — departments, offices, job lists — and lets the customer choose exactly what to sync. Those selections are stored as context variables and automatically used in subsequent sync jobs, with no code changes required per customer.

AI Agent Integration via MCP

If you're building LLM-powered recruiting tools (using frameworks like LangChain or LangGraph), giving your agent access to an ATS is traditionally very difficult. You'd normally need to write custom tool definitions for every API endpoint.

Truto solves this by automatically exposing connected integrations as Model Context Protocol (MCP) servers. When a customer connects their Lever account, Truto dynamically generates tool definitions based on the API schemas. Your AI agent can immediately call tools like list_all_lever_opportunities or create_a_lever_candidate through a standardized JSON-RPC 2.0 endpoint. You can filter tools to read-only operations (safe for autonomous agents) or scope them to specific resource groups using tags. The same connected account serves as the foundation for both your app integration and your AI agent — no need to build two separate integration surfaces.

Testing and Rollout Checklist for LLM Agents

Shipping an LLM agent that reads and writes ATS data requires more testing rigor than a standard API integration. Models are non-deterministic, and a malformed tool response can cause the agent to hallucinate actions or silently skip records.

Validate tool response schemas end-to-end. Generate sample responses for each ATS vendor and run them through your tool's JSON schema validator. A single unexpected null where the model expects a string can derail a chain-of-thought.
Test with sparse data. Create test accounts with candidates missing emails, jobs with no applications, and applications with no stage. Real ATS data is messy - your agent needs to handle absent fields without crashing or inventing values.
Cap response sizes for token budgets. A list_candidates call returning 100 full records will blow past most context windows. Set sensible defaults (10-25 records per page) and include has_more so the agent can paginate intentionally.
Scope write permissions tightly. Start with read-only tool definitions. When you enable writes (stage moves, candidate creation), require explicit user confirmation in the agent loop before executing. An autonomous agent that accidentally rejects 50 candidates is a career-ending bug.
Test tool selection with ambiguous prompts. Ask your agent "show me recent candidates" and verify it picks the right tool and the right ATS account. If a user has Greenhouse and Lever connected, the agent needs logic to determine which account to query - or to ask.
Simulate webhook-driven updates during agent sessions. If a candidate's stage changes mid-conversation, the agent should detect stale data. Include fetched_at timestamps in tool responses so the agent can reason about freshness.
Log every tool call and response. You need a full audit trail of what the agent read and wrote. This is non-negotiable for recruiting workflows where compliance and candidate data privacy are at stake.
Run reconciliation tests. After an agent session that made write operations, compare the ATS state against expected outcomes. Verify that stage moves landed, created candidates have correct fields, and no duplicate records were created.
Load test against rate limits. An eager agent can fire off dozens of tool calls in seconds. Verify that your proxy layer's backoff logic prevents 429 cascades, and that the agent receives clear error messages it can reason about.
Test across all three ATS vendors before launch. Don't assume that passing tests on Greenhouse means Lever and Workable will work. Each vendor's edge cases are different - run the full suite against each.

A Practical Rollout Plan

If you're trying to ship multi-ATS support this quarter, here's the order that works:

Quantify the revenue at risk. Count the deals in your pipeline that require Greenhouse, Lever, or Workable integrations. Attach dollar amounts. This is the number you bring to your engineering prioritization meeting.
Normalize the read path first. Ship jobs, candidates, applications, and stages before touching writes. Get data flowing into your product and validate the integration value with customers before investing in write operations.
Store raw payloads and provider IDs. Future you will need them for support and custom enterprise asks. Don't throw away vendor-native data.
Add webhook ingestion plus reconciliation. Low latency from webhooks, correctness from periodic sync. Don't trust webhooks alone — treat them as hints, not sources of truth.
Hydrate detail selectively. Lists are for discovery. Full object fetches are for correctness. Some vendors (including Workable) return partial responses on list endpoints that require follow-up reads.
Externalize customer-specific mapping. Don't fork code because one enterprise account added custom requisition fields. Use per-account overrides instead.
Add writes after you understand actor identity and permissions. Stage movement, rejection, and offer flows are where vendor differences get expensive. Lever needs candidate_id for stage moves, Workable requires user_id, Greenhouse wants application_id. Ship reads first, add writes once you've mapped the requirements.
Instrument everything. Track webhook lag, sync lag, 429 rates, hydration failures, and per-vendor error counts. If you can't observe it, you can't fix it.

If you're a PM trying to get engineering buy-in for adopting an external integration tool, The PM's Playbook walks through exactly how to frame that conversation.

The ATS integration problem doesn't get easier by waiting. Every quarter you delay is another set of enterprise deals that go to competitors who already support the platforms your prospects use. Stop arguing about whether your team can build a Greenhouse connector. They can. The real question is whether maintaining OAuth tokens, tracking API deprecations, and writing JSON parsing logic for a growing list of ATS platforms is the best use of their sprint capacity.

FAQ

More from our Blog

Building Native HRIS Integrations Without Draining Engineering in 2026

Why Schema Normalization is the Hardest Problem in SaaS Integrations

Build vs. Buy: The True Cost of Building SaaS Integrations In-House

The PM's Playbook: How to Pitch a 3rd-Party Integration Tool to Engineering

OAuth at Scale: The Architecture of Reliable Token Refreshes

3 models for product integrations: a choice between control and velocity

Building Native CRM Integrations Without Draining Engineering in 2026

Building integrations in-house and other horror stories

How to Integrate with the Lever API (2026 Engineering Guide)

How to Integrate with the Greenhouse API: A Guide for B2B SaaS

What Are ATS Integrations? (2026 Architecture & Strategy Guide)

Developer Quickstart: Building a Multi-ATS Link UI for Greenhouse, Lever & More