Skip to content

Implementing Human-in-the-Loop Approval Workflows for AI Agent SaaS Actions

Learn how to architect HITL approval workflows for AI agents, with a Plaid Transfer playbook covering risk tiers, Python examples, audit logs, and re-auth patterns.

Nachi Raman Nachi Raman · · 20 min read
Implementing Human-in-the-Loop Approval Workflows for AI Agent SaaS Actions

Your agent is one tool call away from emailing 14,000 customers, deleting a Salesforce opportunity worth $400K, or pushing an unreviewed payroll entry in NetSuite. You have built an impressive AI prototype. It reasons correctly, plans multi-step workflows, and executes function calls exactly as designed. The model is confident. Your CISO is not.

You cannot let a non-deterministic LLM execute consequential SaaS API actions without human oversight. Implementing human-in-the-loop (HITL) approval workflows for AI agents is the difference between a demo and a production deployment that survives an enterprise security review. You need to pause the agent, request human approval, wait for the response, and resume execution.

This guide is for engineers and PMs who have already discovered that wrapping a tool call in if confirm == 'y' does not scale past a Tuesday afternoon. The hard part is not the prompt. It is the distributed systems plumbing underneath: pausing a non-deterministic process, persisting state safely, surviving expired OAuth tokens, and resuming days later without replaying side effects.

The Danger of "Confirmation Fatigue" in AI Agent Tool Calling

Confirmation fatigue is a documented security vulnerability in AI agents. When you require a human to approve every single minor API action—fetching a contact, updating an internal status, reading a calendar event—users quickly become overwhelmed.

Treating every tool call as equally risky is not safety theater. It is an active vulnerability. Security researchers note that confirmation fatigue is the primary obstacle to effective human oversight at scale. When users are bombarded with approval requests, they stop reading the payloads. They blindly click "Approve" just to clear their notifications and get back to work. After the tenth "Are you sure?" dialog, your operations lead is just clicking yes—including on the one that wipes a production object.

The stakes are not hypothetical. A Gartner survey found that 74% of IT application leaders believe AI agents represent a new attack vector into their organization, and only 13% strongly agreed that they had the right governance structures in place to manage them. Gartner also projects that 40% of enterprise applications will embed task-specific AI agents by 2026, up from less than 5% in 2025. That gap between adoption and governance is exactly where production incidents live.

The fix is risk tiering, not blanket confirmation. Most actions an agent takes are read-only, idempotent, or trivially reversible. Pinging a human for those wastes attention. You must classify SaaS API endpoints into distinct risk tiers and reserve interruption for the consequential class:

  • Tier 0 (Auto-Execute): GET requests, idempotent reads, internal-only writes (logs, embeddings). The agent can execute these autonomously, provided it operates strictly within the boundaries of the end-user's authorized data access.
  • Tier 1 (Notify, Do Not Block): Internal CRM notes, draft creation, status flips on owned records. These might require a daily digest review rather than a synchronous interruption.
  • Tier 2 (Synchronous Approval Required): Outbound emails, deal-stage changes, record deletion, bulk updates >N rows. These require explicit, state-managed human-in-the-loop interruptions.
  • Tier 3 (Multi-Party Approval): Payment writes, contract execution, customer-facing communications, or anything touching regulated data.

In highly regulated industries, these tiers are non-negotiable. In healthcare and life sciences, AWS highlights that GxP regulations require strict human oversight for sensitive operations like modifying clinical trial protocols. Your agent framework must be able to enforce these boundaries deterministically. The interesting engineering problem is everything from Tier 2 down. That is where state, time, and distributed failure modes collide.

Why Synchronous API Calls Fail for Human Approvals

The architectural flaw in most early AI agent deployments is relying on synchronous HTTP requests for human approvals.

Standard agent frameworks ship with synchronous tool calling. The model decides to use a tool, the framework invokes a local function, and that function makes an HTTP request to a third-party API. The process blocks until the HTTP response returns. If you implement HITL as a blocking HTTP call—the agent calls a function, the function sends a Slack message with an "Approve" button, and awaits human input—you will hit production failure within a week.

The reasons are unsurprising once you list them:

  • HTTP Timeouts: Cloud load balancers, API gateways, and serverless runtimes enforce hard timeouts. AWS API Gateway drops connections after 29 seconds. Vercel serverless functions time out after 10 to 300 seconds depending on your tier. If your reviewer is in a meeting, the connection drops, and the agent framework receives a 504 Gateway Timeout error.
  • Process Volatility: A pod restart, deploy, or autoscaler eviction destroys the in-memory call stack the agent was suspended in. The human eventually clicks "Approve" three hours later, but the system that requested the approval no longer exists in memory.
  • Token Expiry: A short-lived OAuth access token (Salesforce, Google Workspace, HubSpot) routinely expires while you wait. The refresh token might too if the wait is long enough.
  • Cursor Invalidation: Pagination cursors, scroll IDs, and bulk export job IDs become stale or invalid after a few minutes or hours of inactivity, breaking any "resume where we left off" logic.
  • Cost: Holding an LLM context warm for hours while a human deliberates is a token-budget disaster.

This is the same architectural mistake covered in our guide on how to handle long-running SaaS API tasks in AI agent workflows—synchronous execution models break the moment real-world latency enters the picture. HITL is just the human-driven version of the same problem.

State-Managed Interruptions: The LangGraph and Temporal Pattern

To solve the timeout problem, you must decouple the agent's reasoning loop from the execution of the API call. The standard pattern is state-managed interruption: pause the agent, serialize its complete state to durable storage, return control to the caller, and resume from the exact checkpoint when an approval signal arrives. The execution stops being an in-memory call tree and becomes a row in a database.

Two families of tooling dominate here. LangGraph acts as a safety guardrail, allowing human supervisors to reconfigure state before irreversible actions occur. Instead of blocking a thread, the framework pauses execution, persists the current state (including the LLM's context and the proposed API payload) to a durable database, and shuts down the compute resources.

When the human provides the asynchronous approval signal, the framework re-hydrates the state from the database and resumes execution exactly where it left off.

graph TD
    A[Agent Proposes Consequential Action<br>via Tool Call] --> B[Pause Execution]
    B --> C[Persist State to Postgres]
    C --> D[Dispatch Approval Request<br>Slack / Email / UI]
    D --> E{Human Review}
    E -->|Approved| F[Webhook Callback Received]
    E -->|Rejected| G[Agent Receives Denial Context]
    F --> H[Re-hydrate State]
    H --> I[Execute SaaS API Call]

The Node-Based Breakpoint Pattern

Historically, this was handled by creating explicit breakpoint nodes in a state graph. Here is a conceptual example of how this is handled using LangGraph's node interruption mechanics:

from langgraph.graph import StateGraph, END
from typing import TypedDict
 
class AgentState(TypedDict):
    proposed_action: dict
    approval_status: str
    api_response: dict
 
def propose_action(state: AgentState):
    # Agent decides to delete a CRM record
    return {"proposed_action": {"endpoint": "DELETE /crm/contacts/123"}}
 
def human_approval_node(state: AgentState):
    # This node acts as a breakpoint.
    # Execution pauses here. The system yields control back to the caller.
    pass
 
def execute_action(state: AgentState):
    if state.get("approval_status") == "approved":
        # Execute the actual API call
        return {"api_response": {"status": 200}}
    return {"api_response": {"status": 403, "reason": "Human denied action"}}
 
workflow = StateGraph(AgentState)
workflow.add_node("propose", propose_action)
workflow.add_node("human_approval", human_approval_node)
workflow.add_node("execute", execute_action)
 
workflow.add_edge("propose", "human_approval")
workflow.add_edge("human_approval", "execute")
workflow.add_edge("execute", END)
 
# Compile with a checkpointer to enable state persistence
app = workflow.compile(checkpointer=postgres_saver, interrupt_before=["human_approval"])

The Modern Interrupt Primitive

More recently, LangGraph ships interrupt() as the primitive, paired with a checkpointer that persists state across pauses. The interrupt function pauses graph execution and returns a value to the caller. When called within a node, LangGraph saves the current graph state and waits for you to resume execution with input.

When interrupt is called, it pauses execution of the graph, marks the thread as interrupted, and puts whatever you passed to interrupt into the persistence layer. You can check the thread status, see that it's interrupted, and then invoke the graph again with graph.invoke(Command(resume="Your response here"), thread) to pass your response back in.

A minimal approval node using the modern primitive looks like this:

from langgraph.types import interrupt, Command
 
def execute_consequential_action(state):
    proposed = state["tool_call"]
 
    # Auto-execute Tier 0/1
    if proposed["risk_tier"] <= 1:
        return {"result": run_tool(proposed)}
 
    # Pause for Tier 2+
    decision = interrupt({
        "action": proposed["name"],
        "args": proposed["args"],
        "diff": preview_changes(proposed),
        "requested_by": state["thread_id"],
    })
 
    if decision["approved"]:
        return {"result": run_tool({**proposed, **decision.get("overrides", {})})}
    return {"result": {"status": "rejected", "reason": decision.get("reason")}}

Temporal takes the same idea further with durable execution: workflows are deterministic functions whose entire history is replayed from an event log, so a workflow can await a signal for weeks. The trade-off is heavier infrastructure and a programming model that disallows non-determinism inside workflow code.

The deeper architectural discussion lives in our piece on architecting AI agents with LangGraph, LangChain, and the SaaS integration bottleneck—both frameworks solve the same problem at different abstraction levels.

sequenceDiagram
    participant Agent
    participant Graph as Agent Runtime<br/>(LangGraph/Temporal)
    participant Store as Durable State Store
    participant UI as Approval UI<br/>(Slack/Jira/Web)
    participant SaaS as Third-Party SaaS API

    Agent->>Graph: Propose Tier 2 action
    Graph->>Store: Checkpoint state + interrupt payload
    Graph->>UI: Render approval request
    Note over Graph: Process exits.<br/>No resources held.
    UI->>Graph: Resume(approved=true, overrides={...})
    Graph->>Store: Load checkpoint
    Graph->>SaaS: Execute API call with fresh token
    SaaS-->>Graph: Result
    Graph->>Agent: Continue reasoning

The SaaS Integration Bottleneck: Tokens, Webhooks, and State

Pausing the agent's state is only half the battle. The agent is safely asleep in your database. But what happens to the integration layer while the agent sleeps? If a human takes 72 hours to approve a Salesforce contact merge, the underlying infrastructure connecting your agent to the third-party SaaS platform degrades. This manifests in several specific failure modes:

1. OAuth Access Tokens Expire. Standard OAuth 2.0 access tokens expire quickly. Salesforce access tokens default to 2 hours. Google to 1 hour. HubSpot to 30 minutes. If your agent stashed an access token in its state at pause time, that token is dead by the time approval arrives. Your resume logic must re-fetch credentials, not reuse what it captured.

2. Refresh Tokens Rotate or Revoke. Some providers rotate refresh tokens on every use. Others revoke them after an admin password change or a Marketplace re-install event. Without a token-management layer that handles rotation server-side, your "resume after approval" path needs to gracefully detect the invalid_grant case and surface a re-auth flow. See handling OAuth token refresh failures in production for the failure modes you must plan for.

3. Idempotency Keys Must Outlive the Pause. Generate the idempotency key before the interrupt, persist it in state, and reuse it on resume. Otherwise, a duplicate-resume (human clicks Approve twice; webhook delivered twice) creates two records.

4. Rate Limit Context is Stale. The HTTP 429 budget you observed at pause time is meaningless on resume. Whatever execution layer drives the API call after approval must re-read live rate limit headers.

5. Pagination Cursor Invalidation. If the agent was in the middle of paginating through a massive dataset when it hit an action requiring approval, the cursors provided by the third-party API might expire. Attempting to use the old cursor will result in a 400 Bad Request.

6. Data Retention and Compliance Risks. When you pause an agent, you must persist its state. If the agent is proposing to create a new employee record in an HRIS, that state contains highly sensitive Personally Identifiable Information (PII). Storing that payload in your intermediate database for days while waiting for approval expands your compliance footprint. To mitigate this, you must rely on pass-through architectures. See zero data retention AI agent architecture for the specific security controls required.

7. The Approval Signal Itself is a Webhook. Slack button clicks, Jira ticket transitions, and DocuSign envelope completions all arrive as third-party webhooks with bespoke shapes, signatures, and verification handshakes. Normalizing these into a single "approval received" event is its own substantial integration project.

Architecting a Risk-Tiered Approval Workflow

A workable framework, ordered by the questions you should answer in code:

Question Implementation
Is the action reversible? Tier down. Reversible writes can use post-hoc review rather than pre-execution approval.
Does it touch external parties? Tier up. Sending an email to a customer is always Tier 2+.
Is it in regulated scope? Force Tier 3 with named-approver lists. Gartner recommendations include enforcing zero trust and least privilege for agents, mandating human oversight for high-stakes actions, and limiting tool access to what each task strictly requires.
What is the blast radius? Bulk operations >N records always escalate one tier.
Who owns the data? The approver should be the record owner or a delegated reviewer, not whoever happens to be on call.

A few engineering rules that have saved us repeatedly:

  • Render a diff, not a request. Showing the human the literal JSON body is useless. Show the before/after of the affected fields, the count of impacted rows, and the dollar amount if any.
  • Log the proposal hash. Persist a hash of the proposed action at interrupt time. On resume, verify the executed action matches. This prevents prompt-injection attacks that try to mutate the arguments between approval and execution.
  • Set an approval TTL. A pause is not an open invitation. After 7 days (or 24 hours for sensitive ops), expire the interrupt and require the agent to re-propose. The world has moved on.
  • Always provide a "reject with edits" path. Approvers should be able to mutate the args ("approve, but cap the discount at 10%") rather than only yes/no. This is what Command(resume={...}) is built for.

Connecting AI Agents to Plaid: A Risk-Tiered HITL Playbook

Plaid is one of the highest-stakes integration targets for AI agents in fintech. The API surface spans read-only balance checks all the way to initiating ACH debits from end-user bank accounts. Getting the risk tier wrong here is not a data quality issue - it is a regulatory and financial liability.

Plaid's credential model also differs from the standard OAuth pattern discussed earlier in this guide. A Plaid access_token is used to make API requests related to a specific Item, and access tokens do not expire, although they may require updating. That means the "token expiry" failure mode is not a timer - it is event-driven, which introduces its own complications for long-running approval workflows.

When to Require Human-in-the-Loop: Reads vs. Writes

Plaid products split cleanly along the read/write boundary, and your HITL tier should follow that split:

Plaid Product Endpoint Recommended HITL Tier
Transactions /transactions/sync Tier 0 - Auto-execute
Balance /accounts/balance/get Tier 0 - Auto-execute
Investments /investments/holdings/get Tier 0 - Auto-execute
Liabilities /liabilities/get Tier 0 - Auto-execute
Identity /identity/get Tier 1 - Notify + audit log
Auth /auth/get Tier 1 - Notify + audit log
Transfer Authorization /transfer/authorization/create Tier 2 - Approval required
Transfer Create /transfer/create Tier 3 - Multi-party approval
Recurring Transfer /transfer/recurring/create Tier 3 - Multi-party approval

Two Plaid-specific nuances matter here:

Identity and Auth are reads, but they expose sensitive data. Plaid Identity verifies account holder names, addresses, phone numbers, and email addresses for KYC and fraud prevention workflows. Plaid Auth verifies bank account and routing numbers for ACH payments and direct deposit switching. An AI agent autonomously pulling this data and passing it into an LLM context expands your PII exposure surface. Tier 1 (notify + audit) is the floor; some compliance teams will push these to Tier 2.

Transfer has a built-in two-phase approval pattern. The core developer flow for Plaid Transfer is four steps: Link (account connection and verification), Authorization (create the auth object before transfer), Transfer (submit the transfer on your chosen rail), and Monitor (track state via webhooks and dashboard). You call /transfer/authorization/create first, which returns an authorization_id and a decision. The decision values are: approved (the proposed transfer has been approved for processing by Plaid), declined (Plaid reviewed and declined processing), or user_action_required (an action is required before Plaid can assess the risk). This is a natural HITL insertion point: let the agent create the authorization, pause for human review, then call /transfer/create with the approved authorization_id only after the human confirms.

Step-by-Step Approval Flow for Plaid Transfer

Here is the concrete flow for an AI agent initiating an ACH debit through Plaid with human approval:

sequenceDiagram
    participant Agent
    participant Runtime as Agent Runtime
    participant Plaid as Plaid API
    participant UI as Approval UI
    participant Store as State Store

    Agent->>Runtime: Propose: debit $500 from account
    Runtime->>Plaid: POST /transfer/authorization/create
    Plaid-->>Runtime: authorization_id + decision
    alt decision == approved
        Runtime->>Store: Persist authorization_id,<br>amount, account_id, payload_hash
        Runtime->>UI: Agent wants to debit $500<br>from ****4829. Approve?
        Note over Runtime: Process exits.<br>No compute held.
        UI->>Runtime: Human approves
        Runtime->>Store: Verify payload_hash matches
        Runtime->>Plaid: POST /transfer/create (authorization_id)
        Plaid-->>Runtime: Transfer created
        Runtime->>Agent: Resume with result
    else decision == declined
        Runtime->>Agent: Transfer denied by Plaid risk engine
    else decision == user_action_required
        Runtime->>UI: Prompt end user to<br>re-authenticate via Link update mode
    end

The corresponding Python implementation using LangGraph's interrupt() primitive and the plaid-python client:

from langgraph.types import interrupt
from plaid.model.transfer_authorization_create_request import (
    TransferAuthorizationCreateRequest,
)
from plaid.model.transfer_create_request import TransferCreateRequest
import hashlib, json
 
def plaid_transfer_node(state):
    proposed = state["proposed_transfer"]
 
    # Step 1: Get Plaid's risk decision before bothering a human
    auth_request = TransferAuthorizationCreateRequest(
        access_token=state["plaid_access_token"],
        account_id=proposed["account_id"],
        type="debit",
        network="ach",
        amount=str(proposed["amount"]),
        ach_class="web",
        user={"legal_name": proposed["legal_name"]},
    )
    auth_response = plaid_client.transfer_authorization_create(auth_request)
    authorization = auth_response["authorization"]
 
    if authorization["decision"] == "declined":
        return {
            "result": {
                "status": "declined",
                "reason": authorization["decision_rationale"]["description"],
            }
        }
 
    if authorization["decision"] == "user_action_required":
        return {
            "result": {
                "status": "user_action_required",
                "message": "End user must re-authenticate via Plaid Link",
            }
        }
 
    # Step 2: Plaid approved - now pause for human review
    payload = {
        "authorization_id": authorization["id"],
        "amount": proposed["amount"],
        "account_id": proposed["account_id"],
        "description": proposed["description"],
    }
    payload_hash = hashlib.sha256(
        json.dumps(payload, sort_keys=True).encode()
    ).hexdigest()
 
    decision = interrupt({
        "type": "plaid_transfer_approval",
        "payload": payload,
        "payload_hash": payload_hash,
        "plaid_risk_decision": authorization["decision"],
        "rationale_code": (
            authorization.get("decision_rationale", {}).get("code")
        ),
    })
 
    if not decision["approved"]:
        return {"result": {"status": "rejected", "reason": decision.get("reason")}}
 
    # Step 3: Verify payload integrity and execute
    if decision.get("payload_hash") != payload_hash:
        return {"result": {"status": "error", "reason": "Payload tampered"}}
 
    transfer_request = TransferCreateRequest(
        access_token=state["plaid_access_token"],
        account_id=proposed["account_id"],
        authorization_id=authorization["id"],
        amount=str(proposed["amount"]),
        description=proposed["description"],
    )
    transfer_response = plaid_client.transfer_create(transfer_request)
 
    return {
        "result": {
            "status": "created",
            "transfer_id": transfer_response["transfer"]["id"],
        }
    }

Key implementation details:

  • Get Plaid's authorization before pausing. The Transfer Risk Engine performs a real-time balance check of the end-user account when evaluating debit transfers. These include a mandatory set of risk and compliance checks, as well as a configurable set of rules displayed in the Dashboard. If Plaid declines the transfer, there is no reason to bother the human approver.
  • The 1-hour authorization window is your constraint. Approved authorizations are valid for 1 hour by default, unless otherwise configured by Plaid support. Your approval UI must enforce this TTL. If the approver does not respond in time, the agent needs to re-create the authorization.
  • Persist the authorization_id in checkpoint state. This is your idempotency anchor. If the human clicks "Approve" twice, you use the same authorization_id for both attempts - Plaid will reject the second /transfer/create call.

Audit Log Schema for Plaid Agent Actions

Every Plaid action your agent takes - read or write - should produce an audit record. This is not optional for fintech. Here is a minimal schema:

Field Type Description
event_id UUID Unique identifier for this audit event
timestamp ISO 8601 When the action was proposed or executed
agent_id string Identifier for the agent instance
thread_id string Agent runtime thread/conversation ID
plaid_product string transactions, identity, auth, transfer, etc.
plaid_endpoint string /transfer/create, /identity/get, etc.
plaid_item_id string The Plaid Item involved
account_id string The Plaid account ID (masked in logs)
action_type enum read, write, authorization
risk_tier int 0-3 per your tier classification
approval_required boolean Whether HITL was triggered
approver_id string Identity of the human who approved (null if auto-executed)
approval_timestamp ISO 8601 When approval was granted (null if auto-executed)
payload_hash string SHA-256 of the proposed action payload
executed_payload_hash string SHA-256 of the actually executed payload
plaid_request_id string Plaid's request_id from the API response
outcome enum success, declined_by_plaid, rejected_by_human, expired, error
error_code string Plaid error code if the call failed
ttl_seconds int Time between proposal and execution

Store these records in append-only storage with encryption at rest. For compliance audits, the fields that matter most are approver_id, approval_timestamp, payload_hash, and executed_payload_hash - these prove that a specific human approved a specific action and that the action was not tampered with post-approval.

Do not log the full Plaid API response for Identity or Auth calls. Those responses contain account numbers, routing numbers, and PII. Log only the plaid_request_id and outcome - you can retrieve the full response from Plaid using the request ID if needed during an investigation.

Re-Authentication Approaches for Long-Running Agent Workflows

Plaid's credential model differs from standard OAuth in a way that matters for HITL. A Plaid access_token corresponds to an Item, and access tokens do not expire, although they may require updating. They become invalid when the underlying bank connection breaks - the user changes their bank password, the bank rotates MFA requirements, or consent expires (updating is needed when a user changes their password or when working with European institutions that comply with PSD2's 90-day consent window).

If, for any reason, an Item ever does need re-authentication, any API call will return the ITEM_LOGIN_REQUIRED error. You may also receive the ITEM_LOGIN_REQUIRED error via the ITEM: ERROR webhook, or receive a PENDING_EXPIRATION or PENDING_DISCONNECT webhook. The only fix is sending the end user through Link in update mode, which is used when an existing Item requires input from a user, such as to update credentials or to grant additional consent.

For an AI agent waiting on human approval, this creates a specific failure mode:

  1. Agent proposes a transfer at 10:00 AM. Plaid access_token is healthy.
  2. Human does not approve until 3:00 PM.
  3. Between 10:00 AM and 3:00 PM, the bank revoked the session.
  4. Agent resumes and calls /transfer/create. Plaid returns ITEM_LOGIN_REQUIRED.

Your resume logic must handle this:

import json
import plaid
 
def resume_after_approval(state):
    try:
        result = execute_plaid_transfer(state)
        return {"result": result}
    except plaid.ApiException as e:
        error = json.loads(e.body)
        if error["error_code"] == "ITEM_LOGIN_REQUIRED":
            # Cannot fix this server-side.
            # The end user must re-authenticate via Plaid Link update mode.
            link_token = create_link_token_for_update(
                state["plaid_access_token"]
            )
            return {
                "result": {
                    "status": "re_auth_required",
                    "link_token": link_token,
                    "message": "Bank connection expired. User must re-authenticate.",
                }
            }
        raise

For step-up verification before high-value transfers, consider these patterns:

  • Session re-check: Before executing a Tier 3 action, verify that the approver's session is still active and authenticated. Do not accept an approval from a session that was established 8 hours ago.
  • Step-up MFA for the approver: Require the human approver to complete an MFA challenge (TOTP, push notification) before their approval is accepted for transfers above a dollar threshold. This is your application logic, not Plaid's - Plaid handles the end-user's bank authentication, but the approver verification is your responsibility.
  • Time-bound approval tokens: Issue a short-lived, signed approval token when the approver confirms. The agent's resume logic validates this token before executing. This prevents replay attacks where an old approval is reused for a new transfer.

Minimal Plaid Product Mappings for Common Agent Use Cases

When initializing Plaid Link for an AI agent integration, request only the products your agent actually needs. Plaid pricing is structured around product selection and API call volume, and the platform offers multiple products - including Auth, Balance, Identity, Transactions, Investments, and Income - each with its own per-call or per-user pricing model. Over-permissioning expands your compliance surface for zero benefit.

Agent Use Case Required Plaid Products HITL Tier Notes
Transaction categorization and expense tracking transactions 0 Use /transactions/sync for incremental updates
Balance monitoring and cash flow alerts transactions, balance 0 balance for real-time; transactions for history
Account verification for payments auth 1+ Returns account + routing numbers. Log all access.
Identity verification / KYC identity 1-2 Returns PII (name, address, email, phone)
ACH debit initiation transfer, signal 3 signal for pre-transfer risk scoring
Investment portfolio analysis investments 0 Read-only holdings and transactions
Loan underwriting / income verification income_verification, assets 2+ Asset reports are point-in-time snapshots
Recurring payment setup transfer, signal 3 Use /transfer/recurring/create. Always Tier 3.
Fraud detection and account monitoring transactions, identity 1 identity component pushes tier up

No products should be added to the products array and no product-specific request parameters should be specified when creating a link_token for update mode, unless you are adding specific products. If you are using update mode to add consented products, the products must be added to the additional_consented_products array instead. Use this mechanism if your agent needs to expand access later rather than creating a duplicate Item.

How Truto Simplifies HITL Integrations for AI Agents

Building a robust, state-managed HITL workflow requires an integration layer that abstracts away the volatility of third-party APIs. Truto does not run your agent graph—LangGraph, Temporal, or your own framework owns that. What Truto handles is the SaaS-side surface area that breaks while your agent is paused, so resumption is reliable.

Proactive OAuth Token Refresh

Truto handles credential lifecycles entirely server-side. Before every single API call routed through Truto's proxy or unified API layer, the platform checks the token expiration. Truto refreshes OAuth tokens proactively before they expire. A scheduled alarm renews the credential 60 - 180 seconds before the expiry window. When your agent resumes 48 hours after pausing, the next API call against Salesforce or Xero uses fresh credentials without your code touching token management.

Standardized Rate Limit Headers

When an agent resumes after a long pause, it might hit an API that is currently experiencing heavy load. Truto normalizes upstream rate limit information into standardized headers per the IETF specification (ratelimit-limit, ratelimit-remaining, ratelimit-reset).

Info

Note on Rate Limits: Truto does not automatically retry, throttle, or absorb rate limit errors. When an upstream API returns an HTTP 429, Truto passes that error directly back to the caller. This is an intentional architectural decision, ensuring your agent framework retains deterministic control over retry logic and exponential backoff.

Webhook Normalization for Asynchronous Callbacks

When the approval signal comes from a third-party (Slack interactive message, Jira transition, DocuSign envelope completion), Truto's webhook normalization layer handles signature verification, payload transformation, and event mapping into a unified shape. Your system listens to a single, predictable webhook format from Truto, which you use to signal your agent framework to resume execution without writing per-provider parsers.

Zero Data Retention

Truto's pass-through architecture means sensitive third-party data is not stored in an intermediary database while your agent waits. This drastically reduces your compliance footprint when handling PII during long-running approvals.

Architecting for Resilience

Deploying AI agents into enterprise environments requires acknowledging the harsh realities of distributed systems. Synchronous HTTP requests will drop. Humans will take days to approve actions. OAuth tokens will expire while workflows sit idle.

By implementing state-managed interruptions, categorizing API endpoints by risk tier to prevent confirmation fatigue, and relying on a resilient integration layer to manage the complexities of SaaS authentication and webhooks, you can build AI agents that execute consequential actions safely and reliably.

Warning

The most common HITL bug in production: the approval UI shows the user one thing, but the args mutate between approval and execution because the agent re-runs an LLM call on resume. Always pin the approved payload by hash at interrupt time and refuse to execute if the hash drifts.

If you are designing this from scratch, start with three things: write down your action taxonomy before any code, pick a checkpointer backed by durable storage from day one, and treat third-party token and webhook normalization as a buy decision unless integrations are your product.

FAQ

How do I safely connect an AI agent to Plaid financial data?
Classify every Plaid product by risk tier. Read-only products like Transactions and Balance can auto-execute at Tier 0. Identity and Auth expose PII and account numbers, so they need audit logging at minimum (Tier 1). Transfer operations that move money must always require explicit human-in-the-loop approval (Tier 2-3). Use a state-managed interruption framework like LangGraph to pause the agent while a human reviews Transfer authorization requests.
Do Plaid access tokens expire?
Plaid access tokens do not expire on a timer like standard OAuth tokens. They become invalid when the underlying bank connection breaks - for example, when the user changes their bank password or when consent expires under PSD2 regulations. When this happens, Plaid returns an ITEM_LOGIN_REQUIRED error, and the end user must re-authenticate through Plaid Link in update mode.
What Plaid products should I request for an AI agent integration?
Request only the products your agent needs. For transaction categorization, request only 'transactions'. For payment initiation, request 'transfer' and 'signal'. Over-permissioning increases costs (Plaid charges per product per call) and expands your compliance surface. Use Plaid Link's additional_consented_products parameter if you need to expand access later.
Why do synchronous API calls fail for human-in-the-loop approvals?
HTTP timeouts, process restarts, and token expiry all break synchronous HITL flows. Cloud load balancers drop connections after seconds, but a human might take hours or days to approve. You need state-managed interruptions that persist the agent's state to durable storage, release compute, and resume only when the approval signal arrives.
How do I prevent confirmation fatigue in AI agent workflows?
Implement risk tiering instead of blanket confirmation. Classify API endpoints into tiers: auto-execute for reads and idempotent operations, notify-only for low-risk writes, and require synchronous approval only for consequential actions like payment writes, record deletion, or bulk updates.

More from our Blog