Agent Steps
Agent Steps are the main step type for AI-driven work in an Action.
Use an Agent Step when an Action Flow needs an AI model to reason over inputs, call tools, inspect intermediate results, and produce a useful business output.
Common uses include:
- extracting structured values from documents,
- classifying invoices or requests,
- matching records against reference data,
- deciding which tool to call next,
- summarizing or validating information before downstream steps.
For general input and tool concepts, see Inputs and Tools.
How an Agent Step Runs
An Agent Step combines several pieces of context:
- the selected AI model,
- the selected Agent Role,
- the step prompt,
- assigned step inputs,
- assigned tools,
- outputs from previous steps referenced in the prompt.
The Agent Role provides the high-level instructions. The prompt tells the agent what to do in this specific step. Inputs and previous step outputs provide the data. Tools let the agent retrieve or update external information when needed.
Agent Role
The Agent Role defines the system instructions for the agent.
Use it for stable behavior rules that should apply across many steps, such as:
- what kind of specialist the agent is,
- how it should reason about the task,
- what output style or constraints it should follow,
- what it must not do.
If a role has versions, you can either use the latest version or pin the step to a specific version. Pinning is useful when you want repeatable production behavior even if the role is updated later.
Keep role instructions reusable. Put step-specific details in the step prompt instead.
Prompt
The prompt is the task instruction for this specific Agent Step.
Good prompts are explicit about:
- the business goal,
- the fields or result the step should produce,
- what evidence the agent should use,
- how to handle missing or conflicting data,
- the expected output shape.
Example:
Review the invoice details and determine whether this invoice is ready for automatic posting.
Return:
- postingDecision: "approve", "review", or "reject"
- reason: short explanation
- vendorId: matched vendor id if available
Use the OCR text and vendor lookup result below. If the vendor match is uncertain, choose "review".
Referencing Values in Prompts
Use double curly braces to reference input values and previous step outputs.
Examples:
Invoice id: {{input.invoiceId}}
Vendor name: {{inputs.dooapEventPayload.vendorName}}
OCR result: {{Read invoice OCR}}
Matched vendor id: {{steps.Match vendor.vendorId}}
Supported patterns include:
{{input.someValue}}for trigger or run input parameters,{{inputs.someObject.nestedValue}}for nested input values,{{Step Name}}for the full output of a previous step,{{Step Name.field}}for a field from a previous step output,{{steps.Step Name.field}}as an explicit previous-step reference.
Default Values
If a referenced value might be missing or null, add a default with the ?? operator. Studio uses the default when the value cannot be resolved, so the placeholder never leaks into the prompt as raw text:
Vendor name: {{inputs.dooapEventPayload.vendorName ?? "Unknown vendor"}}
Matched vendor id: {{steps.Match vendor.vendorId ?? "none"}}
Without a default, an unresolved reference is left in the prompt as its literal {{...}} text. Use an empty default ({{... ?? ""}}) when you want the placeholder to simply disappear if the value is absent.
The default applies when the value is missing or null. A value that resolves to an empty string is treated as present, so the default is not used.
The prompt editor can suggest available step references when you type {{.
How Step Inputs Affect the Prompt
Step Inputs decide what input data is available to the agent.
If your prompt includes a placeholder for an assigned input, Studio replaces that placeholder with the input content.
Example:
Use this OCR text:
{{Invoice OCR Text}}
If an assigned input has data but the prompt does not reference it directly, Studio can append the input under an Input Data section so the agent still receives it.
JSON input is formatted to make it easier for the model to read. File inputs are included as text when possible. Non-text files, such as images or PDFs, are provided as file context or metadata depending on the input type and model capability.
Use outputFilter on input assignments when the input response is larger than the agent needs. Smaller, focused input improves cost, clarity, and reliability.
Tools
Tools let the agent call external capabilities during the step.
Use tools when the agent needs current or trusted system data, for example:
- looking up a vendor,
- checking an invoice status,
- retrieving purchase order data,
- updating a business record after validation.
Prefer tools for facts and external operations. Do not ask the model to guess values that a tool can retrieve reliably.
Maximum Iterations
Maximum Iterations limits how many reasoning and tool-use cycles the agent can perform during the step.
The default value is 10. The editor allows values from 1 to 50.
Use lower values when:
- the task is simple,
- the agent should call at most one or two tools,
- you want stricter cost and latency control.
Use higher values when:
- the agent may need several tool calls,
- the task requires multi-step investigation,
- the agent must inspect and compare multiple results.
Avoid setting this higher than necessary. More iterations can increase runtime and credit usage, and may make failures harder to debug.
Required Confidence Level
Required Confidence Level is a confidence gate for the Agent Step.
When no requirement is selected, the step can complete regardless of its calculated confidence. When a requirement is selected, Studio compares the step confidence to the required level after the agent produces its result.
Available levels are:
- LOW: accepts any confidence level,
- MED: requires at least
60%confidence, - HIGH: requires at least
85%confidence.
If the confidence does not meet the requirement, the step does not pass. Studio records a confidence-not-met failure and uses the configured confidence error Action if one is set. If no confidence-specific error Action is configured, normal step or Action error handling applies.
Use HIGH for business-critical automation where uncertain output should stop or route to review. Use MED when moderate uncertainty is acceptable but low-confidence outputs should be caught. Use no requirement or LOW for exploratory or non-critical steps.
Designing for Confidence
The confidence gate works best when the prompt asks the agent to be conservative.
Helpful prompt guidance:
- tell the agent what evidence is required,
- define when it should return a review outcome,
- require a short reason for the decision,
- avoid forcing a positive answer when data is missing,
- provide focused input instead of large unrelated payloads.
Example:
Detect whether this document is a vehicle maintenance invoice.
Return isVehicleMaintenanceInvoice = true only when the document clearly includes maintenance or repair work for a vehicle, such as service labor, spare parts, tires, inspection, oil change, or workshop fees.
Return isVehicleMaintenanceInvoice = false when the document is for fuel, parking, tolls, leasing, insurance, general office purchases, or when the evidence is unclear.
Include a short reason and cite the fields or text snippets that support the decision. If the evidence is mixed or missing, choose false and explain what is missing.
Structured Output
Require structured output forces the Agent Step to return its final result as JSON that matches a schema you define. Studio validates the result against that schema, so downstream steps can rely on a predictable shape.
Enable it with the Require structured output checkbox, just below Required Confidence Level. When it is off, the agent returns free-form text and nothing changes. When it is on, you define the fields the final response must contain.
Use structured output when:
- a later step needs specific fields from this step,
- you want to fail fast if the agent returns an unexpected shape,
- you are passing the result to a tool, Code Step, or another Action that expects strict JSON.
Defining the Output Schema
By default Studio shows a simple field list, so you can describe the result without writing any JSON. Each field has a name and a type, and you can switch between the Fields and JSON views at any time.
To add a field, select Add field, enter a field name, choose a type, and optionally add a short description that helps the agent understand what to put there. The type defaults to Text.
Available field types are:
- Text for any text value,
- Number for decimals,
- Whole number for integers,
- Yes / No for true/false values,
- List of text and List of numbers for simple lists.
You can change a field's type from the dropdown on its row, and remove a field with the trash icon. Every field you add is required in the result.
Advanced: editing the raw JSON Schema
Switch to the JSON view to read or edit the underlying JSON Schema directly. This is useful for advanced cases the field list does not cover, such as nested objects. When a schema uses features the field list cannot represent, Studio keeps it in the JSON view.
Generating a schema from an example
If you already have an example of the result, choose Generate from JSON example, paste a representative JSON value, and select Generate schema. Studio infers the fields from the example, which then appear in the field list for you to adjust.
Studio validates the schema when you save the step. If the schema is missing or is not valid JSON Schema, the step cannot be saved.
How Structured Output Behaves at Runtime
When structured output is enabled, Studio instructs the model to return JSON that matches the schema, then validates the agent's final output against it.
- If the output matches the schema, the step completes and the parsed result is available to later steps.
- If the output does not match the schema, the step fails and normal step or Action error handling applies.
Referencing Structured Output in Later Steps
When structured output is enabled, the parsed result is also exposed as a real object under a structured field on the step output, in addition to the usual message text.
This means later steps can reference individual fields directly, with their original types preserved:
Decision: {{Review invoice.structured.postingDecision}}
Reason: {{steps.Review invoice.structured.reason}}
Vendor id: {{Review invoice.structured.vendorId}}
In a Code Step, read the same object through the steps bridge:
const result = steps["Review invoice"].structured;
return { decision: result.postingDecision, vendorId: result.vendorId };
Because structured is a parsed object rather than a JSON string, fields keep their original types and can be referenced individually. The autocomplete in the prompt and configuration editors suggests these field paths after the Action has been run at least once.
Debugging Agent Steps
Use Runs to inspect Agent Step behavior.
In run details, review:
- the resolved prompt,
- the input data sent to the step,
- tool calls and tool results,
- the step output,
- confidence details,
- any confidence-not-met or tool failure messages.
When behavior is not what you expected, first check whether the prompt received the values you intended. Most Agent Step issues come from missing inputs, overly broad input data, unclear prompt instructions, or a confidence threshold that is stricter than the prompt and data can support.
Best Practices
- Put task-specific instructions in prompts.
- Reference important values explicitly instead of relying only on appended input data.
- Keep inputs small with
outputFilterwhere possible. - Use tools for trusted facts and external operations.
- Set Maximum Iterations based on expected tool-use complexity.
- Use Required Confidence Level for automation boundaries and review routing.
- Enable Require structured output when later steps depend on specific fields, and reference them through the
structuredobject. - Inspect several representative runs before enabling high-impact automation.