Agent Response Error: "The response exceeded the maximum token limit" — How to resolve this?

Hi Everyone,

I’m working on an agentic automation use case where I’m sending a large unstructured dataset to an LLM-powered agent to extract structured data in JSON format.

However, I’m encountering this error:

The response exceeded the maximum token limit. Please increase the max tokens limit or reduce the desired output size via prompting.
ErrorCode: Agent.BaseError

  • How can I avoid hitting the token limit?
  • Is there a way to increase the max_tokens limit for the response?
  • If not, how should I split or chunk the input/output to avoid this error?
  • Are there prompt engineering best practices to reduce output size without losing key field values?

Thank you in advance.

Make sure you split the large data into multiple prompts. You can summarize the larger data and then pass to the prompt.

No, as the max_token is defined by the LLM. For example ChatGPT 4o min have max token size 128,000 tokens

Try using GenAI activities for this.

2 Likes

Hi @nisha.sonawane,

Please find below answers to your questions:

  1. How can I avoid hitting the token limit?
    Answer: Each LLM model has accepts specified context window which indicates how long input string you can pass. To avoid this break your input in smaller chunk of string. As of now UiPath does not offer much LLM models.

  2. Is there a way to increase the max_tokens limit for the response?
    Answer: No, you cant increase above provided limit.

  3. If not, how should I split or chunk the input/output to avoid this error?
    Answer: Break down input into multiple chunk & design multi-modal agentic system to process them. You can make use of sequesntial agents, router agents.

  4. Are there prompt engineering best practices to reduce output size without losing key field values?
    Answer: You can make use of sequesntial agents where output of one agent acts as a input to other agent. Or you can make use of parallel agentic architecture where each agent process smaller chunk of input & at last all the output feeded to single agent for final response.

Mark as solution, if it solves your queries. :slight_smile:

Br,
NikZ

1 Like

Thank you @ashokkarale for your valuable and timely response, sure will try this.

Thank you @Nikhil_Zambare1 for your valuable and timely response, sure will try this.

@nisha.sonawane

I believe by bow you already understood that you cannot increase the limit but only reduce your output size..so let me give you few best practices

Be precise with your instructions:

Instead of: Summarize this document.
Use: Summarize the document in 5 points under 100 words, focusing only on key decisions and dates.

Ask for structured output: Return only a JSON array of {name, value} pairs Of what you need

Use field-specific constraints
Example: Only include fields: id, status, and amount. Exclude descriptions and metadata.

Use minimum formatting:
Request: Compact output, no extra whitespace or explanations.

Or

Always suppress headers, footers, and friendly closing lines. No summaries or engagement prompts

These can be few

Cheers