Generative extractor not working as expected

Loveness_Chinake · December 1, 2025, 10:36am

Hi all.
I am making use of the Extract Document Data activity in a workflow. I want to use it to extract certain values from a document. I have added one prompt for the field that I want to extract but the activity returns the entire text in the document and I am not sure how to fix it. I would really appreciate your help.

I have attached screenshots of the configuration in the activity.

ashokkarale · December 1, 2025, 10:38am

@Loveness_Chinake

Please share more details like the field you are trying to extract and the prompt you are using for it. Also show the result from the locals panel what’s being returned.

Ruhi_Sayyad · December 1, 2025, 10:41am

Hello, It happens because the Extract Document Data activity needs a very specific prompt, and if the prompt is too general, the model returns the whole document text.Retry rewriting your prompt to be very direct, e.g.: “Extract only the value after the label ‘“Invoice Number:”’ and return nothing else.Also enable Return as JSON and define the expected field name.If the label is always the same, use an anchor-based or regex-based extraction instead of a pure prompt.When the prompt is precise and the expected output format is clear the model will stop returning the full document.It would be great if you provide the input you are giving and the result output

Alex17pat · December 1, 2025, 11:50am

I ran into this a lot during early testing. The key is what they said-

Extreme Specificity- You have to treat the prompt like a command line instruction, not a friendly request. The model needs to be told exactly what to extract and, crucially, exactly what NOT to return (“return nothing else”).
Return as JSON- Enforcing a specific, predictable output structure using the Return as JSON checkbox is the most reliable way to force the model to limit its scope to only the fields you defined.

If you can post the exact prompt you’re using and the output you’re getting, it would make debugging a lot faster for the community!

Loveness_Chinake · December 1, 2025, 12:24pm

Thank you for your response. I actually want to extract table data. The table looks like so:

Loveness_Chinake · December 1, 2025, 12:26pm

p.s: I can only send one picture at a time.
This is the prompt. But as you mentioned, I need to make it very very specific.

Loveness_Chinake · December 1, 2025, 12:26pm

And this is the result:

I will try out your suggestions and see what I get.

Anil_G · December 1, 2025, 1:34pm

@Loveness_Chinake

did you happen to use table type model it is best suited for tables

cheers

Loveness_Chinake · December 2, 2025, 1:05pm

I am not working with IXP though.
Just using the Extract Document Data activity in a workflow

Anil_G · December 2, 2025, 1:42pm

@Loveness_Chinake

What model are you using?

because for tables normal models might not give expected results

cheers

Topic		Replies	Views
Unable to extract table data from PDF using generative Extractor Document Understanding question	7	908	July 21, 2024
Using 'Extract Document Data' as an extractor Something Else studio , question	5	511	September 19, 2024
Extracting specific information from a variable of type IDocumentData Activities activities , question , document_understanding	3	170	November 19, 2024
Extract Document Data Activities to Data Table or JSON Format Document Understanding extract-document-data	2	34	November 15, 2025
How provide a prompt dynamically Activities activities , question , document_understanding	2	499	December 29, 2023

Generative extractor not working as expected

Related topics