Only tables extraction from scanned pdf

Hi I have a scanned PDF which contains multiple pages of Invoice and i want extract them into excel.
I have tried taxonomy but not detecting the different format invoice as well as whole page.

Thanks for your help in advance!! :blush:

@Mohamed_Ameer1

Please try using ml extractor…it would cater for different formats as well

You can classify the documents as well and use different formats for each type of document

Or you can retrain a ml model in ai center and use the retrained mode as you need

Cheers

Thanks for the reply Anil,
Will try that!!

Can you help me on that using ml?

in means of sample workflow? that would be great…!!

1 Like

Hi @Mohamed_Ameer1 ,

Since it is an Invoice document, there is a Pre Build template of Document Understanding already in Studio. You could also refer to it :
image

It should be able to provide you with an example of how the invoices, receipts and forms data work with Document Understanding public endpoints.

You would require to pass the API key from your Cloud Tenant and create the necessary assets and storage buckets in the Orchestrator.

1 Like