Table Data extraction having multiple words and multiple lines for each column

Hi all,

I am facing issue in extracting table data / Following is details issue description.

Issue:- Facing issue in extracting data from table which has multiple words and multiple lines. The table cell size gets changes if there is increase in number of words inside the field, because of the same , earlier i have tried using DU form extractor , but it is not correct way since size of table cell is not fixed .

Next i tried to use anchor base selector data scrapping method, but the selector is selecting whole pdf data , instead of specific data field .

Also since pdf table data is not structured, that is why i have not used regex/string manipulation here.

also I can not share PDF since customer confidential data.

Please provide any solution for this .

Example of Table for understanding -

image

Thanks.

Hi @Yogita ,

Can you please try with the AI Center approach ,creating the ML Model and deploying in your workflow.

Please find the attached for reference

Please go through the series of videos. Let me know if you have any queries.

Happy Automation !!

Hi @suraj.setty ,

Client has not taken Ai fabric license . Hence need to find the other way around for this issue.

Thanks

Hi @Yogita

Have you tried using “Intelligent Form Extractors” to train the template and extract the data.

If so please try out “Machine Learning Extractors” by passing “End Point” and API key. Attached is the screenshot of the same.

The link to find the endpoints, try using Invoices endpoints.

image

Once all the details are entered click on “Configure Extractor” and then click on the “settings” icon as highlighted and then on “Get Capabilities”. Attached is the screenshot of the same

Then you can map the defined fields from the taxonomy to the ML Extractor from the dropdown attached in the screenshot below.

Please let me know if this method works.

Hi @suraj.setty ,

I have not tried Intelligent form extractor , i will try that.

Only I have doubt , can we use machine learning extractor without having ai fabric license ?

Thanks

Yes you can use that if you have API keys available , without the AI licenses.

Hi @suraj.setty

Which API key we need to use for machine learning Extractor document understanding API Key or Computer Vision Api Key?

And also Is there any restriction number of document we can proceed when we use machine learning extractor.

Thanks

Hi @Yogita

Please use Document Understanding API Key.

Yes there is page restriction per day around 2k can be used.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.