Extracting data from PDF Invoices from ma

You cannot customize the taxonomy – at least not with the community edition endpoint. The models/taxonomies used for the community endpoints are pre-determined by the UiPath team.

You should be able to get those fields you said though. If you open the Taxonomy Manager, you’ll see that the invoice taxonomy has some, if not all, of those fields:

If you want to customize the model used by the AI server, you can deploy a local machine learning server, which will be more complex.

You can also take a totally different approach and digitize the entire document with OCR, then use regular expressions to parse out parts of the invoice. That will not involve the UiPath machine learning server.

Another approach is to use an OCR engine in conjunction with custom extractors, but that’s something I can’t yet help with – I’m still trying to understand how this works myself haha.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.