Hello all,
I want to get data from the PDF , i used document undestaning , but I am unable to find which type of machine learning that i must use
Best regards
Hello all,
I want to get data from the PDF , i used document undestaning , but I am unable to find which type of machine learning that i must use
Best regards
if its a structured document then use form extractor instead of ML extractor
Hi Nora,
If you have to extract data from a fixed format template(Structured document), you can train keyword / Intelligent keyword base extractors.
If the document format varies from file to file but still is in structured format, you can use the same option as in scenario (1).
If the document is semi - structured, you could use ML extractors according to your needs (receipts, utility bills,etc.)
Note : You could use multiple extractors in parallel as well. (for example, some values could be extracted from keyword based extractors while other are to be extracted by ML extractors.
Also you can provide confidence as %. (To elaborate, assume you provided 85 % conf. level, then only those values which are being identified as minimum 85% matching by that extractor will be taken.
Hope it helps.
Kind Regards,
Sagar Rana
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.