hello, i have create two separate robots to extract data from PDF invoices. each one extract data from an invoice with different language. each one work fine . but when i try to extract the data from all invoices at once in one robot i get this massage.
I believe it might because not all of the languages are taken into consideration.
Could you let us know what was the configuration done in separate workflows to handle different language ? Or what was the method used for the Extraction ?
yes, as i mentioned before i have created two separate workflow one for English and one for Arabic both are work fine. but not together. all Regex have been checked.
Could you maybe check whether adding the If Condition activity like below would help (Could not test as there were no data) :
I believe the error was happening since the regex used for Arabic invoice don’t result in any matches when English Invoice is passed/used and vice versa. So maybe a handling like stated above would be needed.
Egyptian Holand.pdf (61.0 KB) Invoice3.pdf (54.0 KB)
the first one is Arabic the second is English. Thanks for your great support i am really appreciate it
As already provided screenshot before, I have added the If Activity /Condition to check if there are any matches with the associated regex expression for the document. If present, then I am adding the matched value, if not, I am assigning it an Empty string.
For the new Field value to be Extracted, I have checked found the below regex to be working :
Check the below updated workflow : Main (1).xaml (20.3 KB)
Would encourage you to check the Screenshots and correct your workflow accordingly as some of the times the packages version used in my environment may cause conflicts in your environment.
The error message you’re encountering when trying to extract data from invoices with different languages in a single robot is likely related to the language-specific settings or configurations within your robot. To resolve this issue and enable your robot to handle multiple languages, you can consider the following steps:
Language Detection: Make sure your data extraction process includes language detection for each invoice. This can help your robot determine the language of the invoice and apply the appropriate language-specific rules and configurations.
Multilingual OCR: If your invoices contain text in various languages, you should use Optical Character Recognition (OCR) software that supports multiple languages. OCR tools like Tesseract are capable of recognizing text in various languages. Ensure that your robot uses an OCR engine that’s equipped to handle the languages present in your invoices.
Language-Specific Rules: Define language-specific extraction rules and configurations. For example, create separate templates or rules for each language that your invoices might be in. These rules should account for variations in layout and language-specific data extraction patterns.
Conditional Processing: In your robot, implement conditional logic that checks the language of each invoice and applies the corresponding language-specific rule set. This allows your robot to adapt to different languages on the fly.
Testing and Validation: Test your robot with invoices in various languages to ensure it correctly identifies the language, applies the appropriate rules, and extracts the data accurately.
Error Handling: Implement error handling and logging to identify and address any issues that may arise during the data extraction process.
By following these steps, you should be able to create a more versatile and adaptable robot that can handle invoices with different languages effectively. If you encounter any specific issues or error messages, please provide more details, and I’d be happy to offer further assistance.