I have tested the ML extractor activity and below are some of the major issues that I faced:
Unable to extract “item” information i.e. information present in tabular form from the invoice.
If we process multiple invoice with same template, sometimes it does not extract complete information from all the invoices. For e.g out of 10 invoice of the same template, its extracting “due-date” from 9 invoices and not extracting the same field for one of the invoices.
Please suggest for the same if there is any solution/fix for this.
Data Extraction Scope: Request is unauthorized. Please make sure that a correct API Key was provided.
I am getting the API Key from Settings → Deployment → API Key (I am using the enterprise orchestrator). I am not able to find the “Templateless Invoice Extraction” key under enterprise orchastrator.
ML extractor is not working for some of the invoices. It failed to extract invoice no, Invoice date, Due date, Items data from this invoice. It is extracting Name, address related fields only. How much confidence score should I use for accurate results? now I am using 40%. Please provide me a solution.
.Sample Invoice 3.pdf (16.3 KB)
Hi Team,
I am using ML extractor to extract data from invoice image but i am extracting only predefined field what team mention the fields but i have to extract some other fields like GSTIN,Job Card No,Job card Date etc.And i am unable to extract data table for every invoice have different fields so it is extracting only predefined fields.We are trying to extract data without Present Validation Station for unattended robot please help me .
@atul.trikha you have to use the api key provided with your cloud community version specifically for the Templateless Invoice Extraction. https://demo.uipath.com/[Your User ID]/portal_/licensing
You should have an api key for Computer Vision, Templateless Invoice Extraction and Templateless Receipt Extraction
This seems like a great resources but am unsure where to go form here and any help would be greatly appreciated
RemoteException wrapping UiPath.MachineLearningExtractor.Activities.Exceptions.MLRequestException: Invalid server response. —> RemoteException wrapping System.Net.Http.HttpRequestException: Response status code does not indicate success: 400 (BAD REQUEST).
at System.Net.Http.HttpResponseMessage.EnsureSuccessStatusCode() at UiPath.MachineLearningExtractor.Activities.Services.MLRequester.d__3.MoveNext()
— End of inner exception stack trace —
at UiPath.MachineLearningExtractor.Activities.DataExtraction.MachineLearningExtractor.EndExecute(AsyncCodeActivityContext context, IAsyncResult result) at System.Activities.AsyncCodeActivity.System.Activities.IAsyncCodeActivity.FinishExecution(AsyncCodeActivityContext context, IAsyncResult result) at System.Activities.AsyncCodeActivity.CompleteAsyncCodeActivityData.CompleteAsyncCodeActivityWorkItem.Execute(ActivityExecutor executor, BookmarkManager bookmarkManager
I have tested the given sample program with attached sample Invoice. 01.pdf (145.2 KB)
It has Romainan and GBP totals. By default, its selecting Romanian currency and when I tried to choose GBP value, but the json output still picking the Romanian value.
Same issue for Vendor Address aswell. Updated values are not getting reflected in json. Could you please validate.
Can you please check that the machine learning extractor is correctly configured? If you can’t get this working, can you please attach a workflow and a sample file to test against?
I am assuming that you are using the sample project. If that is the case, you need to make some small changes to capture the ValidatedOutput from the Present Validation Station Activity.
The sample project only outputs the ML Extracted Output - you need to add a variable in the Output of the Present Validation Station activity and then use a Write File activity and serialize that variable. That will contain the human validated results.
Data Extraction Scope: The results of extractor ‘MachineLearningExtractor’ contain duplicates ‘date’. The extractor should return a single ResultsDataPoint entry for each requested field.
@alexcabuz I am getting this error on Digitize Document Activity while using scanned PDF.
RemoteException wrapping System.Exception: An unexpected error has occurred —> RemoteException wrapping UiPath.SmartData.Digitization.Tokenization.TokenizationException: all-scanned-invoices-as-of-march.pdf —> RemoteException wrapping BitMiracle.Docotic.Pdf.UnexpectedStructureException: Unexpected PDF structure. This exception usually indicates that PDF document is malformed but also may indicate a bug in Docotic.Pdf library. Please send us the file for review.
at .()
at .()
at .(IPdfStreamProvider ,
String ,
Boolean )
at .(Stream ,
String )
at UiPath.SmartData.Digitization.PDF.PdfTokenizer.TokenizeDocument(String correlationId,
Stream content,
CancellationToken token)
at UiPath.SmartData.Digitization.PDF.PdfTokenizer.Tokenize(String correlationId,
Content content,
CancellationToken token)
— End of inner exception stack trace —
at UiPath.SmartData.Digitization.PDF.PdfTokenizer.Tokenize(String correlationId,
Content content,
CancellationToken token)
at UiPath.SmartData.Digitization.ContentTokenizer.GetTokenPages(Content content,
CancellationToken token)
at UiPath.SmartData.Digitization.DocumentDigitizer…ctor(Content content,
IOcrEngine engine,
Int32 degreeOfParallelism,
CancellationToken token)
at UiPath.IntelligentOCR.Activities.Digitization.DigitizeDocument.d__31.MoveNext()
— End of inner exception stack trace —
at UiPath.Shared.Activities.AsyncTaskNativeImplementation.BookmarkResumptionCallback(NativeActivityContext context,
Object value)
at UiPath.Shared.Activities.AsyncTaskNativeActivity.BookmarkResumptionCallback(NativeActivityContext context,
Bookmark bookmark,
Object value)
at System.Activities.Runtime.BookmarkCallbackWrapper.Invoke(NativeActivityContext context,
Bookmark bookmark,
Object value)
at System.Activities.Runtime.BookmarkWorkItem.Execute(ActivityExecutor executor,
BookmarkManager bookmarkManager)
@alexcabuz@loginerror If i keep executing the model over same pdf and keep correcting one single field, will it learn?
Or it will keep making the same mistakes?
@lakshman I don’t think so. I am executing one pdf only, have executed it for five times and didn’t see any change is result. Is there any other way to make it learn?
Hello Everyone,
I am using machine learning extractor to read one of my sample Invoice but the extractor only detects some specific fields not all the fields from the invoice. for example “Project No”, “Project Title”, “Tax Id No”, “Terms”, “Payment Instructions” these fields are not getting detected/extracted. can anyone please help or am i missing something? any help would be really appreciated! Thanks! @loginerror