Classify Document Based on it's name

Hello,

I’m trying to classify multiple documents based on it’s name. For example, I have in a folder 2 pdfs, one named ‘Invoice_2571’ and one named ‘Ticket_2571’.
I want to somehow manually assign the ‘ClassificationResult’ output variabile of ‘Classify Document Scope’ based on the name of pdfs and use it further in the extraction scope.
I need to do this, because on this process I can encounter invoices that looks exactly the same but need to extract different data from them (basically load a different document type taxonomy). Hope that makes sense…
Anyone know a solution for this?

Thank you!

Hi @Chiru_Vlad.

you can use simple if condition strFileName.StartsWith(“Invoice”)

then doctype(string) = taxonomy.DocumentTypes.Where(function(x) x.Name.Equals(“Invoice”)).Select(function(y) y.DocumentTypeId).firstordefault

else if strFileName.StartsWith(“Ticket”)

then doctype(string) = taxonomy.DocumentTypes.Where(function(x) x.Name.Equals(“Ticket”)).Select(function(y) y.DocumentTypeId).firstordefault

In extraction scope you can give document type id as string instead of classificationresult

try the above solution and let me know if you face any issues

Thank you Venkata,

This is a good solution, but if we have multiple(5+) document types I think the code will be hard to read with that many if conditions in it.
I also found a method to manually assign the document type just between the ‘Data Extraction Scope’ and ‘Classify Document Scope’ with this line:

classifResult(0).DocumentTypeId = classifResult(0).DocumentTypeId.Replace(Split(classifResult(0).DocumentTypeId, “.”).Last,“Ticket”)

Where classifResult is the ClassifactionResults array output of ‘Classify Document Scope’ and we use that Split, because the DocumentTypeId is defined as ‘[Group].[Category].[DocumentType]’

And we put this in a switch activity

you have a mapping data table with two columns first column “filename” starting keyword(Example: Invoice), second column “documentname”, based on the file name find out what is document name using linq query dynamically and replace the name in the below query to get document type id.

doctype(string) = taxonomy.DocumentTypes.Where(function(x) x.Name.Equals(documentname)).Select(function(y) y.DocumentTypeId).firstordefault

Linq Query documentname= dt.asenumerable().where(function(x) filename.startswith(cstr(x(“filename”)))).select(function(y) cstr(y(“documentname”))).firstordefault

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.