Different extension file to classify documents in Taxonomy?

Hello All,

I have 20 input files all are purchase order but in different file type such as few PO are in JPG file and few are scanned PDF and few Image PDF files.

Do i need to classify into three types in taxonomy? all files are mostly semi-structured.

Also i want to confirm taxonomy classification is based on content type or file extension type I am so confused.

When i tried to classify using Intelligent keyword classifier for few files i am getting below error.

Hi @Anusuya_R

If you have all your documents of Purchase order then classification can be skipped
For extraction use ML Extractor

Here you can get the endpoints

In the taxonomy you have to create on one type for eg
Group → Semi-Structured
Categories → Finance it depends on you what you put)
Document type → Purchase Orders and then you add fields that you want to extract

For the file extension
This are the files that are supported in DU and you don’t have to mention file type anywhere
.png , .gif , .jpe , .jpg , .jpeg , .tiff , .tif , .bmp , and .pdf .

Hope this helps