I need to extract data from a scanned invoice. What is the best idea? Use read pdf with ocr or use taxonomy?
Use Taxonamy if the invoice has some standard template like the following details -
Invoice available field names:
- “name”
- “vendor-addr”
- “billing-name”
- “billing-addr”
- “shipping-addr”
- “invoice-no”
- “payment-terms”
- “due-date”
- “po-no”
- “date”
- “net-amount”
- “tax”
- “total”
- “currency”
- “items”
- “line-no”
- “description”
- “item-po-no”
- “quantity”
- “unit-price”
- “line-amount”
Receipt model available field names:
- “name”
- “total”
- “vendor-addr”
- “date”
- “phone”
- “currency”
- “expense-type”
- “items”
- “description”
- “line-amount”
- “unit-price”
- “quantity”
Else Read PDF with OCR.
Regards,
Karthik Byggari
Hi @Ricardo_Franco,
Read OCR will be easiest and simple way to extract data from scanned invoice.
hi @Santhosh_S
use the omnipage OCR engine, this is the best to extract all the data
Regards
1 Like