Invoice details

I need to extract data from a scanned invoice. What is the best idea? Use read pdf with ocr or use taxonomy?

Use Taxonamy if the invoice has some standard template like the following details -
Invoice available field names:

  • “name”
  • “vendor-addr”
  • “billing-name”
  • “billing-addr”
  • “shipping-addr”
  • “invoice-no”
  • “payment-terms”
  • “due-date”
  • “po-no”
  • “date”
  • “net-amount”
  • “tax”
  • “total”
  • “currency”
  • “items”
    • “line-no”
    • “description”
    • “item-po-no”
    • “quantity”
    • “unit-price”
    • “line-amount”

Receipt model available field names:

  • “name”
  • “total”
  • “vendor-addr”
  • “date”
  • “phone”
  • “currency”
  • “expense-type”
  • “items”
    • “description”
    • “line-amount”
    • “unit-price”
    • “quantity”

Else Read PDF with OCR.

Regards,
Karthik Byggari

Hi @Ricardo_Franco,
Read OCR will be easiest and simple way to extract data from scanned invoice.:sweat_smile:

hi @Santhosh_S
use the omnipage OCR engine, this is the best to extract all the data
Regards

1 Like