Hi,
I have extracted data from PDF to text and now trying to get the data into excel, with vb code only using regex,
attaching the extracted txt file and expected output file
8-103406 INVOICE_OmniPage_.txt (1.2 KB) Book1.xlsx (9.3 KB)
Hi,
I have extracted data from PDF to text and now trying to get the data into excel, with vb code only using regex,
attaching the extracted txt file and expected output file
8-103406 INVOICE_OmniPage_.txt (1.2 KB) Book1.xlsx (9.3 KB)
@indrajit.shah - did you tried Document Understaing to extract from the details from your invoice?
Please share the pdf also, or use preserve format to true while reading the pdf and share the updated text file.
Hi @prasath17, DU is out of the scope, I have to achieve through regex and all.
I had attached the text extracted from pdf.
Thank you in advance.
Hello @indrajit.shah ,
So I think you need only Table from the middle. You know my video :
In your case you just connect your data after Regex directly to invoke Code from my structure (as an input argument) the rest remain the same.
Dim strtmp As String
strtmp = strin.Substring(strin.IndexOf("TOTAL(USD)") + 11, strin.LastIndexOf("Total") - strin.IndexOf("TOTAL(USD)") - 11).Trim
strtmp = strtmp.Replace(" INPP", "INPP")
strtmp = strtmp.Replace(" ", "|")
strout = "col1|col2|col3|col4|col5|col6|col7|col8|col9" + Environment.NewLine + strtmp
Thanks,
Cristian Negulescu
@indrajit.shah - okay thanks… I see we have to write lot of regex pattern to achieve the output you are looking for…
For Ex: Below regex pattern covers 3 of your values but again you have to omit the others…
Yes, Have to write loads of Regex and that’s why I need help buddy.
Thank you @Cristian_Negulescu , I am trying write the whole code in VB just to replicate and then will do on the UiPath tool.
Can you help me with below extraction
INVOICE NO. / INVOICE DATE 8000103406 / 20.01.2021
LC NUMBER /CONT# 2700016007
FOB VALUE (USD) : 1,532.10
ADJUSTMENT - I (ADD FREIGHT + INS 0.00
ADJUSTIMER -2 (ADJ AGAINST CREDIT 0.00
TOTAL INVOICE VALUE (USD) FOB: 1,532.10
@indrajit.shah this is not a table from my point of view so you will use just substring and Split like this:
'FOB VALUE (USD)
Dim strtmp As String
strtmp = strin.Substring(strin.IndexOf("(USD) :") + 7, strin.LastIndexOf("ADJUSTMENT") - strin.IndexOf("(USD) :") - 8).Trim
'INVOICE NO. / INVOICE DATE
Dim strtmp As String
strtmp = strin.Substring(strin.IndexOf("INVOICE DATE") + 12, strin.LastIndexOf("LC") - strin.IndexOf("INVOICE DATE") - 13).Trim
innumber = strtmp.Split("/")(0).Trim
inDate = strtmp.Split("/")(1).Trim