Doubt in pdf automation

Hi @anjani_priya

System.Text.RegularExpressions.Regex.Match(Text,"(?<=Due Date).*").Value

Sequence2.xaml (22.0 KB)

Cheers!!

Hi @anjani_priya

here is the attached xaml
Sequence5.xaml (7.0 KB)

regards

@anjani_priya

Sequence.xaml (10.7 KB)
Use write cell activity to write the data into the excel
if the value is blank or anything then it will write that into excel

I have write the regex for due date only.
Like that you can write the regex for remaining fields.

pdf file is an image I have sent the sample pdf file.Can you please tell how to capture the empty field and if the text is there how to capture the text

pdf file is an image I have sent the sample pdf file.Can you please tell how to capture the empty field and if the text is there how to capture the text.

pdf file is an image I have sent the sample pdf file.Can you please tell how to capture the empty field and if the text is there how to capture the text…

@anjani_priya

Any pdf file it works

Hi @anjani_priya

You can use the same method but need to change the regular expression according to the data you have.

regards

@anjani_priya

Main.zip (3.1 KB)


DuteDate=System.Text.RegularExpressions.Regex.Match(text,"Due Date\s+[A-Z a-z]+\d{1,2}\,\s*\d{4}").Value
if:System.Text.RegularExpressions.Regex.IsMatch(DuteDate,"[A-Z a-z]+\d{1,2}\,\s*\d{4}")
Message box:System.Text.RegularExpressions.Regex.Match(DuteDate,"(?<=Due Date\s+)[A-Z a-z]+\d{1,2}\,\s*\d{4}").Value

Okay @anjani_priya

Here is the below workflow from there you can check the condition with regular expressions.

Step by Step Proces -
→ Use Read pdf text activity to read the pdf and store in a String Variable called Input_Text.
→ After that take an If condition to check the Due date is available or not in If activity.

- Condition -> System.Text.RegularExpressions.Regex.IsMatch(Input_Text,"(?<=Due Date\s)[\w\s\d,]+(?=\n+)")

→ The bot will go to Then block if the Due date is available in the pdf then store in a Variable.

- Assign -> DueDate = System.Text.RegularExpressions.Regex.Match(Input_Text,"(?<=Due Date\s)[\w\s\d,]+(?=\n+)").Value

→ The bot will go to Else block if there is no Due date in the pdf.

Check the below workflow for better understanding,
Regex_Practice.xaml (12.2 KB)

Output -

In the pdf there is Due date that’s why it shown the due date.

Hope it helps!!

I have done invoice text extraction by watching this video
but I have a condition like if the field is empty it should indicate the empty field if the field has text it should capture the text

If there is any thing instead of due date how to use that?

@anjani_priya

You can only indicate if the element is present in that file.

Use another file(where the required value is not blank) to indicate the element.

It will give the output as value if the value is present in the pdf or blank space if the value is not there.

@anjani_priya

If you want extract some other Field instead of due date then you have to write the regex expression.

In my above due date regex is helpful to you in both situations due date is present it writes the date or else it will print empty

Iam getting read text as empty because the pdf has image

iam getting empty in read pdf text

read pdf text is not working iam getting empty field.

Hi @anjani_priya

Is your document a scanned file?

Regards

yes the document is scanned file

I have used regex and read pdf with ocr
iam getting regex empty