How to get output of regex into notepad

Hi I am doing pdf automation and using regex can anyone help in getting the output of regex into notepad

Hi @dipon1112000

Read the Pdf with the read pdf text or read pdf with ocr
Store the output in the string
Use Find Matching Patterns for the regex want you want to extract
Use For each to iterate over the Find Matching Patterns

Use append line activity and Pass the currentitem.tostring

Sample flow

Hope it helps!!

@dipon1112000

If you want without for each,you also follow the below Process

String.Join(Environment.NewLine, OutputofMatches.Cast(Of Match)().Select(Function(m) m.Value.Trim()))

Hope it helps!!

Thank you for replying I have done that but it seems that output is getting repeated

Ex
I have 2 pdf and I am using read pdf text
Steps performed

  1. Assign (pdfpath = file path)

  2. Assign (pdffile = Directory.Getfiles(pdfpath,"*.pdf)

  3. For Each (file In pdffile)

  4. Read pdf text (file.ToSteing)

  5. Match (output bundle)

  6. For Each (item In bundle)

  7. Assign (Bundleplan = item.tostring)

  8. Add data row {CurrentRow (“Account”). toString, Bundleplan}

  9. Write CSV

But the output I am getting is

What is happening is if you look in the output there are 2 problems

  1. There should be only 2 entries (as input has 2 pdf)
  2. The second account number is getting printed twice but if you look properly the second row has got Bundleplan data of the first account.

I don’t know how to fix this

Use remove duplicate activity before converting datatable into text

OR

Initialize the datatable before for each and use add row inside it.

Thank you for helping can you provide a screenshot of how to use remove duplicate activity. It will be great helpful

@dipon1112000

can you the Screenshot of the workflow.


2.

3.

4.


6.

7.

8.

@dipon1112000

  1. First of all get row item and assign with currentrow(0).ToString…any one of it is enough…you dont need to use both
  2. How many pdf files are there? As per the code for each pdf file a new row would be added…
  3. May be the old pdf is still in the folder and that is the reason second time both pdf’s are being read…you can delete or move the pdf which is already read so that only new pdf is present in the folder when you do directory.getfiles

Basically after add data row use a delete file or move file activity

Hope this helps

Cheers

Thank you moving file worked

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.