Building Datable with RegEx

Hi! So I am extracting a certain text from PDF. Imagine something almost looks like a table. However, in order for me to get the next column, I have to use RegEx in order to obtain that information. There is a set list of possible values in the previous columns but sometimes, there can only be a few of them that needs information to be inputted. So, I did a data table that looks something like this: image (this was from a Youtube video I was following along). The idea is to loop through each row to get the information from the text and input it in the next column.

However how can I do this, lets say if “Due Date” is not in the PDF. Will it leave that row blank?

Hi,

It depends on your workflow, but probably Yes (more exactly, it will be set not null but empty )
If you use System.Text.RegularEpressions.Regex.Match(args).Value and doesn’t match, it returns String.Empty.

Regards,

Hi Yoichi!

So what has been happening it would iterate for a few rows in which the data is in the PDF file, then it goes onto the next row, and that particular row value is not in the PDF file, so this is the error it throws: Object reference not set to an instance of an object.

If I use Regex.Match(args).Value, so I put that into a variable ( also sorry but I know the very basics of RegEx)

Hi,

If possible, can you share your workflow?

Regards,

I cannot unfortunately, it is for work. But this is what I did (they all have pretty generic names):

. But the datatable concept is the same as what I posted earlier. I did a write line to see how it would look before I input back into the datatable. That was when it hit a “the information is not in PDF, so I am going to throw an error”

1 Like

Hi,

Perhaps you should use System.Text.RegularExpressions.Regex.Match(args).Value in Assign activity instead of Matches activity as the following.

strMatch = System.Text.RegularEpressions.Regex.Match(pdfString,row("Regex").toString).Value

Regards,

It works!!! Thank you so much!!! :smiley:

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.