Delimiter text using regex

Hello everyone,
I have a txt file, with the following format, and I need to send it to an excel, how can I do it?

Code Description Price
122952 MOBIL 1 ESP FORMULA 5W-30 6X1L 811,968
121463 M-SHC CIBUS 32 PAIL 20L 833,140
100569 MOBIL BRAKE DOT4 24x0.5L 13,200

This looks like a CSV file, did you try the activity Read CSV?

1 Like

@mcampu I agree with @bcorrea it looks like a CSV file, did you remove the comma? It seems like a delimited file, but it cant be using " " as the delimiter, because some of the values have a space in them. Is it possible the delimiter is the tab character? If so, you can split the string by newline, then split each line by the tab character and pass that array into an ‘add datarow’ activity to your datatable. After it goes through each line, you can then use a ‘write range’ activity to write the datatable to your excel file

The file that I show you, is the result of reading a pdf, as you will see the file is not delimited.
And I need to pass it to an excel

did you try data scrape to see if uipath can grab that table directly as datatable from the pdf?

PDF data

Ima%20Factura

not really, because a friend recommended me to pass the pdf file to txt and from this one to excel

@mcampu

You can form that as a data table using generate data table and passing the required delimiter so that it will form the data into a required data format . so you can write it directly to excel :slight_smile:

@HareeshMR
I did the following;
I read the PDF
I passed it to a txt archive
Then with “Generate Data Table” delimit my file by “space” (I also tried “tab”)
and I passed it to an excel file
The result was the following

@HareeshMR

Change that, dont read the pdf as text because that is not doing you any good… try to read the pdf and extract the table directly.

@bcorrea

I tried to read the table with “Data Scraping” but it did not work, the error is as follows “This control does not support data extraction”

that are several ways to do that, is your pdf something you can share here?

@bcorrea

I am not allowed to share it, would you have any example of how to do it?

Just wanted to mention that if it is tab-delimitted, you don’t need to go through any special logic. Just use Write Text file using the extension as ‘.XLS’, then Read Range as normal. Of course, you would first want to test opening the file in Excel to see if the delimiter took.

This isn’t necessarily any solution to the problem, just mentioning it :smiley:

@mcampu
If it comes down to it, you could use some string manipulation to delimit your text.

Let’s say countBefore is the number of columns before the description, and countAfter is the number of columns after the description. We can use those to split around the description. (for more dynamic variables, you would need to determine these numbers from the column headers in the text)

You could do this as an array.
String.Join(System.Environment.Newline, pdfTxt.Split({System.Environment.Newline}, StringSplitOptions.RemoveEmptyEntries).Select(Function(x) String.Join("|",x.Trim.Split(" "c).Take(countBefore)+"|"+String.Join(" ",x.Trim.Split(" "c).Skip(countBefore).Take(x.Trim.Split(" "c).Skip(countBefore).Count-countAfter))+"|"+String.Join("|",x.Trim.Split(" "c).Skip(x.Trim.Split(" "c).Count-countAfter)) ) )

Note: this has not been tested and could have mistakes, but they can be resolved if errors are posted.

Then, you can use Generate Data Table with the new delimiter. I used “|” but you can use another character including the Tab character. If you use Tab, you could also just write directly to an .XLS using Write Text File.

Essentially, this string manipulation, should split by newline to break it up by line. Then, split by space and join each section together. It should take the items to the countBefore number, join it with the items skipping the countBefore and only taking up to the item count minus the countAfter number. Then finally, joining that with the items from the item count minus the countAfter number taking the countAfter number.

EDIT: also output the text to view it after the string manipulation to make sure it delimited it correctly.

Regards.

1 Like

Thanks for typing this all out. I started, but realized it was going to be more complicated when skipping the first/last when joining and just got too lazy haha. @mcampu This is how I would go about it as well

1 Like

This image pdf is too similar to an excel table exported as pdf, are you sure you cant see the person who is exporting this and just get the excel file to work with?

1 Like

@bcorrea
The pdf is received by email, and is deposited in a folder, from there the robot must take the pdf, pass the data to an excel and make the invoice with the data.

@ClaytonM
I’m going to try what you mentioned, thanks

Here you go if you really to do like this, see my attached solution: import.xaml (9.6 KB)