Read Range - Trim Spaces before Output?

datatable
excel
readrange

#1

I am using the Excel Activity Scope to read a spreadsheet regularly provided by an outside vendor to populate internal data. The issue is that the column headers can have a varying number of trailing spaces at the end. To be on the safe side, I’d like to trim all extraneous spaces from the output datatable. I’m not sure how I would go about doing this; could anyone provide assistance?

Thanks!


#2
//Pseudocode 
For each column in dt.columns
Column.ColumnName = Column.ColumnName.Trim

#3

Thank you; I’m not following where I put this code, however. Does this get added somewhere within the Excel Application Scope, or am I adding another activity that does this after that sequence?


#4

You can translate his pseudocode in your workflow by using a generic For Each loop with DataColumn as the TypeArgument, then inside use an Assign activity to perform the assignment like he posted. (Invoke Code might also be an option to speed up the process and decrease CPU usage)

Give that a shot.

Alternatively, you can streamline that down to a one-liner using a query expression, but might complicate it slightly because you would need to go through the rows as a string and concatenate the columns back together after you trim them.
For example,
String.Join(vblf,(From line In txtTable.Split(vblf(0)) Select line.Split({line.Split(",“c)(columnIndex).ToString},System.StringSplitOptions.None)(0) + line.Split(”,“c)(columnIndex).ToString.Trim + line.Split({line.Split(”,"c)(columnIndex).ToString},System.StringSplitOptions.None)(1)))

The first option is easier to understand though, lol.

Thanks.


#5

EDIT:
I’ve mixed up InvokeCode and InvokeMethod in this post. See further replies. Leaving original text to make the thread still make sense.

Pre-edit:

InvokeCode is vastly inefficient, as it needs to use reflection for the method calls. Outside of fringe scenarios it will never come close.

It seems we also have quite a different definition of streamlining :wink: that one-liner call would wreak havoc with heap allocations.

Make World a better place - KISS everything :wink:


#6

Yeah, I wasn’t sure on Invoke Code cause I haven’t used it but letting us use VB.net code in it interested me :). In my mind, I’m thinking it goes through For Loops super fast like your normal vb code for example.

lol, streamlining was wrong word. I mean it performs the change instantly instead of needing to wait for thousands of rows to loop through. I really want to learn how to utilize the Array.ForEach() or similar function so I can perform an action as such instead of needing to use .Select and split/concatenate the string.


#7

Hi Andrzej,

I would partially disagree with you on that. I do believe not all methods from your “code” is called by reflection but rather your code as Method, which is slightly different.

When it comes to comparing Invoke Code VS For each activities, I believe that when dealing with consistently large collection (where the perf effectively matter), activities will have slightly lower perfs for what i have tested.

I guess it is due to the fact that on each iteration inside the “body”, wwf needs to pass through the activity context of each scope (ex: Body=Asign) for each instruction you provide, which implies executign some .Net Code in the back.

I have small benchmark bulding a datatable one column and 1000000 rows (yes that what you need to see a difference…)

I’m invoking those two tests in both approaches :

For Each r As DataRow In dt.Rows
r(0) = r(0).ToString.Trim
Next

For Each r As DataRow In dt.Rows
r(0) = Guid.NewGuid.ToString
r(0) = r(0).ToString.Replace("a","")
r(0) = r(0).ToString.Trim
r(0) = Guid.NewGuid.ToString
r(0) = r(0).ToString.Replace("a","")
r(0) = r(0).ToString.Trim
Next

Invoke Code Test1: 00:03
Invoke Code Test2: 00:06

For each Activity Test1: 00:03
For each Activity Test2: 00:17+

Same results between for each and for each row.

Finally i did a last test using a one statement comparison using a an expensive expression:

New System.Text.RegularExpressions.Regex("\d{2}").Replace(String.Join("",Enumerable.Repeat(dt.Rows(0)(0),500000)).Replace(dt.Rows(0)(0).ToString.First,"y").Replace("y","z").Replace("z","a").Replace("a","b"),"\d{2}")

i get the same result (5 secs) for both

If you get different results or omitted on my comparisons let me know.

Main.xaml (24.8 KB)

Cheers :slight_smile:

My since apologize to the OP who just wanted a trim on its header and end up with weather with debate if you should rather use invoke code, linq or a for each to put a rocket into space…


#8

@Florent_Salendres @ClaytonM
:man_facepalming:
Of course. InvokeCode, not InvokeMethod… My appologies, I’m still in 2016.2 mindset. You’re absolutely correct, I completely mixed those two.
Kudos for fact checking :slight_smile:

I’d still stand by that for trimming column headers what I wrote is the simplest approach, but rest was incorrect.


#9

Thank you all - I haven’t tried this out yet, but I am following now! And since I got some good guidance, I will say that it is also illuminating to mix in a debate about more advanced usage. Hopefully I can pick up some understanding through that as well. :grinning:


#10

Using a generic For Each seemed like the way that made the most sense to me, but I get the below compiler error in this method: pic deleted

It looks like everything should correspond between the DataTable variable and the For Each, so I’m not sure why this is coming back.


#11

The Type in the Properties should be set to the type for each item you look at. In your case, it would be DataColumn, not DataTable. So I think changing the Type will resolve that. Let us know if it doesn’t.

Thanks.


#12

Clayton - you mean the Type within the For Each activity, correct? That is set to System.Data.DataColumn. The Properties pane in the screenshot is for the DataTable variable.


#13

Hi,

You need to iterate through datatable.columns (right textbox of the foreach activity)

Cheers


#14

That’s it, thanks!


#15

I was using the “For Each” activity version of this successfully, but I am having a problem with a second sheet in the same project. After I pull the data from the second sheet, I added the same activity modified appropriately for that sheet. However, when the process gets to that state it throws an error:

Assign: Exception has been thrown by the target of an invocation.

There aren’t any validation errors, and everything appears to be correct. Googling makes me think this isn’t an issue with the project as much as a memory issue with UiPath or something like that? Anyone else experienced something similar?


#16

No idea jcarr. You might place your Sequence in a Try/Catch then in the Catch output something like the row number or something that will confirm it’s not your data causing the error. So when the error is thrown you can see where in your table it’s happening basically.


#17

I was able to figure it out - the second source spreadsheet has two columns with identical column names and that was throwing the error.