I have a read range activity on an excel and after reading it, the new table contains my columns but it adds some new ones called Columns 1,2,3…In the original table i have for example, only 5 columns containing text, so not sure where those Columns 1,2,3,4 are getting in after read range. Is there a way to stop them being read or what’s the best solution to remove them after read range?
Don’t have to much experience, so it’s a bit tricky for me having those unexpected empty columns.
@Mario9016, click on your Read Range Activity to select it and on the properties pane make sure the button Add Headers is checked. if it is then UiPath will understand that your excel has headers.
I think what happens sometimes is the Excel file has hidden content somewhere which causes it to be read and column names get generated as you show in your image.
One option is to highlight everything to the right of the Excel data and use the “Clear All” in the Ribbon to erase all content. However, this requires that the file be maintained like this and I am not sure how consistent it would be.
The method which I have used is to just simply run the columns through a loop and remove the ones that StartsWith(“Column”)
For example:
For each col In dt.Columns.Cast(Of DataColumn).Where(Function(c) c.ColumnName.StartsWith("Column") ).ToArray
Remove Data Column activity //use col.ColumnName for columnname property
Yep, run in to this plenty of times. FWIW my default habit is to just delete the last few column/s whenever I finish working on an excel file a robot will touch (works like “Clear All”):
If this file will be handled by humans or other applications you will risk the phantom data issue.
This is the first time I’ve thought about solving this problem, as it hasn’t popped up outside of unit testing before. Checking the columns schema might be better as @ClaytonM suggests? My first thought is use Rows.All() LINQ to find any columns which have entirely NULL values in all rows, then delete those. The fact that we are dealing with phantom data makes me skeptical of this approach though.
If you can be absolutely sure the phantom columns will only appear at the end of the column list, then perhaps consider @ClaytonM suggestion, or keep track of the number of columns you expect, and delete the rest (pseudocode, untested):
If Columns.Count > NumberOfExpectedColumns Then
For X = Columns.Count - (Columns.Count - NumberOfExpectedColumns) To Columns.Count - 1
Remove Column(x)
Next
If you have columns being inserted before or in the middle of your spreadsheet then a combination of approaches might be suitable.
My suggestion is with the assumption that any phantom data gets added to a column that starts with “Column”. The order doesn’t matter at all.
Basically, I have created an array of the columns that start with “Column”, then run that in the loop to remove them.
dt.Columns.Cast(Of DataColumn).Where(Function(c) c.ColumnName.StartsWith("Column") ).ToArray
this line of code sets an array with only the columns that starts with “Column”. Alternatively, you can run an IF inside that checks the StartsWith() instead, however, that is less efficient.
where the For each is of TypeArgument String, although, you can change it to DataColumn and remove the .Select() part out of it, if you wanted. Then, use col.ColumnName
Here is the example without the IF which I think is better: