Separate value of same cell but different row into matching cell?

Hi all,

I want to split the value in same cell into separate cell/row in excel.
but all values from other columns need to be inherited to new row, anything that has any kind of delimiters (I believe in this example there are commas and hyphen as delimiters)i put the example as below in the screenshots. the merged cells will need to be accounted for too.

input file:
Input (1).xlsx (10.2 KB)

Before:
image

What I am trying to achieve:
image

Any advice or solution would be appreciated!

What you’re describing is usually referred to as “flattening” - even outside of the UiPath world. The flattening here consists in two parts: a) rows that have merged cells and b) rows that have cells with multiple values separated by e.g. commas

For a), it’s key to realize that the actual rows/columns look like this when you read the range into a datatable in Studio:

Row Office Appropriation Account CAT_B_APOR Cam1
2 OCFO 4598 HUDWFRB, HUDWFDB HUDNPS000000
3 HUDWCF000000
4 OCIO 4586 HUDWCES HUDITFP00000
5 HUDITFP25000
6 … … … …

You notice that rows 3 and 5 are empty except for one column.

My rough conceptual approach would be:

  1. Get the data and store in a datatable.
  2. Address scenario a) described above by looping through the rows in the datatable: For each row, fill any empty columns/fields with the value the preceding row had for that column/field. This is assuming that the actual data would not have any empty cells in it.
  3. Address scenario b) by looping through the rows in the datatable (you could probably do 2. and 3. in the same iteration): For each row, look at every column/field and determine whether it contains any of the separation characters (,, -, /) and if it does extract out the separate values. Then for each of the values insert a duplicate of the current row and then adjust the column/field value at hand. I would recommend reading the separation characters in from a config or argument and then constructing a regular expression based on them.
  4. Write the datatable back to Excel.

You need to hash out the details of course. Let me know if you’re running into any issues that you can’t figure out. Happy to help further.

2 Likes

thank you I will take a go at this, I am sure I will run into issues, as this is a learning curve for me. If you have a solution I can refer to when I am done, I would appreciate that as well!

do you have a template you started building off of, so I can follow?

Sorry, @Michael_L. I’m not aware of any template for this.

Which step are you struggling with? We can try to work through this together.

Hi @Michael_L ,

Can you tell me here, as per the requirement on the empty rows like the below can we field the value with previous row value?

Let me know also if there is a possibility of coming empty cell value in the input data.

Thanks,
Sagar

in his example, the result should look like this:

image

please refer back to the original thread and take a look at the input sheet to see the before and after. Let me know if you have any additiona advice/solution!

Hi @Michael_L ,

I am giving a solution on this, please do a trial and post here for any issue.

I am assuming here if any cell value will be empty, the empty cell value will get replaced by the previous cell.

Solution Approach:

  1. Formatting the table on the empty cell value.
  2. Using a LINQ code to get the output result for the delimiters separation.

LINQ Query Code:

(From row In dtFormatData.AsEnumerable
Let col_ColmB = row(1).ToString.Split(",-/".ToCharArray, StringSplitOptions.RemoveEmptyEntries)
Let col_ColmC = row(2).ToString.Split(",-/".ToCharArray, StringSplitOptions.RemoveEmptyEntries)
Let col_ColmD = row(3).ToString.Split(",-/".ToCharArray, StringSplitOptions.RemoveEmptyEntries)
From itemB In col_ColmB
From itemC In col_ColmC
From itemD In col_ColmD
Select dtOutput.Rows.Add({row(0).ToString,itemB,itemC, itemD})
).CopyToDataTable

FYI the output screenshot.

Attached the xaml.
SplitRowsForDelimeters.xaml (20.4 KB)

Thanks,
Sagar

1 Like

in your linq statement

image

what are these referring to?

Hi @Michael_L ,

The number inside the row variable is the index of the table column, which starts from 0 and goes on.

In place of the column index value, you may put the column name as well on the require place like row(0)-> row(“Appropriation Account”)

One example giving you below.

(From row In dtFormatData.AsEnumerable
Let col_ColmB = row("Appropriation Account").ToString.Split(",-/".ToCharArray, StringSplitOptions.RemoveEmptyEntries)
Let col_ColmC = row("CAT_B_APOR").ToString.Split(",-/".ToCharArray, StringSplitOptions.RemoveEmptyEntries)
Let col_ColmD = row("Cam1").ToString.Split(",-/".ToCharArray, StringSplitOptions.RemoveEmptyEntries)
From itemB In col_ColmB
From itemC In col_ColmC
From itemD In col_ColmD
Select dtOutput.Rows.Add({row(0).ToString,itemB,itemC, itemD})
).CopyToDataTable

Note: please reframe your column header names, it can give you issue on the above code if you have line separator as Enter/Newline.

If the solution works fine for you, please mark as solved.

Thanks,
Sagar

1 Like

that makes sense i am used to the format row(“xxxx”) but thank you for the clarification, let me run this on a larger set of data and will reach back to you if i have any issues, really appreciate your help

also are you using the same exact input sheet i sent over, or did you re-format yourself?

THere is one more column i forgot to add into input sheet, same contents as Office column. Where would I change in the code to incorporate the new column to get the same result?

I tried to modify the linq code:
image

and this is the output I am getting a blank column for Cam1:
A spreadsheet contains columns titled "location," "Office," "Appropria," "CAT_B_APOR," and an empty column labeled "Cam1," with rows populated by various alphanumeric codes, and a section highlighted in yellow on the right side. (Captioned by AI)

also for column Cat_B_APOR in output sheet, is there a way to delete leading spaces?
image

here is the updated input sheet with the added column Location:
Input (1).xlsx (10.9 KB)

appreciate your help again!

Hi @Michael_L ,

Thank you for validating the previous code.

As on your new table format one column got added which I am assuming will be same type of “office” column, below is the new Code:

(From row In dtFormatData.AsEnumerable
Let col_ColmB = row(2).ToString.Split(",-/".ToCharArray, StringSplitOptions.RemoveEmptyEntries)
Let col_ColmC = row(3).ToString.Split(",-/".ToCharArray, StringSplitOptions.RemoveEmptyEntries)
Let col_ColmD = row(4).ToString.Split(",-/".ToCharArray, StringSplitOptions.RemoveEmptyEntries)
From itemB In col_ColmB
From itemC In col_ColmC
From itemD In col_ColmD
Select dtOutput.Rows.Add({row(0).ToString,row(1).ToString,itemB,itemC, itemD})
).CopyToDataTable

If you observe on this new code, I shifted the column index position for first collection (col_ColmB) row(1) → row(2), like that for the next columns as well. and on the 2nd last line while adding datarow into final output table as one more column added of “Location” type, so added for the same new column data (Location → row(0) and Office → row(1)).

Thanks,
Sagar

1 Like

yep i fgured that i needed to adjust the row index and add another item to select statement, will keep going and let you know if i run into issues. also how do i delete leading/trailing spaces for all output cells?

Sure, try from your end as well if any modification you are suspecting

Regarding your question:
how do i delete leading/trailing spaces for all output cells? - Use after the Variable “.ToString().Trim()” expression. Like the below example.

row(2).ToString().Trim()

Cheers

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.