FILTER THE DATATABLE_A INTO DATATABLE_B AND DATATABLE_C

Hi Team,
(@aksh1yadav, @arivu96, @KarthikByggari, @Karthick_Settu, @ClaytonM, @pathrudu, @PrankurJoshi, @Priya_Pandey, @Divyashreem, )

I have a datatable (let’s say dtA) which has 10 columns and 1000 rows with various values (Columns are:- Server name, OS version, Path, Port name, etc)

And as I mentioned, there are 1000 such rows.

Now from the datatable I want to filter it such that this dtA is divided into two different datatables (dtB and dtC).

The criteria is :-
if (OS version contains “Windows”, “MIcrosoft”, “MS”, etc) then add this data (all the details Server name, OS version, Path, Port name, etc) needs to be entered into dtB

else
if (OS version consists “Redhat”, “Sun”, “Linux”, “Kali”, “Ubuntu” etc ) then add this data (all the details Server name, OS version, Path, Port name, etc) needs to be entered into dtC

I have performed this operation as below:-

build data table dtB and dtC (I already have dtA in hand)

for each row in dtA
if (row(1).tostring = “MS”, “Windows”, etc) then add a new datarow and enter each row (each and every detail into dtB using assign activity)

else if (row(1).tostring = “Redhat”, “sun”, etc) then add a new datarow and enter each row (each and every details into dtC using assign activity)

Now, as I mentioned, there are 1000 rows, but there might be 100000 or more rows, and using a for loop is not very convenient for me.

I also know that I can directly use filter data table activity to input dtA and output a dtB
and another filter datatable activity where I input dtA and output a dtC

But I need to know how do I mention the Column name and value pair in the filter datatable and retrieve the dtB and dtC directly.

I need to put in the filter datatable activity as dtA.columns.contains(“MS”) or dtA.colunms.contains(“Microsoft”)

because the actual value cell might contain (MS sql server verisio bla bla bla) instead of just “MS” in that cell
or it might contain (Windows server 2008 r2 bla bla bla) insteaad of just “Microsoft”

Please help, Its urgent,

Thanks and Regards,
@hacky

Well,

If row count might turn out to be more than 100000 then better to import it as a database and query it.

1000 to 10000 are fine, more than 100000 I am not so sure your bot will run without crashing sometimes.

Regards

Hi.

So, you say you have a list of words that you want to use to identify an OS for two different tables. Let’s say we store these lists in an array variable.
osWords1 = {"Windows","Microsoft","MS"}
osWords2 = {"Redhat","Sun","Linux","Kali","Ubuntu"}
That way, you can adjust these lists (add more if you want)

You can look in these arrays if a word is contained in a string to return a Boolean using .All() or .Any(), like this:
osWords1.Any(Function(x) str.Contains(x))
.Any() looks in the array and returns True if atleast one item is True, whereas .All() returns True if all items are True.

So using this with LINQ, we can formulate your two datatables.

Like this assuming “OS version” is the column name:

dtB = dtA.AsEnumerable.Where(Function(r) osWords1.Any(Function(x) r("OS version").ToString.Contains(x)) ).CopyToDataTable

dtC = dtA.AsEnumerable.Where(Function(r) osWords2.Any(Function(x) r("OS version").ToString.Contains(x)) ).CopyToDataTable

and there ya have it, two datatables with your criteria. I hope this helps solve your question. Note: I did not test this and this code comes directly from memory, just in case there are mistakes.

Also, if you need to change a cell value in the datatable, you will need to run a ForEach, but processing usage has improved quite a bit now, so 100000 rows may not take very long any more.

Regards.

4 Likes

@ClaytonM
Thank you very much for your response. I am pleased to learn it from you.

I guess your approach seems worth trying. You rightly guessed that (os version) is the column name.

However i had couple of questions in my mind:-

  1. Is there a way I can use filter data table directly to achieve this.??? If so please let me know how can I achieve this.

  2. As you suggested I can use arrays. But FYI, I need to use “ms”, " Microsoft", “windows” etc as a keywords. So i want to use these keywords as dtA.column.contains(“ms”) instead of dtA.column.equals(ms") since the OS version column has handful of words in some cases. Will this be satisfied in your approach??

  3. Also will your approach fetch all the columns into dtB and dtC respectively? Hope this is the case.

Please get back to me at the earliest.

Regards.

1 Like

Yes, if your logic doesn’t require any expressions like to convert or change values as you compare them, then this will work too. In your case, it will work. Here is a sample where you just need to add every single word in the comparison, and click the plus sign to add more and use “Or”:

Yes, my approach for storing the words in an array let’s you add all words you want. So, you can have “ms”, “MS”, and so on - just add them to the array, and you don’t need to change any code in your project at all.

If you use my approach, it keeps the exact data table structure inplace, so all columns will be there. If you use the Filter Data Table activity, then you need to specify every column you want to keep - I think.

I hope this answers your questions.

Regards.

1 Like

I’ll add one more thing to this. If you use expressions like the above example in my approach, you should compare the values as all lowercase or all uppercase, so it is not case-sensitive.

Here is the added .ToUpper to accomplish this:
dtB = dtA.AsEnumerable.Where(Function(r) osWords1.Any(Function(x) r("OS version").ToString.ToUpper.Contains(x.Trim.ToUpper)) ).CopyToDataTable

dtC = dtA.AsEnumerable.Where(Function(r) osWords2.Any(Function(x) r("OS version").ToString.ToUpper.Contains(x.Trim.ToUpper)) ).CopyToDataTable

EDIT: if it is not case-sensitive like this, then you don’t need to add all case-variations like “ms” and “MS”. If you use the Filter Data Table, you can’t use expressions, so you can’t do this I believe and will need to add all variations. Thanks.

Regards.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.