I have a DataTable “dt” which contains a table from Excel. The table contains some informations about some postal numbers, however, one user case we have to look out for is if people enter letters in their postal code.
Example of good input is " 99950" for Alaska.
Example of bad input is " 99950 Alaska" for Alaska
We only accept cells which ONLY contains letters, every cells that contains anything else than numbers has to be copied from dt over to another data table, IncorrectFormat, which stores all the the cells with incorrect input. IncorrectFormat has to store all the incorrect postal codes from top to down in one column.
Is there a easy and efficient way to do this?
Here is what I know so far. Doing “Regex.Matches(yourstring,“[a-zA-Z]”).Count” should compare a string with my regular expression in my brackets which is all the letters, however, how do I do so it checks for columns from an excel spreadsheet?
You may assume I am using Read Range and have the correct file path (“test.xlsx”), sheetname(“page1”) and range("A2:C2)
Attached to this is a screenshot of a For Each loop which (hopefully) illustrates what I want to achieve.
Ps. The In in my For Each loop is Enumerable.Range(0,IncorrectFormat.Columns.Count).ToArray()
Read range with A1:C1 will read only first row and if you have checked headers true then it would have zero rows ideally…If you want to read whole range then you can use A1 only
So what you can do is use a loop on rows first and then loop on columns to check all cells
The question here is if a number is found only cell is to be copied or whole row?
A sample input and output would help in understanding better
First of, I just made a change to my post, I do indeed have headers. I changed range to (“A2:C2”)
Second of, I am sorry, but I do not fully comprehend what you are asking in your reply? I want to check every single cell in all of my columns. Assume every column has 3 rows (realistically I have more than 100+ coloumns and rows), so
A1, A2, A3
B1, B2, B3
C1, C2, C3
I now want to check all of the cells if their contain a letter, don’t matter how many, just a single letter is enough. If a cell contains a letter I want that single cell to be copied over to my other Data Table IncorrectFormat which will then paste all of the incorrect postal codes in another excel spreadsheet, let’s just call that spreadsheet for “IncorrectFormatTest.xlsx”.
So an example could be this
A1 = 123
A2 = 12C3
A3 = 12 DF 5
B1 = FG
B2 = 1234J
B3 = 123478
C1 = 555
C2 = B13B
C3 = 2432
I now want to store A2, A3, B1, B2 and C2 ONLY in my new data table IncorrectFormat which I then will use to paste in my new excel spreadsheet “IncorrectFormatTest.xlsx”.
I hope this makes sense, else, ask as many questions as you want for clarification.
Read the data into datatable(dt) (Again…only giving A2:C2 will read only one row…if you want to read all give only A1 or leave it blank it will read whole sheet)
Pass this expression in assign and the output would be a string with only required values. Here I am selecting rows containing letters and from then getting each cell or column containing letters and concatenating them with New line character
That is so close to what I want. Let me be a bit more pricise in what I want. I actually ONLY want to save the numbers. If there is some text/ characters in my string then I don’t want to save the cell. In your exaple, I only want to save cell(Column2, row 6) and cell(Column1, row 7). Those two cells still need to be saved in a data table which only has one column nammed Incorrect Format, the heading I will do myself. Does this make sense for you?
I don’t need ALL of the rows or ALL of the columns. I need the specific cells where there are text. I will show you a screenshot of the excel document.
Watch how some cells only have numbers, which I do not mind, and some have text ind them. Those cells which has text in them I want to copy and put in another seperate excel document called Incorrect Format.xlsx. How do I do this?
If you see the screenshots I am getting only cells which have text in them…If you want the value that is the way…If not let me know what you want…like the Excel cell A2 is having text then you want A2 as output or row1 column 2 as output
Or you mean you need only number then
str = String.Join(Environment.NewLine,dt.AsEnumerable.Where(function(x) x.ItemArray.Any(function(m) Not System.Text.RegularExpressions.Regex.IsMatch(m.ToString,"[a-zA-z]+"))).Select(function(x) String.Join(Environment.NewLine,x.ItemArray.Where(function(y) Not System.Text.RegularExpressions.Regex.IsMatch(y.ToString,"[a-zA-z]+")))))
Yes, that is excatly what I want. You see how everything else got removed and you now only have numbers? Where do I put that argument? Is that inside my for each loop or where?
And please, can you tell me if your str is a string or a data table value?