While scraping data I am getting some bad data that needs cleansing. When it happens, there is a lot of it.
My script builds up a dt row over the course of several web scrapes, then at the end of the loop writes to a .CSV file before starting the scrape process again. I do it this way so progress is not lost on crash/recovery.
The file it writes to is very, very large, so for resource efficiency sake I don’t want to try to cleanse the whole data table or the CSV every loop. What I’d prefer to do is to cleanse just the latest dt row between the last update to the dt row and the write of the dt to CSV.
// Last dt row update. Mark as Done
dtASlist.Rows(dtASlist.Rows.IndexOf(row))(“GotAllData”) = “Done”
// Write updated DT to the CSV
Write dtASlist (include headers) to strFilePath_AScsv
Between those two lines, what can I add to clean the dt row of bad data without impacting the good data.
dt row with bad data example:
83.06%, $14.32, ###, ---, -, $11.30, -9.57, -3.81%, -$0.05
Need to end up with:
83.06, 14.32, , , , 11.30, -9.57, -3.81, -0.05
I need to remove “-” and “–” only if it is the entire content of the column (i.e. The cleanse must not change a negative integer into a positive integer), I also need to remove dollar signs and percent signs where ever they appear
How can this dt row data cleanse be done? Please provide code/screenshots as I’m inexperienced with UIpath. Thanks