How to remove garbage characters from a string


#1

Capture1Hi,

I am extracting an string from using dynamic position index where length of text is also not known.
So I am extracting extra characters as well at its left and right. When I am putting this in message box, it is showing blank spaces, but when I am trimming it and using it in selector, it is showing as bad characters. For e.g
I have to extract on string from full page for e.g. ‘UIPath’ but this string also can be of variable length i.e. it can also be ‘UiPath12’ or ‘Studio’ or Robot’ etc… that means it could be any.

So I am extracting additional characters before and after that position. like ‘*UIPath

  • is blank spaces here. But when I using it in selector it is showing me as bad character like ‘#ADUIPath&D#’.

Can you please suggest how can I track and remove those bad characters.

Capture2


#2

Any Help?


#3

It’s not clear based on your explanation what exactly is happening. Can you provide an example workflow? Your example doesn’t make any sense, because you say you’re looking for ‘UiPath’ but then say you’d also accept ‘Studio’ or ‘Robot’ etc. Not sure how that association is being made.

Here are my assumptions, please correct me if I’m wrong:

  • You have a string that is variable in length and format
  • You want to get a substring from that string
  • You know the start index, but not the length of the substring
  • Inside of your substring, there are characters that need to be removed

Your first step is identifying how to get the initial substring. Is there a clear end index? If so you can use the start index and end index to pull out the substring. Otherwise, what patterns can you use to search? This would allow to split the string or use regex to pull out your substring. If no pattern can be found, then it is impossible to work with.

Your 2nd step is to repeat step 1, but with your newly found substring. Again, if a pattern is not found, then it may not be possible to automate.

For the garbage characters, where are you pulling from? Is it encoded in any way?