Encoding Problem

Hi everyone,

I have a problem with this encoding:
What I have :
marie_jã©rã´
What I want to get :
marie_jérôme
Do you have any solution for me, please?
I think that I have to convert the text from ISO-8859-1 to UTF-8.
I tried the solution below but it didn’t work:
Encoding iso = Encoding.GetEncoding(“ISO-8859-1”);
Encoding utf8 = Encoding.UTF8;
byte utfBytes = utf8.GetBytes(Message);
byte isoBytes = Encoding.Convert(utf8, iso, utfBytes);
string msg = iso.GetString(isoBytes);

Best regards,

what source of the text is used (text, csv…). How is the readin data implemented (read text, read csv…)?

Hello @ppr
The text is an String obtained by scraping the HTML web page source.
This is one of the URLs that I got
https://am.oddo-bhf.com/content/img/marie_jã©rã´me_blackwhite_print.jpg
when I scrap the page :
https://am.oddo-bhf.com/france/fr/investisseur_professionnel/ad/identite/1010/media_kit/1030

got it as following from extract data:
grafik

can you share some more details on your implementation e.g. screenshots from relevant part

Hi @ppr
Thank you for you response
you can find below screenshots from my workflow where I’m trying to extract the links



image

it looks like something is going wrong with the encoding
But website is running on utf-8
Write text by default is running on utf-8
Internal code not working with ISO-8859-1

Also take a note on following:
grafik
Or API

for encoding analyses we can recommend notepad++ to check results as encodings are displayed in the editor on bottom right

For retrieval we also have a lot of alternates along helping you to fix your approach:

But let us know on what is the main work to achieve and we will check for solution approaches

@ppr I want to extract only the links that are in “href”.
I’m loading the HTMLText in an HTML document to be able to extract the linkNodes which is a collection.

and I cannot use the other alternatives because I’m scrapping all the links of the site https://am.oddo-bhf.com/france/fr/investisseur_professionnel/home and not only the link I provided in my problem description.
Can you please share with me the workflow you used in order to get the result below


Thank you :slight_smile:

ok, once scraped all links where/how you want to store?

there are alternates:

  • find children and post processing
  • the approach that you used - Agility Pack
  • Get Attribute and XML Processing e.g. with LINQ other other .Net XML Apis

in a datatable

I found the solution in this link

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.