Save decoded PDF after HTTP Request

Hello everyone :slight_smile:

I have a problem saving a pdf file using HTTP activity.
I can save a file, but when I try to open it, it shows an error message.
Out IT told me, that the downloaded file is base64 encoded - so that I have to decode it to UTF-8.

I was able to extract the DocumentContent from the HTTP Response.
Then I try to decode it with:
System.Text.Encoding.UTF8.GetString(Convert.FromBase64String(strDokumentContent))

I’m getting a string that seems to be well decoded to UTF-8.
Using “Write Text File” I’m able to save this String as a PDF - but I’m unable to open it. :frowning:

The Error I’m recieving in Acrobat Reader is:


“There was an error processing a page. Invalid ColorSpace.”

Opening it in the Edge Browser returns a blank sheet.

When I open it with an editor it looks like a valid UTF-8 String to me - but I’m not expirienced in this field.

Here 2 screenshots of the dokument openen in editor:



Do you have any suggestions how to decode and save the PDF?
Would greatfully appreciate your ideas :slight_smile:

@Krissi

Can you confirm this…is the response that you are saving is of text?

I guess you are getting a base64encoded string a stext and you need to decode and save as pdf…

But you are trying to save the content as file and then reading and trying to decode

And to save as pdf try using word document activities

Cheers

I’m not saving the pdf, then reading it again and trying to decode it afterwards.
I’m using the HTTP Requests - response. Than I’m using Deserialize JSON. Afterwardt I’m trying to get the Dokument Content through: cStr(jsonDeserialized.GetValue(“dokumentContent”))
There I get the base64 String of the content

@Krissi
we had similar issue
use from the HTTP Request the Filename for Response Attachment Option and refer to the saved output file

grafik

I’ve tried to use the Word activities, but it didn’t work.
There I’ve tried ro append the String to a Word Document and then Extract it as PDF.
But it was just the UTF-8 Text, like one can senn in the editor.
Maybe I was using it the wrong way? :question:

Thank you a lot for this suggestion.
When I’m using this option, I’m not able to decode the file from base64 to UTF-8 before opening it.
So when I open the file it shows me the error:
could not open “” because it is either not a supported file type or because the file has been damaged. :frowning:

yes, as mentioned we had the same issue
While receiving the content from the web/app server using the encoding was broken.

So the best is to sort it out step by step

Step 1: when using the option from above a file is created. What is the file name / content.type?

still it can be handled

Was trying with
Test.pdf and test.txt
What’s better?
What would be your suggestion? :slight_smile:

use text.txt and then use notepad++ and inspect the content

BTW is it a public URL, so we can crosscheck from our end as well

unfortunately it’s not a public url.
It’s a Pdf-file hosted on intern servers.

I’ll try it with test.txt.
What should I look for with notepad++?

the content in general (maybe share some screenshots) and the encoding info bottom right

unfortunately we don’t have notepad++ anymore :sweat_smile:
I’ll use another tool to enspect the .txt - maybe these screenshots can help


image

ok, so in general we would

  • read in the file - maybe handling the encoding
  • parse it to Json
  • ectract the content value
  • decode Base64 (if it is base64)
  • save it as file wtih pdf extension on filename

Shortcut: use an online decoder and test the content (when ok - fine, when failing - ignore as too much factors can falsify the test)

When possible feel free to share with us the text file

grafik

When project is set to Compatibility: Windows handle the encoding as described here:
https://forum.uipath.com/t/infoset-legacytowindows-migration-demo-of-a-prototypical-migration/555711#fix-unsupported-encoding-name-11

Trying these steps:

  • read Text File - without handling the encoding

  • parsing it to Json - is it ok to use the Deserialize JSON to create a json Object from the string?

Saved it like you suggested, but I still get the same error as when I’m getting the content straight from the http Response

no, you should handle the encoding when read in the file.

do I have to enter the codintype of the source format, or the one from the format to be?
Can’t convert it there - just insert the current encoding format - isn’t it?
Thank you so much for your patience!

Is it even possible to recieve pdfs that contain pictures with this method?

from your screenshot we got the hint that file is encoded to Windows-1252 so we will use this for the first step. Kindly note: we already shared some additional information on it with you

unfortunately I wasn’t able to extract the PDF with your suggestion.
Meanwhile we found a solution:

We use the following steps:

  • extract the Data from HTTP Request
  • deserialize the Response-String
  • get the document Content from the json → returns the base64 String
  • use the Invoce Code to save the base64-String into a PDF file

The Script is:

Dim bytes As Byte() = Convert.FromBase64String(**in_strBase64**) 
File.WriteAllBytes(**in_strPath, bytes**)

The used arguments are the recieved String and the path to save the PDF