Save decoded PDF after HTTP Request

Krissi · June 21, 2023, 9:04am

Hello everyone

I have a problem saving a pdf file using HTTP activity.
I can save a file, but when I try to open it, it shows an error message.
Out IT told me, that the downloaded file is base64 encoded - so that I have to decode it to UTF-8.

I was able to extract the DocumentContent from the HTTP Response.
Then I try to decode it with:
System.Text.Encoding.UTF8.GetString(Convert.FromBase64String(strDokumentContent))

I’m getting a string that seems to be well decoded to UTF-8.
Using “Write Text File” I’m able to save this String as a PDF - but I’m unable to open it.

The Error I’m recieving in Acrobat Reader is:

“There was an error processing a page. Invalid ColorSpace.”

Opening it in the Edge Browser returns a blank sheet.

When I open it with an editor it looks like a valid UTF-8 String to me - but I’m not expirienced in this field.

Here 2 screenshots of the dokument openen in editor:

…

Do you have any suggestions how to decode and save the PDF?
Would greatfully appreciate your ideas

Anil_G · June 21, 2023, 9:07am

@Krissi

Can you confirm this…is the response that you are saving is of text?

I guess you are getting a base64encoded string a stext and you need to decode and save as pdf…

But you are trying to save the content as file and then reading and trying to decode

And to save as pdf try using word document activities

Cheers

Krissi · June 21, 2023, 9:13am

I’m not saving the pdf, then reading it again and trying to decode it afterwards.
I’m using the HTTP Requests - response. Than I’m using Deserialize JSON. Afterwardt I’m trying to get the Dokument Content through: cStr(jsonDeserialized.GetValue(“dokumentContent”))
There I get the base64 String of the content

ppr · June 21, 2023, 9:16am

@Krissi
we had similar issue
use from the HTTP Request the Filename for Response Attachment Option and refer to the saved output file

grafik

Krissi · June 21, 2023, 9:16am

I’ve tried to use the Word activities, but it didn’t work.
There I’ve tried ro append the String to a Word Document and then Extract it as PDF.
But it was just the UTF-8 Text, like one can senn in the editor.
Maybe I was using it the wrong way?

Krissi · June 21, 2023, 9:26am

Thank you a lot for this suggestion.
When I’m using this option, I’m not able to decode the file from base64 to UTF-8 before opening it.
So when I open the file it shows me the error:
could not open “” because it is either not a supported file type or because the file has been damaged.

ppr · June 21, 2023, 9:29am

yes, as mentioned we had the same issue
While receiving the content from the web/app server using the encoding was broken.

So the best is to sort it out step by step

Step 1: when using the option from above a file is created. What is the file name / content.type?

ppr · June 21, 2023, 9:30am

still it can be handled

Krissi · June 21, 2023, 9:33am

Was trying with
Test.pdf and test.txt
What’s better?
What would be your suggestion?

ppr · June 21, 2023, 9:35am

use text.txt and then use notepad++ and inspect the content

BTW is it a public URL, so we can crosscheck from our end as well

Krissi · June 21, 2023, 9:37am

unfortunately it’s not a public url.
It’s a Pdf-file hosted on intern servers.

I’ll try it with test.txt.
What should I look for with notepad++?

ppr · June 21, 2023, 9:38am

the content in general (maybe share some screenshots) and the encoding info bottom right

Krissi · June 21, 2023, 9:43am

unfortunately we don’t have notepad++ anymore
I’ll use another tool to enspect the .txt - maybe these screenshots can help

ppr · June 21, 2023, 9:49am

ok, so in general we would

read in the file - maybe handling the encoding
parse it to Json
ectract the content value
decode Base64 (if it is base64)
save it as file wtih pdf extension on filename

Shortcut: use an online decoder and test the content (when ok - fine, when failing - ignore as too much factors can falsify the test)

When possible feel free to share with us the text file

ppr · June 21, 2023, 10:00am

grafik

When project is set to Compatibility: Windows handle the encoding as described here:
https://forum.uipath.com/t/infoset-legacytowindows-migration-demo-of-a-prototypical-migration/555711#fix-unsupported-encoding-name-11

Krissi · June 21, 2023, 10:05am

Trying these steps:

read Text File - without handling the encoding
parsing it to Json - is it ok to use the Deserialize JSON to create a json Object from the string?

Saved it like you suggested, but I still get the same error as when I’m getting the content straight from the http Response

ppr · June 21, 2023, 10:21am

no, you should handle the encoding when read in the file.

Krissi · June 21, 2023, 10:30am

do I have to enter the codintype of the source format, or the one from the format to be?
Can’t convert it there - just insert the current encoding format - isn’t it?
Thank you so much for your patience!

Is it even possible to recieve pdfs that contain pictures with this method?

ppr · June 21, 2023, 10:37am

from your screenshot we got the hint that file is encoded to Windows-1252 so we will use this for the first step. Kindly note: we already shared some additional information on it with you

Krissi · June 21, 2023, 1:08pm

unfortunately I wasn’t able to extract the PDF with your suggestion.
Meanwhile we found a solution:

We use the following steps:

extract the Data from HTTP Request
deserialize the Response-String
get the document Content from the json → returns the base64 String
use the Invoce Code to save the base64-String into a PDF file

The Script is:

Dim bytes As Byte() = Convert.FromBase64String(**in_strBase64**) 
File.WriteAllBytes(**in_strPath, bytes**)

The used arguments are the recieved String and the path to save the PDF

Topic		Replies	Views
Save pdf using HTTP request Studio uiautomation , studio , completed , question	18	9087	November 6, 2020
HTTP request activity saves corrupt pdf file Studio activities , studio , question	10	2924	October 28, 2021
Problems with base64 decoding Studio studio , question , variables_management	5	2713	April 8, 2023
HTTP Request: Destroys downloaded PDF¨s Help activities	17	8963	January 22, 2021
Decoding Excel file from HTTP Request Studio excel , uiautomation , activities , question	5	870	February 23, 2023

Save decoded PDF after HTTP Request

Related topics