I’m trying to reduce a file like if the file size is greater than 5MB, I want to reduce the size to less than 5MB. I am able to get the size of the file through FileInfo.Length and suppose if the file size is greater than 5MB means, I’m unable to find a query/activity to reduce the size of the file and save it in the same directory.
This way you can use the Invoke PowerShell activity in UiPath and provide the input and output file paths to the script and the file should be reduced to a smaller size.
Thanks for the reply. Like I have downloaded the ghostscript from the link which you have provided and I have installed in my system and also I have used command by replacing with the /ebook in the powershell.
The error means that PowerShell does not have a command like the one you use, this is because UiPaths invoke powershell setting should be checked - IsScript
Point 1: In the invoke powershell activity, on the properties panel you can to choose IsScript box.
Point 2: Is your input.pdf the correct path of the file?
You will see that the script text is a little different but just replace the contents in it. Before also running all of this in UiPath it is worthwhile to check if the command you use runs on PowerShell itself. That way you know you are closer to the solution.
You can modify the variables to match your use-case.
After some 2 hours of trying to get the string formatting right for the script, I have a working example for you and others who are interested in reducing size of PDFs
In UiPath this is not that straight forward due to string formatting requirements.
The workflow will take the following arguments including the PDFSettings option.
I have annotations for them in the workflow, which you can refer. The important argument being the location of GhostScript (gswin64c.exe), yours may differ. Do change it before you run the attached workflow.
Since we are sending strings to PowerShell and in script format we check the IsScript
I added some FileInfo details of the input and output files, which you can use to verify if the reduction was successful or not. The output would look like
I have not included any exception handling in the PowerShell script or in the workflow, but I am assuming you can include them. I have a write up on creating error-proof workflow which you can refer.
I have a question, like in the argument section you have used in_PDFSetting. What is the need of this variable and also default value is “screen”. So, can you explain what’s the need and use of that argument.
It is upto you to choose which one suits your use-case best. Lets say if your PDF is further used by other robot / human processes, you may want a higher quality.
Examples:
If you use the output file with OCR engines the results of the OCR would be better with printer or prepress options than screen as they retain better quality (contrast in this case)
If the PDF will just be archived, then use screen this way you have good enough file quality and will save space in the long-run.
That is strange. I used a PDF for lesser than 1mb and it does work. If this happens, I think you need to ensure you read the PDF content and do some exception handling so that the robot does not conclude the job was successfull.
Are you sure the input file was correctly set or has content in it?
The command goes through each page and processes each page, so I am not sure why this behaviour occurs. My guess is that PDF contains images and not rich text. Just a wild guess.
Yes, if my file size is around 1.6Mb something like that, it gives me the required output without any issues. But, if my file size is about 91kb or 313kb or if the file contains a single page alone means I am getting an output file with blank page, even I have tried to change the settings of the Ghostscript yet I am getting the same issue but if my file size is more than 1Mb only I am able to get the output. Also, in my pdf files, the pages are scanned images yet if it’s more than 1Mb means it get process else facing with blank page.
I guess this issue you need explore on your own. The forum can only help you with way to get to your solution and cannot offer a 100% working solution (atleast not for complex automations).
Your use-case in the question was for PDFs sizes over 5MB right? Why dont you filter out PDFs under 5 MB first, that way you do not need to reduce their file size. You will save robot execution time and avoid this failure altogether.
Also adding a sequence to check if the output.pdf has no characters will be necessary if you continue to use the suggested approach on smaller file sizes.
As per the reason for blank pdfs, there are bunch of similar questions in stackoverflow / ghostscript forum dating from 2008 to 2019, which you can study.
I’m facing an issue like if I try to invoke this workflow for multiple files through for loop where the input files is present in different locations and the output file need to be generated in different location the output file is not generated and also if both location in the same location means it works fine.
Waiting for your reply sir, when I tried to loop through files, I am not getting the desired output, tried to sort out in many ways yet have not resolved it.
As far as I know, both input and output file paths can be anywhere.
I had also checked if the code runs when the file paths have space in them.
I would ask you check the command generated.
Copy the string pasted in invoke powershell activity and use message box / log message while debugging. Inspect the command and try running that in powershell. If it does not work then you know why it does not work.
So, debug in for loop with a message box /log message of the command string.
This one I tried through for loop by getting the path of the folder where I take only pdf and do the further process.
2.The sample input file but without using loop, I am directly giving the file name and output in different location, I am able to get the desired output.
Likewise, if I tried for multiple pdf files, like the first image, I am getting the output with blank page or sometimes, the output itself not getting generated.
I have also tried using the Log message, and able to find that this log message “Failure : The size reduction seem to have not worked as required.” is only getting printed in the output panel.
Kindly, check it out whether there is any idea why I am getting this issue!
I have even tried of looping the files which are more than 5Mb into the loop
The process of the workflow:
Getting the file directory
Assigning maximum file value
Iterating through the directory
4.Checking whether file size is less than maximum file size
5.if false, then the file size would be more than 5Mb so I need to compress the size of the file.
5.1 the invoke powershell runs
But, finally I am getting the log message : “Failure : The size reduction seem to have not worked as required.”
and I have not got the output file generated since it throws an error.
Kindly, check it there is any mistake in the logic.