Font Style, Font Color, Font Style and Font Size of Texts in PDF

Hi Friends,

I want to extract texts from pdf and get the Font Style, Font Color, Font Style and Font Size of the extracted Texts from a PDF file.

Extracting text from pdf is pretty easy.

How do I perform text format related activities ?

Thanks

1 Like

Hi, if you extract the text with formatting, what you will need to do with it later?

1 Like

Hi @bcorrea,

The pdf I am talking about is a report.
I would like to check the formatting of the report as expected.

e.g. if the Header of a paragraph is “Verdana” font and size is 12; I need to verify this.

Thanks

i would go for automating the pdf reader itself with those documents opened and see if you can see the font name. PDF activities that read as text will not help you…

@bcorrea,
How do you automate pdf reader itself ?

Thanks

Just make the robot open your pdf reader like any other application… For an example:
image

@bcorrea,

I did this earlier, but it does not hep me to extract Text Formats.

This is just to open it… you will need to do the rest with common activities…

Hi @loginerror,

Is there a solution to what I posted here ?

Thanks

Hi, im sorry you could not find a solution yet. Let me see if i can help you a little more, once you have your PDF file open in your reader, can you extract this text info as a human? For example, if you select the text from the header and copy it to clipboard and afterwards you past it into a Word file, there, Word will show you the format used… So, my idea would be to either do like my example, or export the whole file to Word format and there you see its formats…

No worries.
so what are the activities which I can use for checking text format in word ?
or do I need to check the format in naked eyes ? I have 100 different pdfs and it’s not a good ide to check them manually.

Thanks

With all respect, but i think you are not making a lot of effort here, see i opened a PDF file in my Word application, select a text and it shows me current format of it:

Hope you read and understand my statement above “do i need to check the text format in naked eyes”.
If so then I am not automating this. I have 100 different pdf forms and it’ll be not easy to check each and every pdf forms.

Of course you wouldn’t be doing it with your eyes, that is why we use UiPath in the first place, so we dont have to use our own eyes :slight_smile: If you see my last post and cant understand how you can do this with UiPath, then i recommend that you go through our Academy, so you will learn the basics and start to build your projects. The forum is to assist when there is some specifics that you have problems, with, but we cant build everyone’s project from scratch every time…

Ofcourse I am not asking anyone to build my project from scratch.
I asked in my earlier post, wht are the word activities I can use to verify the text format? it is that simple.

Hi @AjitNayak,
Even i want to test the pdf report. I want to verify the font style,color ,size of header,footer,tables present in a report in pdf format. Did you find any tools that can help me to validate my report in pdf format.

@sangeetha_Narayanan

I believe there is no way to do this in pdf.
The only one way I can think of is to convert the pdf to word and then when you highlight the texts in word, you can see the text format details.

Thanks

@sangeetha_Narayanan @AjitNayak … If you have Adobe Acrobat Profession then you have follow the below steps:

  1. Open your PDF.
  2. Go to TOOLS -> Advanced Editing and select the “TouchUp Text Tool”.
  3. Click on the text that you wish to extract the typeface from and a bounding box should appear.
  4. Highlight a portion of the text and right click to bring up a menu. Select “Properties”.
  5. Information about your font should be displayed under the “Text” tab, including the font name.

thank you, but i m looking for a freeware tool. I dont have Abode Acrobat Profession.