It is possible to Extract PDF text with it's Font information(Size,Boldness and Font type)

Hello All,

I stuck in finding patterns in some pdf samples
it is possible to Extract PDF text with it’s Font information(Size,Boldness and Font type)

Regards
Aditya

Hello Every one,

do any one have solution for this?

Regards
Aditya

Hey Tiberiu ,

Thank you very much for your response, I gone through this post, it suggests to the font to see the font type, boldness!!

here my Requirement is to classify the document based on it’s Font-type, and boldness, my document contains multiple fonts and normal text, bold and extra bold too.

so now my question is, whether, we can use consume Adobe Reader API to Uipath in order to get font details?

Regards
Aditya

Hello All,

I found a alternate way:

first change that pdf to word By Balarewa.PDF.Activities or any other activity if exists

then you can create a python function to recognize text font information’s

the code is:
import docx
path = ‘/home/karamveer/Downloads/222.docx’ #your docx file path
doc = docx.Document(path)
for p in doc.paragraphs:
name = p.style.font.name
size = p.style.font.size
print name, size

Happy Automation :wink:

Regards
Aditya

3 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.