Extracting all headers in a pdf

Hi, I want to extract all the possible headers which are available in a pdf file and store the headers and their values in an excel file I am new to this can anyone help me with the issue

Hi!

if your using native pdf it is easy to get the values from PDF.

https://epsilonai.com/how-to-extract-table-from-pdf-in-uipath

Regards,
NaNi

@THIRU_NANI Hi nani I checked it and I don’t have any tables in my pdf file, I want to extract the headers in the pdf file, for example if there is a name, date,loan issued, headers are available I want to extract those headers along with their values in an excel sheet

IF that pdf is a native pdf use read pdf activity
If that pdf is a scanned pdf use read pdf with ocr. take Tesseract OCR engine to read the pdf.

Regards,
NaNi

images
If you see the above image it has headers as rebrand, poster series, total,tax, subtotal headers, I want to extract all those headers along with their values in an excel

I got that nani, my problem is I only know how to get a single value from a pdf not bunch of values at a time

In that case you can use Get text activity!

Regards,
NaNi

Hi @Funky_Monks

Is it possible to attach pdf file…let me try with regex.

Thanks,
Boopathi

201311_cfpb_kbyo_closing-disclosure.pdf (61.0 KB)
This is the pdf file and I need to extract all the headers in the 5 pages and store it in excel, I know how to get a single data but I never done extracting a bunch of headers at a time

Hi @Funky_Monks

Regex extraction would not be appropriate one for this extraction as the pdf contains lot of information with multiple pages and are the field headers remain constant in every pdf or changes?

or Please check if this activity helps you

Thanks,
Boopathi