Want to extract different forms of data from pdf

I want to extract tables from pdf but it is in different format for different pdf’s. Tried ML model using ai center with taxonomy- but these worked based on the position, but my client required was based on the name of the table it should extract the value rather than developing ML model based on the position based(ex- drawings required to extract latest revision no, date, title)

@Reshmita_Vemulapalli

That does not work on position…

It does take the column names etc into consideration

Cheers

Thank you for ur reply,

But it will not take columns name right. In ai center we will give the position and we will only indicate have to take this, but what i required was if we give (ex-stdrollno- it has to give corresponding no).

@Reshmita_Vemulapalli

so you are saying column names will change?

it does use the specified column to identify but if those are changing then you might need to use ner model etc…and direct way is not there

cheers

column names will be same, but it will be on different positions for each pdf(ex- 100 PDF models), here we cannot train every document in machine learning model right, it will be very difficult.

Can u pls guide about user ner model as u stated ?

@Reshmita_Vemulapalli

you neednnot train every…you need to train similar documents then it should be able to pick up the columns as needed…those changes can be taken by ai center

some insight on ner

cheers