I want to extract tables from pdf but it is in different format for different pdf’s. Tried ML model using ai center with taxonomy- but these worked based on the position, but my client required was based on the name of the table it should extract the value rather than developing ML model based on the position based(ex- drawings required to extract latest revision no, date, title)
That does not work on position…
It does take the column names etc into consideration
Cheers
Thank you for ur reply,
But it will not take columns name right. In ai center we will give the position and we will only indicate have to take this, but what i required was if we give (ex-stdrollno- it has to give corresponding no).
so you are saying column names will change?
it does use the specified column to identify but if those are changing then you might need to use ner model etc…and direct way is not there
cheers
column names will be same, but it will be on different positions for each pdf(ex- 100 PDF models), here we cannot train every document in machine learning model right, it will be very difficult.
Can u pls guide about user ner model as u stated ?
you neednnot train every…you need to train similar documents then it should be able to pick up the columns as needed…those changes can be taken by ai center
some insight on ner
cheers