Hi All
I need help in extracting data from unstructure data read from PDF.
i have read PDF to .txt format and the result is this:
University of Strathclyde, UK Master of Business Administration with Highest Distinction
National University of Singapore Bachelor of Accountancy - 2nd Lower Honours
Pioneer Junior College GCE A Level
Beatty Secondary School GCE O Level
University of London Bachelor of Science(Artifical Intelligent) - 2nd Lower Honours
Jurong Junior College GCE A Level
Nanhua High School GCE O Level
Nanyang Technological University Bachelor of Information (Computer Science)
NUS High School NUS High School Diploma
National University of Singapore PhD in Electrical and Computer Science Engineering
Nanyang Technological University Bachelor of Engineering (Electrical and Electronic Engineering) - 2nd Upper Honours
Ngee Ann Polytechnic Diploma in Electronics, Computer and Communication Engineering
Queensway Secondary School GCE ‘O’ Level
I need the bot to help me cut the result and output the following:
University of Strathclyde, UK
Master of Business Administration with Highest Distinction
National University of Singapore
Bachelor of Accountancy
Pioneer Junior College
GCE A Level
Beatty Secondary School
GCE O Level
University of London
Bachelor of Science(Artifical Intelligent)
Jurong Junior College
GCE A Level
Nanhua High School
GCE O Level
Nanyang Technological University
Bachelor of Information (Computer Science)
NUS High School
NUS High School Diploma
National University of Singapore
PhD in Electrical and Computer Science Engineering
Nanyang Technological University
Bachelor of Engineering (Electrical and Electronic Engineering)
Ngee Ann Polytechnic
Diploma in Electronics, Computer and Communication Engineering
Queensway Secondary School
GCE ‘O’ Level
Can help with the worflow for this? the cut will be based on institution and course of study.
institution generally have the key words such as "Secondary School, Polytechnic, University, High School, Junior College "
Please help ![]()