I have html file (consider it as invoice file) from where i need to extract lot of information like, company code, company name, payment type, date etc. I am able to extract the data but it is taking lot of time to extract data from one file. Can you please suggest best way we can extract the so that extraction is faster.
I have a similar question but maybe slightly different. I am processing html files that reside on my computer. I need to extract data that resides in a grid within the file. That being said, I’m looking for the best way to pull this out. Should I use webscraping? Should I try and iterate through the source HTML (60K lines)? This grid is a small subset of the information within the html document so once I am done pulling that data, I would like the end processing.