I have multiple PDFs and im gathering data from tables in those PDFs. PDFs are very smilirar but no the same, espiecialy selectors, for example on one PDF table have index selector:
on another pdf it`s just table without any index:
<role=‘table’ />
Anyway thats not the problem as i have done if condition with element exist which is check if "idx=2" exist and this is fine. I have wrote above as maybe you can point me to the better solution ( still i didnt tested all PDFs and there may more conditions needed )
The problem is that when im scraping first PDF - thats all fine, data is scrapped correctly, but when its opening and scrapping next PDF its just getting messed up, for example its selecting whole document or its scrapping table, but collumn order is reversed ( first collumn goes as last, last as first ) - i can not invent any logic to it when the scrapping is unpredictable. Could you guys help me please to find solution? The tables have always the same headers and number of columns, but i dont know if i can use any anchors to it? Scraping is working fine but just for first PDF (no matter which one), next ones are the problem. Example
PDFs A and B
Scraped first: A - Scraping Correctly
Scraped second: B - Scraping wrongly
Scraped first: B - Scraping Correctly
Scraped second: A - Scraping wrongly
It could be linked to variables initialization: When you scrappe the first PDF all variables are empty but when you scrape the second it contain values from previous scrapping.