I have a problem with managing information scraping from a web page.
I have a web page with different information.
Of this web page, I only need two details.
The name of the Project manager, and the name of the loaded offer.
I have no problem picking up the PM’s name.
But I find it very difficult to recognize and withdraw the name of the offer.
Unfortunately the scenarios are “infinite”.
The PM inserts a free text, and has no precise rules to respect.
I only have a few “hooks” to recognize the name of the offer.
Often the name of the offer is preceded by these letters:
After these reads follows the symbol - or _ or empty space, and then a number of variable characters.
Below is the excel file for extracting the complete web page.
The name of the offer, are contained in row 22.
Is there any way to handle this situation?
Test.xlsx (8,9 KB)