Hi all,
How to extract the line that has URL: from pdf
“An updated List is accessible on the Committee’s website at the following
URL: www.un.org/securitycouncil/sanctions/materials.”
Thanks and Regrads,
Supriya Galentic
Hi all,
How to extract the line that has URL: from pdf
“An updated List is accessible on the Committee’s website at the following
URL: www.un.org/securitycouncil/sanctions/materials.”
Thanks and Regrads,
Supriya Galentic
It is fetching entire pdf
Need to extract the only string that has URL and not the above and below part
eg- entire 2. should be fetched
-entire 3. should be fetched
Hope u understand my query
Thanks and Regards,
Supriya Yenaganti
Need to extract the url which contains updated or latest keyword in the string
Thanks and Regards,
Supriya Galentic
If you are dealing with documents (scanned or native PDFs), the best appraoch would be to look into Document Understanding. This way, you can:
@supu123 - because these 2 URLs are not in the same line as the word URL thats why…
Could you please give the starting few letters of thr URLs…I saw your masked it. But for your information URLs on the forms are not PII so I don’t think you have to mask it…
If its always starts with www. Then please try www.\S+
This will fetch all the URLs starting with www
Hi @prasath17
Want to extract the entire statement with url in it
number 3. has latest keyword so want to extract only that 2 urls
Thanks and Regards,
Supriya
@supu123 - I am not clear on your requirement. If you are looking to extract the URLs below pattern would work…
Hi @prasath17
I also want the part above the url starting from 3. till end of 3.
The latest versions of the Sanctions lists are accessible on the UN Security
Council’s website at the following URL:a) List of individuals and entities issued by the UNSC ISIL (Da’esh) and Al-Qaida
Sanctions Committee:
b) List issued by the UNSC Committee established pursuant to
resolution 1988 (2011) of individuals and entities linked to Taliban
Thanks and Regards,
Supriya
@supu123 - I am not positive that this can be achieved using Regex…
You may have to try different approach.
Hi @prasath17
For @supu123 work, we had to implement two regex patterns , one for to search data (lines ) having updated word then after that use the another regex to extract URL
We had to try that way to achieve this
Sorry for late response @supu123
Regards
Nived N
Happy Automation