Can anyone help me with the logic on how to find the latest link on a webpage

I have a webpage, which has hyperlinks of few excel files. The link contains the name of the file. The initial part of the filename is same for all links but at the end they are appended with the Date value.
So how do i get the topmost (recent) link to click

Eg: Summary-Report March 22 2022-03-22 12_10.xlsx
Summary-Report March 22 2022-03-19 12_12.xlsx

Kindly help
TIA

Hi @shilpa_p,

The links are in table format?

The new link will be always on the top?

Can you please attached the screenshot if possible.

If it’s in a table format you can design to click on the first row/column

1 Like

Sorry i cannot attach the page due to privacy reasons. But there is a small column on the left side where the links are listed one below the other. its a web page. not in the table format
But i can share a pic which looks similar kind
p1
The one i have marked in a box. My data is similar to that. Just that the links contain file names

Ander Jensen’s page hah? :grinning:

Kind Regards,
Ashwin A.K

No :slight_smile: . Its my work related page which looks similar to the one i have attached

@shilpa_p Try with click activity. First select any one of the hyperlink and see the selector has aaname. If it has aaname, pass the data value through variable. Variable should contain below expression

Str Variable = DataTime.Now.ToString(yyyy-MM-dd)

aaname in the selector should has Summary-Report March 22 “Variable” .xlsx

1 Like

This should work but the date could be any date. Not necessarily today’s. If yesterday’s date is present in the filename, and thats the latest, it should click on that

You can check the selectors and find out any way to get the first row like row=1 and remove the aaname

Sure . I will check with this and update. Thank U :slight_smile:

Hi,

Can you try the following steps?

First get table as DataTable using DataScraping.

Then, get latest file name using the following expression

targetName = ExtractDataTable.AsEnumerable.OrderBy(Function(r) DateTime.ParseExact(System.Text.RegularExpressions.Regex.Match(System.IO.Path.GetFileNameWithoutExtension(r("ColumnName").ToString),"\d{4}-\d{2}-\d{2} \d{2}_\d{2}$").Value,"yyyy-MM-dd HH_mm",System.Globalization.CultureInfo.InvariantCulture)).Last().Item("ColumnName").ToString

Finally, click the file using dynamic selector like the following.

img20220404-3-2

Regards,

Thanks a lot :). I will try with this


My datatable after data scrapping looks like this. I need to look at the line marked in yellow.

I used the same expression as you stated, but it gives me an error saying string was not recognised as a valid date time

Hi,

For now, can you share how we can identify not #18 but #7 is the latest ? Is there something specific keyword?

Regards,

yes row #7 is the latest. Summary support-BOT is the keyword

Hi,

But #18 also has Summary Support at the beginning of the sentence. We need some condition to except #18.

Regards,

BAsed on the date appended we identify it as latest. Like #7 has recent date than #8
#18 should be ignored
I just need to consider the files that have name till bot and appended by date.

Hi,

Row#18 is 2022-03-23 10_42 and it seems newer than row#7 (2022-03-22 12_10)
Or can we remove rows after blank row?

Regards,

ya we can remove the rows. However we dont need the older links.
I cannot consider #18 because it has few more words after the word bot(Eg: _DEF…LLL)
only i need to identify the files which have the file name “Summary Support-BOT” followed by date

Hi,

For now, can you try the following sample? It remove rows after the first “…day(s) ago”, then extract string which has the latest date.

img20220404-4

 targetName = ExtractDataTable.AsEnumerable.Take(ExtractDataTable.AsEnumerable.Where(Function(r) r(0).ToString.Contains("day(s) ago")).Select(Function(r) ExtractDataTable.Rows.IndexOf(r)).First()).Where(Function(r) System.Text.RegularExpressions.Regex.IsMatch(r(0).ToString,"\d{4}-\d{2}-\d{2} \d{2}_\d{2}")).OrderBy(Function(r) DateTime.ParseExact(System.Text.RegularExpressions.Regex.Match(r(0).ToString,"\d{4}-\d{2}-\d{2} \d{2}_\d{2}").Value,"yyyy-MM-dd HH_mm",System.Globalization.CultureInfo.InvariantCulture)).Last().Item(0).ToString

Sequence1.xaml (7.6 KB)

Regards,