データスクレイピングの最後のページでエラーを避ける方法

【やりたい事】
データスクレイピングで「次リンクへのセレクタ」に「次へ」ボタンを指定し、複数ページに渡るデータを取得しようとしています。

【問題】
データスクレイピング対象の最後のページで「次へ」ボタンの要素が見つからなくなると、自動的にデータスクレイピングが停止し、次のアクティビティに移動してほしいのだが、
実行すると、「このセレクターに対するUI要素が見つかりません」のエラーが出て、処理が停止してしまう。

【試した事】
プロパティの「エラー発生時に実行を継続」をTrueにすれば、処理は進みました。

一方、上記の方法では、
他のエラーが生じた際に対応できないため、他の方法があればお聞きしたいです。
next_btn

Hi @111148

Welcome to the community!!

If you are using the data scraping wizard to navigate through the pages using the next button you have over there, this scenario should be automatically handled through the wizard as it asks to set the next page element. So basically when handled automatically it should stop the scraping from the last page without throwing an error.

However, if you are navigating through the pages manually in a loop, then you need to incorporate that logic in your flow.

Assuming that this is done manually without using the page change of the wizard, then we need to use the page number buttons you have there to determine whether the page is available or not.

Run this in a loop
Have a number variable to hold and increment the page number we are currently looking at. Let’s say we start it at 1.
Now use a element exists activity to check whether that particular page is available. Usually, first page is loaded in the application by default as it’s the first page so the selector for the first page might be bit different from the rest. Make the selector dynamic to check using the number variable.

If element exists returns true, that means the next page is available. Then use a if activity and use a click activity inside it to load the desired page.

Once the page is loaded, use another on element appear activity to make sure the table or some unique thing is loaded in the screen. This is to verify that page is loaded before performing the data scraping.

Now in the on element appear activity, use a extract structured data activity to extract the data you need from the table in the screen.

Just wanted to correct you on this point. The Data Scrape wizard will configure an “Extract Structured Data” activity to have a ContinueOnError set to True. The way it ends on the last page is it waits until the TimeoutMS is reached while looking for the Next button. So, if the TimeoutMS is set to default 30000 ms, it will actually wait 30 seconds on the last page before moving to the next activity.

And, the ContinueOnError should be on True, so no errors will be thrown. However, you will want to set the TimeoutMS much lower, but just long enough for page loading time. So, it will wait on the last page until the TimeoutMS is reached and does not automatically stop.

Since it relies on the ContinueOnError, this can cause issues because it may end before the last page if the Next button fails or the TimeoutMS is too short for page loading time.

But, it should definitely not throw any errors when ContinueOnError is set to true. - if that makes sense…

Regards. @111148

1 Like

ExistElementで、「次へ」ボタンの要素が見つからなくなる結果を受け取って、後続の条件分岐で制御してみるってのはいかがですか?

皆様ご回答ありがとうございます!
本件は以下のように理解しました。

★データスクレイピングで「次リンクへのセレクター」を設定する場合は、プロパティの「エラー発生時に実行を継続」をTrueにするしかない。

★それが嫌なら、繰り返し処理の中でデータスクレイピングを利用し、「次リンクへのセレクター」を設定せずに、要素の有無を検出などで次のページに移動するか判定という方法がある。

1 Like