Improvements to the datascaper

Hi,

The datascaper is a great tool, however some improvements would be nice.

List of improvements

  • The ability to change the datascaper after running the wizard. How to change the pattern and the names of the columns, if the datascaper found a table.

  • Provide documentation of the “Data Definition”

  • Change the default behaviour of the find next page feature. Currently the datascaper internally seems to rely on a “Find element” activity on the next page link. If I change the ContinueOnError setting to False, the activity doesn’t work when browsing multiple pages, as I see an exception on the last page.

As a note for the datascaper it will never fail, as the ContinueOnError is True. So you cannot do proper error handling.

image

I agree with you about this point, It will help us to filtering many objects that we don’ t want to get.

Hi,
Thank you for your suggestion. I added it to our internal ideas tracker for our team to consider.

@Pablito Thanks. I would recommend that you prioritize this as the below issue seems to be an apparent design flaw, which could cause false-positive scaping results.

1 Like

@denndk thanks for your feedback. The behavior you describe, looks like a bug to me. However, I was not able to reproduce it by scraping data on regular web pages. Can you please give some more details on the target pages have you used? Thank you

1 Like

@gheorghestan

Try using the acme test site on all the work items.

Please see attached file for an example and description of the problem.

Main.xaml (11.3 KB)