I have been trying to convert the string in the extract structured data page title to a variable which i can set dynamically during run time, but it not clickable. can anyone help me on this? thank you so much.
Try editing selector from here without opening text editor.
Regards
Roshan
Mark as solution if you found this helpful
@Penganimation
it looks like you have used attach browser. So the grayed selector is inherited from attach browser.
to dynamize it do it on attach nrowser level. I would also suggest to split the dynamics to different parts like title, url with the approach below:
in case of you want also dynamize the browser type, then you also will set the Browser enum type. this can be done by this approach:
thank you PPR. It works to change it on the attach browser level. Another question: is there a way to automatically handle these symbols?
I could not use the webpage title string straight away because I get ‘“”’ instead of "e;
thank you for your help, I think the answer from PPR works for me
@Penganimation
with the approach of variable templates with {{VariableName}} (see the first link) from above you can achieve more (e.g. selector validation with variable default value). So it is highly recommended to use this approach instead of …" +Var +"… approach
the quot is coming from the title and the used entity should not block. Otherwise handle this with variable as well
OMG this works really well. Last question. when i am scrapping the page
https://shopee.vn/search?keyword=fans
is there a way to save a image in a folder or together in the same csv file for each item that has been recorded in my csv file?
@Penganimation
I assume following: you scraping some product info and the image for this should be saved as well.
Images needs to be saved into filesystem folder and cannot saved within the CSV.
For doing this check following:
- modify the extractmetadata with an addtional correlated data column:
- indicate the first 2 images and save
- the column will be empty BUT when you edit the extractmetadata xml and change the attribute setting from text to src then it retrieves the image url
let me know once this step was sucessfully. Then we will help you on save image part
Thank you, Peter. I have done according to your instructions for this page
https://shopee.vn/search?keyword=qoobee
And here is the result.
XML:
extracted csv.zip (1.1 KB)
It seems the image URL is not the one I want. And there are also some empty row that appear from nowhere. Really strange.
Since the XML didn’t appear, I have attached the raw uipath file for you here.
DataScrapping.zip (18.7 KB)
[quote=“ppr, post:12, topic:226866”]
</>
[/quote] Here you go. It seeemed to work now, its just that there are some empty rows that only contain image links but no post links
<extract>
<row exact='1'>
<webctrl tag='div' idx='1'/>
<webctrl tag='div' class='shopee-page-wrapper' idx='1'/>
<webctrl tag='div' class='container _2_Y1cV' idx='1'/>
<webctrl tag='div' class='jrLh5s' idx='1'/>
<webctrl tag='div' class='shopee-search-item-result' idx='1'/>
<webctrl tag='div' class='row shopee-search-item-result__items' idx='1'/>
<webctrl tag='div' class='col-xs-2-4 shopee-search-item-result__item'/>
<webctrl tag='div' idx='1'/>
<webctrl tag='a' idx='1'/>
<webctrl tag='div' class='_1gkBDw _2O43P5' idx='1'/>
</row>
<column exact='1' name='Column1' attr='text' name2='Column2' attr2='href'>
<webctrl tag='div' idx='1'/>
<webctrl tag='div' class='shopee-page-wrapper' idx='1'/>
<webctrl tag='div' class='container _2_Y1cV' idx='1'/>
<webctrl tag='div' class='jrLh5s' idx='1'/>
<webctrl tag='div' class='shopee-search-item-result' idx='1'/>
<webctrl tag='div' class='row shopee-search-item-result__items' idx='1'/>
<webctrl tag='div' class='col-xs-2-4 shopee-search-item-result__item'/>
<webctrl tag='div' idx='1'/>
<webctrl tag='a' idx='1'/>
<webctrl tag='div' class='_1gkBDw _2O43P5' idx='1'/>
<webctrl tag='div' class='_3eufr2' idx='1'/>
<webctrl tag='div' class='O6wiAW' idx='1'/>
<webctrl tag='div' class='_1NoI8_ _16BAGk' idx='1'/>
</column>
<column exact='1' name='Column4' attr='src'>
<webctrl tag='div' idx='1'/>
<webctrl tag='div' class='shopee-page-wrapper' idx='1'/>
<webctrl tag='div' class='container _2_Y1cV' idx='1'/>
<webctrl tag='div' class='jrLh5s' idx='1'/>
<webctrl tag='div' class='shopee-search-item-result' idx='1'/>
<webctrl tag='div' class='row shopee-search-item-result__items' idx='1'/>
<webctrl tag='div' class='col-xs-2-4 shopee-search-item-result__item'/>
<webctrl tag='div' idx='1'/>
<webctrl tag='a' idx='1'/>
<webctrl tag='div' class='_1gkBDw _2O43P5' idx='1'/>
<webctrl tag='div' class='_3ZDC1p _1tDEiO' idx='1'/>
<webctrl tag='img' class='_1T9dHf _3XaILN' idx='1'/>
</column>
</extract>
<column exact="0" name="Column1" attr="src">
<webctrl tag="div" idx="1"/>
<webctrl tag="div" class="shopee-page-wrapper" idx="1"/>
<webctrl tag="div" class="container _2_Y1cV" idx="1"/>
<webctrl tag="div" class="jrLh5s" idx="1"/>
<webctrl tag="div" class="shopee-search-item-result" idx="1"/>
<webctrl tag="div" class="row shopee-search-item-result__items" idx="1"/>
<webctrl tag="div" class="col-xs-2-4 shopee-search-item-result__item"/>
<webctrl tag="div" idx="1"/>
<webctrl tag="a" idx="1"/>
<webctrl tag="div" class="_1gkBDw _2O43P5" idx="1"/>
<webctrl tag="div" class="_3ZDC1p _1tDEiO" idx="1"/>
<webctrl tag="img"/>
</column>
it looks like there are dynamics in the class names. Just try to remove it or to make it dynamic and test again
ok thank you so much Peter. let me give it a try.
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.