WhiteHatBox
Share Page
Following(0)
Partner(s)
Timbalord

How to scrape only numbers

ReplyThanks 2016/10/27 07:20:47 0 0

hi everyone,


i am completely new to botchief. At the moment i try to figure out, how to build a bot for chat. login works, several ifs work to get to the chatpart of the webpage but then:


more then 300 users wrote me a chat message. So i try to figure out, how to scrape a list of all the userids and save them to a table.


In a list view i see 20 users per page which have written me a massage.

When i am using the scraper to get the userids i find 20 times this. So its perfect.

But i am not able to scrape only the value of rel="", which has to be the userid. Im only getting all of this tag or nothing.

<span class="b-link im_user" rel="479853413" data-uid="479853413"></span>


Thank you in advance

zhoucongcq
2016/10/27 23:23:12

I am very sorry that Botchief can't scrape the value of rel property, but we will add it in next update.

I think you can use other ways to scrape the user_id after you scrape the tag.

Way 1: If the length of the user_id is always 9, you can use Variable Operate->StringInterception to get user_id.

Here is a sample module: Way1_StringInterception.dat

Way 2: Use Replacement and Delimiters function in Variable Operate action.

Here is a sample module: Way2_ReplacementAndDelimiters.dat

zhoucongcq
2016/10/27 23:41:34
2 # zhoucongcq 10/27/2016 11:23:12 PM

I am very sorry that Botchief can't scrape the value of rel property, but we will add it in next update.

I think you can use other ways to scrape the user_id after you scrape the tag.

Way 1: If the length of the user_id is always 9, you can use Variable Operate->StringInterception to get user_id.

Here is a sample module: Way1_StringInterception.dat

Way 2: Use Replacement and Delimiters function in Variable Operate action.

Here is a sample module: Way2_ReplacementAndDelimiters.dat

I made a mistake. When you find the SPAN tag, you can input the rel property. Then you can scrape the user_id.

Timbalord
2016/10/28 04:49:48

Thank you so much. I didnt recognize, that i can put my own input into the properties dropdown. Now its working.


But there comes up another question:

The websites only shows me 20 of about 300 incomming messages. I have to scroll down the list und then an ajax/Javascript loads the next messages. How can i automate this?


If i try to scrape, then scroll, then scrape again, i get most of the user ids three or for times, because the scraper starts from the beginning.

zhoucongcq
2016/10/30 23:20:39
Maybe you can try to scroll down the list first, then scrape. There is no better way to scrape something on the ajax/Javascript page.
<< < 1> >>
VerifyCode
Advanced Option