I want to run lots of searches (can use any search engine) on certain words. One search is for one word, I have a huge list of words.
The result should be a list of URLs for each word.
I need two kind of maximums to setup:
-Only 1 url per domain for each word
-Maximum 3 urls for each domain
Ths script should run 24/7/365 on a rented vps and it should not be blocked by the search engine:
You can use random waiting time between runs
You can limit the hourly/daily/weekly runs
You can use rotating IP-s, proxies, etc
Ot whatever, making sure it does not get blocked.
In you offer please specify how many searches per day can you guarantee without being blocked, and what methods are you going to use to avoid being blocked.
BR
Istvan
Hi,
We've done yahoo search for a large number of queries in the past. We have more than a year of experience in web scraping techniques and have already setup scripts to scrape data from listing sites, e-commerce sites etc in the past. We use rotating proxy IPs, using firefox user-agent, random tim-delay between requests etc to avoid being blocked.
Thanks
Hi,
I have already written a similar script for a client to parse Google SERPS result.
The current script can parse 2000 keywords per day until each keyword appears in the first 5 result search pages.
Also the current script does not use proxy. With each individual IP through a proxy the number of results that can be parsed gets doubled. The script can even parse country based results.
The script randomizes each request to appear as from a different browser type, making it difficult to identify by Google as script automation.
Please let me know in-case I can be of any help.
Thanks,
Arun.