I will provide 20k URLs. I need you to access the AWIS ALEXA API and download the: URL Information, Historical Web Traffic, Sites Linking In, Browse Category as provided by the API. We need in 1 week. Your bid should include the AWIS API download cost.
AWIS ALEXA API:
[login to view URL]
Alexa Web Information Service
Sign Up For AWIS
The Alexa Web Information Service API makes Alexa’s vast repository of information about the web traffic and structure of the web available to developers.
This page contains the following categories of information. Click to jump down:
Service Highlights
Pricing
Detailed Description
Intended Usage and Restrictions
Service Highlights
Gather information about web sites, including historical web traffic data, contact information, related links and more.
Access historical web traffic data for web sites to analyze growth and understand the effects of specific events on web site traffic
Build a web directory into your web site or service using an Alexa API enhanced DMOZ-based browse service
Access the list of sites linking to any given site
Pricing
Pay only for what you use. There is no minimum fee, and no start-up cost.
Up to 1,000 requests/ month - Free
1,001 to 1,000,000 requests/ month - $0.00045 per request
Over 1,000,000 requests/ month - $0.00030 per request
(Alexa Web Information Service is sold by Amazon Web Services, Inc.)
Detailed Description
AWIS provides the following operations, or "actions":
URL Information
The URL Information action gives developers direct access to information pertaining to web pages and sites on the web that Alexa Internet has gathered through its extensive web crawl and web usage analysis. Examples of information that can be accessed are site popularity, related sites, detailed usage/traffic stats, supported character-set/locales, and site contact information. This is most of the data that can be found on the Alexa web site and in the Alexa toolbar, plus additional information that is being made available for the first time with this release.
Historical Web Traffic
The Historical Traffic action gives programmatic access to web site traffic rank, reach, and page views going back to August 2007. Use this action to compare a web site’s popularity over time, identify trends, or display graphs of traffic
Sites Linking In
The Sites Linking In action returns the sites linking to a specified web site.
Browse Category
The Browse Category action allows developers to access all of the information available at the Open Directory without the need to download or host the directory database on their own systems. This service returns web pages and sub-categories within a specified category. The returned URLs are filtered through the Alexa traffic data and then ordered by popularity
Intended Usage and Restrictions
Your use of this service is subject to the Amazon Web Services Customer Agreement
Hi, expert web/data scraper here with over 17 years experience in programming and RDBMS - please see my reviews.
I'm using Perl for this kind of jobs.
I've got a perl script already prepared for this job.
I'm able to extract data fast.
£150 GBP in 3 days
5.0 (1 review)
2.4
2.4
9 freelancers are bidding on average £220 GBP for this job
Hi sir,
I am scraping expert, I have did too many similar projects, please check my feedback then you will know.
Can you tell me more details? then I will provide demo data for you.
Thanks,
Kimi
Dear Sir,
I'm very much delighted to let you know that i did data scraping
with PHP-cURL from many sites. I just scraped the data from web site
and then wrote the data in mysql database or excel or csv or xml file.
I worked on many similar projects, I have big experience in data mining projects.
I can finish this task in short time, with the best quality. I can assure 100% accuracy.
Please give me the opportunity to do the work.
With Kind Regards,
Debdulal Roy Proshanta
Hello,
First to mention, in order to have an exact estimate of API's cost, would be needed to decide exactly what you need from the available data. To give an example, traffic history: if you need the last 30 days of data for each site, then are 20k requests for history. If you need entire history data(2007 to now), then number of requests just for traffic is up to 8 years * 12 months * 20k url's, so almost 2 million requests(probably slightly less as not all sites will have full interval data).
So API's cost can be from few tens to few hundreds dollars, depending on details of what data to collect.
So my proposal to be considered without the API price included, I do use AWS, also having fast internet connectivity and access to more machines if needed, but I think that most clear and straightforward would be to write the collecting code using the API and then you to use your AWS account, the chosen provider's one just during develop and tests - including for legal reasons(for example if data used in improper ways after, it would be me responsible against Amazon and hard to justify by "it was a freelance project".