I need to develop code that can be placed on a server to scrape information from a website. The website is a public site which posts energy prices. Specifically, it is www.newyorkpowertochoose.com. I need all utility provider results and other fields to be scrapped from the website in regular intervals (e.g. every 6 hours, every day) and placed into a Microsoft SQL Server database. The website requests specific zip codes to provide information on pricing. I will provide approximately 16 target zipcodes. Each scrub would take all information for all specified zip codes at the interval specified.
The data sought form the site includes:
- Service Type (Electricity or Gas)*
- Zip Code*
- Utility Name*
- Supplier Name
- Rate Type (Fixed or Variable)
- Rate (Numeric Only)
- Green Offer (Yes/No)
- Minimum Term (Months Only)
- Cancellation Fee (Numeric Only)
- Cancellation Fee Text (used only when qualifying text is present)
- Comments (Text)
- Date/Time Stamp of Scrub
Lastly, I would also like to have an image of the page scrubbed saved in an image file.
I will provide the Microsoft SQL Server database IP address and login credentials, a directory to save the page images, and the zip codes to be used for each scrub (you can use 10003 as a test zip code).
I need a way to adjust the scrub increment (e.g. days / every 6 hours) and the zip codes in the future if necessary. This adjusting mechanism doesn’t need to be sophisticated it only needs to be able to be done quickly and efficiently. To be specific, I am reasonably tech savvy and could go into the code and adjust it if it’s only located in one or two places though a database driven interval and zip code table might work best.
To help ensure we remain aligned throughout the development process, I propose the following gates, but I am open to your input:
Milestone 1: Developer Review Target Site & Verify Feasibility of Scrubs (each service type, each zip code, at specified intervals)
Milestone 2: Agree on Target Database Structure
Milestone 3: Test Single Scrub (1 zip code) with input sent to target Microsoft SQL Server database & page image sent to specified directory
Milestone 4: Full Scrub Code Finalized
The budget is moderately flexible. I am willing to pay a reasonable price to get it done right, in a timely manner, and by a reliable coder with whom I could do business with in the future but without overpaying. I am estimating this project to be in the "Small" category ($200-$1000 USD) but please indicate if you feel reasonable compensation should be higher or lower. I will consider all reasonable bids.
hi, i am expert in web scraping and code making, let me do this project and i assure you that code will be 100% perfect, accurate and according to your requirement, thanks
Hi, I've built a number of similar web scraping applications before, built either as Windows scheduled tasks or fully-autonomous Window services, so they can run in the background on a regular schedule without any user input. I've checked the target website and the example zip code you've provided, and confirmed its suitability for web scraping. All my code is fully commented and compliant with Microsofts Stylecop design and development guidelines.