Part 1 Python Application
Using Google App Engine and Python, create a simple web crawler which goes to the following
Wikipedia article [login to view URL]
Once you have accessed the website. Use BeautifulSoupto search the content of the web
page that has been downloaded. Loop over all of the links on the page, and perform a check to
see if the link you are currently processing contains any of the following terms:
US
President
America
States
At the end of the page, once all of the processing has completed. Print out an output of how
many times each of terms has been found while looping through the link titles.
An example of the output at the end of the process:
US Found 4 times
President Found 1 times
America Found 2 times
States Found 7 times
During the task of looping over the links, every 20th link that you encounter,you should print
out the link number you are currently on and the link itself to the web page.
An example of this would look like this:
US Found 4 times
President Found 1 times
America Found 2 times
States Found 7 times
Link 20: [login to view URL]
Link 40: [login to view URL]
Link 60 [login to view URL]
….
….
...
Your project must be hosted on Google App Engine. Add comments to your source code and
add a link to where your project is hosted on [login to view URL]
Part 2 Technical Report
Assignment Description
A company has contacted you and asked you to compile a document outlining the
requirements needed for hosting a python application that requires a database to serve
500 different users in the company. The company also needs a method for storing files,
backup files. As an element of the application hosting process they are interested in
having redundancy built into their application, so if one server fails another is available
to continue on where the other left off.
Create a 5 page document outlining the technologies that are available to host this
application in a cloud environment. Make reference to the optimizations that can be
made from the database up to the hosting of the application.
In your document, make specific reference to the different specifications of services
available and how much they would cost the client in total.
This report should be coherent, outlining each element of the hosting process and
outlining exactly what is needed by the client at every step of the process.
This document should be written in Times New Roman font, size 11pt. All text used
should write written in your own words. Remember plagiarism detection is done on all
submissions.
Hi there,
All requirements are pretty clear and straightforward to implement. I'll use python2.7 and bs4 branches, if you want to target different versions like py3 branch let me know. I can start as soon as you award the project and expect to be done within 5 days.
Thanks
Ibrahim