This mini project is in 2 steps:
- First of all, it is a question of retrieving the first XX (50 to 100) URLs (FR in a first step) in Google for specific keywords (possibility to use proxies)
- In a second step, it will be a question of scrapping the urls obtained via Google in order to extract the specific content by excluding any menus, footer, etc in text form.
The output file should be plain text, 1 sentence per line.
The script must be able to run on Linux (scrappy framework?).
42 freelancers are bidding on average €150 for this job
hey, I can make this tool using PYTHON to do 1) search by keywords on google and 2) filter them based on criteria. I would love to talk more about this in chat.
Hello there, Scraping Python expert here, Im used to Scrapy. What specific content exactly? Webpages are chaotically different from one another? Let me know more about the task in chat Pandelis.