PHP command line CLI script to crawl or spider a website and log data
$30-250 USD
In Progress
Posted about 11 years ago
$30-250 USD
Paid on delivery
I need an extremely small PHP CLI (command line php) script that can crawl a website to find all of the web pages on that site starting from the main default page.
As it follows links to crawl the entire site, I need it to record the following data in a csv file:
1. url
2. meta title
3. meta description
4. meta keywords
5. H1 tags separated by pipes (in one field)
6. H2 tags separated by pipes (in one field)
7. H3 tags separated by pipes (in one field)
8. If simple and affordable since this is on a tiny budget, each of the images on the page that are not layout related or on the rest of the pages of the site. Ideally, only images unique to that page. Record file names in one cell separated by pipes and, in the last cell, include the image alt text separated by pipes.
Please message me if there's anything you need.
Hello, we would be very happy to help you with the project. We have done many data processing and scraping jobs, handling complex javascript based sites, producing multi-threaded solutions to provide the most efficient and quick solution for any project. Please check the PM for further specification of the project. Thank you