This can be achieved through PHP, perl or linux shell scripts. The script should be scalable to handle huge date in xml format, typically between 250 to 350 mb.
The project involves the following:
1. Download xml feed from a given location.
2. The feed will be zipped, so the program will unzip to extract the xml file.
2a. Verify the xml feed has been updated (this can usually be done using the file name). If it is updated, go to next step/
3. The xml DTD and fields will be provided to you. Based on business logic, for each item in the xml feed, the program will:
a. Insert into the database,
b. Update exisiting rows in the database.
c. Deactivate existing row in the database (update a deactivate date).
4. Once this is done, a report is generated as to how many were updated, how many were new records and how many were deactivated.
5. Handle duplicates based on business rule to be provided.
The job will be scheduled and should be schedulable using linux cron. The environment will be linux and mysql database.
Proper error handling and code documentation should be provided in the code.
The details of the xml DTD, business rules and DB structure will be provided once a bid id approved.
This is not a complex project and the person with the right knowledge/skill will be able to accomplish this in no time.
## Deliverables
The script must be coded, tested and deployed by the coder within 10 days of bid approval. Any deviance in this timeline must be communicated and agreed upon by both buyer and seller.
The coder will communicate regularly the status of the project.