resume parser
$750-1500 USD
Paid on delivery
Resume Parser or Parsing
We need a resume parsing application. It needs to run on linux and we would prefer in written in perl or php. If you want to use something else please let us know.
This parser will be used to parse millions of existing resumes in html, word, rtf, text and pdf formats. Most of our resumes are in unstructured html but we have thousands in word, rtf, text and pdf as well. We also have many in the body of emails so if we can parse those as well it is an added bonus but this is not a firm requirement.
The parser needs to be able to extract the following data from the resumes:
------
1. candidate first name
2. candidate last name
3. candidate address
4. candidate city
5. candidate state
6. candidate zip code
7. candidate country
8. candidate email address
9. resume job category (accounting or sales or legal or insurance, or etc.) - we will supply a list of possible categories. It is possible that a resume may fit more than one category so the parser should make a best guess on the correct category
10. resume title
11. candidate career objective
12. years of professional experience
13. employment history
14. education history
15. licenses and certifications
16. military history
17. foreign languages
18. security clearances
19. references
20. skills keywords
21. complete resume in text format. Parser needs to remove all html tags and non-resume information (such as headers, footers, side bars, etc.) in an intelligent way to produce a clean and readable resume in text format.
------
Output of the parser should be an xml tagged file, one xml file for each parsed resume, output file name to be the same as the input file name with extension changing from [login to view URL] to [login to view URL] or [login to view URL] to [login to view URL], etc.
All of the parsed fields will be used to upload into a mysql database. Parser may be asked to do the database insertion as part of the parsing process.
We will supply a sample set of resumes, as many as you need to be successful.
Resumes are unstructured so formats and content vary widely. The ability to score the parsing performance would be beneficial. It would be helpful to be able to look at a parsing report that indicates which resumes the parser thinks it did poorly on so we can manually revisit those parsed resumes that have the highest probabilty of having parsing errors.
Parsing will be done in a batch on all our resumes (millions) and will also need to be able to parse resumes that are added to our system every day. So we would need to be able to integrate the parser with our existing perl and php website applications.
Passing acceptance testing with several thousand resumes will be required at project completion.
Thanks!
P.S. Our budget is somewhat flexible so please submit a bid even if it exceeds the posted budget. We are looking for the best most robust solution possible. Thanks.
HI,
Some samples are attached in the zip file. Please PM for the password. Please keep in mind that 99+% of our resumes are in html format. The ability to accurately parse html resumes is mission critical, being able to parse in other formats (word, pdf, rtf, text, email, etc.) is a nice to have but not a requirement for success.
Thank you.
HI,
Some samples are attached in the zip file. Please PM for the password. Please keep in mind that 99+% of our resumes are in html format. The ability to accurately parse html resumes is mission critical, being able to parse in other formats (word, pdf, rtf, text, email, etc.) is a nice to have but not a requirement for success.
Thank you.
Project ID: #1350154
About the project
39 freelancers are bidding on average $1425 for this job
Hi, I have over 13 years of Experience in software design, development and implementation of various commercial applications in Client/Server environment, Web and ERP applications using C# 1.1/2.0/3.5, ASP.Net, VB More
It's a whole text mining system. I'm a text/data mining and machine learning researcher. I can develop a scalable text mining system for you.
I should be able to deliver complete working code even if i don't have any rating to show for it ...
Dear Sir, We are having a team of technologies expert working in different technologies like php,Joomla, Smarty,.net, C with our company. Kindly check your PMB for more details.
Hello Sir, We can confidentially complete the project.. Please check PMB for listing.. Warm Regards
Hello. I'm a perl expert coder. I have very good experience in many perl aplications. I have already developed a parser in perl.