Find Jobs
Hire Freelancers

Create algorithm to extract data from the web, deliver the pdf files and extract data from a pdf to an Excel file

R$90-750 BRL

Closed
Posted over 6 years ago

R$90-750 BRL

Paid on delivery
Create an algorithm that download the banns of marriage of Rio de Janeiro State (Brazil) from the site [login to view URL] from 09/2008 to 08/2017. These data should contain all personal information available about the couples, such as, date of banns of marriage, banns of marriage number, city and state where marriage will be performed, full name of the bride and groom, place of birth, date of birth, age, address, CPF (Brazilian social security number), ID (RG/identidade / id in Portuguese): number and institution/place, parents' name, occupation, and the process number (Proc.), the process id and name from the city (“comarca” in Portuguese), type of marriage contract, marital status. These data should be saved in spreadsheet xlsx. In each city, personal information can vary. The details of each stage are given below: Stage 1: Access a form in the following website: [login to view URL] . Click in the word “ÍNDICE” (in Portuguese). Select the date (”DATA DA PUBLICAÇÃO” in portuguese). Date in portuguese is presented in this format: DD/MONTH/YEAR ** (see file 1) Stage 2: Select the information clicking the word “CADERNO” (in PORTUGUESE). And choose the option: “V- Editais e demais publicações” (in Portuguese). Data will be available daily (except weekends and holidays). The form will appear as shown in the following website for year 2017 and month September (Setembro in portuguese): [login to view URL] Stage 3: Click in the word “CONSULTAR” (in Portuguese). This procedure will allow you to view the files. Stage 4: Validate the page, completing with the numbers and words asked on the website. The form will appear as shown in the following website: [login to view URL] After that, the form will appear as shown in the following website: [login to view URL] Stage 5: Download all the pages of “V- Editais e demais publicações” in “Diario da Justiça Eletrônico” and save them by date. The form will appear as shown in the following website: [login to view URL] Stage 6-Look for the words “casar”, “casamento” or “habilitam-se” in each page. Save the pages that contain any of them in pdf by day, month and year. Stage 7: After saving all PDF pages (stage 6), create a single PDF file per day, that is, one file that combines all saved pages for a given day. Stage 8- Using the files saved in stage 7, look for all the information available about the couple, such as shown in File 2: [login to view URL] An example from “Diario Oficial”: [login to view URL] Download the following website and this file will have an example and comments about variables 1 to 19: [login to view URL] Stage 9: Extract data as described in stage 8 from the files created in stage 7. The algorithm must repeat this procedure daily for the period: 01/09/2008 to 31/08/2017, except for holidays and weekends. Stage 10: Create an Excel worksheet to save all the information available about the couples by day, month and year. Each single file must contain all cities (“Comarca” in Portuguese) by day, month and year. An example [login to view URL] File 1: [login to view URL]
Project ID: 15156051

About the project

10 proposals
Remote project
Active 6 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
10 freelancers are bidding on average R$524 BRL for this job
User Avatar
I can build a script for you using R to fill in the forms, download the PDFs and extract the data into the required format. Relevant Skills and Experience I have experience using R for web crawling/scraping as well as PDF structured data extraction. Please see my previous projects. Proposed Milestones R$300 BRL - Script to fill forms and download PDFs R$450 BRL - All PDF data extraction
R$750 BRL in 10 days
4.6 (2 reviews)
4.6
4.6
User Avatar
Dear Sir, I perused your job. I am interested in performing said assignment. It is manual and very sensitive work. I am trying to touch all instruction which you describe in your job details. Relevant Skills and Experience I have six years professional in Data Entry Field All kind of data Entry Proposed Milestones R$833 BRL - Initial Milestone I can give you the fast and accurate work. Thanks Regards
R$833 BRL in 15 days
5.0 (16 reviews)
3.7
3.7
User Avatar
Hi. I can create auto scripts to scrape websites, auto click, format txt, csv, xls, xlsx, doc, docx, rtf, json, xml, database files as you request. I can start right now Relevant Skills and Experience I am an expert in VBA, VBScript, Visual Basic, C#, F#, C, C++, ASM, Delphi, Java, iMacros, Flash, ASP, ASP.NET, Access, MySQL, MSSQL, QuickBooks, Oracle Proposed Milestones R$277 BRL - complete
R$277 BRL in 3 days
4.9 (12 reviews)
3.6
3.6
User Avatar
Hello There, Greetings of the days..!! I hope You Doing well. I am Aditya.I Read carefully and Analyzed Your Provide Project Requirement for Create algorithm to extract data from the web. Relevant Skills and Experience I Have 5+ years of experience in website Design and Development and 15+ well experience Designer and Developer. Our Skills- JAVA, JAVASCRIPT, HTML, HTML5, PHP, LARAVEL, WORDPRESS,CSS. Proposed Milestones R$466 BRL - web development Please discuss more details about your projects . Looking Forward To work with you. Thanks Aditya
R$466 BRL in 2 days
0.0 (0 reviews)
0.0
0.0
User Avatar
I have experience in querying and scraping web pages, and delivering results into the format of your choice. I'm eager to hear more about the scope and timeline of this project. Relevant Skills and Experience web scraping, ETL, python, data mining Proposed Milestones R$20 BRL - Align on scope R$702 BRL - Deliver project
R$722 BRL in 7 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of BRAZIL
Sao Paulo - SP, Brazil
4.8
3
Payment method verified
Member since Oct 19, 2016

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.