Database management / entry / macro / script/ application csv files
$30-100 USD
Paid on delivery
see advanced summary
## Deliverables
Hello,
I have 55 CSV files, with millions of entries in them, one of the fields is a reference.
I want the 55 CSV files, all split up into different csv files, categorized by the "start" bit of the reference field.
I want an experienced programmer to use code/application/macro to do this.
There is 2849 different possible "start" parts of the code. I will send you all of the 2849 possible reference "start bits"
So when the work is completed I will have 2849 csv files named after the individual reference starts, containing entries which have a reference starting with whatever that file is called. So for example file DL15 will contain all of the entries which has a reference starting DL15.
(I want the 2849 files delivered in 3 different formats.)
There is 5 types of reference:
2 letters and 2 numbers at the start always followed by a number e.g. DL15 7QZ
2 letters and 1 number at the start always followed by a number e.g. NE3 4RR
1 Letter and 1 number at the start always followed by a number e.g. B1 7DE
1 Letter and 2 numbers at the start always followed by a number e.g. B15 9FX
2 Letters and 1 Number followed by a Letter, then always followed by a number e.g. WC1A 1DR
I think the program/code/application will have to function like this:
1) Pick reference start from 2849 options.
2) Search all 55 CSV files for entries which have that reference start.
3) Put all data together in one place/temporary CSV file.
4) Remove duplicates. ( I then want a copy of the file at this stage saved)
5) Then I want to remove some of the fields and reformat one of the fields(ill go into more detail on this when job accepted). It will leave one field and reference.
6) Remove duplicates again at this point ( I want a copy of the file at this stage)
7) Remove one of the fields leaving only 1 field ( I want a copy of the file at this final stage)
You should know exactly how to do this and process millions of entries - I do not want to have to provide any help or advice with this.
----------
I nearly forgot to add - there is 122 different reference starts when categorized by ONLY THE LETTERS - so i also want delivering 122 csv files which combines all of the csv files into each reference start by only the letter. e.g. all of the entries beginning with B or DL.
Project ID: #2726322