Hello,
I am a data scientist and R programmer.
I am a full time freelancer who does not outsource work, I do my projects myself.
I have read your description and I can relate to the problem you are having, writing R code and actually making it work on LARGE datasets are two different things. I have recently ( about a month now )worked on a dataset that was 108Gigabytes. It can be tricky to manage those kinds of datasets, but there are ways to do so. I haven't yet looked in depth into the code to know what kind of analysis you are doing, but I could think of some potential solutions. Generally speaking, R isn't so good with datasets with too many columns, it is written is a kind if mathematical approach which is computationally heavy when operating on datasets with a big number of columns. A possible way to go around this is to subset data or transpose it, of course not using the trafitional ways, because those ate heavy too. Another would be to reduce dimentionality or drop irrelevant variables, but it all depends on the data.
I would like to know what field is this data from, and what type is it ? That would help me a lot in deciding how to approach the problem.
I will be waiting for your reply so we could discuss further details about this project which actually interests me a lot ! It gives this vibe of some serious real life problem that we have to tackle, and I love challenges :D
Looking forward to hear from you.
Thank you,
Mohamed Yassir K.