Hi there,
Let me implement the following requirements for your Occupy Wall Street on Twitter history learning project.
a. All the data should be in chronological order, so if I want some data of certain day I can get it.
b. I also need data about the list of hashtags which are related to Occupy Wall Street, from Jan 9th, 2011 to October 1st, 2012. And the number of people who are active under these hashtags, the number of tweets, comments, likes, and re-tweeting
c. I have to know the times of discussion action take part every day (discussion action means A @ B, and then B replies A on the same day). And the top100 words which are mentioned in their tweets in that period, again, I want to know the times they were mentioned each day.
I've just analyzed some sample python code written for above kind of tasks. Can you explain in more details these: "times of discussion action take part every day (discussion action means A @ B, and then B replies A on the same day)." and what do you mean by "mentioned in their tweets in that period". THEIR - who are they???. Also in phrase, "times they were mentioned", do you mean "top100 words" when you say "they"?
I'm involved in web-development business for 10+yrs and have been doing python+django projects for 3+yrs.
Your project is not difficult one, but requires time to until i get the result you want to have. Testing and fixing takes more time than actual implementation process. Need to be sure that statistics are correct.
Thanks
Max