I’m as excited as the next guy about the possibilities of “Big Data!” but possibly more excited about the opportunities presented by plain old “Modest Data”. I believe there is plenty of scope for useful analysis on fairly moderate data sets with the right approach and tools.
I’d go as far as to say that many of the “Big Data!” stories and analysis currently performed is really plain old statistical analysis with a few new touches from the ever-expanding list of R libraries.
For example, it seems that papers with shorter titles get more citations by other researchers. Although the research considered 140,000 papers, there is nothing especially “Big Data!” about the analysis. The paper and authors suggest several possible causes related to the quality of the journal, period of time etc. Disappointingly, they don’t seem to have modelled these possible effects directly to understand whether there is any residual effect.
There is scope for great analysis without “Big Data!” and plenty of scope of poor analysis with all the data in the world.