November 21, 2011
Earlier in the year, I wrote an opinion column on TechCrunch that big data “needs to think bigger.” At the time, I kept hearing the term “big data” over and over, and wondered how much of the emerging insights and techniques would be applied toward the Internet versus the larger problems society faces, such as detecting fraud in financial markets, finding new deposits of natural resources, or helping discover the next big pharma drug.
Yet in some of my experiences monitoring the space since then, I’ve come to conclusion for now that my March 2011 column meant well, but that reality is much further behind than we’d like to think. One would assume, for instance, that big drug companies would be aggressive adopting new, external, cutting-edge techniques to analyze their own data for new insights, especially with a dangerous patent cliff looming in 2012. Turns out, oftentimes drug companies aren’t always willing to share data with third parties, which is often necessary to take advantage of big data infrastructure. While I believe that eventually the best data science will emerge to help these industries grow in new ways, for now at least, the best opportunities lie in the one area I wanted to gloss over last time: the consumer and mobile web.
Investors see the wave coming. Over the past few months, the top-tier funds have begun to make their moves. Benchmark Capital brought in Craig Weissman from Salesforce as an EIR and invested in Josh James’ new company, Domo; Accel Partners recently announced the creation of a “Big Data Fund” by reallocating monies from existing funds, which will improve data dealflow; and of course, there’s Greylock Partners, which was one of the earliest investors in this space through numerous companies and, most recently, by recruiting DJ Patil to be their “Data Scientist in Residence.”
Since March, I’ve continued to hear the term “big data” uttered by so many, yet so few seemed to grasp what it means for us and the web (yours truly, included). We all know that the major social networks (like Facebook), broadcast engines (like Twitter), self-expression tools (like Tumblr and Pinterest), and services (like Dropbox) generate ridiculous amounts of data. Add to this the growing Quantified Self movement, where connected devices from companies like Fitbit, Runkeeper, and Jawbone let us track our offline movements and analyze them online.