The term "Big Data" seems to be all the rage these days. Everyone uses it in someway to describe what their software company does. I often goad friends that work at startups claiming to be Big Data companies because they are handling the twitter stream or they are running X thousand transactions per second. I tell them to give me a call when their company gets up to scale a bit more.
In the online advertising world, especially with programmatic buying of ads, we have to handle nearly a million requests per second (and that is just the start of our scaling), make decisions in 5ms, track everything that happens across the globe and be able to report it to our clients within 5 minutes of it happening. With that kind of scale our budgeting systems have to be accurate within seconds or you could over spend by thousands of dollars. When you start to find that Amazon and Rackspace cloud environments can't handle your systems due to network speeds you may have achieved a decent level of scale (not kidding, we actually took down an entire Rackspace data center at one point).
So with the large number of transactions that we handle and the massive amount of data that we store and process every second, what is it that's important to us and what do we care about with our "Big Data"? There's not really much you can do with the mass of data as a whole, except maybe donate it to some research university to use in their studies, or keep it all somewhere and pay massive storage costs every month. No, it's not the Big Data that matters, what really matters is the "Micro Data" within that large data set that is interesting and with out the large amount of data it's not really possible to find micro data trends.
In online advertising the more micro the trend is the more valuable it is. If we know that every left handed race car driver in eastern Iowa is guaranteed to buy your product, then you are going to be willing to pay a lot of money to show that one person an ad (we of course care about a bit large audience than that one guy). If we can find millions of micro trends that are valuable to advertisers then we can really help guide their advertising budgets and make really good decisions on how much to spend on showing an ad to any given request to buy an ad among those million requests we see each second.
I don't claim to have coined the term micro data. I first heard about it from a friend over beers one evening. He is a researcher at the University of Colorado and is in a research lab with Big Data in the name. One of the bodies of data that they use in their research is the US Census data which collects massive amounts of data points on every household in the United States. He told me that there is nothing interesting about saying that the average annual salary of each household in the US is $X, or that the average family in the country has 1.8 children and 2.3 dogs. Those statistics are meaningless to all but politicians who want to use meaningless data for whatever purpose then need. Instead, these researchers look at the micro trends in the data to understand things better. It's much more meaningful to know the average or median income of a specific block in downtown Boulder or the average number of people living in each household in a single block in South Boulder.
That type of micro data helps in the advertising world as well since that is how advertisers want to be able to control their spending on ad buys. We spend a huge amount of our time looking for and processing "Big Data" to discover the "Micro Data" within that is so much more interesting and valuable.