iballpark: 2015

Monday, November 23, 2015

Agency Vs procurement : The quantification gap

Prior to an annual media house negotiation meeting a few years ago , a marketing client shared some sales data with me to make the point that it was more important for her brand to be on-air than which specific TV stations it was on While that was just her being the excellent hard-nosed negotiator that she is and using a somewhat incomplete picture as a negotiating chip , it was nonetheless a perspective.

I'm reminded of it today when there's unprecedented pressure on agencies' remuneration. The focus on the advertisers' side seems to be on procurement while agencies point towards the value they bring to the table both from 'hard' strategy and from 'soft' passion.

It's increasingly becoming less of a contest - procurement is the order of the day. The rule-proving exceptions are becoming rarer.

There was interesting news last week (read here) about Pepsico eliminating its procurement department but to me , the nub lay in the last para of the piece : removing a department is not the same as removing its function , or at least the approach.

The whole difficulty arises from what is fundamentally a quantification gap : the difference in perceived value between the agency seller and the procurement buyer. (Ref. diagram below). Both sides basically agree on the value of advertising, the very tangible difference between 'on' and 'off' (it's why they're sitting across the table in the first place !) The disagreement is about the various scenarios of 'on'. From a stated negotiation position at least, the advertiser sees diminishing marginal value between , say, a 'low strategy' media plan and a 'high strategy' one.

Unfortunately, it's a gap which is exceedingly difficult for the agency to quantifiably equate with a fair price even as it knows better. (The unwillingness of procurement to be persuaded is a non-factor here : it's their job not to be !) It has to in effect manage a particularly nasty strain of the 'which-half-works' virus even as that virus has already been declared invincible ! As tough ones go , this one is a whopper. But it's imperative to keep trying. Perhaps a good starting point would be to size up the challenge for what it is and work forward from there.

Even then, the quantification gap problem is unlikely to be solved soon , if at all ! More likely is an evolutionary change of the entire ecosystem - even perhaps less gradual than supposed. Given the economics of the business, something has to give.

One hopes both parties in their evolved form - agency and advertiser including , yes, procurement ! - are at a better and fairer place of value when that happens.

Thursday, September 10, 2015

The Medium Data world of audience research

Mayer-Schonberger and Cuckier's Big Data ... book (my review here) sparked a few thoughts on how its N=All argument -redundancy of sample extrapolation in the availability of all of the datapoints- relates to media audience research.

Audience measurement today uniquely straddles both the big data world of digital media where the entire audience is captured in real time - indeed, in the N=All sense, it is one of the purest play big data situations there is - and the 'small data' world of offline media with sample-extrapolated audience.

I say 'uniquely' not because other domains are exclusively one or the other but because of two unique characteristics. First is the asymmetry between data availability and the share of pie of each part. Half or more of the ad spend is based on “small data” measurements. Too much is riding on sample extrapolation for it to go away anytime soon.

The second unique characteristic is the less obvious reason why : the interdependency between the two types of data. Sample-based ‘small’ data is a critical requirement for unlocking the full value of the available big data itself.

At the heart of this is the need for machine (server / IP / browser) –level data to be translated into human target audience- level data. Sample surveys are the crucial bridge. The interdependency flows in both directions. Big data is being used to improve sample-based systems.

These dynamics, together with concerns about whether current systems are adequately measuring the complex media consumption of today are driving ‘hybrid’ research - the fusion of ‘census’ (N=All) big data and ‘survey’ or ‘panel’ sample data to estimate audiences that each could not do adequately alone.

Three key applications here would be (a) Target Audience Reach/Frequency metrics for online campaigns - wherein clickstream data (impressions, uniques, etc.) are matched against panel demographics through website tags / cookies (b) Supplementing existing TV audience measurement with Return Path Data (RPD) from set top boxes - thereby augmenting the sample size especially for vehicles like Pay TV which may be underrepresented in the normal panel. It'll also be a crucial part of ‘Addressable TV’ advertising (the digital insertion of one-on-one adverts on the TV set ) and (c) 'Total Video' measurment - combined TV + online video viewership at both levels - the total audience of TV stations across standard and online channels AND the total video exposure of ad campaigns across various online and offline channels.

Given the size of TV and the growth of multiscreen online video, ‘Total Video’ as a combination of both is arguably the Holy Grail of audience research today. This is where most of the industry focus is currently on and where much is indeed underway. Many projects are being directed by industry Joint Industry Committees (JIC) and at the forefront of most are the current TV measurement agencies - Nielsen and Rentrak in the US, BARB in the UK, AGF/AGOF in Germany, Mediametrie in France, etc -. Among the notable others are comScore and (reflecting the broadening stakeholder base of this field ) and Google which has projects with measurement agenies like Mediametrie and Kantar. Closer home in the MENA region, Ipsos Connect have a Fusion project in its initial stages. These are only a handful of examples mostly from the US and Western Europe but other similar projects are in progress there and elsewhere.

All of which is not to say that there is a magic button in place now or even around the corner. These projects are very much work in progress. Adding to the methodological complexities and logistical difficulties (increasingly including issues of viewability and ‘Bot’ fraud) are the operational challenges of working with multiple bodies, stakeholders and partners with their various expectations and interests. Costs are a major challenge - especially when there's often not enough clarity in the first place about the demand.

Given these challenges and complexities, turnaround into usable planning software is bound to lag the rapid evolution of media consumption. That should not, however, prevent planners from applying the understanding of these measurement issues to creatively think out of the box vis-à-vis the available data and develop their own back-of-the-envelope planning guidelines.

In the MENA region,for example, one needs to go beyond E-GRPs, which is relatively simple, and into calibration against TV GRPs, wherein suboptimal measurement differences add significant complexity. As it stands, it's an apples-to- oranges comparison : 'actual’ online video ad views captured in real time versus sample-extrapolated, next-day telephone interviews (CATI)-based views on TV. The latter relies on the respondent’s memory recall down to 15 minutes on the previous day and does not capture commercial break ad views.

An apples-to-apples calibration could work via estimating the ‘accuracy losses’ involved between the two systems via comparable differences versus electronically measured TV systems such as Peoplemeters which capture ad viewership. While such a system is only indicative at best, it can nonetheless provide useful guidelines for video budget allocation between online and offline channels rather than have this decided more arbitrarily.

Returning to the starting point of this piece in conclusion, big data in media audience research in the strictest N=All sense is limited mainly to online campaign analytics data wherein all of the exposures are captured. The rest needs to be qualified to N=Nearly All. Either because most census big data is actually proprietary (e.g. server data of websites) or is difficult to extract (e.g., all STB data across all operators in a specific market) or is non representative of the total market (e.g. IPTV STB data in the MENA market) or any combination of these, all of the data points are simply not available. What census big data does is to add on to panel sample data to give it more depth and breadth.

In other words, be it with the largest formal systems being developed at an industry level or the smallest in-house calibration projects, the alleged death of sampling in the big data era simply does not hold up to the reality of media audience measurement. Sampling quality affects not only the offline media it directly measures but also the comparative relative value - and therefore budget share- of the online media it shares the pie with. Hence there's a real need to ensure that the old fashioned checks are in place, that samples are robust, random and representative and that the analyses are rigorous. This is particularly true for markets where those checks were much less in focus to begin with. The big data era, far from making it redundant, accords an arguably greater importance than ever before to ‘small data’.

Welcome to the Medium Data world of audience research!

Wednesday, June 10, 2015

On Big Data : A Revolution That Will Transform How We Live,Work and Think

Big Data : A Revolution That Will Transform How We Live,Work and Think , the 2013 book by Victor Mayer-Schonberger and Kenneth Cuckier which I read recently can be divided into three broad parts (division mine).

Part 1 : "By changing the amount, we change the essence"
The authors begin by arguing that the sheer volume of data today marks three fundamental transformations which they cover in three succinctly titled chapters , namely : (a) "More": The ability to collect all or nearly all of the datapoints makes sampling-based extrapolation redundant. Today , with increasingly more data available, the sample size is the universe , or , N= All. (b) "Messy" : With this, the need for exactitude recedes as error margins become relatively more insignificant. You can live with messy datasets as long as they are big enough to deliver insights and results that smaller ones can't. (c) "Correlation" : What matters are associations ,not their explanations. Predictive quality is all and causality nothing. It is good enough to know that there's a correlation between variables even if it may not be clear why.

Part 2 : Data as the 'oil of the information economy'
The second part covers the increasing 'datafication' of our world in general , the value created by the datafication of business specifically and the implications of that on the ecosystem. The book draws an interesting distinction between 'datafication' as the process of quantifying the world in analyzable formats and 'digitization' as the means that "turbocharges" that process. The value of data is likened to an iceberg , most of it is below the surface.Value accrues both from the primary use for which it was collected and from its resuse and extension beyond that purpose. This is the "option value of data" and is a key driver of the ecosystem today. 'Data exhaust' - information from users' usage and interactions online - is being used to "train" the system to drive improvements in areas like speech recognition, translations , etc. The book identifies three types of players in the data value chain : those who own the data (often only incidentally) , the analytics experts who apply their skills on others' data and those with the 'mindset' - the entrepreneurs who see the opportunity and build businesses around data. All are trying to position themselves at the centre of maximum leverage and data owners are unlocking value by processing and selling information to outside parties. The authors argue that over time as data skills and mindsets become more common , it's the data itself and data owners who will be the winners in the chain. The authors also discuss the end of domain specialists with data scientists 'letting the data speak' and making the decisions.

Part 3 : The dark side of big data
The third section covers privacy, data protection and related legal / regulatory issues. It talks about the risks of a big data world to individual liberty. On the controls side, as it can not be known in advance how exactly individuals' data is going to be used, the authors argue for a move from "privacy by consent" to "privacy by accountability" whereby data users take on the responsibility of ensuring that it is not misused. Interesting but iffy ! This last section didn't grip as much as the first two.

The book is laced throughout with interesting examples which fuelled the arguments and concepts very well. Some are the familar generic types - predictive analysis based recommendations on E-Commerce sites like Amazon or Netflix , sentiment analysis from social media data , etc. Then there are specific well-known examples - Google and flu trends, Walmart stocking Pop Tarts before hurricanes, Target and the pregnant teenager . And there are some lesser known but equally fascinating examples. Consider how the pressure applied by a person on the car seat can be mapped through sensors to assign a unique digital id and used as an anti-theft feature (and for other purposes like safety , to boot). Or how in 2008 the Billion Prices Project by MIT in the US used web crawling software to track five times as many more product prices in a day than the official CPI system did in a month and predicted the post-Lehman deflationary swing a couple of months in advance. A standout example was the automaker who having discovered a faulty part through data collected from their cars goes on to sell the patent of the fix to the supplier !

On the flip side, the book tends to be a bit repetitive. I can also see how it could perhaps be too basic for hard core practitioners. Finally, the book had a bit of an All or Nothing ring to it, especially with some of the bolder arguments like the deprioritization of causality or the end of specialists. That they are valid only within a context is not made clear and the exceptions are perhaps glossed over. However, it presents those contexts as they are and doesn't tip over into exaggeration as a book like this easily could have.A paradigm shift is undeniably on the way and that this is the starting point is very well brought to light by the authors.

Overall , I found the book to be an excellent layperson introduction to this important and very current topic- and a very enjoyable read at that !

iballpark