Monday, May 27, 2013

Musing of Big Data and Privacy

Big Data and Privacy. Or should a Big Box store figure out if someone is pregnant?  

Is that Private?


So what is Big Data? Is it the latest 'fashion statement' from the IT world? A bunch of numbers, letters, that represent something or someone? Something of an asset?

All the above and more. Basically it is the information, or data, that is generated by everyone and everything.  Examples of Big Data include this particular blog entered on the web, the decoding of the human genome, the buying habits for your customers, your credit score etc.

 Its 'stuff'. 
Google’s CEO Eric Schmidt stated: “From the dawn of civilization until 2003, humankind generated five exabytes of data. Now we produce five exabytes every two days…and the pace is accelerating.”

SO that is Big Data. But how does it concern privacy? Before we go there, lets reflect this issue. 

Companies are generating great mounds of data. Everything from what you purchase in grocery items (those Customer loyalty cards) to what credit cards you use and where. 

This is an asset to the company. It is something that can be analyzed, inspected, and reported on, all for the purpose to get the upper edge from their competitors,  a better understanding of the customers,how to market/target them to get the best results, What tickles their fancy so to speak? Maybe get that same customer to buy milk from your company as well as  the clothing that they buy now.

While doing the research for this blog I came across an interesting case study concerning this  issue.
A major Big Box chain’s (not Wal-Mart) department of thinkers (not a real department but could have well been named that) got together to try to see if they could 'predict' which of their  customers were pregnant. 

The reason was if they can get that pregnant customer to start buying the 'stuff' needed for the happy occasion, they could influence their buying patterns in the future. A better 'bottom' line (pun intended).

They had all this raw data about their clients and their buying habits. They can mine the information (Big Data) and determine if there were any patterns. And the results were, to say the least, eye opening. 

Now, this blog is not the place to have a detailed discussion about this, but needless to say the mathematical model that was developed was successful in more the 87% to predict, based solely on buying habits, which of their clients were pregnant. They were then able to target  the pregnant customers with  coupons, flyer's, etc in hopes getting them to buy more ‘STUFF’, 

This was done without the a client filling out a form letting the company know they were expecting, Ms Jane Doe customer had yet to buy a single diaper etc. The mining of this client’s information from the company database which indicated her buying habits, was the only determining factor. 
That is what Big Data is, and what it can do.

Can you see the issues in privacy in all this? Actually, there are really three different issues when dealing with Big Data.

Is what the company doing legal?
Is it ethical?
Is it acceptable to the general public? 
Let tackle the legality first. 
It’s not a simple answer. There are a lot of variables involved. Where does the customer live? Did he/she give permission to the company to use the data collected for internal (and maybe external) use? These are but two questions that privacy officers need to deal with, address and ultimately sign off on. 

Generally speaking, we can assume, when a customer signs up for a loyalty card, there would be some form of authorization to use the data. Or at least best practices demands such sort of disclosure, if nothing else. And this may be the easiest of the three questions.

Is it ethical? 

PHD theses have been written about this very question for 'years'. There is no gov't review panel to determine if it is or not ethical, but the question is still very valid.  One education site states that ' ethics refers to standards of behavior that tell us how human beings ought to act in the many situations..'  

http://www.scu.edu/ethics/practicing/decision/framework.html
While there is no stand fast rules on what is and is not ethical, one can, if for no other reason,  look into the mirror and ask the question? Is this ok?

Is it then acceptable? 

Going back to the story above, let’s see what happened. After the store created the model, they started sending flyer's, coupons that would target the would be moms. Examples, like diapers coupons , flyer's featuring cribs etc.  were sent out to the targeted group. 

Well, you can imagine what happened next. Many irate customers wondered, first of all, how did this company know they were expecting. Even more damaging to the company’s reputation was the fact that they were sending baby oriented coupons to non pregnant clients. And what if those target accounts were teenagers, and/or single,  and/or religious?

A public relations nightmare. In fact, while doing the research, I was surprised that this had not been thought out more thoroughly in the marketing department of the company.

All these factors play in the realm of Big Data. And privacy is just one of those factors.

Ultimately, the people responsible for privacy need to assure themselves that the use of the data is within legal constraints. 

It can be more complicated if that data  being analyzed is sent out to another company. There are 'mounds' of companies whose only job is to message the data and make sense of it. They can then market to those clients with targeted campaigns  as successfully as possible(the pregnant ladies from the above example), to get the best return on the data. (the Big Data).

Big Data means being able to see trends and patterns, not determining individuals buying habits per say. 

No one in Costco cares if the individual named Robert will buy a steak or a bottle of milk. What they do care about is influencing the group that Robert ‘belongs to’ so they can somehow how influence that targeted group to buy both products (as an example).  

 So an argument concerning privacy can go something like this:

Its not the PII information of a particular person that is being used (for the most part) for this type of analysis, but that a customer bought an item and he is middle aged, 6 foot, lives in a middle class area, Etc. And he belongs to a statistical group that represents 25% of the customer base in a particular region.

Maybe. But then again is that the only usage of these great mounds of data?

The debate on Big Data, how to handle it, and the ramifications on privacy will continue. What we need to do, is have the dialog, ask the questions, figure out what can and should be done. 

The concerns won't go away, and ignoring the issues will only make it worse.  We all need to first understand the issues and then try to make 'a go at it.' And at same time making sure we don't shot ourselves in the foot.


No comments:

Post a Comment