watchingYouIn a previous post I covered the scope and touched upon some uses of Big Data and in this second post I look at how it can be misused to compromise both your privacy and limit your choices as a consumer.

Do you, like me get annoyed by those little pesky adverts which seem to follow you around from website to website? You may not realise it but that obstinate little advert is targeted at you because you probably searched for or maybe mentioned the particular product brand or genre on social media or maybe even in an innocuous email. I once tried an experiment by changing my age in Facebook and the usual adverts were replaced by a completely different type aimed at my (new) age group. This is just the thin edge of the wedge of what is coming as more and more companies get up to speed with Big Data. The trivial stuff can usually be dispatched with just a little knowledge such as turning off 3rd party cookies in your browser settings, as these arethirdparty the ones that follow you around from site to site. Or the ‘do not track’ settings in browsers, although some websites ignore or only partially abide by this. Also little known is that you can actually see some of those companies which are customising ads specifically for you by heading over to the AdChoices website and while there while not make your own choice of those you want (if any) targeting you.

Did you know that unless you have opted out that Google scans your Gmail for the purposes of targeted adverts? To disable the annoying ‘interest based’ ads in Gmail head over to the settings page and disable. This won’t of course stop Google from scanning your email, and if that’s of concern then you should change your email provider, although that doesn’t guarantee your emails won’t be scanned by some government agency or other. But then again you may consider a small loss of privacy as a small price to pay for free email.

As covered in a previous post, Big Data are datasets that are large, diverse, complex, unstructured or semi-structured distributed datasets generated from a variety of sources including Internet transactions, email, video, click streams, and pretty much all digital sources. In itself Big Data is not a threat to your privacy but it is what can be done with it with analytics that poses the biggest threat and in some instances in (nearly) real-time.

Ever wondered about those silly little discount vouchers you get handed along with your shopping receipt? Catalina Marketing delivers point-of-sale  offers to individual customers on behalf of retailers. Its database is huge, and is growing by over 300 million retail transactions every week! No wonder the supermarkets love them as it produces redemption rates of up to 25% compared to an industry average of 6-10%. On its website Catalina says its Vision is to:

‘unleash the potential to know, engage and empower the consumers of the world’

It then goes on to list how it goes about this as: Acquire (customers), Maximise (basket size and trips) and Retain (customers). However, it doesn’t expand on the 3rd part of its vision how it ‘empowers the consumers of the World’; I don’t particularly feel empowered with a 20p discount off my next purchase of product xyz  if I buy it before the day after tomorrow.

With the wide variety of potential uses for big data analytics there must be discussion about the legal, ethical, and social implications and protection of privacy and other values in a BIG DATA world. Of course not all BIG DATA is personal but where it is then it should be made obligatory to notify consumers about what use is made of their data and to provide a way for them to opt-out and not just by the smallest of tick boxes at the bottom of the page.

Perhaps a more worrying aspect is that of dynamic pricing, where prices are changed because of availability or demand for a product or even based on offering a different price to individuals based on their profile as someone who is either ‘affluent’ or budget-conscious. There is already some evidence that this is already taking place, although this is probably rare at this time because of the lack of data, but it will undoubtedly become more prevalent unless legislation is introduced. But legislation will always be playing catch-up especially in such a fast developing technology.

Presently any data that can be used to identify an individual is classed as personal data and is regulated by the Data Protection Act but what about those businesses that operate in a number of countries or use cloud services so that they don’t  know or even care where that data is held and so which regulatory regime presides in that location. Although, there is a Proposed EU General Data Protection Regulation (GDPR) in the pipeline will it go far enough to protect the individual?  One positive is that business will have to notify any breaches resulting in release, corruption or loss of personal data, which is not currently the case.

Anonymous data, that is data that is in such a format such that no individuals can be identified, is not subject to the data protection act. Privacy protection technology can strip the data of any individual identity, but equally powerful technology is able to bring it back together and re-identify individuals. The ‘mosaic effect’ can equally pull together diverse datasets which do not include personal identifiers perhaps from entirely different sources or locations to build a profile of an individual. More resource is being used to re-identify ‘anonymous’ data then that used to enhance privacy. This raises the question as to what rights the individual will have on data derived from multiple datasets.

A recent example of the uses of such technology may have been whereby a certain intelligence agency was able to identify a British terrorist in Syria from social media. It is known that Raytheon has developed an alogorithm that is capable of a high level of insight of an individual from his or her social media landscape. Presumably this can be done under the current legal framework despite recent EU guidelines stipulating that personal data cannot be processed for purposes incompatible with those for which it was initially collected. Although invariably this same data is being passed on to marketing firms as consent is not always needed despite the common belief that it is. A system whereby you have to tick a box to either give or withhold consent is largely unworkable, particularly given that only around 7% of people only ever read the terms and conditions given in their entirety, although many more suffer for not having done so. Within the GDPR framework a tightening of the law on this issue is proposed in that explicit consent would have to be given, although there is much opposition to this and that ‘legitimate interest’ should be used as a de facto standard.

Data-tagging is one technique that can be used to control access but only a few per cent of the data mountain is actually tagged. Similarly, only a small fraction of data that should be protected is. There is wide disparity in control between consumers and the information about them that is controlled by companies and subsequently the risk of exploitation of vulnerable populations exist.

Consumer profiling and purchasing patterns is often done by unregulated data broker services, collected when a consumer interacts with a brand or via on-line ad network interactions, social media and a whole host of other services. This may then be enhanced with data from public records or  from other commercial sources to provide an exceptionally  detailed profile of a consumer which is then brokered to marketers for targeting that individual with interest based ads.

Most people know about credit scores but what most don’t realise is there exists dozens of other consumers scores created by data brokers, analytics firms and retailers which are under the radar and can have a big impact on your daily life as a consumer. Apart from the intrusion aspect of this, the most worrying of this is that you have very little say in what it says about you and the ability to correct any inaccuracies caused say by identity theft or inaccurate conclusions. The scope of these ‘secret’ scores is very wide from household wealth and financial habits through to the likelihood of you faulting on a loan, losing your job and the state of your health. The context of use of such scores is very important and legislation must be passed that limits its use. If you talk to those agencies that assemble such profiles they will tell you it is all for the benefit of the consumer in providing targeted advertising and emphasis that it is all perfectly legal. In reality all such data can have dual purpose, take for instance the fusing together of analysis of medical research with medical records and genomic information may give rise to earlier and better treatment but could also be used for insurance risk and employment suitability.

There is undoubtedly an overwhelmingly balance of power in the favour of the vendors and users of such data. As consumers and citizens we should have the right to transparency and the opportunity to challenge any inaccuracies and the extent of the information that is held and bartered about us as individuals.

It is only going to get worst, there is a whole tsunami of exclusions and targeted marketing heading your way as more companies get on board with Big Data considering that so far only a small proportion of the accumulated data has been analysed and around only half of that which should be protected has been. Throw into the mix that such data will double around every two years and you have a likely scenario that Big Data will be watching you in almost every aspect of your life not only a consumer but also the potential of compromising your privacy as a citizen.

Fortunately not all is doom and gloom since Big Data can also be very beneficial in finding the proverbial needle in the haystack, such as by uncovering meaningful genetic variants for a disease, or in real-time alerting law enforcement and directing them to the location of gunshots in American cities. I will be covering both beneficial and more worrying  applications (depending on your viewpoint) in  a future post, so stay tuned.

