The downside of open data

(Anirudh Dinesh) #1

I found an interesting case on about the possible misuse of open data. Does anyone know of any other similar case studies?

(Maria Hermosilla) #2

@anirudh which case are you referring to? Can you put a direct link please?

(Anirudh Dinesh) #3

@MariaHermosilla This is the United States Eight Maps case study. @andrew worked on this. Any more examples, Andrew?

(Andrew Young) #4

There’s an interesting discussion of the privacy implications of data sharing (including but not limited to open data) in this paper: A decision model for data sharing.

(Bernhard Krabina) #5

I have been asked recently about examples of the negative effects of Open Government Data. Honestly, I could not think of anything…
would be interested to hear other experiences, though.

(Jan Suchal) #6

(Beth Noveck) #7

Not on open data per se but check out the Seven Deadly Myths of Transparency

Wide range of resources on the side about the downsides of transparency for legislative effectiveness.

(Justin Longo) #8

I suggested once that there is a potential downside to open government data, but it requires two premises:

  • that those advocating for or implementing open government data initiatives have a particular objective with respect to transparency and accountability (e.g., that if parents have data on school performance, we’ll get better schools as a result);
  • and that NPM (new public management) has been revealed to be an ineffective way to run government.

As Johnny Carson said: “if you buy the premise(s), then you buy the bit.” Most people I’ve heard from on this don’t accept the premises.

The paper is available here.

(J. Albert Bowden) #9

misuse is subjective; if data is only supposed to be used for certain things, its not open at all. i’ll assume that misuse in this case is something with negative outcomes, but even then i disagree with its usage along the lines of open data. even the eight maps example shared isn’t misused…it was poorly executed. that data should have been anonymized, and had it been, could have never been misused in this manner.
one case in particular i remember arguing about: someone felt that using traffic camera locations to avoid them/not get a ticket was a misuse of data. i fundamentally disagree with that opinion on all accounts. that data is provided for citizens to use, but more importantly, traffic cameras are routinely gamed to make profits off of citizens. alls fair in love and open data.

the real arena where misuse happens is with government; selling open data, denying access to open data, etc. that is where the misuse occurs.
another area of heavy abuse is for-profit entities consuming massive amounts of open data and selling them as a service, while contributing nothing back to the ecosystem.

that said, here is one example of seemingly misused open data:
businesses mining student directories and sending political ads/messages to students.

(Sebastian Haselbeck) #10

Personal information isn’t “open data”. By most definitions, data that can be opened up does not include any personal or sensitive Information (including security related, etc.)

(Greg Bloom) #11

It certainly does depend on what your values and premises are, but – assuming those values are things like equity, fairness, and representation (as opposed to, say, helping businesses make money) – then yes the case for skepticism about the use of open data is well made and substantial.

Conceptually, it goes like this:

  1. The amount of use one can make of open data is correlated with various factors like data literacy, aroused incentives, know-how, connections, etc. The more of all of these things you have, the more likely you are to find and use open data for your benefit.

  2. The most likely actors to have high degrees of all of those factors are corporations (health insurance, pharmaceuticals, etc), agents of capital (real estate developers, etc), and people from upper classes generally.

  3. Often times the interests of those actors in #2 are not aligned with public interests, especially the interests of people and communities who are least likely to have high degrees of the factors in #1 that yield benefit from open data.

So the most likely uses of open data are ways in which already-powerful interests act to increase their power.

To the extent that most advocates for open data seem to care about democratic processes, public interests, etc, this disconnect suggests the likelihood of significant, systemic misalignment between intentions and outcomes.

There are plenty of documented instances of this, especially in land use and housing (see Tom Slee and Jessica McKenzie in Civicist) and I assume you’d find other examples in health, criminal justice, etc. I’ve personally seen well-intentioned civic hackers use open data on education to help upper-middle class parents accelerate a process of public education segregation and disenfranchisement.

In other words, the real misuse of “open data” may not be about bad actors so much as the misuse of the concept itself as a flimsy stand-in for what otherwise should be broader and more robust civic strategies for effective democratic governance and collective action for public interest.

(Jan Suchal) #12

That’s not true at all. See - data from companies registers - containing personal data of statutory bodies - personal data of owners etc. - who is benefiting from companies = personal data

Property owners in land registries - e.g. New Zealand

Procurement data containing winners and bidders (these could be also natural persons, not only companies = personal data)

and many many more. All of these have tremendous value for transparency and allow to create a healthy open data business ecosystem.

(Jan Suchal) #13

BTW - GDPR (the upcoming EU regulation regading personal data protection) is causing a lot of confusion in Open Data / Open Research data world - see

(Anna Kuliberda) #14

I find the example of this post-mortem quite interesting:
Should Schools Be Closed? Learning from Schooloscope, an OpenData post-mortem

(Sebastian Haselbeck) #15

You are right, my statement was a bit sloppy. A lot of the “open government data” advocacy however concentrates on the “open by default, except…” philosphy, wherein all data is to be opened except if they contain sensible personal Information or personal identifiable data, sensitive data for example regarding national security or business secrets. The other cases you name are of course valid. Ownership data is especially contentious, however different EU member states seem to follow very different approaches on this (compare e.g. the UK vs Germany). Thanks for the link to the okfn discussion thread.