Edited to add: there’s a follow-up post:
What follows is the on-the-fly translation of this week’s installment of my “DataKnightmare” podcast. It aired as 1×28 on April 24th, 2017.
Please look beyond prose and style, this is serious, it’s happening now and I am furious –W
These are strange times, you’re barely done seeing a problem through your traditional black-tinted glasses and facts whiz by on a sports car, head out of the window, lampooning you and giving you the finger.
You’ll remember four weeks ago we talked about Google Deep Mind and the Royal Free NHS Trust. We noticed how five years of health data about some 700 thousand patients had been quietly handed over in a secret deal to Google for its AIs to play with. OK, for its AIs to train.
And you’ll remember just three weeks ago we were reading JM Porup’s “95 Theses of Cyber“. These two are now especially appropriate:
- To own a nation’s data is to enslave that nation.
- Extracting a nation’s data is a modern form of imperial conquest.
So you remember all this, you remember the distinct conspiracy theory aftertaste it had, eh?
Very well. Let me welcome you now to the mother of all data grabs.
Imagine a reasonably advanced country, maybe decaying a little but still a civilised one. Imagine its Prime Minister willing to attract the investment of a large multinational corporation. Imagine he is willing to offer the usual financial incentives. Imagine also that, incentives not being enough, the PM offers the corporation free, unrestrained access to his entire nation’s health data. Not just anonymous health data, but personally identifiable health data too.
Every prescription, every clinical record, each patient’s entire diagnostic history, all the data in the Substance Rehab Agency archives, each and every Emergency Room case, each and every appointment with a specialist. And of course all genetic data.
Everything. With no control. No warranty of limited and proper use (no use limitation, either). No return, not even financial. And, to top it off, no tender either.
Good. Now look out the window and wave it hello, that country is our country, Italy.
As it happens, I just stumbled across this story. It hit the news a couple months ago, and completely failed to raise a single eyebrow. It’s an investigation (beyond paywall on newspaper site: article1, article2, full text on Mr. Barbacetto’s blog: article1, article2 ) by Gianni Barbacetto for the national newspaper il Fatto Quotidiano. Mr. Barbacetto is one of the best investigative journalists we have left, and the fact that in two months no one has picked up his investigation is strangely telling.
So let’s have a few facts.
March, 2016. Then prime Minister Matteo Renzi announces he has convinced IBM to invest in Italy. On the terrain where the Expo 2015 was held, IBM is going to build a brand new research centre for Watson Health. The project, at a cost of about 150 million dollars, will house some 400 researchers. The announcement is promptly picked up by the media. Panorama, Repubblica, Corriere Comunicazioni
Just like in the Royal Free / Google Deep Mind case, there is no contract. Only a Memorandum of Understanding. Which is kept confidential, that’s to say secret, hidden from questioning eyes, for over a year.
In February 2017, Mr. Barbacetto manages to get hold of a copy, here’s what the MoU says from his own article of February 19, 2017 (translation mine):
“As a prerequisite for the project and the investment, IBM (including its controlling, controlled, affiliate and linked companies, where necessary) expects to gain access –in ways that will be defined– to the treatment of the health data of the roughly 61 million Italian citizens (i.e. historical, present and future health data) both in anonymous and personally identifiable form, for specific project purposes, including a right to secondary use of the aforementioned data for purposes that go beyond the present project.
By way of example, and the list is by no means meant to be exhaustive, it is crucial to access Patient Data, Pharmacological Data, Tumor Register Data, Genomic data, care data, regional data, State Drug Agency data on drugs, undergoing clinical studies, enrollment and demographic data, diagnostic histories, Refunds and Costs of Use, Conditions and Medical Procedures, Outpatient settings, pharmacological treatments inclusive of costs, Emergency Room records, hospital discharge records, information on specialist appointments, including times and attendance, and other health information.”
End quote. Fantastic, no? Very interesting, bordering on the comical, especially that “in ways that will be defined”. I guess it means that first and foremost we state that IBM will access all data and then, maybe, we’ll define exactly how it will do it. If time allows, of course, because we don’t want to get in the way of business, here. And anyway, it’s just an MoU, not a contract. Which is fine, if this were Sweden. This being Italy, one year down the line there’s still no contract in sight.
So, let’s recap.
IBM expects to access all of our health data. In anonymized form, says the MoU, but since the data are exhaustive, anonymization is little more than a farce. And anyway the MoU says that the data will also be accessed in personally identifiable form. So much for anonymization.
And what is the goal of this access, apart from letting Watson (IBM’s AI) play with it?
In a followup article two days later, Mr. Barbacetto unveils an “Industrial Development Contract Proposal”, also confidential, dated 26 January 2017. Nine months after the MoU has been signed. The proposal is between an IBM company called SoftLayer technologies Italia srl, and the Ministry for Economic Development and its investment attraction agency, Invitalia.
The goal of the contract is to launch in Milan the first european center for Watson Health the health market version of IBM’s famous AI.
The project envisions bold goals, as befits IBM (translation mine, again):
“To generate strategies for appropriate, coordinated care”; “to improve the management of high-risk-, high-need-patient clusters, lowering service costs and improving patient results”; “to give citizens and businesses easier access the data patrimony owned by the public administration”; and even “develop research projects on big data, infectious diseases, elder care, predictive precision oncology”.
“All information contained in the present document or anyway related to the Program and the Projectsin the Industrial Development Contract Proposal are strictly confidential between IBM and Invitalia.” “Information may not be disclosed to third parties and/or made publicly available without prior signed approval by IBM and without the signing of a Non-Disclosure agreements between the parties.”
How come all this secrecy in a public contract, a contract related to the treatment of the health data for an entire country? What happens to the much-praised transparency? Or maybe, just maybe, transparency may allow somebody to wonder why IBM and not someone else or even, God forbid!, why IBM and not every other interested, qualified party, just to see some market competition, for a change?
Or maybe transparency may allow one to wonder what’s Italy’s interest in the deal, what does it get in return? Here’s the point: there is nothing in return.
In return for using this data trove, secretly and without giving any warranty as to proper use and data protection IBM will give… absolutely nothing.
Again, Mr. Barbacetto on the Contract Proposal:
“IBM will keep the intellectual property of the pre-existing cognitive platform (IBM Watson) as well as that of the new Watson soultions and of any tools that will be developed.”
“will own research results and will provide other potential project partners with a User License”
And of course IBM will be
“free to reuse the collected data for other uses beyond the project scope”.
Basically, IBM can do whatever it wants with the data. Not only IBM is not paying to access the data, but any development made possible by those data will be its exclusive property! But, since they are nice guys after all, the may “provide other potential partners” with a User License.
What the FUCK, eh? WHAT THE FUCKING FUCK?
Italy is an integral partner of the project from day one, since the project can only start because of Italy’s data. As a Brit may say, “potential my arse”.
This is beyond absurd already, but it’s not all. There’s also the small detail that for this operation IBM will be incentivised with 60 Million Euros, 30 from the Lombardy Region (where Milano is) and 30 more from MISE, the Ministry for Economic Development.
So, let’s see: if you want to take a picture of your daughter’s school recital you are required to get a signed consensus from all other parents because privacy. And God forbid one shot does not end up on local news, or you may be sued for privacy infringement.
But if you just want to use the nation’s health data without control, without guarantee, without accountability, without declaring your purpose and without even telling please feel free, would you like a cup of cofee, one lump or two?
Well, we need to talk about this.
Lombardy, that according to the draft proposal should shell out 30 million does not seem too enthused; giving away citizen health data for free? Who gives a fuck, we just work here. But actually paying money, well, maybe we can do better.
So Lombardy requested an opinion to the Data Protection Authority. The authority, of course, replied with the legalese equivalent of “WTF!?!” and requested an official clarification from Lombardy.
Because, you know, if you are the largest, richest, most advanced region in Italy you cannot really go to the Data Protection Authority saying “Look, we’d like to give away our citizen’s health data and while we’re at it we’re promising Italy will do the same, without contract, oversight or return and without knowing how they will be used, are you cool with it?” and expect the Authority to give you a pat on the back and an attaboy.
The requested clarification should have been delivered by march, 20th. I heard nothing about it, which is OK since I’m just an ordinary Joe. Mr. Barbacetto hasn’t published a new piece, which is more worrying. What, if anything, is happening?
Can our health data be given away like this, without anybody asking permission? Without anybody questioning?
Legally, health data can be shared in some cases. For reasons of medical, biomediacl or epidemiological research, our health data can change hands without previously requiring our written consensus, if there is a law that allows the transfer.
This is why your doctor can consult another doctor regarding your treatment without your consent, and of course you need not give consent to receive a friendly reminder for a prostate cancer screening examination if you are a male entering your 50s.
But there is no law stating that the health data of the entire population can be passed on to a commercial entity without that entity providing specific reasons and goals for using those data.
So, really, what is going on here? From the outside, this looks worryingly like colonisation, as Porup rightly framed it. Giving your data in bulk to a commercial entity like IBM (but if it were Google it would make no difference) means to become that commercial entity’s colony.
And this cannot happen. On this, we need to take to the streets with our voices and our signs. And if signs are not enough, maybe pitchforks may, because what we have here on the one hand is political incompetence at a criminal level. On the other hand, the level of economic stupidity is unprecedented.
because if data is the new oil, as it is,you may be Prime Minister, but you don’t get to give away all of the nation’s health data for free. Because if you try that, you are criminally incompetent at your job, which should be safeguarding the nation’s interest; you, rather than valorise the nation’s resources give them away for free with nothing in return.
It is not inconceivable to initially give data for free if the goal is to develop a new diagnostic product or service. I can understand that, because correctly valorising data is difficult. How much are those data worth? A hundred thousand euros a kilo? And how much data make a kilo? So I can understand if the initial cost of data is set to zero. But, since I’m giving you the data for free, anything that you develop from those data will be our shared property, not just yours.
No way it remains your sole property and maybe if you’re nice you may… they say they may give a user license! Which means maybe you choose to charge me for it instead!
Now, it does not take a PhD to understand this: I give you something you need to develop your product. You either pay up front, or we share the property of the end product. No way you give me a “user license” and even get a chance to choose to charge me for that –I’m not a user, I’m your fucking peer.
There. I don’t understand anymore… Or, better: I do understand, and what I understand is not nice.
At any rate, this does not end here.
Be vigilant, because this cannot end here.
How we choose to handle this thing will decide whether we remain a civililsed country or we become a Banana Republic, another third world country ripe for plucking.
Our choice will make the difference. Remember this, and keep your eyes open.
This was DataKnightmare, the look on the dark side of data; I am Walter Vannini. If you are listening, you are the Resistance.