I was excited back in 2010 when HHS started posting breaches on what some would call the “wall of shame.” I knew that we’d only learn about breaches involving HIPAA-covered entities, but at least we were finally starting to get some actual data. Now, more than 6 years later, it’s become clear to me that it’s probably best to just call time of death on the breach tool, despite its popularity with marketers who look for numbers to support their sales pitches.
In this post, I review some of what we are not seeing on HHS’s breach tool, and why it’s really not a source of accurate or helpful information for those who want to understand breaches and incidents involving health or medical data.
Have You Checked the Dark Web Recently?
Last June, when TheDarkOverlord made headlines by advertising patient databases for sale at absurd prices, it was a bit of a wake-up call for me. I had never checked any dark web marketplaces for patient data that might be up for sale. Nowadays, I do, and I occasionally find evidence of breaches that have never appeared on HHS’s breach tool.
But since I’ve already mentioned TheDarkOverlord, let’s start with him/them. Can you explain why none of three databases recently dumped had appeared on HHS’s breach tool, even though we knew about two of those claimed hacks last year? If you read this site or Protenus’s Breach Barometer, you knew about two of those three incidents last year when they were first disclosed by the hacker(s). But if you relied solely on HHS’s breach tool for your stats on hacking of patient data, you didn’t know about these incidents – and still don’t.
In contrast, an incident involving Behavioral Health Center in Maine is on HHS’s breach tool, but probably only because I discovered it on the dark web and notified the covered entity, who, in turn, notified HHS.
Here are some more dark web listings that still haven’t shown up – and may never show up – on the breach tool:
A listing for pediatric patients’ information, which I’ve previously noted on this site. Although there have been one or two pediatrics offices reporting incidents, none have reported any that would quite correspond to what the vendor is claiming to possess. And then there’s this listing:
Where did these data come from? Were these data from the Nevada incident reported by Justin Shafer – the one where the state said it had no indications of misuse of data and that private patient information was secure? The Nevada incident does not appear on HHS’s breach tool. Unfortunately, the dark web vendor would not provide me with a sample of the data, so I couldn’t confirm whether the data for sale were from Nevada, but there is no incident on HHS’s breach tool that appears to correspond to this listing.
Then there was a dark web listing for 5,200 patients’ records from an incident in Minnesota that almost certainly should be on HHS’s breach tool – except that it’s not:
The listing first appeared in a vetted forum with an asking price of $100.00. It was subsequently listed in a second forum with a $99.00 list price. The second listing identified the data as coming from a particular clinic in Minnesota: LifeMedical. DataBreaches.net was able to obtain all the data, but when I contacted LifeMedical in Minnesota and spoke to their outsourced tech firm, PriorityOne Technologies, they denied that the patient data was theirs. DataBreaches.net also contacted eClinicalWorks because there were references to them in the database. They were of no real help, however. Nor did the dark web vendor respond to a private message I sent them on the marketplace seeking further information.
So we have data that can be partially verified by public sources such as Google searches or by calling the patients directly, but whose patient data were these? If no one has accepted responsibility/ownership of this database, no one has notified HHS and probably no one has notified or warned the 5,200 patients that their data was not only acquired by criminals, but was up for sale on the dark web in at least two marketplaces.
And Oh, Those Third-Party Breaches
Ok, what might happen if a service that handles documents with sensitive information relating to workers compensation cases has a misconfigured server (and no, I’m not talking about the Systema Software leak but a much more recent incident)? DataBreaches.net was so concerned by exposed reports that @s7nsins was finding and sharing with me that I called and sent messages to a law firm whose clients’ personal and medical information were among the numerous files that were exposed.
To give you a sense of the scope of the problem, here’s just one file exposed on the server that has been redacted by DataBreaches.net. Note all the types of information that were included.
Other files included financial information about settlements, W-9 information, radiological findings, and more. Now maybe some of these records could rightfully be considered public records if they’re evidence in litigation, but even courts generally require some redaction or sealing of sensitive information, don’t they?
So how many hundreds – or thousands – or millions – of individuals may have had their personal and medical information exposed accidentally by this vendor and you wouldn’t have even known about it except for the fact a researcher contacted me and I just mentioned something here? How many curious researchers – or worse, criminals – may have downloaded all the data? Yes, DataBreaches.net will be following up on this one, but cannot tell you whether you will ever see it on HHS’s breach tool.
No, I’m Not Done. Not By a Long Shot.
When you think about massive exposed databases or servers like the one mentioned above, it should serve as a sobering reminder that we are only hearing about a fraction of compromised records if our only source of data is HHS’s breach tool.
How many of the misconfigured MongoDB incidents or misconfigured rsync incidents have you seen reported on HHS’s public breach tool? Only a handful at the most, probably, because the data are often owned by entities that are not covered by or subject to HIPAA. In other cases, you may not hear about it because researchers contact me and ask me to just handle a notification but not report what they found – such as the time a researcher contacted me about prisoner medical records that were exposed due to a misconfigured server. The prison was very grateful for my call, but I never saw any report from them to HHS. Had anyone else accessed the server? I’ll likely never know – and neither will you.
And while we’re at it: how about all the Sharepoint breaches that entities confessed to in a survey? Where are those reports on HHS’s breach tool?
Researchers continue to report tremendous amounts of data exposed by misconfigured databases, servers, and backup devices. But you probably can’t tell that from HHS’s public breach tool.
Do you remember how the FBI issued a private industry notification in March about all those public FTP incidents it claimed it was aware of? Where are those incidents on HHS’s public breach tool? Even when Shafer reported the incidents to HHS, HHS did nothing to investigate most of them and never added them to the breach tool.
Misconfigured MongoDB databases, misconfigured rsync backups, public FTP servers exposing data …. tons and tons of leaking medical or health information that is never reported on HHS’s breach tool. And for the most part, this is not HHS’s fault. But can you really have any confidence in conclusions based on HHS’s public breach tool? I don’t think so.
In addition to the breach tool necessarily omitting incidents involving non-HIPAA-covered entities, DataBreaches.net has already investigated and documented a second problem or limiting factor with HHS’s breach tool: it significantly underestimates and under-reports third-party incidents. A third major problem with the breach tool is that some codes/categories are so ambiguous as to be non-helpful in trying to understand the threat landscape. Is a case of “unauthorized access/disclosure” due to human error on the part of an employee or willful and malicious sharing of information by an employee? Is a “hacking/IT incident” on “network server” really a hack or is it a case that an employee forgot to restore a firewall after an upgrade and a search engine indexed data?
When all the problems are taken together, it’s time to call time of death on using HHS’s breach tool as an analytic tool. It’s time for vendors to stop just rehashing numbers from the site to provide “headlines” about breaches to support their marketing when there is just so much missing or ambiguous in the breach tool data.
We need more reliable and more complete data.
Every month, DataBreaches.net provides data to Protenus, Inc. about breach incidents that were disclosed or first made public during the month. While the data includes incidents reported on HHS’s breach tool, the data goes well beyond the breach tool to provide additional details and incidents.
If you are not subscribing to Protenus’s Breach Barometer, you might want to try it. The Breach Barometer and Verizon’s DBIR are probably the two most useful tools for understanding breaches involving health or medical data, although they employ slightly different methodologies. And of course, Verizon has tons of resources, and I’m just a solo blogger/researcher. But if you’re still just rehashing numbers from HHS’s breach tool, you’re not adding to the conversation and are part of what may just be a major distraction from discussions of the more serious risks of data loss or compromise.