There’s also MIB Group (Medical Information Bureau) which collects healthcare data as part of an exemption to the fair credit reporting act, along with an extensive astroturfing campaign to hide their activities (the Wiki article on them is useless).
> In addition to an individual's credit history, data collected by MIB may include medical conditions, driving records, criminal activity, and participation in hazardous sports, among other facts. MIB's member companies account for 99 percent of the individual life insurance policies and 80 percent of all health and disability policies issued in the United States and Canada.
Tangent, but ... they picked the acronym MIB? I had to look it up, and the term Men in Black in the lay usage dates to 1947.[1] But the group MIB dates to 1902, so just a coincidence.
> He started by reaching out to the biggest corporations. If they would agree to give him data on their employees’ paid medical claims, he would return to them an analysis of their cost drivers, benefit designs, and manageable risks that would give them leverage in negotiations with insurers
What? Isn't this exactly the sort of thing HIPAA is supposed to ban? What happened to doctor-patient confidentiality? Why do employers even have that information?
The clues are a bit earlier in the article and the full name of HIPAA
> The financial trajectory of MarketScan was perhaps unimaginable in 1981, when a former insurance executive named Ernie Ludy founded the company. His idea was to simply collect patients’ data and parcel it out to big companies that were seeking to control costs by getting a more granular view of their employees’ health care use.
> The Health Insurance Portability and Accountability Act of 1996
Ahh, thanks. Should have kept reading. However, a paragraph or so later, it says HIPAA doesn't apply to de-indetified data, and that it's easy for researchers to buy the data set.
Hopefully, some security researchers will get their hands on it, de-anonymize the data set, and then regulators will burn the industry to the ground.
Although I'd point out that very little is needed to un-deidentify medical records if you want to. For example, see some of the work Latanya Sweeney has done.
> Hopefully, some security researchers will get their hands on it, de-anonymize the data set, and then regulators will burn the industry to the ground.
Computer science evolves in a circle. Analyzing data for meteorological predictions was one of the first uses of a digital computer; it just wasn't hip at the time and involved 0 pandas.
“Hot” conditions with high conversions are tracked in near real-time. I learned this when we received via FedEx a box of Enfamil on what should have been the due date of of daughter. Unfortunately, we miscarried.
But I can assure you that your pregnancy was not revealed to a marketing organization by your doctor or insurance company. HIPAA prohibits that kind of information transfer, and the consequences of violating the law are severe enough that physicians and insurers are highly unlikely to risk it, for the little bit of dough they'd get by selling the fact of a pregnancy.
However, HIPAA only protects information about you gathered by your doctor or insurance company in the course of providing medical care. It does not protect you against data aggregators inferring your condition based on non-medical activities. In the case of pregnancy, it's not unlikely that your condition was inferred from credit card activity or online retail activity. (There is a well known case of a retailer - Target - building a model that inferred a pregnancy in a household based on retail activity; they started sending flyers/adverts to households they had identified as pregnant, and in the process revealed pregnancies of wives or daughters in the household to others who had not been in read in to the news; it did not end well for Target).
My wife’s pregnancy and expected due date were available on a list sold by zip code, which I know because I purchased it after it was identified by Enfamil. I also learned that my neighbor has type-2 diabetes and another has a child with ADHD, among other things.
In my wife’s case, she suffered from an ectopic pregnancy that resulted in a near death experience, and subsequent admission to the OB floor of the hospital. The admission and drugs prescribed are sold/reported in near real-time to various data aggregators, who in turn are able to link them to the ob/gyn who performed the emergency surgery. We think that the admission to the obstetric floor, duration of hospital stay and certain prescriptions trigged a false hit of “pregnant”.
This data is poorly anonymized and can be trivially reconstructed. Both an attorney I hired and an investigator from the state health department confirmed that. There was no retail behavioral tracking because the events happened before my wife was aware that she was pregnant.
Insurance company and medical data is used for all sorts of practices. The state of Georgia, for example, has a anti-opioid surveillance system that uses Medicaid, private insurance, and behavioral data to identify pregnant women at risk of delivering an opioid addicted child. (Which will cost state Medicaid $1M or more)
Edit: I see now that technically they say it’s anonymized. I’m assuming that’s the answer.
So basically HIPAA is privacy theater, the same as we have security theater at the airport now and financial accountability theater and the latest: data privacy theater via GDPR.
It is a clear cut HIPAA violation if the data sold to the data aggregator contained personal identifiers of any kind. The standard is not that data be "anonymized" but that all personal identifiers be redacted from the data. This includes dates, addresses (typically the finest grain of address information you can include is the first 3 digits of the zipcode, and sometimes not that), birthdays.
The redaction does not mean that data cannot under some circumstances be re-identified, of course. But it's not trivial.
HIPAA (one P, two As) is chiefly for data portability between doctors; not privacy. HIPAA is not intended to and does not protect your medical information from insurers.
Sorry to hear about your daughter. That is really tough.
Correct, though it's interactions between any of doctors, payors (insurers), and information brokers (more like clearing houses if I remember correctly) - between themselves or each other.
So if any of them or their business associates got the information and sold it that would be a violation. But if say Target figured it out because she was buying a lot more orange juice and lotion (true story, Target‘s ability to figure out who’s pregnant is legendary) and sold that into it would not be covered under HIPAA
There's a weird thing going on where everyone pronounces it as if it were spelled "hippa", and then everyone believes it must be spelled that way because of how it's pronounced.
It doesn't seem to have occurred to anyone to pronounce it in a way that's compatible with the spelling.
English has barely any words with "aa", so you can't differentiate the vowel in a normal way.
There's no standard way to exaggerate a vowel that would work either. The closest I can think of would be an extra-long pronunciation or sticking a glottal stop in to make it sound like two vowels, but those would both be confusing and sound too unlike a real word.
And no matter what you do it's going to sound like hippo.
> The most common pronunciation of “aa”, especially when its not in initial position, seems to be IPA ɑː
> Which is how pretty much everyone in the field pronounces it in “HIPAA”.
Something's gone wrong. People in the field pronounce it /ə/, as in the second syllable of "comma". This is a fairly common pronunciation of "aa", but only as in "Isaac". It doesn't match Aaron, aardvark, baa, aah, argh, waah, bazaar, salaam, or Quaalude.
Your list mostly consists of words that cannot reasonably be said to exist in English. But of the real ones:
START / PALM [ɑ]: ah [spelling variable; attested in the list as 'aah'], argh [spelling variable; attested in the list as 'aargh', 'aarrgh', and 'aarrghh'], bazaar, aardvark, aardwolf. Also salaam, which is a foreign word that is common enough to have a conventional spelling. (So is "niqab", which appears on the list as "niqaab"...)
TRAP [æ]: baa, waah.
FACE [eɪ]: Quaalude, which is a proper noun. Why is it in a list of Scrabble words? Isaac and Aaron aren't there.
Personally, I read it in my head as /hipɑːː/, with the artificially lengthened vowel that you note. But if you wanted to wear that down into a form that was easier to say, /hipɑ/ is still very distinct from /hɪpə/, and /haɪpɑ/ would be too.
It's only three letters, it's hard to know what sound the C would make, and pronouncing an acronym with only one consonant is a bit iffy for clarity. Those are small factors but they add up.
HIPAA has the double A acting against pronouncing it, but everything else pushes toward pronouncing and not worrying about that part.
> It's only three letters, it's hard to know what sound the C would make, and pronouncing an acronym with only one consonant is a bit iffy for clarity. Those are small factors but they add up.
URL?
I can't keep this up, but I'm being limited more by my acronym vocabulary than by any tendency for people to avoid pronouncing them letter by letter.
>> HIPAA can't be easily pronounced
> Most people don't seem to agree with you.
This is how we started the thread, by observing that people have enormous problems spelling HIPAA because they believe the spelling should correspond to the pronunciation.
A precondition for this observation is that the spelling isn't being pronounced.
> This is how we started the thread, by observing that people have enormous problems spelling HIPAA because they believe the spelling should correspond to the pronunciation
People who read HIPAA have no problem pronouncing it.
People who hear hɪpɑː sometimes have trouble spelling it, coming up with “HIPPA” instead of “HIPAA” (this often isn’t because it is the most natural read of the phonetics, but because they back-figure the acronym from the best-known focus of the law as “Health Insurance Privacy Protection Act” rather than “Health Insurance Portability and Accountability Act”.)
URL is only three letters, like CIA, but everything else you said about CIA doesn't apply. There is no ambiguity as to how it would sound, and there are multiple consonants.
The rules of English spelling don't allow for the sequence "aa" at all. That's why the scrabble list consists of (in small part) onomatopoetic expressions of variable duration, where repeating the "a" shows longer duration, and (mostly) foreign words.
As a fedex driver I always dreaded delivering those. Not because of the possibility of unfortunate outcomes such as yours (never crossed my mind), but because I knew they were unsolicited "gifts" that came from big data schemes like this. Just like how marketing companies know that you're pregnant even before you do.
I will never believe someone when they say info is "de-identified", because even if it is, it is shockingly easy to pinpoint 1 person out of millions based on a very small number of unique factors.
I think it’s because the word “anonymous” permeates our understanding of the concept, and – even for people who didn’t actually study Greek – the tie to “name” is clear.
We need to understand that “knowing someone’s identity” is not coextensive with “knowing their name”, and that in fact knowing someone has a rare medical condition may be more identifying[0] than knowing their name.
[0] Or k-deanonymising, for anyone who’s pedantic about identity being an absolute.
A lot of people simply don't know this. Politicians either don't know or claim to not know.
This is a good example of the law and popular conception of a concept being badly out of date, to the advantage of industry and disadvantage of regular people. Therefore industry has a vested interest in keeping the public perception focused on "de-identified" with a narrow definition of PII.
I spent a good deal of my professional life in the last decade dealing with the problem of de-identifying medical data. You are correct that it's hard, but the HIPAA rule is actually not a bad go at it. See https://www.hhs.gov/hipaa/for-professionals/privacy/special-....
For the kind of data in this particular database (mostly insurance claims data), it's highly unlikely that you could learn much through re-identification of the data.
> What happened to doctor-patient confidentiality?
It was cast by the wayside long ago. First it was the "mental health" exemptions, then various law enforcement provisions, then third-party "office solutions", then transcriptionist services, then private medical databases...
You might as well assume that anything you tell your doctor is going to be recorded and (eventually) used against you.
And then, as was alluded to earlier, there's the problem of incorrect information in those databases... and your only recourse is to give them more-accurate information to sell.
All the dice are loaded against the patient. And as anyone online for long knows, data is forever.
The data is de-identified, and thus not subject to HIPAA restrictions.
It's not made entirely clear in the article, but most of this data is insurance claims data, not medical records per se. That's why employers have it. If your employer underwrites your medical claims directly - which most do nowadays - when you or your doctor submits an insurance claim, they are submitting it to your employer. It may go through a health insurance company - since most employers hire one to administer their plans - but that insurance company is collecting the information on behalf of the plan owned by your employer. The fact that it's insurance claims and not raw medical records is one of the challenges IBM had in making a business out of analyzing it. There is a lot less and less quality, medical data in insurance claims than IBM hoped.
> Medical data mining companies have made a business of scraping the details of consumers’ daily lives into medical dossiers that, if combined with MarketScan’s de-identified information, could be used to re-identify the individuals within its databases.
De-identification is unreliable. If you have enough context, then the patients can be re-identified.
I agree that de-identification is imperfect, and re-identification is a risk (re-identification is, however, illegal in several states). However, the notion that there is big business in re-identifying medical records to target individuals is largely a myth, because there isn't a great market for individual-level information. Life insurance companies can't use it - they are required to get information they use for underwriting from the applicant, or with the applicant's permission. Companies who market medical products generally won't or can't use it (for one thing, it's typically not very timely - by the time your diagnosis appears in a re-identified data set, the time to get your attention is pretty well past for most conditions).
The companies that came seeking medical records (I was until recently Chief Technology Officer for a clinic/hospital system with 10M active patient records) were satisfied with de-identified data, because they were looking for insights to develop new, or re-purpose existing, therapies and technologies, or to create marketing strategies for their tech. They had no need or desire to know who a particular patient was, because they were looking for broader insights.
Yep and covered entities are usually related to billing for medical care. As an example, almost all life insurance companies are not even hipaa compliant because they aren’t covered entities.
In net - hipaa doesn’t protect medical information generally - only the subset that’s usually visible to doctors. And even then, it stops working as soon as the info is outside a covered entity.
If you video chat with doctors, look at the privacy policies of the video chat platform. IME, it doesn't differ from most other applications: We collect what we want, including medical data; we share it with partners for marketing, etc. And of course the only clause that matters: We can change our policy at any time by posting a new one to this page (i.e., without notification, effectively), so effectively we have no policy and can do whatever we want.
Someone I know needed a prescription medication that they didn't want in public records, for understandable reasons (I would be concerned too). It wasn't a 'controlled substance' like a narcotic; what they wanted was perfectly legal and appropriate; it was just a prescription drug that implied a certain underlying condition and related events, and they wanted privacy. They asked me for suggestions. I suggested that the best they could do was to find a small, independent pharmacy; call ahead to verify that the pharmacy didn't share its data; and pay with cash (apparently ID was required to obtain any prescription drug). They did all that and reconfirmed all that at the pharmacy. The next time they were at a doctor in the giant local health co, the nurse asked about the drug, which appeared in the health system's medical records. Later, for unrelated reasons, they obtained their own credit report and it included the drug (!).
Epilogue: When I go to the doctor these days, as I'm signing in they tell me to sign on a small touch-sensitive pad. The pad is about the size of a signature line and displays nothing. What am I signing? That I received their privacy notifications. (I didn't omit a step - they never mentioned or offered them otherwise.) My pharmacy is similar - 'sign', they tell me.
I would need hard evidence to believe a credit report had someone’s prescription medication listed on it. Or anything non borrowing money related / identity / location history listed.
Even the Medical Information Bureau report should not list medications:
> Doctors, hospitals, pharmacies, and other health professionals can’t submit information to MIB. The report won’t include every diagnosis, blood test, or a list of your medicines.
I actually saw it. It was in a separate section of the report. Also, how did their hospital chain obtain the data? Connecting dots - a dangerous form of reasoning - I guessed it was via this report.
How can I find out what’s in my medical payment history?
You should contact the consumer reporting agencies that specialize in medical records or payments. These agencies may supply reports on your prescription drug purchase histories, medical conditions, data from your insurance applications, and data from other sources. Life insurance companies, for example, commonly use these reports to evaluate policy applications from potential customers.
Credit reports come from the credit reporting agencies, of which there are 3 - Equifax, Experian, and Transunion. You are legally entitled to get one free credit report per credit reporting agency per year from annualcreditreport.com
The above quote:
> the consumer reporting agencies that specialize in medical records or payments.
indicates to me that the consumerfinance.gov website is not talking about the credit reporting agencies because they do not specialize in medical records, and hence medication information would not appear on a credit report.
As I understand, a healthcare provider would have to sell the unpaid bill to a debt collector and only then would the debt show up in a credit report (but it would not have details about what you purchased, just like any other debt does not on your credit report).
Perhaps you did see details of medications on a credit report from Equifax/Experian/Transunion, but it makes no sense to me, so I would need extra proof.
It was a pretty standard-looking credit report, though it was long and detailed.
IME, 'what makes sense to me' is an unreliable signal of truth. It does strongly signal my limits, however. If I treat it as a signal of truth, I limit the world to what I already believe - a very narrow subset that is also subject to my biases and misunderstandings, and one that excludes the most ripe areas for learning: things I have no idea about, that are a corrective for my many errors, or that I had no idea even existed.
I wonder what they plan to do with our data? I found this:
- Francisco Partners is a leading global investment firm that specializes in partnering with technology and technology-enabled businesses.
- FP's current and past investments include such companies as BeyondTrust, ClickSoftware, GoodRx, Ichor Systems, iconectiv, LegalZoom, Quest and Verifone.
and from the article:
- Francisco Partners had previously purchased stakes in the telemedicine and drug coupon company GoodRx, the virtual appointment booking company ZocDoc, and Edifecs, a company that builds software to enable a more seamless exchange of data.
- The firm declined to comment on the acquisition or its plans for the MarketScan database.
Don’t overreact to this. It’s really not a big deal. I have worked with data of this kind for years.
First of all, there are a bunch of datasets like this or very similar to it both public and private. Example: have you ever been to a hospital in California? There is a state agency called OSHPD which makes inpatient discharge records available to academic researchers (like me) and to others like insurers. Texas, Oklahoma, New York (iirc) and many many other states make similar data sources available.
What’s actually in the data? Generally all of the ICD-9 or -10 codes generated by a visit. There may be very limited demographic information - race, ethnicity, zip code, age. That’s about it. (That’s what de-identified means here.)
Sometimes it will tell you the source of admission (eg “arrived by ambulance”), and generally will tell you the source of the payment (eg “patient has private insurance”). Some of these datasets will let you track patients over time (TX has a version like that) within a year.
It is extremely extremely difficult to match this information to actual people, assuming anyone even cared to do so (which almost certainly no one does!).
For one thing, most zip codes are large and contain many people (generally only three digits of the zip are reported if there are very few records from that zip).
More importantly, you just don’t have that much information about anyone in the data - you would have a row which essentially says: “Hispanic male age 50-54 from zip code 78201 who was billed for a knee replacement.” Good luck figuring out who that is!
Even for very rare illnesses it doesn’t work… if you alreadyknew someone in 78201 had X Rare Disease and went to Y hospital, you might be able to find the record in the data with some work. But what additional information would you get? Maybe how much his insurance was billed?? You already had to know the guy had a rare disease just to find him!
Also - observe the fact that this dataset keeps getting sold. That tells you something about how useful it actually is in practice. I get that some MBA at IBM thought it would be great for “generating insights” about cost drivers and saving money. It’s really not.
If you look for the “cost drivers” in this data you will see that long visits and complicated procedures are costly. And then your next step is…??? This data does not give you anything which will help you predict or avert such episodes. Neither IBM nor anyone else has actually come remotely close to solving this problem, right? I am not aware of a single effort which has even moved the needle on spending by 1%! After holding on to it for a few years, IBM or whoever owns it next will talk up its value and sell it to another sucker. (If it were really worth that much to IBM, they wouldn’t sell it!)
If you want proof that there isn’t a privacy concern here, get exactly this data from Texas (which makes it publicly available). Open it up and let me know how easy a time you have finding anyone you can identify. Even if you have friends and family in the state who you know went to the hospital you’re going to have an extremely hard time.
> IBM’s efforts to use the repository to transform broad swaths of the health care system ultimately fizzled. The company struggled to create the cloud storage and computing infrastructure needed to combine all the data so it could be analyzed by its AI and analytics machinery.
Wow still? I'm a little surprised by this as almost as lots of cloud offerings these days seem designed for massive scale.
> Under data protection law, you have several rights over your personal information. You can exercise any of your rights by contacting us at InformationRights@UKHSA.gov.uk
> COVID-19 data may need to be shared with WHO for research purposes and where required to help trace contacts internationally. These are restricted transfers made on the basis of being important for reasons of public interest, where we rely on one of the derogations under Article 49(1)(d) of the GDPR.
> Personal information includes ... name, DOB, address, employer, locations visited, travel itinerary, mental health, lifestyle, social circumstances, ethnic origin, DNA/biometric data ... We send personal information to ... Amazon, AWS, Deloitte, MoD, NCSC, ONS, Police, Palantir, Serco, WHO [and others]
> In addition to an individual's credit history, data collected by MIB may include medical conditions, driving records, criminal activity, and participation in hazardous sports, among other facts. MIB's member companies account for 99 percent of the individual life insurance policies and 80 percent of all health and disability policies issued in the United States and Canada.
https://www.ftc.gov/news-events/press-releases/1995/06/medic...
You may request the data they have on you and allegedly you can dispute the information.