Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> That’s a tiny percentage: about 0.02% of your genome. So no, they don’t have your genome, but they do have a small sample of it.

What kind of reasoning is that? Fine, they're not doing whole genome sequencing on you (yet), but having a detailed chip profile of several million informative SNPs absolutely can and will be used to profile you.

Very quickly and easily I might add.

Classical linkage analysis has been used quite effectively to profile people since the 80s using only a handful of (polymorphic) markers, because the power of the analysis is driven more by the number of related members than by the number of markers of an individual.

23&Me has a customer base of more than 10 million people(!!)



> Fine, they're not doing whole genome sequencing on you (yet).

We do Whole Genome Sequencing, and sometimes we outsource the sequencing. We always get the excess of DNA back, and it is stored in our own freezers. Even in this scenario we can't be 100% sure they don't store the DNA or the files for their own purposes, but that's the risk we assume. The DNA we send is only identified by a number.

I can 100% imagine a company such as 23andMe storing DNA for later sequencing, or even doing WGS to do their side business, while sending you back only the genotype. Did you request your excess of DNA back? No, you didn't, because you didn't even know how much you sent or how much is needed for a genotyping. What you did was linking your DNA with your real name and some extra data, so further data augmenting is trivial.


> I can 100% imagine a company such as 23andMe storing DNA for later sequencing

They do, as far as I know. Most genealogical DNA testing companies do, and they tell you so. In case you want to upgrade the analysis later.

> doing WGS to do their side business

That would land them in hot water with the EU. Per GDPR, you can't ask for PII for one purpose and use it for something else down the line. 23andMe customers didn't consent to WGS.

But there's another reason I think they wouldn't do that, and that's that WGS is time-consuming and expensive. Some random person's DNA data isn't that valuable. There's a reason payment is part of their business model, and if that's true for cheap microarray tests, how much more isn't it true for terribly expensive WGS tests?


> terribly expensive WGS tests

https://services.bgi.com/wgs-sequencing

Making a report is expensive. The WGS maybe cheaper than you think (I don't know how much you were thinking). Then running a quick admixture and some PRS analysis on the data by yourself, with the aid of ChatGPT or some of its friends if you have zero clue, it can be surpringsingly affordable.

BGI is chinese, but there are other services out there doing similar services (e.g. Macrogen in Korea).


I've always been interested in having a full genome sequencing, do you have any other recommended resources for DIY/à la carte sequencing analysis?


There are lots of analysis you can do with a WGS, and lots of them involve a high level of expertise. But the basics are not that difficult, if you assume that you might be wrong (for example, you can try to do basic PGS for some conditions, but don't freak out if you score too high in some of them: take it as a hint worth of asking for professional guidance).

Assume you have your WGS files. BGI/Macrogen can already provide you with basic analysis (like the Variant calling), but next you can do is head to https://nf-co.re/ and look for pipelines that might be of your interest: sarek, raredisease, oncoanalyser, hlatyping... and have some fun. Next head to https://snakemake.github.io/snakemake-workflow-catalog/ and look if there is something you can be interested in (grenepipe, dna-seq-gatk-variant-calling, PopGLen...). With some help from a decent LLM you should be able to run your "admixture" that some people loves (the "I'm 3% asian, 24% african, 10% nordic..."). I don't know anyone that does à la carta analysis, we do all of them ourselves, but sure there are some.

Note that you enter a difficult world, and there is no absolutes there: is fun to play, the same way that is fun to track your own sugar levels with an app. But you shouldn't be diagnosing yourself or taking radical actions based on the results. You are not debugging yourself. Soon enough you'll be wondering about your RNA-seq under X or Y conditions.


YSEQ is a German mom and pop lab which does next generation sequencing for genealogists. It has rather little utility for genealogists, but some people are really into figuring out their place in the great Y chromosome haplotree, so apparently there is a market. Family Tree DNA also does some next generation sequencing, of the Y chromosome and mitochondria only - they have bigger trees than YSEQ, but YSEQ I believe gives you the option to get even more resolution on the Y chromosome.

YSEQ will do full genome NGS if you ask for it but it's expensive, can take six months or more, and the owners have apparently been known to cancel your order and refund you if you complain about the wait times.


> but having a detailed chip profile of several million informative SNPs absolutely can and will be used to profile you.

Yes, that was 23andMe's business model. They thought so too. Since they went bankrupt, I think it's safe to say, the commercial utility of such profiles was pretty overrated.


Of course they don't store your entire genome; 99.9% of that is identical for all humans. That has no value to them at all. It's only the 0.1% that can vary between humans that's of any interest.

(Note that there are very different ways to measure that percentage and they can mean very different things. I'm not intending these percentages to be accurate, but I'm sure you get my point.)


They don't have all your personal information, they just have your name and address.


Which makes it trivial to buy a database and correlate everything.


I think GP was making a joke about the (small!=unimportant) information.


i.e. they can get the rest of it by sending ICE to pick you up.


Yes the 0.02% thing is a bit disingenuous because he knows better: the bases the chip covers were specifically picked because they are variable in the human genome. They don't have "your genome", but as most of it is the same for everyone those 640k snps give much more information than 0.02% of the letters of a book would.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: