23andMe Pulls Off Massive Crowdsourced Depression Study

Mtinie · on Aug 2, 2016

I'm very conflicted about trying a service like the one that 23andMe offers. Part of me would love to see what comes back, the other (currently louder) part of my brain is worried about throwing away any ideas of future genetic privacy if I do.

The worst part is, I'm not sure which is a more rational desire. Is my my privacy concern undue paranoia?

jasode · on Aug 2, 2016

It's true that 23amdMe doesn't want submissions to be anonymous[1]. However, that doesn't stop people from hacking their way around it.[2][3]

[1]https://customercare.23andme.com/hc/en-us/articles/202907890...

[2]http://venturebeat.com/2013/09/20/how-to-use-23andme-without...

[3]https://news.ycombinator.com/item?id=2431433

nommm-nommm · on Aug 2, 2016

[2] is just an advertisement for a company called Abine who apparently makes tools called DoNotTrackMe and MaskMe. Using your work address for a shipping/billing address is not "hacking [your] way around" anything.

jasode · on Aug 2, 2016

>is just an advertisement for a company called Abine

It's true the writer works for Abine (and she discloses that fact) but it's not relevant. Your comment (unintentionally) makes it sound like Abine is selling an intermediary privacy mailbox for receiving 23andMe data and that's not the case. Copying her steps requires zero interaction with Abine.

>Using your work address for a shipping/billing address is not "hacking [your] way around" anything.

She was using a fake name in combination with a work address which can be an adequate hack for many people if they are not under active FBI surveillance. If the work address has a thousand employees, that might be anonymous "enough". That's probably better anonymity than combining a fake name with your relative's (parent's/sibling's) address.

If one needs more anonymity than the fake name + work address, he/she would have to get way more creative. Possibly get on Silk Road 3.0 to buy a counterfeit government driver's license with an alias -- and use that ID to rent a USPS mailbox for $50. Also maybe get a wig and mustache to avoid face recognition from USPS cameras on the premises. Etc,

maga · on Aug 2, 2016

> Is my my privacy concern undue paranoia?

And are there genes responsible for it?

On a more serious note, I remember hearing about microchips the size of USB sticks that were supposed to sequence a DNA in hours costing "only" few grands back then couple of years ago. I'm surprised they didn't yet flood the market considering the privacy concerns when it comes to services like 23andMe.

Sure, those services do more just sequencing, but for those who would just like to check for a few hereditary deseases running in their family while sticking to their tinfoil hat that would be a bargain. In my case, though, my body is so tough that I'm afraid if my DNA is revealed it may be used for making a clone army one day.

dekhn · on Aug 2, 2016

You're referring to Oxford Nanopore's devices, MinION: https://en.wikipedia.org/wiki/Oxford_Nanopore_Technologies

Its error rate is too high to be important; the technology is overhyped.

yread · on Aug 2, 2016

Sequencing is the easy part. Going from an unordered set of 100-character reads (with 10% error rate) to a list of somatic variants (mutations, insertions and deletions that have a biological effect) is the difficult and costly part

maga · on Aug 2, 2016

> Going from an unordered set of 100-character reads (with 10% error rate) to a list of somatic variants (mutations, insertions and deletions that have a biological effect) is the difficult and costly part

Isn't it a software/data science problem? And if so, what makes if difficult? Computational complexity or the availability of reference data, or both?

dekhn · on Aug 2, 2016

If you assume the hardware is fixed, then yes, it's a data problem.

What makes it difficult is several things: the genome itself has many repetitive regions whose lengths are greater than 100 characters. If your sequencing technology produces reads less than 100 characters, you don't have enough information to place those reads within their proper location in the genome.

Next: There are regions that aren't exact duplicates of each other, but are very similar. If you have a 100 character read, and you know it probably contains errors (a necessary consequence of the sequencing technology), it can be hard to assign it precisely to a single region, as the read will match numerous regions equally well. A huge amount of effort is currently put into mapping these reads to the most appropriate location.

Next, because of the high cost of doing the assembly properly, heuristics are used. Often, the heuristics will be based on greedy algorithms, that try to tile overlapping reads to extend the reads into longer segments. However, due to the read error rate, you might accidentally tile two unrelated reads; this wil prevent you from finding the true, optimal solution. To correct for this ambiguity, you typically have to sample a large combinatorial set of possible solutions to find the best ranking one. This is a major area of research.

Mapping reads to reference data is often done (instead of wholesale assembly) but it suffers from the same problems. The reference is highly biased, it was based on a small number of individuals; when you try to map reads from an individual who is genetically distinct from the reference nindividual, there will be large regions that don't map (for example, I believe the reference was a European-American, if you try to map African genomes to that reference, you'll find there are entire regions missing from one that are in the other.

I don't know a lot of CS, but there is probably a term for taking a bunch of reads and mapping them in a way that maximizes the probability of the mapped reads representing the true solution (IE what is actually in the person's genome). Most people come up with heuristics that (IMHO) have serious deficiencies. When I ran Exacycle at Google we used it to do a "real" assembly (full n2 comparison of all read pairs) and we found it was far more accurate than previously assembled genomes. Subsequent to that, Gene Myers (who designed shotgun assemblers) used these reults to find a heuristic that produced assemblies that were nearly as good but at a much lower cost. I'm personally still skeptical.

If sequencers produced 10KB reads with 0.01% error rate, I'd be happy.

maga · on Aug 2, 2016

Thanks for detailed reply! This really puts things into perspective.

From the NLP side of the data munching isle, it sounds like a lot more complicated variations on language processing problems.

Comparing reads to each other is like approximate string matching (or fuzzy matching) we use to account for spelling errors in words, but genome chunks are longer and you don't have dictionaries to check against.

And finding the best arrangement of those reads is akin to language modeling where we assign probabilities to word sequences which later can be used to predict the most likely sequence. In case of genomes, though, with so little data, no standard "words", and the error rates in reads, it's like trying to put back a shredded book of in illiterate author only having few other books of illiterate authors as reference.

dekhn · on Aug 2, 2016

my introduction to DNA sequence analysis was "the linguistics of DNA" by David Searlers: https://www.jstor.org/stable/29774782?seq=1#page_scan_tab_co...

it applies the chomsky hierarchy to sequence analysis, there is in fact a ton of interesting literature around graphical models like HMMs, see https://www.amazon.com/Biological-Sequence-Analysis-Probabil... for more details.

dekhn · on Aug 2, 2016

Here's the reference from Myers regarding Exacycle and the subsequent DALIGNER: https://dazzlerblog.files.wordpress.com/2015/11/daligner.pdf

It took exacycle 404,000 CPU-hours to do the full n2 comparison (it's an embarassingly parallel problem), but since Exacycle supplied over 600,000 CPU-hours per hour, it took less than one hour to compute. However, that wasn't "scalable" if you wanted to do full de novo assembly on every human (2.8e15 CPU hours) so Myers came up with a heuristic.

ufukbay · on Aug 2, 2016

I'm in the same boat like you. For example I would love to see where my ancestors come from but I also have privacy concerns. (I would be willing to pay more for an anonymous service.)

losteverything · on Aug 2, 2016

I have concluded that I want to be a friend to my ancestors, not a voyeur. After long self debate I won't continue my love of genealogy with any of the 3 types of DNA tests- submissions.

But at some point I think it will be irrelevant because 1,2,3 Generations will triangulate me anyway

Fun reads on reddit genealogy

charlieflowers · on Aug 2, 2016

I reached the same conclusion. There is the risk that insurance companies will somehow use my own genome against me in the future.

I want to learn everything my own genome can teach me, but have complete control over which parts of that knowledge I share.

orionsbelt · on Aug 2, 2016

In the U.S, the federal law GINA prevents genetic discrimination by health insurance companies. It could theoretically be repealed one day, but I think that is extremely unlikely.

There are, however, certain gaps in the law. For example, I believe life insurance is not currently covered.

nommm-nommm · on Aug 2, 2016

>It could theoretically be repealed one day, but I think that is extremely unlikely.

For reference on how unlikely - https://www.congress.gov/bill/110th-congress/house-bill/493/...

04/25/2007 Passed/agreed to in House: On motion to suspend the rules and pass the bill, as amended Agreed to by the Yeas and Nays: (2/3 required): 420 - 3 (Roll no. 261).(text: CR H4083-4094)

04/24/2008 Passed/agreed to in Senate: Passed Senate with an amendment by Yea-Nay Vote. 95 - 0. Record Vote Number: 113.

05/01/2008 Resolving differences -- House actions: On motion that the House agree to the Senate amendment Agreed to by the Yeas and Nays: 414 - 1 (Roll no. 234).(text as House agreed to Senate amendment: CR H2961-2972)

05/19/2008 Presented to President.

05/21/2008 Signed by President.

[The one Nay in the House was Ron Paul].

thaumasiotes · on Aug 2, 2016

> There is the risk that insurance companies will somehow use my own genome against me in the future.

They do that now. That's the whole point of asking for your family medical history.

criddell · on Aug 2, 2016

My memory may be failing me, but I think when I last applied for life insurance, they asked if I had my genome sequenced and the possible answers were yes, no, or I don't want to say. I don't believe they asked for any information beyond that.

I could see the day where they offer a discount if you disclose your data just like car insurers offer a discount if you opt-in to tracking.

mrfusion · on Aug 2, 2016

23andme wouldn't count as sequencing anyway right? They're just looking for certain snp's.

criddell · on Aug 2, 2016

My memory may be wrong. It was a decade ago so maybe it wasn't DNA testing. I'm thinking it might have been full body scans. They were popular for a while, but I really haven't heard much about them lately.

wccrawford · on Aug 2, 2016

Last I checked, it was specifically illegal for them to get that information for insurance purposes.

criddell · on Aug 2, 2016

I suppose it would be okay for them to collect that information, as long as it wasn't used to set insurance rates or to deny coverage.

nommm-nommm · on Aug 2, 2016

Doesn't the ACA and GINA prevent them from doing this anymore?

nkurz · on Aug 2, 2016

Could you (and others) be more explicit about which privacy concerns you have? Rather than treating them as an amorphous whole, it's probably more accurate to consider each concern individually. I've used 23andMe, and think it's an acceptable low risk compared to the potential benefit. But maybe I'm missing something that's obvious to others.

nommm-nommm · on Aug 2, 2016

What's the potential benefit?

nkurz · on Aug 2, 2016

To me, the main benefit is potential knowledge about specific risks that I might be able to mitigate. But even if it just satisfies my own intellectual curiosity, I find it to be a benefit.

To the world, I think that having a large dataset helps to improve the quality of studies such as the one being discussed. There's great utility in having a "hold-out set" for verification.

The OP presumably perceives some similar benefits, hence the sense of being "conflicted". Whether the benefits are real is a valid, although separate question than whether the privacy concerns are justified.

naasking · on Aug 2, 2016

> The worst part is, I'm not sure which is a more rational desire. Is my my privacy concern undue paranoia?

Probably. Getting a DNA sample from you would be trivial if someone really wanted it. You throw off genetic material all the time.

At worst, having it already archived on a service lowers the barrier slightly.

mattmanser · on Aug 2, 2016

You're looking at it the wrong way round.

You're not protecting yourself against being specifically targeted, you're protecting yourself against getting erroneously caught in a dragnet.

marchenko · on Aug 2, 2016

Your privacy concern is not necessarily undue paranoia. It can be difficult to adequately anonymize genetic data, which can be personally identifying by its very nature, without much accompanying metadata. And you should also keep in mind that you can partially compromise the privacy of your genetic relatives as well. 23 and me may have reasonable privacy protections for the current state -of-the-art, but advances in genomics mean that what is 'anonymized' today might be personally identifiable in the near future. And what happens if 23 and Me, or its assets, are sold in the future? Unlike email addresses, login details, or bank accounts, genetic data remain personally identifying for you and your relatives and descendants and are inalienable (at least until the BioSingularity)

While it is true that anyone could collect your shed cells to analyze, collecting personal artefacts requires some degree of prior specific interest in you sufficient to justify the costs of the collection and analysis - this is a barrier. When your data is uploaded into an easily-searchable database, this barrier is lowered, allowing those with access to trawl the DB for any members matching certain criteria. You should also keep in mind that all of these private databases are open to subpoena by various governmental jurisdictions. If you are familiar with the Birthday Problem and the various common abuses of Bayesian reasoning, you can imagine how this could pose a problem even for law-abiding citizens (and their relatives, remember). Additionally, it should be noted that regulations against adverse use of publicly-stored genetic information (e.g. by insurance companies) are by no means universal.

Aggregated SNP data is very useful for researchers. But the utility of these commercial SNP products for the customer - beyond satisfying personal curiosity - is often overstated. If you are in the fortunate position of having access to a reasonably adequate family medical history and regular physical examinations, you will receive little marginal medical benefit from these services. Many of the phenotypes that customers spend the most time discussing on various fora can be reconstructed with reasonable accuracy from a pedigree, and the predictive power of these assays for many polygenic traits is fairly low at present. If you are interested in a few particular traits (like caffeine or drug metabolism) I would suggest ordering a kit for just those traits at first, and also playing around with data available on the 1000 Genomes website (www.1000genomes.org). This can be a bit more expensive, but it allows you to establish your comfort level with DNA data before exposing you to any privacy concerns. Of course, if you are interested in conducting personal research on your own SNP chip data and are already familiar with the data format and all of the associated statistical caveats, 23 and Me provides a good product at a reasonable price.

miguelrochefort · on Aug 2, 2016

> Is my my privacy concern undue paranoia?

Yes.

erdevs · on Aug 2, 2016

I believe this is the underlying paper: http://www.nature.com/ng/journal/vaop/ncurrent/full/ng.3623....

GWA studies general should be treated with great caution. The way they work generally is based on a simple p-value test of association among outcome (in this case, depression) and all genes based on SNPs. There is a high degree of mere chance association and false positives. Most GWA studies leave a lot to be desired.

This one looks more solid than most. The p-value appears to be 10^-5 and they ran a replication data set as well. Many GWASs report much less stringent p-value and many don't run sub-replications.

One interesting aside about GWA studies: this may have changed recently, but I believe it used to be that any GWA study involving funding from the NIH was requires to post its data to a public, freely available database. That always seemed a good practice to me and one that should be emulated in other sorts of studies involving government funding. I wonder if this study is subject to that requirement.

aab0 · on Aug 2, 2016

There are 3 datasets here, the 23andMe discovery dataset (n=300k), the previous Psychiatric Genetic Consortium ('PGC'/'MDD' in the paper), and then a second 23andMe heldout validation sample (n=150k).

The set of 5 hits discovered in the discovery dataset were at the usual 10^-8 threshold. They then meta-analytically pooled those results with the older PGC results and got a set of hits at 10^-5. Finally, they took those two datasets and pooled it with the held out replication dataset, yielding the final set of 17 hits at 10^-8. The final results are at the significance level you want, and almost all of the signs for the top SNPs are the same between datasets/cohorts (presumably why they reported broken-out sub-analyses rather than skipping straight to the final results, to demonstrate consistency). Those 17 are the ones used in the rest of the paper.

> Many GWASs report much less stringent p-value and many don't run sub-replications.

This isn't true. Most GWASes use the standard genome-wide significance level of 10-8. If they do not, it's because of well-motivated other considerations such as being replications of previous hits. (If you are testing replication of 5 earlier hits, rather than 500,000 SNPs, your p-value threshold ought to be looser.)

> I wonder if this study is subject to that requirement.

The supplementary gives the top 10k hits, which is the most critical part of the data, which you can use for polygenic scores, gene sets, heritability & genetic correlations via LD score regression etc. It's only top 10k SNPs because I believe 23andMe imposes that as a requirement on people using its data - something about possible reidentification if too many SNPs' values are released. (There are, of course, other cynical business-related reasons for why they might impose such a requirement.) I've seen that done in a few other GWAS studies like educational attainment, and they said it was because of 23andMe.

erdevs · on Aug 2, 2016

> There are 3 datasets here...

Yes, understood. Part of the reason I dug up the actual paper was so that others could see it, as the news source linked didn't go into much detail. I didn't read the entire paper, and appreciate the additional explanation here. I'm sure it's helpful for others as a condensed summary as well.

> > Many GWASs report much less stringent p-value and many don't run sub-replications.

> This isn't true.

What is not true? Do all GWAS studies do similar holdout replications and compare to other results? No. Have all GWAS studies over time always adjusted appropriately for multiple testing and held to a high significance standard? No. That is a more recent development and set of standards in the field over the past few years. I didn't make an outrageous claim; I simply noted that many GWASs have used much lower significance thresholds (especially in the past) and many don't do thorough holdout replications nor replicate in other ways.

> Most GWASes use the standard genome-wide significance level of 10-8.

Indeed. I see that you are aware of this push in the field for greater degrees of significance and that this is now fairly standardized... yet in another comment you appear to be debating me when I stated that this has been the case.

nonbel · on Aug 2, 2016

>"The p-value appears to be 10^-5 and they ran a replication data set as well. Many GWASs report much less stringent p-value and many don't run sub-replications."

The p-value really should be totally irrelevant to your assessment, unless you believe the null model they used is literally true. I didn't check the paper, so maybe this is one of those rare times. However, in general I've seen that everything is correlated with everything.

jessriedel · on Aug 2, 2016

> There is a high degree of mere chance association and false positives.

Are you really suggesting that most GWAS studies don't calculate genome-wide significance? This is wrong. If that's not what you're suggesting, I don't know what you're saying.

erdevs · on Aug 2, 2016

You should try reading up on the subject before being blithely dismissive.

False discovery rates, false positive rates, and family-wise error rates and how best to control for them are ongoing areas of interest and research in GWAS. There are calls for p-value requirements in the 10^-8 range to help avoid this. There are calls to address stratification (which can affect both type I and type II errors). A lot of research and debate on this topic over the past several years. Who are you exactly to dismiss all of this scientific inquiry? It's great if you have expertise in another field, but it seems odd to dismiss scientific questions and ongoing research in this particular field.

You'll find plenty of information if you actually seek it out rather than simply making a knee-jerk, snarky comment, but here is one example article which articulates some of the issues that have been under consideration in recent years: http://m.ije.oxfordjournals.org/content/41/1/273.full

Note, there, how the level of significance is discussed. A p-value of 10^-7 to 10^-8 is suggested (as compared to this study's 10^-5 level of significance... and, believe me, many GWA studies have been published with much less significant p-values).

This is an ongoing and active area of discussion in the field. I'm not an expert, but some of my colleagues are, and it's a topic they sometimes discuss and brainstorm over lunch, etc.

Actually, even the wikipedia page on GWAS mentions some of this inquiry and debate, as well as the erroneous publication that has plagued this nascent field. It'll all be worked out over time and great discussions are happening here. Vast improvement in processes and standards has been made over the past few years in particular. But we don't move the ball forward by dismissing questions or incorrectly assuming all must simply be right and well.

jessriedel · on Aug 2, 2016

My comment was not intended to have the barbs on it that appear to have been read into it. ("Who are you exactly to dismiss all of this scientific inquiry?" "...knee-jerk, snarky comment...") I'm sorry that it came off that way. In any case, I still don't know what your concrete claim is.

> here is one example article which articulates some of the issues that have been under consideration in recent years: http://m.ije.oxfordjournals.org/content/41/1/273.full

From the abstract of that article: "Currently, associations of common variants reaching P ≤ 5 × 10−8 are considered replicated. However, there is some ambiguity about the most suitable threshold for claiming genome-wide significance." So, people do calculate genome-wide significance, and there is some ambiguity over where exactly that line should be. This is in line with my understanding of the situation.

The exactly analogous statement can be made within particle physics ("there is some ambiguity about the most suitable threshold for claiming significance"), where it is often called the "look-elsewhere effect". But this prudent caution doesn't cause to people to say FUD like "particle physics studies should be treated with great caution" or "Most particle physics studies leave a lot to be desired". Such statements may absolutely in fact be true for GWAS studies, but you didn't give any good reasons for it.

Indeed the very article you link to ends this way: "Conclusion: A substantial proportion, but not all, of the associations with borderline genome-wide significance represent replicable, possibly genuine associations. Our empirical evaluation suggests a possible relaxation in the current GWS threshold."

How should one square that conclusion with your original comment?

erdevs · on Aug 2, 2016

Thanks for the reply, and my turn to apologize for misinterpreting your tone.

You asked initially what exactly I was saying, and here again how to square what I'm saying with my original comment. So, let me try to explain, and I would hope we're not on different sides of this as I think what I'm saying is reasonable, given the context.

I originally said: "GWA studies general should be treated with great caution. The way they work generally is based on a simple p-value test of association among outcome (in this case, depression) and all genes based on SNPs. There is a high degree of mere chance association and false positives. Most GWA studies leave a lot to be desired."

For context: genome-association studies have had a history of being blown out of proportion in the press. And often for outcomes which greatly affect people's lives. Depression is one such issue and it'd be a shame if people were led into thinking there is necessarily a great breakthrough here in understanding possible gene-linkages to depression outcomes. It'd also be a shame if the result was ignored. I tried to provide some praise to this paper for being fairly rigorous, but also note that GWAS studies should be treated with caution generally.

Why should GWAS studies be treated with caution generally? (Besides that results of any study should be treated with some degree of caution.)

Well, firstly, GWAS is a fairly nascent field. Unlike physics or even particle physics, it hasn't had that much time to mature. This is doubly so when applying GWAS to mental health. I'm sure the paper covers some of these risks (or at least it hopefully does)... but relying on self-reporting introduces potential selection bias in the population sample, as does using people seeking help vs a general population study. Quickly reading the paper, it looks like it relied on self-reports and analyzed only people who'd been diagnosed with major depression (meaning they'd sought help). We should be cautious in over-generalizing based on this.

Secondly, GWAS has had a rough history of overstating results and misapplying analyses. It's much better today than it was even, say, 5 years ago. Ioannidis and others made some heroic efforts to convince the field to clean up it's act, in effect starting ~7-8 years ago.

Thirdly, there is a historical pattern of results in this field being overhyped in the press.

Finally, there are active and sometimes heated debates in the field about how best to do GWAS. This is getting worse as a high-throughput, low-cost full-genome sequencing comes online to a greater and greater degree and SNP-based data sets fall by the wayside. Some question taking a frequentist approach at all in the face of such a huge degree of multiple testing. Others call for much, much higher requirements for holdout data sets, cross-validation, and replication before a study is published or considered final, especially when dealing with things like mental health (and their likely application to the field of pharmacology).

This is serious stuff that could end up affecting people's mental health treatments and lives, so caution is warranted. Especially given the field's relative nascence, self-admitted history of publishing low-quality results, the rapidly changing techniques, and the fact that there are ongoing debates within the field of how best to do GWAS analysis and how to effectively replicate results.

erdevs · on Aug 2, 2016

Here is an example of the sort of press reaction I'm referring to (this isn't a particularly egregious example, but it demonstrates the point): https://www.washingtonpost.com/news/to-your-health/wp/2016/0...

Look at the language there. Scientists "pinpointed 15 locations in our DNA that are associated with depression..." [emphases added]. There is no sense of nuance or any caution in conclusion here. Even reading the whole article, which perhaps most people won't even bother with, you don't get much sense that there's any degree of uncertainty here. There is no indication that the field at large is still wrestling with how best to analyze SNP studies at all, despite publishing studies "practically ever week". There is little qualification around the data set here (it only briefly mentions that 23andme's data is based on saliva, not blood or other cells... which may be problematic. But also it should be noted that disease outcomes in this study were self-reported and from a self-selected subset of the general population who sought professional help... and there are many, many other nuances to consider). Instead, we get the fairly flat-out impression that this is a definitive discovery and, not only that, but it is of huge significance in the field as well.

It may well be. But caution is warranted until time and replication hopefully do their work. Unfortunately, this is a pattern in GWAS studies and their public relations. And while the underlying methodology in the field has improved tremendously over the past 5 years, there is still a lot of debate within the field about how much more improvement needs to be made (on data sources, on analytic techniques, on review standards), and how to adapt to forthcoming changes in available study data.

I think this study appears to be more robust than many others. So, I'm not picking on it. In fact I gave it particular praise for being relatively robust compared to some other GWASs. But I do think generally people should be more cautious about interpreting GWA studies or over-generalizing them.

aab0 · on Aug 2, 2016

> You'll find plenty of information if you actually seek it out rather than simply making a knee-jerk, snarky comment, but here is one example article which articulates some of the issues that have been under consideration in recent years: http://m.ije.oxfordjournals.org/content/41/1/273.full

I assume you are referring to

" If the seven associations that did not reach P ≤ 5 × 10−8 when additional data were considered are assumed to have been false-positives, the false-discovery rate for borderline associations is estimated to be 27% [95% confidence interval (CI) 12–48%]. For five associations, the current P-value is > 10−6 [corresponding false-discovery rate 19% (95% CI 7–39%)]."

That doesn't show anything relevant. Failure to replicate at 10-8 is a ludicrous way to define non-replication; to paraphrase Cohen, surely God loves the 10-7.99 almost as much as the 10-8... This paper needs to adjust for power, and ask how many hits one would expect to not replicate at 10-8 given the power of the replicating studies. If you do remember power, GWASes replicate fantastically, for example https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3681663/

"Replicability rates are high within Europeans, with 155 successful out of 181 attempts (85.6%), when only 9 positive replications (∼5%) would be expected under the null hypothesis of no association (binomial test, P<10−16). This excess was robust to the significance threshold (e.g. 122 observed vs. 0.18 expected if only replication attempts achieving P<0.001 are considered successful and 56 observed vs. 1.8×10−5 expected for a threshold of P<10−7, Table S5). Moreover, replicability rates within Europeans approach 100% when accounting for statistical power. For the 168 attempts for which we could calculate the power to replicate the original finding (Table S5), we observed 147 positive replications, which is almost identical to the expectation of 149.1 positive replications given that average power is 89.1% (see Materials and Methods). This is expected, since most GWAS already contain an internal replication phase [1], [24]."

> and, believe me, many GWA studies have been published with much less significant p-values

I don't think they have. Ever since Ioannidis and others demonstrated what a total debacle the early candidate-gene studies were around 2009-2011, using the first GWASes to demonstrate that, GWASes have been pretty standardly done at 10-8.

erdevs · on Aug 2, 2016

> > and, believe me, many GWA studies have been published with much less significant p-values

> I don't think they have. Ever since Ioannidis and others demonstrated what a total debacle the early candidate-gene studies were around 2009-2011

Yes, pre-Ioannidis is the period I was referring to. Things cleaned up a lot in 2012+. What you said does not conflict with what I said... the field, especially early on, has published some spurious results. It seems like you're simultaneously saying "I don't think they have [published low-quality results]" and then immediately admitting what a "debacle" early studies sometimes were.

> That doesn't show anything relevant. Failure to replicate at 10-8 is a ludicrous way to define non-replication

This is a silly statement. Doesn't the significance level being "ludicrous" depend on things like the degree of multiple testing happening? Obviously, yes. The reason why the field (not just this one paper) has pushed for significance in the 10^-7 to 10^-8 range is partially for this reason. So... are you questioning the entire field's movement over the past few years? If so, on what basis?

As high-throughput, low-cost full genome sequencing begins to replace SNP-based techniques, GWAS will have to wrestle with this issue even more.

I'm not sure what exactly you're debating me on here. I'm not saying anything controversial in the field. Again, even the wikipedia article cites well-known studies and quotations from respected sources in the literature, including "Particularly the statistical issue of multiple testing wherein it has been noted that "the GWA approach can be problematic because the massive number of statistical tests performed presents an unprecedented potential for false-positive results"... which is what I originally pointed out. This is an issue the field has struggled with from the get go. It's matured and is much better now (as I've noted), but the field still struggles with the issue. And there are still low-quality papers being published. I also gave this particular paper praise for holding to a higher standard than some other GWA studies.

aab0 · on Aug 2, 2016

> Yes, pre-Ioannidis is the period I was referring to. Things cleaned up a lot in 2012+. What you said does not conflict with what I said..

A candidate-gene study != GWAS. It's particularly bizarre to criticize GWASes for the sins of candidate-gene study when GWASes were literally partially designed to avoid those problems. Don't equivocate. If you have criticisms of actual GWASes as they are run and good reason to doubt that the hits are noise and will not replicate in well-powered followups (contrary to what we actually see, in this GWAS and others...), give them; don't swap in criticisms of candidate-gene studies and pretend they're the same thing, because they're not.

> Doesn't the significance level being "ludicrous" depend on things like the degree of multiple testing happening?

It does. And that's exactly why holding replications to 10-8 is ludicrous. The error rate when testing 5 or 15 SNPs at 10-8 is much much much smaller than when testing 500,000 SNPs. Why would you hold a multiple-test of 5 tests to the same standard as 500,000+ tests?

> I'm not saying anything controversial in the field.

If you're insinuating that GWASes are as bad as candidate-gene studies were, or that a p-value threshold of 10-8 should be used for everything, or that we cannot have high confidence in any given hit at 10-8 (especially when replicated at 10-5) you certainly are saying controversial things.

> I also gave this particular paper praise for holding to a higher standard than some other GWA studies.

Such as?

erdevs · on Aug 2, 2016

> > Yes, pre-Ioannidis is the period I was referring to. Things cleaned up a lot in 2012+. What you said does not conflict with what I said..

> A candidate-gene study != GWAS. It's particularly bizarre to criticize GWASes for the sins of candidate-gene study when GWASes were literally partially designed to avoid those problems.

Obviously. I don't understand why you are referring to candidate-gene studies, as I haven't mentioned them at all. Are you saying that GWASs did not exist prior to 2012? Are you saying that GWASs somehow have not had to develop more rigorous analyses, change their standards for publication, or discussed and improved replication techniques in recent years? That somehow false positives, FDR, and FWER are of no concern in any published GWAS study today? That early GWA studies didn't struggle with effect sizes and power or that there aren't plenty of examples of published GWA studies that are underpowered or even which published spurious results? That there aren't fundamental questions about the assumptions underlying most GWAS analyses (eg potential epistasic effects vs common GWAS assumptions of SNP independence / additive effects... which is a big question in the field currently).

I'm talking about GWA studies. And I'm just pointing out well-known issues the field has wrestled with. For example, here are a couple of the early, fairly influential papers overviewing issues with GWAS (not candidate-gene studies):

2008.. http://jama.jamanetwork.com/article.aspx?articleid=181647 "GWA studies are an important advance in discovering genetic variants influencing disease but also have important limitations, including their potential for false-positive..." This and this paper, 2009: https://projecteuclid.org/download/pdfview_1/euclid.ss/12717... both helped move the field forward in more rigorously examining analysis and defining how replication could be achieved.

These are helpful papers which moved the field forward substantially and didn't just point out issues then-currently facing the field, but also pointed out solutions (many of which were subsequently adopted). It demonstrates that GWAS has struggled with such issues. I imagine you are familiar with some of this, as you mentioned Ioannidis. So... what are you even saying here? Power, effect sizes, false positives, FDR, FWER, etc have been major issues since GWAS' early days, and despite a great deal of progress over the past few years, it's still an ongoing area of discussion, research and debate in the field.

When I say caution is warranted and that we shouldn't over-generalize results, that's because the field has learned this the hard way.

> GWASes were literally partially designed to avoid those problems.

Yes, they were. And also, as I'm saying, they suffered from analytical and publishing issues for a number of years, despite being designed to address some of the early issues in single gene association studies. In the past few years, things have gotten much better. But there is still cause for concern and a lot of ongoing debate in the field.

> Don't equivocate. If you have criticisms of actual GWASes as they are run and good reason to doubt that the hits are noise and will not replicate in well-powered followups (contrary to what we actually see, in this GWAS and others...), give them; don't swap in criticisms of candidate-gene studies and pretend they're the same thing, because they're not.

I'm in no way equivocating, and you are the individual here discussing candidate-gene studies, not me. I have no idea why you're referencing them when I'm talking specifically about GWA studies.

GWA studies, currently, today, as of right now... still have issues with FDR and FWER at a base level, despite improvements in analytical rigor. With study power. With changing data collection techniques and quality control. With population sample bias. With concern for heterogeneity when applied to difficult-to-definitively-diagnose or subjectively defined traits (such as mental health diseases) in particular. With myriad issues.

It's still a relatively nascent, immature field and it is absolutely silly and unhelpful to dismiss a call for caution in interpretation, especially in the light of public press which is historically prone to jump to conclusions, under-report detail, under-qualify nuance, and overstate results.

If you want some recent examples of this debate in the field regarding GWAS, not candidate-gene studies, see a number of articles from even the past few years, including the following couple examples:

2013: http://www.nature.com/nrg/journal/v14/n7/pdf/nrg3457.pdf This is also in Nature.. and it was published because the editors in the field thought it held an important message for the field... despite what you seem to be saying here.

2016: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4756503/ "Identifying disease-associated SNPs facilitates the clinically relevant task of identifying higher-risk individuals. However, the large amount of reclassification that we demonstrated in individuals initially classified as Higher Risk but later as Average Risk or Lower Risk, suggests that caution is currently warranted in basing clinical decisions on common genetic variation for many complex diseases." [Emphasis added] This example is very on the nose in calling for exactly the same sort of caution that I'm calling for here, and which you for some reason seem to be disputing the merit of. This paper specifically addresses problems that arise when different numbers of SNPs are used in calculated genetic risk scores, as well as other issues common in the field today.

Now, I am not saying the GWAS are generally spurious or in any way bad science. I am saying we should treat them with caution, because it is a nascent field. Because there are a lot of underlying analytical assumptions and heavy analytical techniques involved in deriving results. Because the field has already seen a lot of flux over its short lifespan. Because there has been some incidence of spurious publication in the past. Because there are active and ongoing debates in the field about quality control, replication, analytical technique, etc. Because there are issues with population sampling. Because pathways aren't well-understood in many cases. And because, as I initially said, it's easy to get false positives here. It's hard to get appropriate power with a high degree of quality control. It's hard to get fully independent replication data of appropriate power. The field has challenges (against which brilliant and valiant efforts have and are being made), yet those challenges and qualifying statements rarely come to the fore in the public press regarding GWA studies.

Are you actually disagreeing that caution is duly warranted, or are you just picking at nits?

hyperbovine · on Aug 2, 2016

FDR was practically invented to handle the multiple testing issues that accompany analyzing array data. I think what OP was implying was that there is no way that you are going to get a GWAS paper into Nature Genetics in 2016 without performing the appropriate statistical corrections. Literally everyone in the field is aware of this issue. In other words, give the authors and referees a little credit.

erdevs · on Aug 2, 2016

I did not mean to imply that this paper was due extra skepticism in particular. Indeed, I tried to give the paper due praise for seeming to be more rigorous than many other GWASs.

And, yes indeed, false discovery rate has been a huge issue which the field has highlighted... that is my point. Caution has been warranted here historically, and it continues to be warranted even today.

dekhn · on Aug 2, 2016

The history of GWAS has shown that the community producing it took quite some time to reach barely acceptable levels of statistical analysis.

argonaut · on Aug 2, 2016

Most people in this thread are missing the fact that use of your genetic info for research purposes by 23andme is opt-in. And their privacy statements indicate the same would hold for other uses.

batbomb · on Aug 2, 2016

Just because something is opt-in, that doesn't mean it is necessarily ethical.

SEJeff · on Aug 2, 2016

And by ethical do you mean you personally don't want your DNA used in this regard, or by ethical do you mean "helps those suffering ailments such as depression"?

Having a friend who committed suicide and also being someone who voluntarily let 23andme use my DNA for this sort of thing, I'm very excited to see it happen. This type of research will literally save someone's life some day. Just wish it had been my friend :(

merpnderp · on Aug 2, 2016

They clearly state, in short easy to read words, exactly what is going to be done with your data. They also offer at any time to destroy the remains of your existing sample and disassociate the data with your account.

They may be unethical, but I have seen zero evidence of it.

posterboy · on Aug 2, 2016

That implies the usage of the data is easy to understand exactly.

dekhn · on Aug 2, 2016

That is strictly true, but 23andme's docs on what they do with your data is pretty good, and I don't see any ethical concerns with that they are doing (note: I've worked to make opt-in medical consent the default, I know a fair amount of what's going on in this area). By providing the necessary information for an individual to make an opt-in choice, the individual has the ability to decide the ethicality themselves.

lips · on Aug 2, 2016

This article has zero detail. It's like an empty box of cookies.

shaqbert · on Aug 2, 2016

23andMe biz model in a nutshell:

- you give us your DNA (spit) and consent to do whatever we goddamn like with your DNA - we give you some dozen of mostly uninteresting genetic marker worth of analysis (wet earwax anyone?)

With genetic filtered pharma studies in the vicinity, 23andMe sits on a goldmine.

jonlucc · on Aug 2, 2016

As someone who works in pre-clinical pharma research, thank god someone is doing this. Projects get killed because they would lead to a drug that would require very expensive clinical trials. Often, this is because the patient population is hard to identify before they get very sick.

Also, I think 23andMe should be very clear that this is what they do, but it isn't inherently problematic. People shouldn't ever be surprised by getting asked to be in a clinical trial based on their information stored at 23andMe.