More than a decade ago I submitted my DNA to the National Genographic Project [1] which was an attempt to trace ancient human migration using Y chromosome and
mitochondrial DNA from a sampling of people around the world.
My paternal Y-chromosome line showed an ancient mutation (>50K year ago) carried by humans who had likely spread through a coastal migration from Africa around coasts of the Arabian Sea down to South India and eventually thousands of years later to Australia. My ancestors are from the west coast of India. About 5% of Indians carry this mutation (shared with indigenous australians) according to the data that was available at that time.
As we gather more data our understanding of human migration inevitably clashes with identity politics/nationalism and previous simplistic pre-scientific classifications based on skin color and facial features: Aryan invasion, Dravidian etc. Human migration appears to be much richer. There have been many migrations of humans groups both in and out of the Indian subcontinent going on for more than 50K years.
In terms of progress in DNA sequencing this is quite an old news. Quite a lot of studies have been conducted since then with better sequencing techniques. David Reich, a well known researcher in the field, a student of Svante Pääbo who was awarded the Nobel Prize[1] for pioneering this field and technique, has authored a terrific book[2] synthesising all these findings.
The details are fascinating a nuanced, highly recommend this book if you would like to pursue this rabbit hole.
The book from Svante Pääbo [3] is equally fascinating which goes into the details of how DNA sequencing works to determine ancient ancestry. In a way it takes off from where the Y-chromosome study ends.
That there is some Indian introgression into Aborigibal genomes is completely not surprising based on phenotypic characteristics of their faces as well as other aspects of their cultures like language. It was remarked upon by "previous simplistic pre-scientific" British settlers, who it turns out were quite right. Many other older theories have turned out to get support from the genetic evidence.
The study above argues convincingly for later contact though. There were probably several waves, theres other (disputed) circumstantial evidence for this. When we sequence more (if allowed) we will find richer picture as the OP alludes to.
If you have the raw sequence, services like yfull.com can help provide more detailed information. At this point, new haplotypes are being called every month. When I sequenced my own haplotype about 20 years ago (I was in the Indian lab partner of the Genographic Project at the time), the best we could do was call a few handfuls of mutations by hand (I used to run the gels myself) and my own haplotype was characterized out to maybe 3 mutations - the current haplotype is perhaps out to 20?
I don't have the raw sequence. The project handled DNA data in total privacy.
I did send my DNA swab a second time about 5-6 years later, this time to look at my maternal line (mitochondrial dna). The report had changed quite a bit possibly because the project had gathered more data in the intervening years. .
I've stopped looking into this further - I really don't care about my ancestors or genetics. More interested in humans as a whole - and for that we need more data.
The field has advanced incredibly since then - uniparental markers like mitochondria or Y are a helpful clue, but we have so much genome wide information now. Indian population genetics is still in its infancy, though, since Indian government rules around the export of biomaterial means only Indian labs can work on them, and honestly most of them have strong biases they're looking to confirm. So much of the Indian population information in databases is simply extrapolated from tiny, non-representative samples collected from Indians abroad.
> As we gather more data our understanding of human migration inevitably clashes with identity politics/nationalism
It's funny you should mention this, as I saw the documentary and I seem to remember some North American indigenous tribes refusing to participate. Their reasoning was the Nat Geo was just trying to prove they were immigrants like everyone else.
You probably know this having watched the documentary, but I'm just adding context for others.
It's primarily because many American tribes try to maintain religious beliefs in conflict with an immigrant identity. Although, evidence points to most of the earliest American humans immigrating across the Bering Land Bridge ~20,000 years ago. The religious stories tribes tell depict that they came from American caves, soil, water sources or other nature-related points of origin.
It's religious-based denialism, no different than what other major religions do when their beliefs conflict with reality.
> As we gather more data our understanding of human migration inevitably clashes with identity politics/nationalism and previous simplistic pre-scientific classifications based on skin color and facial features
Does it? Because most studies find that people living in the same country are indeed more closely related to each other than to people of different countries, roughly according to what you would expect from geography, and the simplistic categorization of humans into ~5 races corresponds well to clusters on a genetic map. See:
Yes, ppl in the same country have more similar genomes. But modern population genetics has found that 1. There is more similarity than difference between any 2 humans on earth (there are africans and europeans who share more dna than with others of their own ancestry. And, 2. People move around, a lot! We knew this already but there are a huge number of undocumented migrations that are being revealed by these studies. Turns out 20th century multiculturalism is just an acceleration of what we already did.
Man, it’s kinda scary hearing the “more similarity between any two humans” line here on HN. That one’s up there with the “we only use 10% of our brains” line, for me. My understanding is that’s a pretty well documented misconception about genetic diversity which doesn’t really apply here.
> But modern population genetics has found that 1. There is more similarity than difference between any 2 humans on earth
It's not even that modern. A human shares over 98% of the DNA of a pig, and about 50% of the DNA of a tree. Anything much closer to us than a tree has "more similarity than difference" to humans on a genetic level.
"The creatures outside looked from pig to man, and from man to pig, and from pig to man again; but already it was impossible to say which was which." - Orwell
> there are africans and europeans who share more dna than with others of their own ancestry
You're probably referring to the "greater variation within than between races" argument. But it's only true when looking at single genes, not collections of genes - that is how genetic tests can accurately determine race. And most traits are polygenic, so it does not even make sense to compare using only individual genes.
That's the reason Asian couples don't spontaneously give birth to European-looking children, and vice versa. And genes not expressed in superficial appearance were exposed to selection pressure just as much.
Whenever you see a theory so obviously contradict the evidence of your eyes, you should be at least skeptical of it (though not dismissive - e.g. quantum mechanics are true, however counterintuitive)
Yes you are right about the greater variation being localised. But I thought it was localised at the level of linkage disequlibrium blocks which can encompass many thousands of genes at their largest.
Regardless the genetic distance between any two people remains tiny compared to the distance between even closely related species or sub-species elsewhere.
You're talking about averages. The idea that "genetic tests can accurately determine race" kind of falls apart when you apply it to an individual with mixed ancestry. You can probably detect and quantity the extent of the mix (which is why 23andme or ancestry can make money), but labelling a bunch of averages as "race" and arguing that there is an innate and immovable property of biology seems like there's something you're missing in drawing these correlations. You can start at a conclusion and show all sorts of correlations given enough data, but it doesn't give validity of how this is somehow the right way to look at it.
And that's not even getting into how you're giving off creepy eugenicist vibes.
Lewontin came to that conclusion by examining exclusively blood type alleles, which was kind of asinine because they have no apparent geographic distribution pattern at all. Other genetic markers tell a much clearer story, as the principal component analysis shows. It couldn't really be any more obvious - the PCA makes a map. You can literally see each of the continents represented in a projection of genetic data.
Something like 30-50% of all human genetic variants are shared across continents. It's hardly asinine to say that a significant proportion of genetic variation is shared with everyone. Depending where you are talking about, ~20% of variation is unique to a given continent.
I don't think PCA plots really tell us much beyond there being distinct genetic clusters? One could do a PCA only on people with european ancestry, or people living in a small town, and there would be plenty of interesting structure to look at.
> and the simplistic categorization of humans into ~5 races corresponds well to clusters on a genetic map
Are you familiar with principal component analysis? That’s what was used to generate the very clearly clustered charts that you linked to. It’s useful for analysis because it exaggerates the density and distinctness of clusters based on major features. But what that means is precisely that those clusters are not real! You’re mistaking a visual/analytical aid for raw data. A bit like looking at a false-color diagram of the human body and concluding based on it that blood is blue.
> it exaggerates the density and distinctness of clusters
It does the opposite - the PCA graph shown is 2D, meaning points in the N-dimensional space are projected onto a single plane. In this projection, information is lost - i.e. point-clouds that are distinct in 3D may overlap in 2D. While we are shown the axes along which the data is most distinct, the true clusters are even more distinct, as every added dimension would contribute additional 'distinctness'.
No, sorry, you’re just wrong here. You can run PCA with pretty much any set of genetic samples and get a very nicely clustered graph, because that’s what PCA is for. See this PCA graph [0] on population groups present in Brazil. Should Xavante people be the sixth race in your taxonomy? They sure look distinct from other Amerindians and very distinct from the rest of the world!
(No, of course they shouldn’t. It’s just that extracting features to artificially cluster points along the most characteristic axes is what PCA is for.)
I believe I described PCA accurately, so could you elaborate which part of my statement is wrong?
> extracting features to artificially cluster points along the most characteristic axes is what PCA is for
How is the clustering "artificial"? Because if you generate data without clusters (e.g. points evenly distributed within a sphere), applying PCA to it won't show clusters either.
It might support the prescientific racial classifications, but it definitely undermines the idea of a nationally pure identity for any country. The "Aryan race" comes from the mixture of several different ancient cultures. The same can be said for north Indians, and other groups which claim to have some pure ancient genetic lineage.
> most studies find that people living in the same country are indeed more closely related to each other than to people of different countries
I'm surprised that anyone would really think otherwise. There is so much obvious similarity between people in different countries (well, the old countries, I mean, such as in Europe and Asia) that sometimes I wonder if each country has a relatively recent common ancestor.
> I'm surprised that anyone would really think otherwise
Usually they disprove a much stronger, extreme claim (e.g. some sort of racial purity, or the idea of entirely discontinuous, clearly-delimited races), then imply this means peoples are entirely interchangeable, and that the weaker, obviously-true claim of closer kinship within countries/regions is somehow false and/or meaningless.
Yep, it is a motte-and-bailey argument[1]. The motte is that the popular notion of race does not map completely accurately onto the spectrum of actual human genetic diversity, which of course is true. The bailey is that the idea of race is completely meaningless and that anyone who buys into it to any degree whatsoever is a racist. The bailey is often presented using the thought-terminating cliche of "race is an unscientific/pseudoscientific concept".
> that people living in the same country are indeed more closely related to each other than to people of different countries
These charts don't show what you apparently think they do. The samples in the larger datasets (like the HGDP) are chosen for their differentiation, often from indigenous groups, with admixture samples usually removed (the first adds in a few admixture types for the specifics of their study).
Your statement was "people living in the same country are indeed more closely related to each other than to people of different countries"
The first chart shows a direct counterexample with "African American", "Mexican American" and "North America" groups more separated from each other than from other groups and yet many of them live in the same country. These are also not the only groups of people you'd find in North America, they're just the ones used in the study of Northwestern american indigenous groups that this is screenshotted from. This is, again, poor evidence for your claim.
Countries that had recent significant immigration are an exception, and their population will reflect the genetics of their geographic origins. I.e. African-Americans are closer related to Africans than to Italian-Americans. I thought this was obvious enough to not require a disclaimer.
I was speaking off the top of my head, hence the ~ preceding the 5. That said, more clusters may be visible on a 3D graph. The data itself has many more dimensions, and we see only a projection.
Sorry, I forget that I have to explain when I'm being sarcastic on Hacker News. The actual number of clear clusters I found was zero, maybe 2. The four groups I colored on the chart seem to follow the cleavages just as well as the ones shown originally, but they don't map to our culture's understanding of race at all. If we lived in a culture that believed that Europeans and Indigenous Alaskans were part of the same race, they could support that idea using that chart.
Perhaps more dimensions would reveal five or so clear clusters that correspond exactly to the races listed in the US census, but I suspect what it would actually reveal is that population genetics is really complicated and has lots of counter-intuitive cases that defy most people's naive expectations.
Your first graph is surprising. In general, people from across Africa have the greatest genetic diversity. Mostly thanks to founder effect https://en.wikipedia.org/wiki/Founder_effect from what I heard.
But your first graph shows all of their African samples clustered relatively tightly together.
That's because the x-axis is "pairwise allele-sharing distance" ie: we munged the data every which way to find something that we could use k-means clustering to write a paper.
A better graph would be a "pairwise interracial-fucking distance" with the y axis
the percentage of healthy babies born. That would be just one cluster.
And it doesn't matter, because if you have mixed-race children then it's not true anymore. Past statistical distributions are not evidence of future statistical distributions.
As an only child raised far away from all family I find ancestry hugely interesting. Wanting to be part of a tribe is an inherently human trait. Finding jewish ancestry and that those that didn't leave eastern Europe were wiped out GREATLY changed my motivation to protect all people. Finding out that my quite unassuming grandfather (who made a huge difference in America's 80s farm crisis at the cost of his entire career in the 70s and was blacklisted from employment forever) descended from basically every European nobile and was more royal than my ex's side (who are very proud of their silver set handed down from their nobility) was so, so gratifying.
You should try talking to people with interesting ancestors (or views of their ancestors).
I won't go into too much detail here to avoid doxing myself, but I've had a few decent discussions about some of my and my friends' ancestors/likely ancestors as historical figures.
Depending on who you're (un)fortunate enough to be descended from, you might find yourself viewing certain historical events in a different light.
Personally as a child it helped me humanize these historical figures and view them as I would an uncle or grandfather, as opposed to the sometimes one-dimensional treatment notable figures would get from the educational system or culture you learn about them from.
Now as an adult it's perhaps less interesting, but I still know significantly more random trivia about historical figures I'm distantly related to compared to others.
My paternal Y-chromosome line showed an ancient mutation (>50K year ago) carried by humans who had likely spread through a coastal migration from Africa around coasts of the Arabian Sea down to South India and eventually thousands of years later to Australia. My ancestors are from the west coast of India. About 5% of Indians carry this mutation (shared with indigenous australians) according to the data that was available at that time.
As we gather more data our understanding of human migration inevitably clashes with identity politics/nationalism and previous simplistic pre-scientific classifications based on skin color and facial features: Aryan invasion, Dravidian etc. Human migration appears to be much richer. There have been many migrations of humans groups both in and out of the Indian subcontinent going on for more than 50K years.
[1] https://en.wikipedia.org/wiki/Genographic_Project