I would like to thank everyone for the incredible reaction to my first dendrogram and a special thanks to the great James Thompson who tweeted it out to his thousands and thousands of twitter followers.
Although the vast majority of people enjoyed the dendrogram, there were a few people who were mocking it. Luckily I was tipped off by a quick thinking blogger. Not to namedrop but it was HBD Chick).
One of the critics asked why I had not calculated the cophentic correlation, because if I had, I would have known that human “races” don’t fit a tree like structure.
Alan R. Templeton writes:
The cophenetic correlation measures how well the observed genetic distances fit the predicted genetic distances from an evolutionary tree model and provides a heuristic goodness of fit to treeness… The cophenetic correlations for various data sets that have been used to portray human population trees vary from 0.45 to 0.79 (Templeton, 1998a). A tree-like structure of genetic differentiation requires a cophentic correlation greater than 0.9, and any value less than 0.8 is regarded as a poor fit (Rohlf, 1993)Source: Biological races in Humans
So what’s the cophenetic correlation of my dendrogram?
Why did my dendrogram achieve such a cophenetic correlation when others failed to do so? There are several possible reasons.
- incompetence? perhaps I didn’t follow the correct procedures?
- luck. With only a 5 modern human populations, a high correlation may have occurred by chance.
- pure samples: perhaps the genomes in the data-set were less hybridized than previous data-sets which contained mixed groups like Southeast Asians
- genome thoroughness: perhaps my data being newer, sampled more of the genome than previous research and thus gave more accurate results.