Can the minds of machines teach us something new about what it means to be human? When it comes to the intricate story of our species’ complex origins and evolution, it appears that they can.
A recent study used machine learning technology to analyze eight leading models of human origins and evolution, and the program identified evidence in the human genome of a “ghost population” of human ancestors. The analysis suggests that a previously unknown and long-extinct group of hominins interbred with
Homo sapiens in Asia and Oceania somewhere along the long, winding road of human evolutionary history, leaving behind only fragmented traces in modern human DNA.
published in , is one of the first examples of how machine learning can help reveal clues to our own origins. By poring through vast amounts of genomic data left behind in fossilized bones and comparing it with DNA in modern humans, scientists can begin to fill in some of the gaps of our species’ evolutionary history. Nature Communications
In this case, the results seem to match paleoanthropology theories that were developed from studying human ancestor fossils found in the ground. The new data suggest that the mysterious hominin was likely descended from an admixture of Neanderthals and Denisovans (who were
only identified as a unique species on the human family tree in 2010). Such a species in our evolutionary past would look a lot like the fossil of a 90,000-year-old teenage girl from Siberia’s Denisova cave. Her remains were described last summer as the only known example of a first-generation hybrid between the two species, with a Neanderthal mother and a Denisovan father.
“It’s exactly the kind of individual we expect to find at the origin of this population, however this should not be just a single individual but a whole population,” says study co-author
Jaume Bertranpetit, an evolutionary biologist at Barcelona’s Pompeu Fabra University.
The ability of early humans to adjust to changing conditions ultimately enabled the earliest species of
Homo to vary, survive and begin spreading from Africa to Eurasia 1.85 million years ago.
Previous human genome studies have revealed that after modern humans left Africa, perhaps
180,000 years ago, they subsequently interbred with species like Neanderthals and Denisovans, who coexisted with early modern humans before going extinct. But redrawing our family tree to include these divergent branches has been difficult. Evidence for “ghost” species can be sparse, and many competing theories exist to explain when, where, and how often Homo sapiens might have interbred with other species.
Traces of these ancient interspecies liaisons, called introgressions, can be identified as places of divergence in the human genome. Scientists observe more separation between two chromosomes than you’d expect if both of the chromosomes came from the same human species. When scientists
sequenced the Neanderthal genome in 2010, they realized that some of these divergences represented fractions of our genome that came from Neanderthals. Studies have also revealed that some living humans can trace as much as 5 percent of their ancestry to Denisovans.
“So, we thought we’d try to find these places of high divergence in the genome, see which are Neanderthal and which are Denisovan, and then see whether these explain the whole picture,” Bertranpetit says. “As it happens, if you subtract the Neanderthal and Denisovan parts, there is still something in the genome that is highly divergent.”
Identifying and analyzing the many divergent places throughout the genome, and computing the countless genetic combinations that could have produced them, is too big a job for humans to tackle on their own—but it’s a task that may be tailor made for deep learning algorithms.
Deep learning is a type of artificial intelligence in which algorithms are designed to work as an artificial neural network, or a program that can process information the same way a mammalian brain would. These machine learning systems can detect patterns and account for previous information to “learn,” allowing them to perform new tasks or look for new information after analyzing enormous amounts of data. (A common example is
Google DeepMind’s AlphaZero, which can teach itself to master board games.)
“Deep learning is fitting a more complicated shaped thing to a set of points in a bigger space,” says
Joshua Schraiber, an evolutionary genomics expert at Temple University. “Instead of fitting a line between Y and X, you’re fitting some squiggly thing to a set of points in much bigger, thousand-dimensional space. Deep learning says, ‘I don’t know what squiggly shape should fit to these points, but let’s see what happens.’”
In this case, machines were set to work analyzing the human genome and predicting human demographics by simulating how our DNA might have evolved over many thousands of possible scenarios of ancient evolution. The program accounted for the structure and evolution of DNA as well as models of human migration and interbreeding to try to fit some of the pieces together in an incredibly complex puzzle.
The researchers trained the computer to analyze eight different models of the most plausible theories of early human evolution across Eurasia. The models came from previous studies that attempted to come up with a scenario that would result in the current picture of the human genome, including its known Neanderthal and Denisovan components.
“There could be other models, of course, but these models are the ones that other people have been proposing in the scientific literature,” Bertranpetit says. Each model begins with the accepted out-of-Africa event, then features a different set of the most likely splits between human lineages, including various interbreedings with both known species and possible “ghost” species.