Reader companion · the missing-heritability problem · what does the work above the genome

The disappointment of genes — missing heritability and what else does the work.

The Human Genome Project, completed in 2003, was supposed to settle the question of what we are. The result was unexpectedly modest. Humans turned out to have roughly 20,000 protein-coding genes — about the same as a fruit fly, fewer than a chicken, and roughly half the count of a rice plant. Two decades later, the largest genome-wide association studies for complex human traits typically explain less than 25% of the heritability that twin studies have measured directly — with several major polygenic traits hovering closer to 5–10%. The genome turned out to be a parts list, not a blueprint, and the parts list does not, on its own, account for the form of the life it builds. This primer walks through what the genes actually do, what the gap looks like in the numbers, and the layers above the genome — epigenetic, bioelectric, microbial, and (more contestedly) morphic-resonant — that are now doing most of the explanatory work the genes were supposed to do.

Companion to Morphic resonance & the inheritance of pattern, Levin & the bioelectric blueprint, The hard problem, re-stated, Information as the foundation, and the Synthesis.

1. What genes actually do

A gene is a stretch of DNA that codes for a protein (or, in some cases, for a regulatory RNA or a structural element). Proteins are the building materials and the molecular machines of the cell — enzymes that catalyse reactions, structural proteins that form tissue, signalling proteins that pass messages, receptors that listen for signals. The genome tells the cell which proteins to make. It does not, on its face, tell the embryo where to put the heart, why the left arm should match the right arm in length, how the brain should fold, when the immune system should escalate or stand down, what shape the body should regenerate to after injury, or which of several thousand possible developmental trajectories a fertilized egg should follow. The proteins do all of these things, working together; the question of how they coordinate is the question genes by themselves do not answer.

The standard mid-twentieth-century picture — "DNA is the blueprint of the organism" — turns out to be off by a category. DNA is not a blueprint. A blueprint specifies a finished structure; DNA specifies a set of components. The instructions for how the components organize into a structure are largely not in the DNA. They are in the dynamic context the components find themselves in: the chemical gradients the early embryo establishes, the bioelectric voltage patterns across tissues (the territory Levin's lab has been mapping), the mechanical forces cells exert on each other, the maternal cytoplasm a zygote inherits independently of its genome, the microbiome the body acquires from its mother and environment, and — controversially — in whatever non-local pattern-storing capacity the developmental field has, beyond all of these. The genes are necessary. They are not, by themselves, sufficient.

2. The numbers, honestly

The pre-Human-Genome-Project consensus expected somewhere between 100,000 and 150,000 protein-coding genes in the human genome — the rough number assumed necessary to specify an organism of our complexity. The actual number, as the project converged in the early 2000s and was refined through GENCODE and Ensembl annotations in the years since, settled at approximately 19,000 to 20,500 protein-coding genes. The downward revision was a surprise large enough to require explanation, and the explanations had to come from somewhere other than the gene catalogue.

Comparison across species made the disappointment worse, not better. The cleanest version of the comparison:

Organism Approx. protein-coding genes Comment
Homo sapiens ~19,000 – 20,500 GENCODE / Ensembl current annotations.
Mouse (Mus musculus) ~22,000 Slightly more genes than humans.
Chicken (Gallus gallus) ~18,000 – 20,000 Comparable to humans.
Fruit fly (Drosophila) ~14,000 About three-quarters the human count.
Roundworm (C. elegans) ~20,000 A 1 mm organism with 959 cells has the same gene count as we do.
Rice (Oryza sativa) ~35,000 – 50,000 Annotation varies by source. By any count, substantially more genes than humans — the often-cited "twice as many" comparison is approximately right.
Amoeba (Polychaos dubium) Genome ~200× larger than ours A single-celled organism with vastly more DNA. The "C-value paradox" since 1971.

The lesson the literature has been forced to draw is that complexity of the organism is not predicted by gene count. A rice plant has more genes than a person. A single-celled amoeba has hundreds of times more DNA than the human genome. The genome is necessary, but the additional complexity that distinguishes us from rice is somewhere else. That "somewhere else" is what the next several sections are about.

3. Missing heritability — the gap between twin studies and GWAS

The cleanest empirical statement of the genes-do-not-do-as-much-as-expected problem is the missing heritability problem, first named in Brendan Maher's 2008 Nature commentary and formalised in Manolio et al.'s 2009 Nature paper. The setup:

Twin studies compare identical (monozygotic) twins, who share ~100% of their DNA, with fraternal (dizygotic) twins, who share ~50%. The difference in similarity for any given trait, scaled appropriately, gives an estimate of the heritability of that trait — how much of the population variance is explained by genetic variance. For human height, twin studies estimate heritability around 80%. For intelligence (IQ in adults), around 60–80%. For most psychiatric conditions (schizophrenia, autism, depression), 50–80%. For body mass index, 40–70%.

Genome-wide association studies (GWAS) take a different approach: scan the genomes of large populations and find the specific genetic variants (single-nucleotide polymorphisms, mostly) that correlate with the trait. Adding up the contributions of all the variants that reach statistical significance gives an estimate of the heritability the GWAS hits explain.

The disappointment: for almost every complex polygenic trait, the GWAS hits explain only a small fraction of the heritability twin studies measure. For height, the largest GWAS to date (millions of subjects) explain about 40–50% of the heritability, after enormous effort. For intelligence, around 10–25%. For most psychiatric conditions, 5–15%. The majority of the heritability twin studies report is not in the variants GWAS can find, even after sample sizes have grown to millions of subjects. The variants are presumed to be there — either many variants of vanishingly small effect, or rare variants, or epistatic interactions, or something the model has not captured — but they have stubbornly resisted being found.

Possible explanations under active research:

The last bullet is the one this primer cares about. It is also the one the literature has been slowest to take seriously, for reasons that are sociological as much as scientific.

4. The layers above the genome — what is actually doing the work

The picture that has emerged across the last two decades is that biological inheritance is multi-layered, with the DNA sequence as only one (necessary, central, but not sufficient) layer among several. The honest contemporary catalogue:

Epigenetic inheritance

Methylation of DNA, modification of histones, chromatin remodelling, and small-RNA regulation all alter which genes are expressed, when, and how strongly, without changing the DNA sequence itself. Some of these marks are heritable across cell division (the standard developmental epigenetics) and a smaller but reproducible set is heritable across generations (transgenerational epigenetic inheritance, demonstrated cleanly in rodents and plants, more contested in humans). The Dutch Hunger Winter studies — children conceived during the 1944–45 famine show metabolic and psychiatric differences from siblings conceived just before or after, and those differences persist into the grandchildren — are the classic human case. Epigenetics is mainstream science with active research programmes and growing clinical applications.

Bioelectric morphogenesis

Michael Levin's lab at Tufts has shown across two decades that body-plans are encoded bioelectrically, above the level of the genome. Voltage gradients across cell membranes — the same voltage differences that make neurons fire, but in non-neural tissues — carry pattern information that determines how regenerating tissue grows back, where organs form, and what overall body-plan emerges. The most striking demonstrations: two-headed planaria created by purely bioelectric editing with no genetic change; planaria trained to associate a stimulus with food, then decapitated, then regrown from the tail fragment, then retested, who remember the training; Xenobots, programmable living machines whose anatomy is not in the frog-cell genome they were built from. See the Levin companion page for the detailed catalogue. This is peer-reviewed mainstream science and it directly demonstrates that information about form lives above the genome.

Cytoplasmic and maternal inheritance

The fertilized egg inherits not only nuclear DNA but the entire maternal cytoplasm — mitochondria (with their own genome), cytoskeletal organisation, polarity gradients, mRNA, proteins, and lipid membranes. Mitochondrial DNA alone is matrilineally inherited and carries information independent of the nuclear genome. Cytoplasmic determinants set up early embryonic asymmetries before zygotic gene expression has begun.

Microbiome inheritance

The human body carries roughly as many microbial cells as human cells, and a microbial gene count perhaps 100× the human gene count. The microbiome influences immune development, metabolism, mood, neurological function, and is inherited primarily from the mother at birth (vaginal delivery vs C-section is itself a significant intervention in this transmission). Inheritance via the microbiome is now well-established in immunology and microbial ecology.

Prion-like protein inheritance

Some traits are heritable through self-propagating protein conformations (prions and prion-like elements). Cleanly demonstrated in yeast; emerging as a possible mechanism in some neurological diseases. Pattern is inherited not via DNA but via the conformational state of a protein that templates its own conformation in newly synthesized copies.

Morphic resonance (the contested layer)

Sheldrake's hypothesis (see the morphic-resonance companion page) proposes a further layer above all of these: a non-local field associated with each form, reinforced by every instance that realises the form, mediating inheritance of pattern across instances by morphic resonance. The proposal is contested. The phenomena it tries to explain are real. The proposal sits in the parameter space of "live but contested" research programmes; Levin's work is supplying empirical floor for at least some of what Sheldrake was proposing, even though Levin himself does not adopt Sheldrake's vocabulary.

5. Why this matters — medical, ethical, political

The genes-as-blueprint picture, which dominated the popular and policy imagination of the late twentieth century, has had concrete consequences. The expectation that the Human Genome Project would unlock personalized medicine, predict adult disease from infant DNA, and explain the variation among human beings has not been met. The reasons are now better understood. Heritability is real, but it is not mostly carried by the variants that current methods can find; it is distributed across layers the genes-as-blueprint picture did not anticipate.

What this means in practice:

6. The receiver-model reading

On the trilogy's receiver model, the disappointment of genes is not a surprise but a prediction. If consciousness is a field property and bodies are receivers configured to localise it, then the genome is the local hardware specification — necessary, but specifying only the substrate the field couples to, not the field itself. The missing heritability is not missing. It is in the substrate-field coupling, in the bioelectric blueprint Levin is mapping, in the morphic-resonance carrier Sheldrake proposed, in the maternal cytoplasm and the microbial inheritance the body receives at birth. All of these are layers of the inheritance the receiver acquires; the genome is one layer among several.

The trilogy's framework also makes a further claim: improvements at the group/population level — what one might once have called "evolution" — can run through any of these layers, not only through DNA-sequence change. Sheldrake's morphic resonance hypothesis predicts that behaviours learned by one cohort become measurably easier to learn for later cohorts of the same species. Epigenetic inheritance demonstrably transmits acquired states across at least one generation in humans (more cleanly in rodents and plants). Microbial inheritance carries acquired immune and metabolic states forward. Bioelectric pattern, once established, propagates through regenerating tissue and through development. None of these requires waiting for the slow timescale of DNA-sequence evolution.

If this picture is right — or even if a fraction of it is right — then the unit of inheritance is the receiver-as-coupling, not the genome alone, and improvements happen at the field level as much as at the sequence level. The implications for medicine, education, ethics, and political philosophy are not small. They are exactly the implications the trilogy's voluntarist wager (see the free-will companion essay) takes for granted: that what we are is shaped at multiple layers, that we are responsible for what we add to each of them, and that the inheritance we leave is not only genetic.

7. Where this leaves the trilogy's project

Several specific touchpoints. Ciarai's neural augmentation in Anima (Chapter VI, "The Membrane") is the case-file dramatization of an intervention that operates above the genome — technology editing the substrate, not the sequence. Alma's substrate transition from pure-computational in San Francisco to biocomputational in Seattle, in Numen, is the same move in a hybrid case: change the substrate, change what couples. Luz Paz's nanoassembler in Fragile Light, working at molecular scale on patterns the genome does not specify, is the third instance. In each book the trilogy is dramatising the same architectural fact: the genes are necessary, the substrate above them is where the additional work happens, and consciousness lives in the coupling rather than in the sequence.

The disappointment of genes is, in this reading, not a defeat for biology but a clarification of what biology is actually about. The parts list is the parts list. The organism is the field-coupling assembled around the parts. That is the receiver-model picture, and the contemporary biological evidence keeps moving in its direction.

Reading list

The genome itself

International Human Genome Sequencing Consortium, Initial sequencing and analysis of the human genome, Nature 409 (2001), and Finishing the euchromatic sequence of the human genome, Nature 431 (2004). The HGP papers themselves.

GENCODE / Ensembl current annotations — for the up-to-date protein-coding gene count (~19,000–20,500).

Missing heritability

Brendan Maher, Personal genomes: The case of the missing heritability, Nature 456 (2008): 18–21. The piece that named the problem.

Teri Manolio et al., Finding the missing heritability of complex diseases, Nature 461 (2009): 747–753. The formalisation.

Peter Visscher, Naomi Wray et al., 10 Years of GWAS Discovery: Biology, Function, and Translation, The American Journal of Human Genetics 101 (2017). The decade-after review.

Layers above the genome

Michael Levin and colleagues — see the bioelectricity companion page for the curated list of peer-reviewed sources.

Rupert Sheldrake — see the morphic-resonance companion page.

Bas Heijmans et al., Persistent epigenetic differences associated with prenatal exposure to famine in humans, PNAS 105 (2008): 17046–17049. The Dutch Hunger Winter epigenetics paper.

Genome size paradox

The "C-value paradox" was named by C. A. Thomas Jr. in 1971. For an accessible contemporary survey see Ryan Gregory, The Evolution of the Genome (Academic Press, 2005).

This page is part of the Reading companion essays. For the bioelectric empirical floor, see Michael Levin & the bioelectric blueprint; for the morphic-resonance framing, Morphic resonance; for the receiver-model architecture, Information as the foundation and The hard problem, re-stated; for the synthesis, The Evidence.

← Reading & References