Professor Steve Hsu, a well known quantum physicist at Michigan State, ran a lot of studies on the genetic influence of intelligence, both in the US and in China...He concluded that thousands of genes affect intelligence, and also that intelligence is roughly 80% heritable....That seems likely because good health and IQ are definitely correlated, and a great many genes affect health...
"Here I add to that by arguing that current GWAS studies must be overlooking much of the genetic influence on intelligence. In short, intelligence must be affected by vast numbers of genes, which means that most of them must have very small effects, and current GWAS studies do not have the statistical power to detect these tiny effects."
I believe that is the correct analysis of the situation. The fact that human intelligence manifests as a spectrum shows that there is a large number of genes influencing intelligence.
To add a further confounding factor, intelligence presents itself in different abilities, such as abstraction, logic, reasoning, planning, creativity, critical thinking, and problem-solving.
It seems to me that 80% or more of intelligence is genetic.
Great article! I noticed this comment: "Indeed a more recent and bigger study analysed 3 million genomes and found 3,952 SNPs associated with educational attainment, which together account for 12 to 16% of the variance."
This is slightly misstated as these estimates refer to the predictive performance of a polygenic score (PGS) including weights from the standard panel of ~1.2M HapMap3 common variants (see Methods: Polygenic Prediction and Supplementary Table 3 of the study). Their method of training this PGS attempts to glean signal from the many truly associated SNPs for which low power prevented their reaching strict, genome-wide significance (p < 5e-8) unlike the smaller number of 3,952 lead SNPs. The same applies to the intelligence GWAS mentioned in the preceding paragraph.
If anything, this supports your general point that GWAS/PGS methods are too underpowered and data-limited to precisely estimate heritability to the same extent as do twin studies.
I think Gusev is being given too much credit. His approach essentially boils down to the cheating spouse caught in flagrante by their partner: 'Who are you going to believe, me or your lying eyes?'
It is based on two prejudices. 1. That intelligence heritability is low or non existent. 2. A rationalist faith which supposes reality must be transparent to rational-technical methods. If the molecular models don’t find heritability, then the heritability must not exist. But that reflects the limits of the model, not reality.
Heritability is a statistical property of populations, not a catalogue of specific genes. We will probably never know exactly what drives it. That doesn't mean it isn't as real or as powerful as people observe, or as Twin Studies suggests.
The article argues that Genome-Wide Association Studies do not currently have sufficient statistical power to estimate the contribution of all genotyped common variants. Had the author read only a few sentences further in my article (https://theinfinitesimal.substack.com/p/no-intelligence-is-not-like-height) they would have found a resolution to this dilemma:
"But prediction accuracy depends on sample size, could the findings drastically change with more samples in the future? In fact, through the magic of statistics, we actually know that this claim will always to be true. We know this because we have estimated a parameter called molecular heritability, which tells us the upper bound on what a genetic predictor could ever achieve ... But for IQ, the direct heritability dropped to 15% (with a wide error bar) and for educational attainment all the way down to 4% (with a narrow error bar). These substantial decreases are the result of some mix of cultural influences, assortative mating, and population structure."
So we already had our answer: molecular methods can estimate the total GWAS heritability without being constrained by statistical power to identify individual effects, and these estimates are *also* much lower than those obtained by twin studies.
Though I wish the author had read the entirety of my article and saved themselves the trouble of writing a response to a point that was already addressed, I do applaud the effort. Noah Carl, whose piece is also cited here, appears to have read none of my article at all! His response immediately shifts to pedigree and adoption studies that were not even mentioned. Thankfully, Vinay Tummarakota at Unboxing Politics has recently done the heavy lifting of sifting through pedigree and adoption studies and demonstrated that Carl's non-response also happens to be incorrect on the merits: adoption and pedigree studies do not provide strong support for twin studies either (https://unboxingpolitics.substack.com/p/contra-scott-alexander-on-missing). I again encourage all interested parties to read carefully, as Tummarakota's piece also addresses concerns about rare and SNP-level variation in the section titled "Relatedness Disequilibrium Regression". Perhaps in the distant future someone around here will have actually read to the end and offered a counterpoint.
We know that children resemble their parents. Adoption studies find that adopted children resemble their biological parents much more than their adoptive parents, and often that they don't resemble their adoptive parents at all. This seems like strong evidence that the resemblance between children and their parents is mostly due to genes.
Hi Sasha, can you talk us through the method by which you arrive at an upper bound (regardless of statistical power) using GWAS studies? Or point me at somewhere where this is explained?
There are individual-level based estimators (typically referred to as GREML or GCTA) and there are estimators based on GWAS summary statistics (typically referred to LDSC). Both of these methods can either be applied to "population level" data to estimate the total proportion of the trait that can be explained by the genotyped SNPs or "family level" data to estimate the *direct* effects that can be explained by the genotyped SNPs (the distinction between population and direct effects is explained in the above link; direct effects are what people intuitively think of as "heritability"). Importantly, all of these methods provide unbiased estimates of the total contribution of all GWAS SNPs (and all other genetic variation they are correlated with) and neither require nor rely on individually-significant associations.
Hi Sasha, regarding GREML, what do you think of Hill et al (2018, Molecular Psychiatry, 23, 2347)? Using GREML it reports a heritability for IQ of 50%. Would you accept this value? This estimate seems to be for only the additive effects of SNPs (am I interpreting that correctly?), so doesn’t include all other forms of genetic variation, so the true heritability would be larger (agreed?). Given that, it seems that the distance from the estimates from twin studies (~ 70%) might not be that big.
This isn't a serious article. It ignores all of the discussion on the problems with twin studies that we've been having on substack, including the influence of indirect genetic effects, assortative mating, and rGE. It is also teeming with factual errors. As one example, Collier writes: "GWAS studies examine one type of genetic variability, Single Nucleotide Polymorphisms (or SNPs), and they typically record SNPs at 20,000 locations." Collier cites no support for this claim. Contemporary biobanks map out tens of millions of gene variants for use in GWAS. For example, the UK Biobank has mapped out 96 million variants. It’s a ludicrous misrepresentation to say that GWAS are based on 20,000 SNP locations. This is off by more than three orders of magnitude. It’s a terrible indictment of the quality of the editors at Aporia that they would publish an article that makes such an elementary mistake.
Collier and seemingly the rest of the hereditarians don’t understand twin studies either. The most basic mistake they make is to think that twin studies give us an estimate of heritability that is independent of the particular population being studied. If you do a study of the heritability of IQ among Chinese twins reared apart, and you find a strong correlation, there's no way of knowing whether the correlation is due to shared genes or shared environment. After all, those twin pairs grew up in massively similar environments. They all grew up in China at exactly the same time and encountered similar health and education systems, similar entertainment options, similar political ideologies. Since all of those environmental features are shared by both twins, twin studies give you no way of measuring their effect on IQ. Twin studies can separate out the effects of environmental features that differ between twins reared apart. You can deduce something about the effect of parental income if those incomes differ or the effect of geography if it differs. But you can't do anything like estimate the full effect of environmental factors because most of them are shared by both twins. So any heritability estimate you get is valid only for a particular population, like Chinese people born in 1954, and has no validity outside of that population. If you don’t believe me, try reading an actual geneticist like K. Paige Harden in *The Genetic Lottery*.
As Lyman Stone argues there are always environmental influences that get lumped in with genetic factors in twin studies: 'This kind of GxE will almost never be captured by twin studies, because twins always share a birth cohort by definition. They are massively range-restricted in ways it is physically impossible to control for; the nature of being twins means you can’t have variance in things like cohort of birth or nation of origin, which are super important elements of “environment.”' That’s why when you test heritability estimates derived from twin studies, you find they are wrong. Stone provides multiple examples.
First, 20,000 SNPs is indeed "typical" for published GWAS studies (though this number is increasingly rapidly as technology advances, so at worst the number is out of date). And the number of SNPs used in a given study of a trait is not the same thing as the total number of SNPs mapped by UK Biobank. There are (I think) no estimates of the heritability of IQ that employ 96 million SNPs.
Second, every decent account of twin studies emphasizes that the heritability estimates pertain to the range of environments sampled in the study! This is known, understood and emphasized.
Obviously, if one changes the environment in ways that are not sampled in the study, then the heritability estimate could be changed. This is not in any way a refutation of the values given by twin studies (though it is a feature that does need to be understood and borne in mind).
Professor Steve Hsu, a well known quantum physicist at Michigan State, ran a lot of studies on the genetic influence of intelligence, both in the US and in China...He concluded that thousands of genes affect intelligence, and also that intelligence is roughly 80% heritable....That seems likely because good health and IQ are definitely correlated, and a great many genes affect health...
Excellent article.
"Here I add to that by arguing that current GWAS studies must be overlooking much of the genetic influence on intelligence. In short, intelligence must be affected by vast numbers of genes, which means that most of them must have very small effects, and current GWAS studies do not have the statistical power to detect these tiny effects."
I believe that is the correct analysis of the situation. The fact that human intelligence manifests as a spectrum shows that there is a large number of genes influencing intelligence.
To add a further confounding factor, intelligence presents itself in different abilities, such as abstraction, logic, reasoning, planning, creativity, critical thinking, and problem-solving.
It seems to me that 80% or more of intelligence is genetic.
Great article! I noticed this comment: "Indeed a more recent and bigger study analysed 3 million genomes and found 3,952 SNPs associated with educational attainment, which together account for 12 to 16% of the variance."
This is slightly misstated as these estimates refer to the predictive performance of a polygenic score (PGS) including weights from the standard panel of ~1.2M HapMap3 common variants (see Methods: Polygenic Prediction and Supplementary Table 3 of the study). Their method of training this PGS attempts to glean signal from the many truly associated SNPs for which low power prevented their reaching strict, genome-wide significance (p < 5e-8) unlike the smaller number of 3,952 lead SNPs. The same applies to the intelligence GWAS mentioned in the preceding paragraph.
If anything, this supports your general point that GWAS/PGS methods are too underpowered and data-limited to precisely estimate heritability to the same extent as do twin studies.
I think Gusev is being given too much credit. His approach essentially boils down to the cheating spouse caught in flagrante by their partner: 'Who are you going to believe, me or your lying eyes?'
It is based on two prejudices. 1. That intelligence heritability is low or non existent. 2. A rationalist faith which supposes reality must be transparent to rational-technical methods. If the molecular models don’t find heritability, then the heritability must not exist. But that reflects the limits of the model, not reality.
Heritability is a statistical property of populations, not a catalogue of specific genes. We will probably never know exactly what drives it. That doesn't mean it isn't as real or as powerful as people observe, or as Twin Studies suggests.
'We have no good reason to think that twin studies are severely underestimating the heritability of IQ'.
Shouldn't that read 'overestimating'?
Good point
—NC
The article argues that Genome-Wide Association Studies do not currently have sufficient statistical power to estimate the contribution of all genotyped common variants. Had the author read only a few sentences further in my article (https://theinfinitesimal.substack.com/p/no-intelligence-is-not-like-height) they would have found a resolution to this dilemma:
"But prediction accuracy depends on sample size, could the findings drastically change with more samples in the future? In fact, through the magic of statistics, we actually know that this claim will always to be true. We know this because we have estimated a parameter called molecular heritability, which tells us the upper bound on what a genetic predictor could ever achieve ... But for IQ, the direct heritability dropped to 15% (with a wide error bar) and for educational attainment all the way down to 4% (with a narrow error bar). These substantial decreases are the result of some mix of cultural influences, assortative mating, and population structure."
So we already had our answer: molecular methods can estimate the total GWAS heritability without being constrained by statistical power to identify individual effects, and these estimates are *also* much lower than those obtained by twin studies.
Though I wish the author had read the entirety of my article and saved themselves the trouble of writing a response to a point that was already addressed, I do applaud the effort. Noah Carl, whose piece is also cited here, appears to have read none of my article at all! His response immediately shifts to pedigree and adoption studies that were not even mentioned. Thankfully, Vinay Tummarakota at Unboxing Politics has recently done the heavy lifting of sifting through pedigree and adoption studies and demonstrated that Carl's non-response also happens to be incorrect on the merits: adoption and pedigree studies do not provide strong support for twin studies either (https://unboxingpolitics.substack.com/p/contra-scott-alexander-on-missing). I again encourage all interested parties to read carefully, as Tummarakota's piece also addresses concerns about rare and SNP-level variation in the section titled "Relatedness Disequilibrium Regression". Perhaps in the distant future someone around here will have actually read to the end and offered a counterpoint.
We know that children resemble their parents. Adoption studies find that adopted children resemble their biological parents much more than their adoptive parents, and often that they don't resemble their adoptive parents at all. This seems like strong evidence that the resemblance between children and their parents is mostly due to genes.
—NC
Hi Sasha, can you talk us through the method by which you arrive at an upper bound (regardless of statistical power) using GWAS studies? Or point me at somewhere where this is explained?
Sure, some of the methods are described here:
http://gusevlab.org/projects/hsq/#h.gg1hj8vdv5em
There are individual-level based estimators (typically referred to as GREML or GCTA) and there are estimators based on GWAS summary statistics (typically referred to LDSC). Both of these methods can either be applied to "population level" data to estimate the total proportion of the trait that can be explained by the genotyped SNPs or "family level" data to estimate the *direct* effects that can be explained by the genotyped SNPs (the distinction between population and direct effects is explained in the above link; direct effects are what people intuitively think of as "heritability"). Importantly, all of these methods provide unbiased estimates of the total contribution of all GWAS SNPs (and all other genetic variation they are correlated with) and neither require nor rely on individually-significant associations.
Hi Sasha, regarding GREML, what do you think of Hill et al (2018, Molecular Psychiatry, 23, 2347)? Using GREML it reports a heritability for IQ of 50%. Would you accept this value? This estimate seems to be for only the additive effects of SNPs (am I interpreting that correctly?), so doesn’t include all other forms of genetic variation, so the true heritability would be larger (agreed?). Given that, it seems that the distance from the estimates from twin studies (~ 70%) might not be that big.
This isn't a serious article. It ignores all of the discussion on the problems with twin studies that we've been having on substack, including the influence of indirect genetic effects, assortative mating, and rGE. It is also teeming with factual errors. As one example, Collier writes: "GWAS studies examine one type of genetic variability, Single Nucleotide Polymorphisms (or SNPs), and they typically record SNPs at 20,000 locations." Collier cites no support for this claim. Contemporary biobanks map out tens of millions of gene variants for use in GWAS. For example, the UK Biobank has mapped out 96 million variants. It’s a ludicrous misrepresentation to say that GWAS are based on 20,000 SNP locations. This is off by more than three orders of magnitude. It’s a terrible indictment of the quality of the editors at Aporia that they would publish an article that makes such an elementary mistake.
https://www.nature.com/articles/s41586-018-0579-z
Collier and seemingly the rest of the hereditarians don’t understand twin studies either. The most basic mistake they make is to think that twin studies give us an estimate of heritability that is independent of the particular population being studied. If you do a study of the heritability of IQ among Chinese twins reared apart, and you find a strong correlation, there's no way of knowing whether the correlation is due to shared genes or shared environment. After all, those twin pairs grew up in massively similar environments. They all grew up in China at exactly the same time and encountered similar health and education systems, similar entertainment options, similar political ideologies. Since all of those environmental features are shared by both twins, twin studies give you no way of measuring their effect on IQ. Twin studies can separate out the effects of environmental features that differ between twins reared apart. You can deduce something about the effect of parental income if those incomes differ or the effect of geography if it differs. But you can't do anything like estimate the full effect of environmental factors because most of them are shared by both twins. So any heritability estimate you get is valid only for a particular population, like Chinese people born in 1954, and has no validity outside of that population. If you don’t believe me, try reading an actual geneticist like K. Paige Harden in *The Genetic Lottery*.
As Lyman Stone argues there are always environmental influences that get lumped in with genetic factors in twin studies: 'This kind of GxE will almost never be captured by twin studies, because twins always share a birth cohort by definition. They are massively range-restricted in ways it is physically impossible to control for; the nature of being twins means you can’t have variance in things like cohort of birth or nation of origin, which are super important elements of “environment.”' That’s why when you test heritability estimates derived from twin studies, you find they are wrong. Stone provides multiple examples.
https://lymanstone.substack.com/p/more-evidence-twin-studies-are-bad?utm_source=publication-search
Hi Ian,
First, 20,000 SNPs is indeed "typical" for published GWAS studies (though this number is increasingly rapidly as technology advances, so at worst the number is out of date). And the number of SNPs used in a given study of a trait is not the same thing as the total number of SNPs mapped by UK Biobank. There are (I think) no estimates of the heritability of IQ that employ 96 million SNPs.
Second, every decent account of twin studies emphasizes that the heritability estimates pertain to the range of environments sampled in the study! This is known, understood and emphasized.
Obviously, if one changes the environment in ways that are not sampled in the study, then the heritability estimate could be changed. This is not in any way a refutation of the values given by twin studies (though it is a feature that does need to be understood and borne in mind).