The paper Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder is published in Nature Neuroscience.
ABSTRACT: Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder with a strong genetic basis. Yet, only a small fraction of potentially causal genes—about 65 genes out of an estimated several hundred—are known with strong genetic evidence from sequencing studies. We developed a complementary machine-learning approach based on a human brain-specific gene network to present a genome-wide prediction of autism risk genes, including hundreds of candidates for which there is minimal or no prior genetic evidence. Our approach was validated in a large independent case–control sequencing study. Leveraging these genome-wide predictions and the brain-specific network, we demonstrated that the large set of ASD genes converges on a smaller number of key pathways and developmental stages of the brain. Finally, we identified likely pathogenic genes within frequent autism-associated copy-number variants and proposed genes and pathways that are likely mediators of ASD across multiple copy-number variants. All predictions and functional insights are available at http://asd.princeton.edu.
A preprint describing this work is currently available on bioRxiv.
Supplemental tables from the manuscript are available below for download. Supplemental Figures and Notes are available here (from bioRxiv).
Table S1: Training gold standard. [Download]
Our training gold standard consisted of known ASD-associated genes (with varying levels of evidence E1-4) as positives and non-mental-health-related genes as negatives. The positives are listed along with their evidence level and source database.
Table S2: Top 20 biological processes enriched in our SVM model for predicting ASD-genes. [Download]
We analyzed our ASD-gene prediction model to identify which biological processes and pathways contribute the most in associating a gene with ASD in the brain-specific network. The table contains the top 20 statistically enriched Gene Ontology biological processes among genes that are most highly “weighted” by the model, i.e., associated with the highest feature weights in our SVM model. The most informative genes in our ASD network-based model are strongly enriched for neurological processes, providing insight into the general underlying processes that may be driving our predictions.
Table S3: Genome-wide prediction of ASD-associated genes. [Download]
The predicted ASD-association ranking of all genes in the genome is listed along with detailed information on their gold standard status, prediction score, prediction probability, prediction P and Q values, and membership in ASD-related gene sets.
Table S4: Targets of de novo mutations identified by exome sequencing of the Simon Simplex Collection. [Download]
Genes harboring de novo likely-gene-disrupting (LGD; also known as loss-of-function) or synonymous (SYN) mutations identified in autistic children (probands; prb) and unaffected sibling (sib) are listed separately.
Table S5: ASD-association of brain developmental gene-expression signatures. [Download]
All signatures that are significantly enriched among the top-ranked ASD genes are listed here along with the number of genes in each signature and their enrichment scores.
Table S6: ASD-associated functional modules in the brain-specific network. [Download]
The nine modules of top-ranked ASD genes each tightly connected in the brain-specific network are presented here with information about their module/cluster membership, connectivity within each cluster, and enriched biological processes in each cluster.
Table S7: Prioritization of genes within ASD-associated CNVs. [Download]
The table contains the complete ASD ranking of genes within each of eight autism-associated CNVs along with details on previous genetic or functional evidence for the connection of individual CNV-genes to ASD.
Table S8: Functional analysis of ASD-associated CNVs. [Download]
Results from the functional analysis of top-ranked genes in the eight ASD-associated CNVs are presented here, with details on the specific ‘intermediate’ genes and processes that connect the CNV genes to the molecular phenotype of autism.
Table S9: Detailed functional, developmental, and CNV information for our top-decile genes. [Download]
Top 2,500 ASD candidate genes along with their functional module memberships, spatiotemporal developmental gene-expression patterns, and CNV membership.
Download code for genome-wide autism gene prediction with user-defined gold-standards.