Graph Neural Networks for genome enabled prediction of complex traits.

Hounie, Ignacio - Elenter, Juan - Etchebarne, Guillermo - Fariello, María Inés - Lecumberry, Federico

Resumen:

The advent of Graph Neural Network architectures has enabled Deep Learning on non-Euclidean data, finding numerous applications both inside and outside genomics. Here we introduce these models in the context of Genome enabled prediction of complex traits.Graph representations of genome-wide marker information can be derived treating individuals as nodes, giving place to population graphs, where each genotype is supported on a node. We address graph structures estimated solely from SNP marker data by means of the Genomic Relationship Matrix. That is, we build an association network between individuals using correlations between genotypes.In this scenario we propose a novel neural network architecture supported on these graphs. It leverages both 1D convolutions, which aim to exploit local structures along the genome arising from linkage disequilibrium, and Graph Neighbourhood Aggregation operations, so as to incorporate population structure. First, low dimensional embeddings are computed from locally aggregated genotypes, which are then concatenated with embeddings from the target node and fed to a linear predictor. These embeddings are extracted using convolutional and fully-connected layers and the model is trained end-to -end. In order to circumvent scalability issues, node neighbourhoods are sampled, thus allowing training on large graphs. The model was evaluated in the realm of Holstein cattle milk yield prediction, outperforming state-of-the-art methods. We show that neighborhood aggregation improves performance, which illustrates the potential of graph based techniques. To the best of our knowledge, this is the first Geometric Deep Learning approach to this problem.


Detalles Bibliográficos
2021
Este trabajo fue parcialmente financiado por el proyecto ANII FSDA 1-2018-1-154364.
Graphical models
GNN
Genomics
Inglés
Universidad de la República
COLIBRI
https://meetings.cshl.edu/meetings.aspx?meet=PROBGEN&year=21
https://hdl.handle.net/20.500.12008/36832
Acceso abierto
Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)
Resumen:
Sumario:Los experimentos presentados en este trabajo se realizaron utilizando ClusterUY (sitio: https://cluster.uy)