Leveraging Automated Machine Learning for EnvironmentalData-Driven Genetic Analysis and Genomic Prediction inMaize Hybrid
Kunhui He, Tingxi Yu, Shang Gao, Shoukun Chen, Liang Li, Xuecai Zhang, Changling Huang, Yunbi Xu, Jiankang Wang, Boddupalli M. Prasanna, Sarah Hearne, Xinhai Li, Huihui Li
Advanced Science; 2025; IF:14.3
DOI:10.1002/advs.202412423
Abstract
Genotype, environment, and genotype-by-environment (G×E) interactions play a critical role in shaping crop phenotypes. Here, a large-scale, multi-environment hybrid maize dataset is used to construct and validate an automated machine learning framework that integrates environmental and genomic data for improved accuracy and efficiency in genetic analyses and genomic predictions. Dimensionality-reduced environmental parameters (RD_EPs) aligned with developmental stages are applied to establish linear relationships between RD_EPs and traits to assess the influence of environment on phenotype. Genome-wide association study identifies 539 phenotypic plasticity trait-associated markers (PP-TAMs), 223 environmental stability TAMs (Main-TAMs), and 92 G×E-TAMs, revealing distinct genetic bases for PP and G×E interactions. Training genomic prediction models with both TAMs and RD_EPs increase prediction accuracy by 14.02% to 28.42% over that of genome-wide marker approaches. These results demonstrate the potential of utilizing environmental data for improving genetic analysis and genomic selection, offering a scalable approach for developing climate-adaptive maize varieties.