As part of the NIH Common Fund’s Bridge2AI program, the CM4AI data generation project seeks to map the spatiotemporal architecture of human cells and use these maps toward the grand challenge of interpretable genotype-phenotype learning. In genomics and precision medicine, machine learning models are often "black boxes," predicting phenotypes from genotypes without understanding the mechanisms by which such translation occurs.
To address this deficiency, project will launch a coordinated effort involving three complementary mapping approaches – proteomic mass spectrometry, cellular imaging, and genetic perturbation via CRISPR/Cas9 – creating a library of large-scale maps of cellular structure/function across demographic and disease contexts. These data will broadly stimulate research and development in "visible" machine learning systems informed by multi-scale cell and tissue architecture. In addition to data and tools, this project will implement a standards data management approach based on FAIR access and software principles, with deep provenance and replication packages for representation of cell maps and their underlying datasets; initiate a research program in ethical AI, especially as it relates to how maps will be used in genomic medicine and model interpretation; and stimulate a diverse portfolio of training opportunities in the emerging field of biomachine learning.
The collaborative award is $4,894,457 over the project period. The School of Information portion of the award is $333,944.
1OT2OD032742-01