Yin Shen, PhD
My lab studies the fundamental mechanisms of transcriptional control. We utilize human pluripotent stem cells to model development and diseases, and innovative genomic and genetic tools to investigate how cis- regulatory elements (CREs) affect gene expression. By investigating how genetic and epigenetic variations contribute to CRE function, we will develop a detailed understanding of CREs and how they contribute to cell fate determination. These studies will lead to a better comprehension of development, and will facilitate the improvement and advancement of preventive methods, diagnoses, and treatments for diseases.
- Unlocking cell type-specific mechanisms of gene regulation.
A remarkable feature of multi-cellular organisms is that they develop a distinct set of highly specialized cells from the same genetic blueprint. This is achieved by precise transcriptional regulation involving the interplay of multiple components including transcription factors and CREs in the genome. The NIH Roadmap Epigenome and ENCODE projects have mapped millions of putative CREs across more than one hundred different cell types and tissues. While these maps have significantly expanded our knowledge of non-coding sequences, there are still large gaps between having these maps of CREs and understanding how these CREs function in gene regulation. These include: (a) few CREs predicted by ChIP-Seq and DNase-Seq experiments are functionally validated; (b) Epigenomic studies do not provide information on the target genes of these putative CREs; (c) none of the epigenomic marks can comprehensively capture all CREs in a given cell type. Therefore, we do not completely understand the mechanisms of gene regulation mediated by CREs. To address these issues, we are utilizing integrative, unbiased, and high-throughput genomic and genetic tools to achieve the following three specific aims: (1) We are using a high-throughput enhancer reporter assay to functionally characterize putative enhancer elements. (2) Enhancers are only one kind of CREs. By performing a high-throughput genetic screening mediated by CRISPR/Cas9, we aim to comprehensively identify all CREs, and concurrently provide additional functional validation of the enhancers identified in Aim 1. (3) Finally, we will capture promoter-CREs interactions in various cell types using a targeted high-throughput chromosome conformation capture (capture Hi-C) method. Collectively, by integrative analyses of data from all three aims and other publicly available epigenomic datasets, we aim to provide a perspective of generalized rules that govern the fine-tuning of gene expression.
- Determining functional consequences of neurological disease-associated genetic variation at CREs.
Putative regulatory regions harbor a disproportionally large number of sequence variants associated with human traits and diseases, leading to the notion that genetic lesions in the CREs contribute substantially to common human diseases. However, a major bottleneck in advancing this hypothesis has been the limited availability of high throughput means for functionally and quantitatively characterizing the large number of predicted CREs, in particular with regards to their contributions to target gene expression. Therefore, we know little about the consequences of such disease-associated genetic variations in CREs, especially for their function in the central nervous system. To begin to address these issues, we will focus on Parkinson’s disease (PD), in which the exact cause for over 90% of the patients is not known. We will first identify a comprehensive list of candidate PD causal SNPs in the regulatory regions (rSNPs) by integrative analysis of genetics (GWAS SNPs and others in the same linkage disequilibrium block) and epigenomic data (putative CREs). Second, we will perform molecular characterization of PD rSNPs by (a) employing high-throughput reporter assays, and (b) linking the PD rSNPs to their target genes by high-solution chromosome conformation capture experiments. Third, we will investigate the biological consequences of PD-associated rSNPs in induced pluripotent stem cells (iPSC)-derived PD models. This includes physiological studies of DA neurons derived from isogenic iPSC lines with or without rSNPs. Overall, the study will address the important challenge of moving forward from the identification of disease- associated SNPs by GWAS to a more thorough understanding of the functional consequences of specific variants that cause or contribute to human disease. Ultimately, the goal of this work is to further integrate genomic approaches into the diagnosis and treatment of PD and other neurological diseases, and facilitate truly personalized medical treatment in patients.