Description & Requirements
Recently, the Somatic Mosaicism across Human Tissues (SMaHT) Network was established by the NIH to catalog somatic genetic variation across human tissues and discover its patterns, causes and consequences. This effort includes all classes of somatic mutation: single nucleotide variants, short insertions and deletions, structural variants and other large chromosomal aberrations. The somatic mutation catalog will enable downstream analyses, such as rates and burdens of mutations across tissues, mutational signature analysis, driver mutations and clonal expansions, and lineage tracing.
The Broad Institute is one of five Genome Characterization Centers (GCCs) within the Network and is tasked with delivering the sequencing data underpinning this somatic mutation catalog. In conjunction with this effort, we are looking for a highly motivated and talented individual with computational background to join these efforts and to lead data curation and analyses for this ambitious 5-year project. Some of our recent work includes Coorens et al., Nature 2021 and GTEx Consortium, Science 2020).
The successful candidate will join an interdisciplinary team working with an unprecedented set of data from a wide range of human tissues and donors, including extensive deep short- and long-read whole-genome sequencing data as well as RNA and duplex sequencing data. The scope of this project provides unique opportunities for developing novel analytical methods for data QC, integration, detection of somatic mutations, multi-tissue analyses, and integration with transcriptomic data.
Overall Responsibility
The computational scientist will create and oversee the implementation of experimental work plans, pipelines for data processing, organization, and analysis, and contribute to budgetary and logistical considerations. In addition, this individual is expected to be able to clearly communicate scientific details, results and strategic considerations to others within the team or SMaHT network at large. This role will require strategic coordination of multiple groups at the Broad Institute (others in the GRO and Cancer Genome Analysis groups) and within the SMaHT Network. This individual will serve as a key contact point for project leaders, international collaborators of the project, funders, and junior staff.
PRINCIPAL DUTIES AND RESPONSIBILITIES
- Design and execute data analysis strategies involving multimodal human tissue datasets, and specifically lead whole-genome sequencing data and somatic mutation analyses.
- Together with others, develop new methodologies and/or evaluate new methods for integrative analysis of genomic data.
- Explore novel data representation with emphasis on integrating diverse data types.
- Apply or develop state-of-the-art computational tools and pipelines to a) assess data quality, b) integrate diverse data types and metadata, and c) detect somatic mutations and subsequent downstream analyses.
- Work closely with project leadership and research scientists to design, execute, and analyze experiments critical to the development of single-cell pipelines from primary human tissues.
- Collaborate with internal technology development efforts and scientists towards the application of high-throughput single cell multiomic mapping strategies to diverse biological models.
- Present ideas and results to the multi-disciplinary members of the SMaHT Network. Prepare written reports and presentations for internal use as well as presentations at conferences.
- Mentor junior staff.
QUALIFICATIONS
- A PhD in Genomics, Computer Science, Physics, Statistics, Math, Engineering, or a related quantitative discipline is required.
- Experience with computational analysis, algorithm development, and statistics.
- Substantial experience analyzing genomic data, preferably (whole-)genome sequencing, is required.
- Proficiency in at least one modern programming language. Experience with a scientific programming environment, such as python, Julia, R or MATLAB, is preferred.
- Proficient in design and application of analysis pipelines, and able to quickly learn and adapt computational tools for novel analyses. Experience with cloud computing is a plus.
- Strong communication skills, with ability to effectively communicate with specialists and non-specialists.
- Background in genetics or biology is a plus.
- All computational scientists at Broad are encouraged to continue developing their expertise by engaging with the wider computational community through Broad's vibrant Models, Inference & Algorithms Initiative (broadinstitute.org/mia).