(This article forms a part of the Science for All newsletter that takes the jargon out of science and puts the fun in! Subscribe now!)
Deciphering how and which of an individual’s genes switch on an off involves mapping their RNA landscape (the messengers necessary to translate gene signals to proteins) to a standard reference. However, such reference templates are frequently themselves short on information and impede understanding of gene function.
In a new paper published in the journal Nature Methods, researchers at the University of California, Santa Cruz, have proposed a “pantranscriptome,” which combines a transcriptome and a pangenome — a reference that contains genetic material from a cohort of diverse individuals, rather than just a single linear strand.
RNA’s most commonly recognized function is to translate DNA into proteins, but scientists now understand that the vast majority of RNA does not make proteins, but instead can play roles such as influencing cell structure or regulating genes. The entire RNA landscape is known collectively as the transcriptome, and mapping this allows researchers to better understand an individual’s gene expression.
The pantranscriptome-concept builds on the emerging concept of “pangenomics” in the genomics field. Typically, when evaluating an individual’s genomic data for variation, scientists compare the individual’s genome to that of a reference made up of a single, linear strand of DNA bases. Using a pangenome allows researchers to compare an individual’s genome to that of a genetically diverse cohort of reference sequences all at once, sourced from individuals representing a diversity of biogeographic ancestry. This gives the scientists more points of comparison for which to better understand an individual’s genomic variation.
Mapping RNA sequencing data to understand gene expression can be difficult because the RNA sequences are spliced by cellular mechanisms, meaning one set of RNA data can come from non-connected areas of the genome, making it challenging to correctly align them to a reference. These splicing sites are not uniform across the human population, but vary between individuals. It is also difficult to know which haplotype the RNA comes from — whether the group of genes comes specifically from the set of chromosomes inherited from the individual’s mother, or the set inherited from the father.
But with the new pipeline of open-source tools, the researchers can take the spliced segments of an individual’s RNA, map where they align on a pangenome, identify which haplotype the data belongs to, and analyse gene expression.
From the Science pages
Reconstructing past deep-water circulations of Indian Ocean
Joshimath: a victim of the Himalayan development model
Is handedness inherited?
Have galaxies existed in the universe before the known ones? Read the answer here.