发布: 2020年09月20日第10卷第18期 DOI: 10.21769/BioProtoc.3757 浏览次数: 5868
评审: Imre GáspárShyam SolankiAdam Idoine
Abstract
Gene transcription in bacteria often starts some nucleotides upstream of the start codon. Identifying the specific Transcriptional Start Site (TSS) is essential for genetic manipulation, as in many cases upstream of the start codon there are sequence elements that are involved in gene expression regulation. Taken into account the classical gene structure, we are able to identify two kinds of transcriptional start site: primary and secondary. A primary transcriptional start site is located some nucleotides upstream of the translational start site, while a secondary transcriptional start site is located within the gene encoding sequence.
Here, we present a step by step protocol for genome-wide transcriptional start sites determination by differential RNA-sequencing (dRNA-seq) using the enteric pathogen Shigella flexneri serotype 5a strain M90T as model. However, this method can be employed in any other bacterial species of choice. In the first steps, total RNA is purified from bacterial cultures using the hot phenol method. Ribosomal RNA (rRNA) is specifically depleted via hybridization probes using a commercial kit. A 5′-monophosphate-dependent exonuclease (TEX)-treated RNA library enriched in primary transcripts is then prepared for comparison with a library that has not undergone TEX-treatment, followed by ligation of an RNA linker adaptor of known sequence allowing the determination of TSS with single nucleotide precision. Finally, the RNA is processed for Illumina sequencing library preparation and sequenced as purchased service. TSS are identified by in-house bioinformatic analysis.
Our protocol is cost-effective as it minimizes the use of commercial kits and employs freely available software.
Background
Transcription in bacteria is initiated by the RNA polymerase holoenzyme, which recognizes specific sequence elements on the DNA within the promotor region, to which sigma factors are bound (Feklistov et al., 2014). This RNA polymerase holoenzyme binding site defines the Transcriptional Start Site and the direction of transcription. For example, the most common house-keeping sigma factor, named 𝝈70 in Escherichia coli, recognizes two elements centered approximately 10 and 35 bp upstream of the TSS (Feklistov et al., 2014). The RNA polymerase holoenzyme melts the double stranded DNA between 11 nt upstream (position -11) to 3 nt downstream (+3) of the TSS (+1), and the single-stranded DNA can then be used as template for the addition of tri-phosphorylated ribonucleotides. The initiation starts mainly at a specific position, but sometimes “wobbles” of one or more bases up- or downstream are encountered (Murakami and Darst, 2003; Robb et al., 2013; Vvedenskaya et al., 2015). The DNA sequence around TSS have long been recognized as crucial for gene regulation in bacteria (Jacob and Monod, 1961). Depending on the position within the gene structure, which begins with a start codon (usually ATG) and finishes with one of the three stop codons, we can identify two types of transcriptional start sites: primary and secondary. Primary transcriptional start sites (pTSS) are located some nucleotides upstream of the translational start site, while the secondary transcriptional start sites (sTSS) are located within the gene encoding sequence (Figure 1).
Figure 1. Schematic representation of the Primary and Secondary Transcriptional Start Site definition
Until the advent of next-generation sequencing, in order to locate the TSS of a specific RNA, it was necessary to examine each transcript individually, using either the S1 protection assay, primer extension or a 5’ RACE method (Sharma and Vogel, 2014). Owing to the increasing popularity and a decrease in costs of high-through put sequencing, in 2010 differential RNA-seq (dRNA-seq) was developed to simultaneously map all TSS of a genome using Helicobacter pylori as first model organism (Sharma et al., 2010). Since then, this method has been widely employed to determine the TSS of several bacterial species (Berghoff et al., 2009; Jager et al., 2009; Albrecht et al., 2010 and 2011; Bohn et al., 2010; Irnov et al., 2010; Schluter et al., 2010; Sharma et al., 2010; Beckmann et al., 2011; Deltcheva et al., 2011; Filiatrault et al., 2011; Mitschke et al., 2011a and 2011b; Kroger et al., 2012 and 2013; Madhugiri et al., 2012; Ramachandran et al., 2012 and 2014; Sahr et al., 2012; Schmidtke et al., 2012; Wilms et al., 2012; Cortes et al., 2013; Dugar et al., 2013; Mentz et al., 2013; Nickel et al., 2013; Pfeifer-Sancar et al., 2013; Porcelli et al., 2013; Schluter et al., 2013; Voss et al., 2013; Wiegand et al., 2013; Zhang et al., 2013; Voigt et al., 2014; Cervantes-Rivera et al., 2020).
Primary transcripts of prokaryotes carry a triphosphate at their 5’-ends. In contrast, processed or degraded RNAs only carry a monophosphate at their 5’-ends. This is also the case of ribosomal RNA (rRNA) (Schoenberg, 2007). The dRNA-seq approach used here exploits the properties of a 5’-monophosphate-dependent exonuclease (TEX) to selectively degrade processed transcripts, thereby enriching for unprocessed RNA species carrying a native 5’-triphosphate (Schoenberg, 2007). TSS can then be identified by comparing TEX-treated and untreated RNA-seq libraries, where TSS appear as localized maxima in coverage enriched upon TEX-treatment (Sharma et al., 2010).
Until 2013 TSS annotation was performed manually, but this method is arduous and time-consuming. Nowadays many computational tools are available for automatic TSS annotation using dRNA-seq data. These include TSSPredator (Dugar et al., 2013), TSSAR (Amman et al., 2014), TruHMM (Li et al., 2013), TSSer (Jorjani and Zavolan, 2014) and ReadXplorer2 (Hilker et al., 2016).
Here, we present a step by step protocol for TSS determination through comparison of TEX-treated and untreated RNA libraries in Shigella flexneri serotype 5a strain M90T as originally performed in (Cervantes-Rivera et al., 2020). The overall workflow is illustrated in Figure 2.
Figure 2. Workflow of dRNA-seq for whole-genome Transcriptional Start Sites identification
Materials and Reagents
Equipment
Software
Programs
All programs used in this protocol are freely available
Databases
Procedure
文章信息
版权信息
© 2020 The Authors; exclusive licensee Bio-protocol LLC.
如何引用
Cervantes-Rivera, R. and Puhar, A. (2020). Whole-genome Identification of Transcriptional Start Sites by Differential RNA-seq in Bacteria. Bio-protocol 10(18): e3757. DOI: 10.21769/BioProtoc.3757.
分类
微生物学 > 微生物遗传学 > 基因表达
微生物学 > 微生物遗传学 > RNA
系统生物学 > 基因组学 > 测序
您对这篇实验方法有问题吗?
在此处发布您的问题,我们将邀请本文作者来回答。同时,我们会将您的问题发布到Bio-protocol Exchange,以便寻求社区成员的帮助。
提问指南
+ 问题描述
写下详细的问题描述,包括所有有助于他人回答您问题的信息(例如实验过程、条件和相关图像等)。
Share
Bluesky
X
Copy link