搜索

Computational Identification of MicroRNA-targeted Nucleotide-binding Site-leucine-rich Repeat Genes in Plants
植物中microRNA靶向结合的富含亮氨酸重复序列基因的计算识别   

评审
匿名评审
下载 PDF 引用 收藏 提问与回复 分享您的反馈 Cited by

本文章节

Abstract

Plant genomes harbor dozens to hundreds of nucleotide-binding site-leucine-rich repeat (NBS-LRR, NBS for short) type disease resistance genes (Shao et al., 2014; Zhang et al., 2015). Proper regulation of these genes is important for normal growth of plants by reducing unnecessary fitness costs in the absence of pathogen infection. Recent studies have revealed that microRNAs are involved in regulation of NBS genes in plants (Zhai et al., 2011; Shivaprasad et al., 2012). This protocol describes computational methods for the genome-wide identification of plant NBS genes potentially regulated by microRNAs.

Equipment

  1. Personal computer (an internet connection is needed) (Intel Core i5-2300 CPU, 8 GB RAM)

Sequence data and software

  1. Sequence data compilation
    The coding sequence (CDS) and protein sequences of interested genomes should be downloaded from relevant databases. A recommended database containing a relatively large number of sequenced plant genomes is Phytozome (http://www.phytozome.org/) (Goodstein et al., 2012). MicroRNAs of interested genomes can be retrieved from the miRBase (http://www.mirbase.org/) (Kozomara and Griffiths-Jones, 2014) or from in-house sequenced data.
  2. Required software and online tools
    The following software should be locally installed in your computer:
    1. Hmmer 3.0 (http://hmmer.janelia.org/) (Johnson et al., 2010), for perform local hidden Markov models (HMM) search of NBS homologous proteins.
    2. NCBI BLAST+ or NCBI BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi), for perform local blastp search of NBS homologous proteins.
    3. ActivePerl 5.14.2 (http://www.activestate.com/activeperl/downloads), for running scripts written in perl language.
    Manuals for installation/operation of these softwares could be downloaded in the referred websites.
    1. The online tools to be used are:
    2. COILS (http://www.ch.embnet.org/software/COILS_form.html) (Lupas et al., 1991), a program for identification of coiled-coils (CC) domain in protein sequences.
    3. Pfam (http://pfam. sanger.ac.uk/) (Finn et al., 2014), a database for identification of protein domains.
    4. psRNATarget (http://plantgrn.noble.org/psRNATarget/) (Dai and Zhao, 2011), a website designed for prediction microRNA targets in plants.

Procedure

  1. Preparation of the query file and local database
    For a given plant genome of interest, do the following:
    1. Download all CDS and protein sequences of all protein-coding genes from a relevant database such as phytozome (http://www.phytozome.org/) (Goodstein et al., 2012). Make sure that each gene has the same name in both CDS and protein sequence files.
    2. Download the HMM profile and the extended amino acid sequence for NB-ARC domain (Pfam no. PF00931) from the Pfam database (http://pfam.sanger.ac.uk/) (Finn et al., 2014).
    3. Download all microRNA sequences from a relevant database such as miRBase (http://www.mirbase.org/) (Kozomara and Griffiths-Jones, 2014) or prepare a fasta file including all the microRNA sequences obtained in house experiments.
    4. Create a local database of your downloaded protein sequences for blast search using the formatdb program of the blast software (http://blast.ncbi.nlm.nih.gov/Blast.cgi).


      Figure 1. A flow chart of the steps described in our procedure

  2. Computational identification of NBS genes
    1. Perform the HMM search against the fasta file that contains protein sequences you downloaded using the hmmsearch.exe program in the hmmer 3.0 package (Johnson et al., 2010) with the HMM profile of NB-ARC domain as a query in default settings.
    2. Run a local BLASTp search against the protein database that was created in the procedure step A4 using the amino acid sequence of the NB-ARC domain as a query. The threshold expectation value was set to 1.0 as used in previous studies (Li et al., 2010; Shao et al., 2014).
    3. Retrieve the gene name of potential NBS genes from the results of HMM search and BLAST search and combined them together to obtain the maximal number of hits.
    4. These gene names are used to retrieve the protein sequences from protein dataset downloaded in procedure step A1. This step could be achieved manually if only a few NBS genes are found in the genome. For large datasets, we recommend the researchers writing a Perl script for this step (a script is also available from the authors upon request).
    5. The obtained sequences are further subjected to the online Pfam analysis to verify whether they indeed possess the NBS domain, with the E-value setting to 10-4 (Figure 2A). Sequences that do not have a detectable NB-ARC domain are discarded. The remaining sequences represent all NBS proteins of the dataset.
    6. The Pfam analysis is also important to detect whether these proteins have an N-terminal toll/interleukin-1 receptor (TIR) domain or RESISTANCE TO POWDERY MILDEW8 (RPW8) domain or a C-terminal LRR domain (Meyers et al., 2003). Protein sequences that do not have a detectable N-terminal domain by Pfam are further analyzed using the COILS Server (http://www.ch.embnet.org/software/COILS_form.html) (Lupas et al., 1991) in default settings to detect whether they have a coiled-coils domain at the N-terminal.


      Figure 2. Screen shots for steps A) B5, and B) C2

  3. Identification of NBS genes potentially targeted by microRNAs
    1. To predict NBS genes targeted by microRNAs, retrieve the CDS sequences of identified NBS genes by searching gene names from the downloaded CDS dataset.
    2. Submit the sequences corresponding to CDS of identified NBS genes and to microRNAs to the psRNATarget Server (http://plantgrn.noble.org/psRNATarget/) (Dai and Zhao, 2011) in fasta format for microRNA target prediction (Figure 2B).
      Note: At this step, researchers can change the parameter settings to restrict or expand the number of predicted targets. For example, one can set the Maximum expectation (transformed from mismatch penalty) to 3 to obtain fewer hits with lower false positive prediction rate; or set the Maximum expectation to 5 to maximize the number of potential targets at a higher risk of false positive prediction rate.
    3. Download the prediction results from the psRNATarget Server and retrieve those NBS genes predicted to be targeted by microRNAs.
    4. The predicted regulation of NBS genes by microRNAs could be experimentally validated by co-expression of them in tobacco leaves as described in several studies (Liu et al., 2014; Yu and Pilot, 2014).

Acknowledgments

This protocol is adapted from Shao et al. (2014). This work was supported by the National Natural Science Foundation of China (30930008, 31170210, 91231102 and 31400201), China Postdoctoral Science Foundation (2013M540435 and 2014T70503).

References

  1. Dai, X. and Zhao, P. X. (2011). psRNATarget: a plant small RNA target analysis server. Nucleic Acids Res 39(Web Server issue): W155-159.
  2. Finn, R. D., Bateman, A., Clements, J., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Heger, A., Hetherington, K., Holm, L., Mistry, J., Sonnhammer, E. L., Tate, J. and Punta, M. (2014). Pfam: the protein families database. Nucleic Acids Res 42(Database issue): D222-230.
  3. Goodstein, D. M., Shu, S., Howson, R., Neupane, R., Hayes, R. D., Fazo, J., Mitros, T., Dirks, W., Hellsten, U., Putnam, N. and Rokhsar, D. S. (2012). Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40(Database issue): D1178-1186.
  4. Johnson, L. S., Eddy, S. R. and Portugaly, E. (2010). Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics 11: 431.
  5. Kozomara, A. and Griffiths-Jones, S. (2014). miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res 42(Database issue): D68-73.
  6. Li, J., Ding, J., Zhang, W., Zhang, Y., Tang, P., Chen, J. Q., Tian, D. and Yang, S. (2010). Unique evolutionary pattern of numbers of gramineous NBS-LRR genes. Mol Genet Genomics 283(5): 427-438.
  7. Liu, Q., Wang, F. and Axtell, M. J. (2014). Analysis of complementarity requirements for plant microRNA targeting using a Nicotiana benthamiana quantitative transient assay. Plant Cell 26(2): 741-753.
  8. Lupas, A., Van Dyke, M. and Stock, J. (1991). Predicting coiled coils from protein sequences. Science 252(5009): 1162-1164.
  9. Meyers, B. C., Kozik, A., Griego, A., Kuang, H. and Michelmore, R. W. (2003). Genome-wide analysis of NBS-LRR-encoding genes in Arabidopsis. Plant Cell 15(4): 809-834.
  10. Shao, Z. Q., Zhang, Y. M., Hang, Y. Y., Xue, J. Y., Zhou, G. C., Wu, P., Wu, X. Y., Wu, X. Z., Wang, Q., Wang, B. and Chen, J. Q. (2014). Long-term evolution of nucleotide-binding site-leucine-rich repeat genes: understanding gained from and beyond the legume family. Plant Physiol 166(1): 217-234.
  11. Shivaprasad, P. V., Chen, H. M., Patel, K., Bond, D. M., Santos, B. A. and Baulcombe, D. C. (2012). A microRNA superfamily regulates nucleotide binding site-leucine-rich repeats and other mRNAs. Plant Cell 24(3): 859-874.
  12. Yu, S. and Pilot, G. (2014). Testing the efficiency of plant artificial microRNAs by transient expression in Nicotiana benthamiana reveals additional action at the translational level. Front Plant Sci 5: 622.
  13. Zhai, J., Jeong, D. H., De Paoli, E., Park, S., Rosen, B. D., Li, Y., Gonzalez, A. J., Yan, Z., Kitto, S. L., Grusak, M. A., Jackson, S. A., Stacey, G., Cook, D. R., Green, P. J., Sherrier, D. J. and Meyers, B. C. (2011). MicroRNAs as master regulators of the plant NB-LRR defense gene family via the production of phased, trans-acting siRNAs. Genes Dev 25(23): 2540-2553.
  14. Zhang, Y. M., Shao, Z. Q., Wang, Q., Hang, Y. Y., Xue, J. Y., Wang, B. and Chen, J. Q. (2015). Uncovering the dynamic evolution of nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes in Brassicaceae. J Integr Plant Biol. (Epub ahead of print)

简介

植物基因组具有数十至数百个核苷酸结合位点 - 富含亮氨酸重复序列(NBS-LRR,简称NBS)型疾病抗性基因(Shao等人,2014; Zhang等人 。,2015)。 通过在不存在病原体感染的情况下减少不必要的健康成本,适当调节这些基因对于植物的正常生长是重要的。 最近的研究显示微小RNA参与植物中NBS基因的调节(Zhai等人,2011; Shivaprasad等人,2012)。 该协议描述了用于全基因组鉴定植物NBS基因的计算方法,所述植物NBS基因可能由微小RNA调节。

设备

  1. 个人计算机(需要互联网连接)(Intel Core i5-2300 CPU,8 GB RAM)

序列数据和软件

  1. 顺序数据编译
    感兴趣的基因组的编码序列(CDS)和蛋白质序列应从相关数据库下载。包含相对大量的测序植物基因组的推荐数据库是Phytozome( http://www.phytozome.org/)(Goodstein,Shu等人,2012)。可以从miRBase中获得感兴趣的基因组的微RNA( http://www.mirbase.org/)( Kozomara和Griffiths-Jones,2014)或内部测序数据
  2. 所需的软件和在线工具
    以下软件应在本地计算机中安装:
    1. Hmmer 3.0( http://hmmer.janelia.org/)(Johnson,Eddy em et al。 ,2010),for 执行局部隐马尔可夫模型(HMM)搜索NBS同源 蛋白质
    2. NCBI BLAST + 或NCBI BLAST ( http://blast.ncbi.nlm.nih.gov/Blast.cgi),用于执行本地blastp 搜索NBS同源蛋白
    3. ActivePerl 5.14.2( http://www.activestate.com/activeperl/downloads ),适用于运行以perl语言编写的脚本 这些软件的安装/操作手册可以在推荐网站下载。
      要使用的在线工具是:
    4. COILS( http://www.ch.embnet.org/software/COILS_form.html)(Lupas, Van Dyke等人,1991),用于鉴定卷曲螺旋的程序 (CC)结构域
    5. Pfam( http://pfam。sanger.ac.uk/)(Finn,Bateman et al。,2014),一个用于鉴定蛋白质结构域的数据库
    6. psRNATarget( http://plantgrn.noble.org/psRNATarget/)(Dai和 Zhao,2011),一个设计用于预测微小RNA靶标的网站 植物。

程序

  1. 准备查询文件和本地数据库
    对于感兴趣的给定植物基因组,执行以下操作:
    1. 下载所有蛋白质编码基因的所有CDS和蛋白质序列 ?相关数据库如phytozome( http://www.phytozome.org/) (Goodstein等人,2012)。确保每个基因具有相同的名称 CDS和蛋白质序列文件
    2. 下载HMM配置文件和 ?NB-ARC结构域的延长的氨基酸序列(Pfam no.PF00931) 从Pfam数据库( http://pfam.sanger.ac.uk/)(Finn et al。,2014)。
    3. 从相关数据库中下载所有microRNA序列,如 miRBase( http://www.mirbase.org/)(Kozomara和Griffiths-Jones,2014) 或制备包括所获得的所有微RNA序列的fasta文件 ?在内部实验。
    4. 创建一个本地数据库 下载的蛋白质序列,用于使用formatdb程序进行blast搜索 ?的爆炸软件( http://blast.ncbi.nlm.nih.gov/Blast。 cgi )。


      图1.我们的过程中描述的步骤的流程图

  2. NBS基因的计算鉴定
    1. 对包含蛋白质的fasta文件执行HMM搜索 你使用hmmer中的hmmsearch.exe程序下载的序列 3.0包(Johnson,Em等人,2010),其具有NB-ARC的HMM轮廓 域作为默认设置中的查询。
    2. 运行本地BLASTp搜索 ?针对在方法步骤A4中产生的蛋白质数据库 ?使用NB-ARC结构域的氨基酸序列作为查询。的 阈值期望值设置为1.0,如在以前的研究中使用的 (Li等人,2010; Shao等人,2014)。
    3. 检索基因名称 来自HMM搜索和BLAST搜索的结果的潜在NBS基因 将它们组合在一起以获得最大命中数。
    4. 这些基因名称用于从蛋白质中检索蛋白质序列 ?在过程步骤A1中下载的数据集。可以实现该步骤 如果在基因组中仅发现少数NBS基因,则可以手动进行。对于大 数据集,我们建议研究人员为此编写一个Perl脚本 步骤(脚本也可以从作者请求)。
    5. 将获得的序列进一步进行在线Pfam分析 ?以验证他们是否确实拥有NBS域,具有E值 设置为10 -4 (图2A)。没有可检测的序列 NB-ARC域被丢弃。剩余的序列代表所有NBS 蛋白质。
    6. Pfam分析也很重要 检测这些蛋白质是否具有N-末端收缩/白细胞介素-1 受体(TIR)结构域或抗性粉纹MILDEW8(RPW8)结构域或a ?C-末端LRR结构域(Meyers等人,2003)。蛋白质序列 不具有通过Pfam的可检测的N-末端结构域进一步分析 使用COILS服务器 ( http://www.ch.embnet.org/software/COILS_form.html)(Lupas等人,1991) ?在默认设置下检测他们是否有盘绕线圈域 在N端。


      图2.步骤A)B5和B)C2
      的屏幕截图
  3. 鉴定可能被微小RNA靶向的NBS基因
    1. 为了预测由微小RNA靶向的NBS基因,检索CDS序列 通过从下载的CDS中搜索基因名称来识别NBS基因 数据集。
    2. 提交对应于所识别的CDS的序列 ?NBS基因和microRNA到psRNATarget服务器 ( http://plantgrn.noble.org/psRNATarget/)(Dai和Zhao,2011) fasta 格式用于微小RNA靶标预测(图2B)。
      注意:在这里 步骤,研究人员可以更改参数设置限制或 扩大预测目标的数量。例如,可以设置 最大期望(从不匹配罚分转换)到3以获得 较少的假阳性预测率较低;或设置最大值 ?期望5以最大化潜在目标的数量 更高的假阳性预测风险。
    3. 下载 来自psRNATarget服务器的预测结果并检索这些NBS 预测被微小RNA靶向的基因
    4. 预测 微小RNA对NBS基因的调控可以通过实验验证 通过它们在烟草叶中的共表达,如几个中所述 研究(Liu et al。,2014; Yu and Pilot,2014)。

致谢

该协议改编自Shao等人(2014)。这项工作得到中国国家自然科学基金(30930008,31170210,91231102和31400201),中国博士后科学基金(2013M540435和2014T70503)的支持。

参考文献

  1. Dai,X.和Zhao,P.X。(2011)。 psRNATarget:植物小RNA靶分析服务器。 Nucleic Acids Res 39(Web服务器问题):W155-159
  2. Finn,RD,Bateman,A.,Clements,J.,Coggill,P.,Eberhardt,RY,Eddy,SR,Heger,A.,Hetherington,K.,Holm,L.,Mistry,J.,Sonnhammer, ,Tate,J。和Punta,M。(2014)。 Pfam:蛋白质家族数据库 Nucleic Acids Res 42(数据库问题):D222-230。
  3. Goodstein,DM,Shu,S.,Howson,R.,Neupane,R.,Hayes,RD,Fazo,J.,Mitros,T.,Dirks,W.,Hellsten,U.,Putnam,N.and Rokhsar, DS(2012)。 Phytozome:绿色植物基因组学的比较平台 Nucleic Acids Res 40(数据库问题):D1178-1186。
  4. Johnson,L.S.,Eddy,S.R.and Portugaly,E。(2010)。 隐藏马尔可夫模型速度启发式和迭代HMM搜索程序 BMC Bioinformatics 11:431.
  5. Kozomara,A.和Griffiths-Jones,S。(2014)。 miRBase:使用深度测序数据注释高可信度microRNA。核酸研究 42(数据库问题):D68-73。
  6. Li,J.,Ding,J.,Zhang,W.,Zhang,Y.,Tang,P.,Chen,J.Q.,Tian,D.and Yang,S。 禾本科NBS-LRR基因数量的独特进化模式 Mol Genet Genomics 283(5):427-438。
  7. Liu,Q.,Wang,F.and Axtell,M.J。(2014)。 使用本氏烟草定量瞬时分析植物微RNA靶向的互补要求测定。 植物细胞 26(2):741-753。
  8. Lupas,A.,Van Dyke,M。和Stock,J。(1991)。 从蛋白质序列预测卷曲螺旋 科学 252( 5009):1162-1164。
  9. Meyers,B.C.,Kozik,A.,Griego,A.,Kuang,H。和Michelmore,R.W。(2003)。 拟南芥中的NBS-LRR编码基因的全基因组分析。 植物细胞 15(4):809-834。
  10. Shao,Z.Q.,Zhang,Y.M.,Hang,Y.Y.,Xue,J.Y.,Zhou,G.C.,Wu,P.,Wu,X.Y.Wu,X.Z.,Wang,Q.,Wang,B和Chen,J.Q.(2014)。 核苷酸结合位点 - 富含亮氨酸的重复基因的长期进化:从和超越的理解豆科植物。 植物生理学 166(1):217-234。
  11. Shivaprasad,P.V.,Chen,H.M.,Patel,K.,Bond,D.M.,Santos,B.A。和Baulcombe,D.C。(2012)。 microRNA超家族调节核苷酸结合位点 - 富含亮氨酸的重复序列和其他mRNA。 em> Plant Cell 24(3):859-874。
  12. Yu,S.和Pilot,G。(2014)。 通过在本田烟草中的瞬时表达来测试植物人工微RNA的效率。 在翻译层面的附加动作。 前植物科学 5:622.
  13. Zhai,J.,Jeong,DH,De Paoli,E.,Park,S.,Rosen,BD,Li,Y.,Gonzalez,AJ,Yan,Z.,Kitto,SL,Grusak,MA,Jackson, Stacey,G.,Cook,DR,Green,PJ,Sherrier,DJ和Meyers,BC(2011)。 MicroRNA作为植物NB-LRR防御基因家族的主调节子,通过生产分阶段, Genes Dev 25(23):2540-2553。
  14. Zhang,Y. M.,Shao,Z.Q.,Wang,Q.,Hang,Y.Y.,Xue,J.Y.,Wang,B.and Chen,J.Q.(2015)。 揭示核苷酸结合位点 - 富含亮氨酸重复(NBS-LRR)基因的动态演变
  • English
  • 中文翻译
免责声明 × 为了向广大用户提供经翻译的内容,www.bio-protocol.org 采用人工翻译与计算机翻译结合的技术翻译了本文章。基于计算机的翻译质量再高,也不及 100% 的人工翻译的质量。为此,我们始终建议用户参考原始英文版本。 Bio-protocol., LLC对翻译版本的准确性不承担任何责任。
Copyright: © 2015 The Authors; exclusive licensee Bio-protocol LLC.
引用: Readers should cite both the Bio-protocol article and the original research article where this protocol was used:
  1. Shao, Z., Zhang, Y., Wang, B. and Chen, J. (2015). Computational Identification of MicroRNA-targeted Nucleotide-binding Site-leucine-rich Repeat Genes in Plants. Bio-protocol 5(21): e1637. DOI: 10.21769/BioProtoc.1637.
  2. Shao, Z. Q., Zhang, Y. M., Hang, Y. Y., Xue, J. Y., Zhou, G. C., Wu, P., Wu, X. Y., Wu, X. Z., Wang, Q., Wang, B. and Chen, J. Q. (2014). Long-term evolution of nucleotide-binding site-leucine-rich repeat genes: understanding gained from and beyond the legume family. Plant Physiol 166(1): 217-234.
提问与回复

(提问前,请先登录)bio-protocol作为媒介平台,会将您的问题转发给作者,并将作者的回复发送至您的邮箱(在bio-protocol注册时所用的邮箱)。为了作者与用户间沟通流畅(作者能准确理解您所遇到的问题并给与正确的建议),我们鼓励用户用图片或者视频的形式来说明遇到的问题。由于本平台用Youtube储存、播放视频,作者需要google 账户来上传视频。

当遇到任务问题时,强烈推荐您提交相关数据(如截屏或视频)。由于Bio-protocol使用Youtube存储、播放视频,如需上传视频,您可能需要一个谷歌账号。