Module:Bio::Tools::RNAMotif

From BioPerl
Jump to: navigation, search
Bio::Tools::RNAMotif
PDoc Bio::Tools::RNAMotif
CPAN Bio::Tools::RNAMotif
metaCPAN Bio::Tools::RNAMotif

Contents


Introduction

This is a module for parsing RNAMotif output, returning Bio::SeqFeature::Generic objects. RNAMotif[1][2][3] is a program which searches DNA sequences for a pattern matching a description for an particular RNA structure (including simple stem-loops, complex multiloop structures, pseudoknots, triple helices, and G-quartets). This pattern, called the descriptor, is stored in a second text file and is passed when initiating a search.

The Descriptor

The descriptor file normally has three sections: a parameters section (parms), a descriptor section (desc), and a scoring section (score). The descriptor syntax allows for enormous flexibility when searching for motifs. Simple motifs, which could potentially return thousands of hits, can be further screened based upon the score. The score itself can be based on several criteria, including free-energy rules[4][5], presence/absence of conserved sequences, and user-defined values; any value returned from this section is output to STDOUT. The included sprintf() function also allows the output of more complex data. This extreme flexibility in scoring can be somewhat of a disadvantage as various approaches using different scoring systems are not directly comparable.

More information can be found on the Case lab website and in the manual, included with the source code.

Below is an example descriptor file using all three sections. The sprintf() function returns the free energy (if less than -10), the length of the helix, and the nucleotide sequence of the internal tetraloop, separated by commas:

parms
	emax = -12;

descr
	h5( minlen=4, maxlen=7 )
		ss( len=4 )
	h3

score
	{ sc1 = efn( h5[1], h3[NSE] );
	  if( sc1 > emax )
		REJECT;
	  sc2 = length( h5[1] );
	  SCORE = sprintf( '%.3f,%d,%s', sc1, sc2, ss[2] );
	}

Here's some raw output:

#RM scored
#RM descr h5 ss h3
#RM dfile sprintf.descr
>gi|173609|gb|M28984|ACARRDX A.castellani 5S ribosomal RNA
gi|173609|gb|M28984|ACARRDX -12.500,6,gcga 0      81   16 gggtgg gcga ccaccc
>gi|1236163|gb|L41047|ANNRRO Actinoplanes sp. ribosomal RNA (rRNA)
gi|1236163|gb|L41047|ANNRRO -15.400,6,gaaa 0     110   16 ccccgg gaaa ccgggg
>gi|1236163|gb|L41047|ANNRRO Actinoplanes sp. ribosomal RNA (rRNA)
gi|1236163|gb|L41047|ANNRRO -12.100,5,gaaa 0     111   14 cccgg gaaa ccggg
>gi|173741|gb|M83548|AQF16SRRN Aquifex pyrophilus 16S ribosomal RNA (16S rRNA)
gi|173741|gb|M83548|AQF16SRRN -15.400,6,gaaa 0     154   16 ccccgg gaaa ccgggg
...

References

  1. Macke TJ, Ecker DJ, Gutell RR, Gautheret D, Case DA, and Sampath R. RNAMotif, an RNA secondary structure definition and search algorithm. Nucleic Acids Res. 2001 Nov 15;29(22):4724-35. PubMed ID:11713323 | HubMed [rnamotif1]
  2. Lesnik EA, Sampath R, Levene HB, Henderson TJ, McNeil JA, and Ecker DJ. Prediction of rho-independent transcriptional terminators in Escherichia coli. Nucleic Acids Res. 2001 Sep 1;29(17):3583-94. PubMed ID:11522828 | HubMed [rnamotif2]
  3. Tsui V, Macke T, and Case DA. A novel method for finding tRNA genes. RNA. 2003 May;9(5):507-17. PubMed ID:12702810 | HubMed [rnamotif3]
  4. Mathews DH, Sabina J, Zuker M, and Turner DH. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J Mol Biol. 1999 May 21;288(5):911-40. DOI:10.1006/jmbi.1999.2700 | PubMed ID:10329189 | HubMed [energy1]
  5. Serra MJ and Turner DH. Predicting thermodynamic properties of RNA. Methods Enzymol. 1995;259:242-61. PubMed ID:8538457 | HubMed [energy2]
All Medline abstracts: PubMed | HubMed
Personal tools
Namespaces
Variants
Actions
Main Links
documentation
community
development
Toolbox