Module:Bio::Tools::RNAMotif

From BioPerl
Jump to: navigation, search



Pdoc documentation: Bio::Tools::RNAMotif CPAN documentation: Bio::Tools::RNAMotif

Contents


Introduction

This is a module for parsing RNAMotif output, returning Bio::SeqFeature::Generic objects. RNAMotifrnamotif1rnamotif2rnamotif3 is a program which searches DNA sequences for a pattern matching a description for an particular RNA structure (including simple stem-loops, complex multiloop structures, pseudoknots, triple helices, and G-quartets). This pattern, called the descriptor, is stored in a second text file and is passed when initiating a search.

The Descriptor

The descriptor file normally has three sections: a parameters section (parms), a descriptor section (desc), and a scoring section (score). The descriptor syntax allows for enormous flexibility when searching for motifs. Simple motifs, which could potentially return thousands of hits, can be further screened based upon the score. The score itself can be based on several criteria, including free-energy rulesenergy1energy2, presence/absence of conserved sequences, and user-defined values; any value returned from this section is output to STDOUT. The included sprintf() function also allows the output of more complex data. This extreme flexibility in scoring can be somewhat of a disadvantage as various approaches using different scoring systems are not directly comparable.

More information can be found on the Case lab website and in the manual, included with the source code.

Below is an example descriptor file using all three sections. The sprintf() function returns the free energy (if less than -10), the length of the helix, and the nucleotide sequence of the internal tetraloop, separated by commas:

parms
	emax = -12;

descr
	h5( minlen=4, maxlen=7 )
		ss( len=4 )
	h3

score
	{ sc1 = efn( h5[1], h3[NSE] );
	  if( sc1 > emax )
		REJECT;
	  sc2 = length( h5[1] );
	  SCORE = sprintf( '%.3f,%d,%s', sc1, sc2, ss[2] );
	}

Here's some raw output:

#RM scored
#RM descr h5 ss h3
#RM dfile sprintf.descr
>gi|173609|gb|M28984|ACARRDX A.castellani 5S ribosomal RNA
gi|173609|gb|M28984|ACARRDX -12.500,6,gcga 0      81   16 gggtgg gcga ccaccc
>gi|1236163|gb|L41047|ANNRRO Actinoplanes sp. ribosomal RNA (rRNA)
gi|1236163|gb|L41047|ANNRRO -15.400,6,gaaa 0     110   16 ccccgg gaaa ccgggg
>gi|1236163|gb|L41047|ANNRRO Actinoplanes sp. ribosomal RNA (rRNA)
gi|1236163|gb|L41047|ANNRRO -12.100,5,gaaa 0     111   14 cccgg gaaa ccggg
>gi|173741|gb|M83548|AQF16SRRN Aquifex pyrophilus 16S ribosomal RNA (16S rRNA)
gi|173741|gb|M83548|AQF16SRRN -15.400,6,gaaa 0     154   16 ccccgg gaaa ccgggg
...

References

<biblio>

  1. rnamotif1 pmid=11713323
  2. rnamotif2 pmid=11522828
  3. rnamotif3 pmid=12702810
  4. energy1 pmid=10329189
  5. energy2 pmid=8538457

</biblio>

Personal tools
Namespaces
Variants
Actions
Main Links
documentation
community
development
Toolbox