Module:Bio::Tools::RNAMotif
| Pdoc documentation: Bio::Tools::RNAMotif | CPAN documentation: Bio::Tools::RNAMotif |
|---|
Contents |
Introduction
This is a module for parsing RNAMotif output, returning Bio::SeqFeature::Generic objects. RNAMotifrnamotif1rnamotif2rnamotif3 is a program which searches DNA sequences for a pattern matching a description for an particular RNA structure (including simple stem-loops, complex multiloop structures, pseudoknots, triple helices, and G-quartets). This pattern, called the descriptor, is stored in a second text file and is passed when initiating a search.
The Descriptor
The descriptor file normally has three sections: a parameters section (parms), a descriptor section (desc), and a scoring section (score). The descriptor syntax allows for enormous flexibility when searching for motifs. Simple motifs, which could potentially return thousands of hits, can be further screened based upon the score. The score itself can be based on several criteria, including free-energy rulesenergy1energy2, presence/absence of conserved sequences, and user-defined values; any value returned from this section is output to STDOUT. The included sprintf() function also allows the output of more complex data. This extreme flexibility in scoring can be somewhat of a disadvantage as various approaches using different scoring systems are not directly comparable.
More information can be found on the Case lab website and in the manual, included with the source code.
Below is an example descriptor file using all three sections. The sprintf() function returns the free energy (if less than -10), the length of the helix, and the nucleotide sequence of the internal tetraloop, separated by commas:
parms
emax = -12;
descr
h5( minlen=4, maxlen=7 )
ss( len=4 )
h3
score
{ sc1 = efn( h5[1], h3[NSE] );
if( sc1 > emax )
REJECT;
sc2 = length( h5[1] );
SCORE = sprintf( '%.3f,%d,%s', sc1, sc2, ss[2] );
}
Here's some raw output:
#RM scored #RM descr h5 ss h3 #RM dfile sprintf.descr >gi|173609|gb|M28984|ACARRDX A.castellani 5S ribosomal RNA gi|173609|gb|M28984|ACARRDX -12.500,6,gcga 0 81 16 gggtgg gcga ccaccc >gi|1236163|gb|L41047|ANNRRO Actinoplanes sp. ribosomal RNA (rRNA) gi|1236163|gb|L41047|ANNRRO -15.400,6,gaaa 0 110 16 ccccgg gaaa ccgggg >gi|1236163|gb|L41047|ANNRRO Actinoplanes sp. ribosomal RNA (rRNA) gi|1236163|gb|L41047|ANNRRO -12.100,5,gaaa 0 111 14 cccgg gaaa ccggg >gi|173741|gb|M83548|AQF16SRRN Aquifex pyrophilus 16S ribosomal RNA (16S rRNA) gi|173741|gb|M83548|AQF16SRRN -15.400,6,gaaa 0 154 16 ccccgg gaaa ccgggg ...
References
<biblio>
- rnamotif1 pmid=11713323
- rnamotif2 pmid=11522828
- rnamotif3 pmid=12702810
- energy1 pmid=10329189
- energy2 pmid=8538457
</biblio>