ERPIN
From BioPerl
Daniel Gautheret's and André Lambert's suite of programs erpin for searching for RNA structural motifs using secondary structure profiles (SSP). The main advantage this has over descriptor-based programs like RNAMotif is the computation of E-values for matches.
It is considered under active development, so future BioPerl support will be experimental.
It can be found here.
Raw Output Format
Training set: "tbox.epn":
140 sequences of length 97
Cutoff: 0.00 15.00
Database: "B_sub.fas"
4214574 nucleotides to be processed in 1 sequence
ATGC ratios: 0.282 0.283 0.217 0.218
E-value at cutoff 15.0 for 4.2Mb double strand data: 4.20e+00
>gi|50812173|ref|NC_000964.2| Bacillus subtilis subsp. subtilis str. 168, complete genome
FW 1 20780..20845 44.01 9.62e-09
GTTTTCAA.TCAGGG.TGGCAAC.GCGAGA.gc------------.TCTCGT.CCCTTT.atggggatgagggctc------------------.TTTTTATTT
FW 2 112686..112757 36.80 4.25e-05
CTTTTCAA.ACAGAG.TGGAACC.GCGCGG.ttaaa---------.GCGTCT.CTGTCA.tgtttacatgcagagacgc---------------.TTTTTTTAT
FW 3 277013..277073 36.72 4.56e-05
GCTGTTAA.TAAAGG.TGGTACC.GCGAGA.ccc-----------.TCGTCC.TTTGCA.taggacggggg-----------------------.TTTTTTGT-
FW 4 1377705..1377775 45.16 1.54e-09
GCGTTCAA.TCAAGG.TGGTACC.ACGGAA.accca---------.TTTCGT.CCTTAT.gaatcaggatgaaatggg----------------.TTTTTTTAT
FW 5 1612561..1612629 46.76 8.26e-11
TTTTTCTA.AAAGGG.TGGTACC.GCGAGA.taagctt-------.TCTCGT.CCCTTA.tgggatgagagggc--------------------.TTTTTTTAT
FW 6 2472254..2472316 44.19 7.31e-09
TGTGCTAA.TGAGGG.TGGTACC.GCGAAC.ct------------.TTTCGT.CCTTTA.cgtgatgaaaagg---------------------.TTTTTTGTT
FW 7 3946092..3946165 31.51 2.37e-03
GTCTTCAA.CCAGGG.TGGTACC.GCGTGC.attgagccacg---.TCCCTT.ATTGGG.atgggctcttttttgtg-----------------.TTTGTA-A-
>gi|50812173|ref|NC_000964.2| Bacillus subtilis subsp. subtilis str. 168, complete genome
RC 1 1218449..1218517 40.41 1.11e-06
TGGAATAA.TCAGGG.TGGTACC.ACGGTT.catt----------.CGTCCC.TTTTTT.acaggggaagaatgagcc----------------.TTTTTT-AT
RC 2 2607929..2608008 43.70 1.54e-08
CTCAGCAA.CTAGGG.TGGAACC.GCGGGA.gaac----------.TCTCGT.CCCTAT.gtttgcggctggcaagcatagagacgggag----.TTTTTTG--
RC 3 2800080..2800160 44.88 2.43e-09
GCCCGTAA.TCAGGG.TGGTACC.GCGAGA.cagc----------.TCTCGT.CCCTGT.gtaaacgttggtttgcatagggggagggc-----.TTTTTTGCT
RC 4 2817094..2817169 42.69 6.46e-08
TATTCCAA.CTAGGG.TGGCACC.ACGGGT.ataac---------.TCTCGT.CCCTAC.tatcatgtatagtaggggcgggag----------.TTTTTTTC-
RC 5 2868569..2868636 42.06 1.49e-07
TTCATGAA.AAAAGG.TGGTACC.GCGAAA.gagct---------.TTTCGT.CCTTTT.acagggatgaagagctc-----------------.TTTTTT-C-
RC 6 2896140..2896207 34.68 2.48e-04
GCCGTAAA.CAAGGG.TGGTACC.GCGGAA.agaaaagcct----.TTTCGC.CCCTTT.tagctatcgcag----------------------.TTACT-GC-
RC 7 2929584..2929652 31.83 1.91e-03
GTCTGAAA.TAAGGG.TGGTACC.GCGGCC.acaactcgtc----.CCTTGT.ACAAGG.gacgggtttttt----------------------.TTATTTTC-
RC 8 2960251..2960324 42.11 1.40e-07
TGCGGAAA.AAAGGG.TGGAACC.ACGATT.ccgtttattcaa--.CCTCGT.CCCTTT.catagggggcgggg--------------------.TTTTTATAT
RC 9 3036940..3037006 37.30 2.70e-05
TCTTATTA.GTAGGG.TGGTACC.GCGATA.atcaat--------.CGTCCC.TTCGTG.taaacgaaggggcg--------------------.TTTTTT-AT
RC 10 3104180..3104252 46.93 5.85e-11
CGCGCTAA.CGAGGG.TGGTACC.GCGGGA.aaacgaaagtc---.TCTCGT.CCCTTT.ttgggatgagggagt-------------------.TTTTTTTA-
RC 11 3490315..3490403 43.60 1.79e-08
AGAATCAA.CAAGAG.TGGTACC.GCGGTC.agccgaaggct---.CGTCGT.CTCTTT.atctattagattaggtaggagacggcgggc----.TTTTTTGTT
RC 12 3855224..3855296 44.61 3.79e-09
GTCGGGAA.CTTGGG.TGGAACC.ACGGGT.taatcacac-----.ACTCGT.CCCTAT.ctgcgggacgggtgtg------------------.TTTTTTTAT
RC 13 3855477..3855549 44.61 3.79e-09
GTCGGGAA.CTTGGG.TGGAACC.ACGGGT.taatcacac-----.ACTCGT.CCCTAT.ctgtgggacgggtgtg------------------.TTTTTTTAT
RC 14 3855732..3855804 43.74 1.44e-08
ATCGGGAA.TTTGGG.TGGAACC.ACGGAT.gatcaacac-----.ATTCGT.CCCTTT.tagagggatgggtgtg------------------.TTTTTTTAT
-------- at level 1 --------
8429234 bases processed
cutoff: 0.00
13 config. per site
84448 hits
-------- at level 2 --------
cutoff: 15.00
120 config. per site
21 hits
21 independent hits
References
<biblio>
- erpin pmid=11700055
</biblio>