=head1 NAME Bioscripts 1.4 - Bioperl scripts =head1 SYNOPSIS A list of the scripts in the Bioperl package =head1 DESCRIPTION These scripts have been contributed by the developers and users of Bioperl. They are organized into directories roughly mirroring those in the Bioperl Bio/ directory. There are 2 directories for these scripts, scripts/ and examples/. The scripts in scripts/ are production quality scripts that have POD documentation and accept command-line arguments, the scripts in examples/ are useful examples of Bioperl code. You can install the scripts in the scripts/ directory if you'd like, simply follow the instructions on 'make install'. The installation directory is specified by the INSTALLSCRIPT variable in the Makefile, the default directory is /usr/bin. Installation will copy the scripts to the specified directory, change the 'PLS' suffix to 'pl' and prepend 'bp_' to the script name if it isn't so named already. Please contact bioperl-l at bioperl.org if you are interested in contributing your own script. =head1 PRODUCTION SCRIPTS =head2 scripts/install_bioperl_scripts.PLS This script installs scripts from the scripts/ directory on 'make install'. A fully-featured script that uses Bio::Biblio, a module for accessing and querying bibliographic repositories like Medline. =head2 scripts/biblio/biblio.PLS A fully-featured script that uses Bio::Biblio, a module for accessing and querying bibliographic repositories like Medline. =head2 scripts/Bio-DB-GFF/bulk_load_gff.PLS This script loads a mySQL Bio::DB::GFF database with the features contained in a list of GFF files, it cannot do incremental loads. =head2 scripts/Bio-DB-GFF/bp_genbank2gff.PLS This script loads a Bio::DB::GFF database with the features contained in a either a local Genbank file or an accession that is fetched from Genbank. =head2 scripts/Bio-DB-GFF/fast_load_gff.PLS This script does a rapid load of a mySQL Bio::DB::GFF database using files as source. Probably only works in Unix as it relies on pipes. =head2 scripts/Bio-DB-GFF/generate_histogram.PLS Create a GFF-formatted histogram of the density of the indicated set of feature types. =head2 scripts/Bio-DB-GFF/load_gff.PLS This script loads a mySQL Bio::DB::GFF database with the features contained in a list of GFF files. This script will work with all database adaptors supported by Bio::DB::GFF (mySQL, Oracle, Postgres). =head2 scripts/Bio-DB-GFF/pg_bulk_load_gff.PLS Bulk-load a PostgreSQL Bio::DB::GFF database from GFF files. =head2 scripts/Bio-DB-GFF/process_gadfly.PLS Transforms Gadfly GFF files into correct format. =head2 scripts/Bio-DB-GFF/process_ncbi_human.PLS Trnasforms NCBI's chromosome annotations into correct format. =head2 scripts/Bio-DB-GFF/process_sgd.PLS Transform SGD format annotations into GFF format. =head2 scripts/Bio-DB-GFF/process_wormbase.PLS Transforms Wormbase's GFF files into correct format. Requires Ace.pm. =head2 scripts/DB/bioflat_index.pl Create or update a biological sequence database indexed with the Bio::DB::Flat indexing scheme. =head2 scripts/DB/flanks.PLS Fetch a sequence, find the sequences flanking a variant or SNP in the sequence given its position. =head2 scripts/DB/biofetch_genbank_proxy.PLS A CGI scripts that queries NCBI's eutils to provide database access according to the BioFetch protocol. Requires Cache::FileCache. =head2 scripts/DB/biogetseq.PLS Sequence retrieval using the OBDA registry. =head2 scripts/graphics/feature_draw.PLS Script that accepts files in GFF or tab-delimited format and creates corresponding PNG image files. See L and L for more information. =head2 scripts/graphics/frend.PLS Create a PNG file on the Web using Bio::Graphics - accepts a file containing sequence and feature coordinates. =head2 scripts/graphics/search_overview.PLS Create a simple overview graphic of the hits, color is based on the hit score much like the NCBI overview graphic in a BLAST report. =head2 scripts/index/bp_fetch.PLS Fetch sequences from local indexed database or over the network and reformat using Bio::Index* and Bio::DB*. =head2 scripts/index/bp_index.PLS Indexes local databases, partners with bp_fetch.pl. =head2 scripts/seq/extract_feature_seq.PLS Extract the sequence for a specified feature type. =head2 scripts/popgen/composite_LD.PLS An easy way to calculate composite linkage disequilibrium (LD). =head2 scripts/popgen/heterogeneity.PLS A test for distinguishing between selection and population expansion. =head2 scripts/searchio/filter_search.PLS Simple script to filter by SearchIO criteria and print. =head2 scripts/seq/seqconvert.PLS Bioperl sequence format converter. =head2 scripts/seq/split_seq.PLS Split a sequence in a file into chunks of equal size. =head2 scripts/seq/translate_seq.PLS A simple Bioperl translator. =head2 scripts/seqstats/aacomp.PLS Prints out the count of amino acids over all protein sequences in the input file. =head2 scripts/seqstats/chaos_plot.PLS Produce a PNG or JPEG chaos plot given a DNA sequence using GD.pm. =head2 scripts/seqstats/gccalc.PLS Prints out the GC content for every nucleotide sequence in the input file. =head2 scripts/seqstats/oligo_count.PLS Use this script to determine what primers would be useful for frequent priming of nucleic acid for random labeling. =head2 scripts/taxa/local_taxonomydb_query.PLS Script that accesses a local Taxonomy database and retrieves species or taxon ids. =head2 scripts/taxa/taxid4species.PLS Retrieve the NCBI Tada ID for a given species. =head2 scripts/tree/blast2tree.PLS Builds a phylogenetic tree based on a sequence search (Fasta, BLAST, HMMER). =head2 scripts/utilities/bp_mrtrans.PLS Perl implementation of Bill Pearson's mrtrans to project protein alignment back into cDNA coordinates. =head2 scripts/utilities/bp_nrdb.PLS Make a non-redundant database based on sequence, not id. Requires Digest::MD5. =head2 scripts/utilities/bp_sreformat.PLS Perl implementation of Sean Eddy's sreformat, a sequence and alignment converter. =head2 scripts/utilities/dbsplit.PLS Splits one or more sequence files into subfiles with specified numbers of sequences, any sequence format. =head2 scripts/utilities/mask_by_search.PLS Masks parts of a sequence based on a significant matches to that sequence as contained in a SearchIO-compatible report file. =head2 scripts/utilities/mutate.PLS Randomly mutagenize a single protein or DNA sequence. Specify percentage mutated and number of resulting mutagenized sequences. =head2 scripts/utilities/pairwise_kaks.PLS Takes DNA sequences as input, aligns them as proteins, projects the alignment back into DNA and estimates the Ka (non-synonymous) and Ks (synonymous) substitutions. =head2 scripts/utilities/remote_blast.PLS This script executes a remote Blast search using RemoteBlast. See L for more information. =head2 scripts/utilities/search2BSML.PLS Turns SearchIO-compatible reports into a BSML report. =head2 scripts/utilities/search2alnblocks.PLS Turns SearchIO-compatible reports into alignments in formats supported by AlignIO. =head2 scripts/utilities/search2tribe.PLS This script will turn a protein SearchIO-compatible report (BLASTP, FASTP, SSEARCH) into a Markov Matrix for TribeMCL clustering. =head2 scripts/utilities/search2gff.PLS Turn SearchIO parseable reports(s) into a GFF report. =head2 scripts/utilities/seq_length.PLS Reports the total number of residues and total number of individual sequences in a specified sequence database file. =head1 EXAMPLE SCRIPTS =head2 examples/align/align_on_codons.pl Aligns nucleotide sequences based on codons in a specified reading frame. =head2 examples/align/aligntutorial.pl Examples using EMBOSS, pSW, Clustalw, TCoffee, and Blast to align sequences. =head2 examples/align/clustalw.pl A demonstration of the various uses of Alignment::Clustalw. See L for more. =head2 examples/align/simplealign.pl A script that demonstrates some uses of AlignIO. Please see L for more information. =head2 examples/biblio/biblio.pl A script that shows how to query bibliographic databases, such as Medline, using ids, keywords, and other fields. See L for details. =head2 examples/biblio/biblio_soap.pl Connect to and test a SOAP server using a Bio::Biblio object. =head2 examples/biographics/all_glyphs.pl Creates an image showing all possible glyphs. =head2 examples/biographics/dynamic_glyphs.pl Creates a complex image of a gene with confirmed and predicted exons, transcripts, and text labels. =head2 examples/biographics/lots_of_glyphs.pl Creates a complex image of a gene with confirmed and predicted exons, transcripts, and text labels. =head2 examples/bioperl.pl A Bioperl shell! =head2 examples/cluster/dbsnp.pl How to parse a dbsnp XML file. See L for details. =head2 examples/contributed/nmrpdb_parse.pl Extracts individual conformers from an NMR-derived PDB file. =head2 examples/contributed/prosite2perl.pl Convert Prosite motifs to Perl regular expressions. =head2 examples/contributed/rebase2list.pl Script to convert rebase file to format compatible with Bio::Tools::RestrictionEnzyme. =head2 examples/contributed/expression_analysis* A set of scripts that accept microarray data as input and perform statistical analyses, including t test, U test, Mann-Whitney, and Pearson correlation coefficent. =head2 examples/db/dbfetch Creates a Web page to query a local SRS server and fetch sequences. =head2 examples/db/est_tissue_query.pl Fetch EST sequences from local files or Genbank filtered by tissue using Bio::DB* or Bio::Index*. =head2 examples/db/gb2features.pl Shows how to extract all the features from a Genbank file. See L for more information on features. =head2 examples/db/getGenBank.pl Retrieving Genbank entries over the Web using DB::GenBank. See L for more information. =head2 examples/db/get_seqs.pl Fetches and formats sequences from GenBank, EMBL, or SwissProt over the network using Bio::DB*. =head2 examples/db/gff/* Scripts that reformat sequence to GFF and load GFF format files into an indexed database - see L for more information. =head2 examples/db/rfetch.pl A script that uses Bio::DB::Registry to retrieve sequences from EMBL, reformat them, and print them. See L. =head2 examples/db/use_registry.pl Script that shows how to use Bio::DB::Registry, part of Bioperl's integration with OBDA, the Open Bio Database Access registry scheme. See L for more information. =head2 examples/exceptions/* Scripts that demonstrate how to throw and catch Error.pm objects. =head2 examples/generate_random_seq.pl Writes random RNA, DNA, or protein sequence of given length. =head2 examples/biographics/render_sequence.pl This scripts fetchs a sequence from a remote database, extracts its features (CDS, gene, tRNA), and creates a graphic representation of the sequence in PNG or GIF format. See L and L. =head2 examples/liveseq/change_gene.pl A script showing how to use LiveSeq::Mutator and LiveSeq::Mutation. Please see L and L for more information. =head2 examples/longorf.pl A script that finds the longest ORF in one or more nucleotide sequences. =head2 examples/make_mrna_protein.pl Translate a cDNA or ORF to protein using Bio::Seq's translate() method. =head2 examples/make_primers.pl Design PCR primers given a sequence and the positions of the start and stop codons in the sequence's ORF. =head2 examples/popgen/parse_calc_stats.pl Shows how to read data from a Bio::PopGen::IO object. =head2 examples/rev_and_trans.pl Examples using Bio::Seq.pm for reversing and translating sequences. See L for more information. =head2 examples/revcom_dir.pl Eeturn reverse complement sequences of all sequences in the current directory and save them in the same directory. =head2 examples/sirna/rnai_finder.cgi CGI script for RNAi reagent design. See L for more information. =head2 examples/root_object/* Scripts that demonstrate uses of the Bio::Root modules. =head2 examples/searchio/blast_example.pl Print out all parsed values from a BLAST report. =head2 examples/searchio/custom_writer.pl Demonstrates how to extract data from BLAST reports and output as tab-delimited data. =head2 examples/searchio/hitwriter.pl Demonstrates how to extract data from BLAST reports and output as tab-delimited data. =head2 examples/searchio/hspwriter.pl Demonstrates how to extract data from BLAST reports and output as tab-delimited data. =head2 examples/searchio/htmlwriter.pl Demonstrates how to extract data from BLAST reports and output as HTML. =head2 examples/searchio/psiblast_features.pl Illustrates how to grab a set of SeqFeatures from a Psiblast report. =head2 examples/searchio/psiblast_iterations.pl Demonstrates the use of a SearchIO parser for processing the iterations within a PSI-BLAST report. =head2 examples/searchio/rawwriter.pl Shows how to print out raw BLAST alignment data for each HSP. =head2 examples/searchio/resultwriter.pl Demonstrates how to extract data from BLAST reports and output as tab-delimited data. =head2 examples/searchio/waba2gff.pl Convert raw WABA output to one type of GFF. =head2 examples/seq/* Example code for working with multiple sequence files, including formatting, printing, and filtering based on length or description or ID. =head2 examples/seq/extract_cds.pl Extract the CDS features from a Genbank file. =head2 examples/seqstats/aacomp.pl Calculate amino acid composition of a protein using Tools::CodonTable and Tools::IUPAC. See L and L for more information. =head2 examples/structure/struct_example* Scripts that show how to examine details of the 3D structure of a protein by parsing a PDB file. See L for more information. =head2 examples/subsequence.cgi CGI script to fetch a sequence from Genbank and extract a subsequence using DB::GenBank. See L. =head2 examples/tk/gsequence.pl Create a Protein Sequence Control Panel GUI with Gtk. =head2 examples/tk/hitdisplay.pl Create a GUI for displaying Blast results using Tk::HitDisplay. Please see L for more information. =head2 examples/tools/gb_to_gff.pl Extracts top-level sequence features from Genbank-formatted sequence files using Tools::GFF. See L. =head2 examples/tools/gff2ps.pl Takes an input file in GFF format and draws its genes and features as Postscript using Tools::GFF. See L. =head2 examples/tools/parse_codeml.pl Script that parses output from codeml, one of the PAML programs. See L. =head2 examples/tools/psw.pl Example code for using the XS extensions for comparing proteins using Smith-Waterman. =head2 examples/tools/restriction.pl Example code for using the RestrictionEnzyme module. See L for more information (note that Bio::Tools::RestrictionEnzyme has been superceded by Bio::Restriction::*). =head2 examples/tools/run_genscan.pl Run GENSCAN on multiple sequences and create output sequence files using Tools::Genscan. Please see L for more information. =head2 examples/tools/seq_pattern.pl A script that shows how to use sequences as regular expressions using Tools::SeqPattern. Please see L for more information. =head2 examples/tools/standaloneblast.pl The many uses of StandAloneBlast, including BLAST and PSIBLAST. =head2 examples/tools/state-machine.pl A demonstration of how to create a state machine using StateMachine::AbstractStateMachine. Please see L for more information. =head2 examples/tools/test-genscan.pl Script for testing and demonstrating Bio::Tools::Genscan. =head2 examples/tree/paup2phylip.pl Convert a PAUP tree block to Phylip format. =cut