MrBayes
MrBayes is software for the Bayesian estimation of phylogeny. The program can obtained from the MrBayes web site.
Running MrBayes
The program takes as input NEXUS multiple alignment format which is appended with a block of commands intended for MrBayes. The following is an example of a slice of data from Jason Stajich's research which includes alignment of proteins from 47 taxa. The outgroup is set to atha_gbk, the prior model for proteins is set to mixed so multiple rate matricies are considered, and the MCMC search is run with 1,000,000 generations but will stop if the chains converge before the total generations are run (using 2 chains with two runs each so will need 4 CPUs).
#NEXUS begin data; dimensions ntax=47 nchar=50; format interleave datatype=protein gap=- ; matrix sklu_AUG RLKYALNGRE VKAIMMQRHV KVDGKVRTDT TYPAGFMDVI TLEATNENFR sscl_GLEAN RLKYALNSRE TKAILMQRLI KVDGKVRTDA TYPAGFMDVI GIEKTSENFR rory_SNAP RLKYALNGRE VQSILMQRLV KVDGKVRTDS TFPAGFMDVI SVEKTGENFR ddis_gbk RLKYALTKKE VTLILMQRLV KVDGKVRTDP NYPAGFMDVI SIEKTKENFR calb_AUG RLKYALNGRE VKAIMMQQHV QVDGKVRTDT TYPAGFMDVI TLEATNEHFR sbay_AUG RLKYALNGRE VKAILMQRHV KVDGKVRTDT TYPAGFMDVI TLDATNENFR scer_yjm78 RLKYALNGRE VKAILMQRHV KVDGKVRTDT TYPAGFMDVI TLDATNENFR scas_AUG RLKYALNGRE VKAILMQRHV KVDGKVRTDT TYPTGFMDVI TLDATNENFR cimm_AUG RLKYALNGRE TNAILMQRLV KVDGKVRTDA TYPAGFMDVI SIEKTGENFR cneo_R265 RLKYALTGRE VTAIVKQRLI KVDGKVRTDE TFPAGFMDVI SIERSGEHFR fver_GLEAN RLKYALNYRE TKAILMQRLV KVDGKVRTDS TYPSGFMDVI TIEKTGENFR ylip_GENO RLKYALNGRE VNAILMQRLV KVDGKVRTDS TFPAGFMDVI QLEKTGENFR skud_AUG RLKYALNGRE VKAILMQRHV KVDGKVRTDT TYPAGFMDVI TLDATNENFR scer_rm11 RLKYALNGRE VKAILMQRHV KVDGKVRTDT TYPAGFMDVI TLDATNENFR pchr_GLEAN RLKYALTGKE VLSIVMQRLI KVDNKVRTDP TYPAGFMDVI TIEKSGEHFR umay_BRD RLKYALTGRE VNAITAQRLI KIDGKVRTDP TYPTGFQDVV SIEKSGEHFR ater_GLEAN RLKYALNGRE TKAIMMQRLI KVDGKVRTDP TYPAGFMDVI GIEKTGENFR cneo_WM276 RLKYALTGRE VTAIVKQRLI KVDGKVRTDE TFPAGFMDVI SIERSGEHFR cneo_H99 RLKYALTGRE VTAIVKQRLI KVDGKVRTDE TFPAGFMDVI SIERSGEHFR ctro_AUG RLKYALNGRE VKAIMMQQHV QVDGKVRTDS TYPAGFMDVI TLEATNEHFR spom_SANG RLKYALNGRE VKAILMQRLI KVDGKVRTDS TFPTGFMDVI SVEKTGEHFR cdub_AUG RLKYALNGRE VKAIMMQQHV QVDGKVRTDT TYPAGFMDVI TLEATNEHFR hsap_ens RLKYALTGDE VKKICMQRFI KIDGKVRTDI TYPAGFMDVI SIDKTGENFR snod_BRD RLKYALNARE VNAILMQRLV KVDGKVRTDS TFPSGLMDVI SIEKTGENFR klac_GENO RLKYALNGRE VKAILMQRHV KVDGKVRTDT TFPAGFMDVI TLEATNENFR scer_s288c RLKYALNGRE VKAILMQRHV KVDGKVRTDT TYPAGFMDVI TLDATNENFR cgla_GENO RLKYALNGRE VKAIMMQRHV KVDGKVRTDA TYPAGFMDVI TLEATNENFR uree_GLEAN RLKYALNGRE TNAILMQRLV KVDGKVRTDS TFPTGFMDVI SIEKTGENFR crei_jgi RLKYALTGKE VQSILMQRLV KVDGKVRTDH TYPTGFMDVI SMEKTDENFR kwal_AUG RLKYALNGRE VRAIMMQRHV KVDGKVRTDI TYPAGFMDVI TLEATNENFR ncra_BRD RLKYALNYRE TKAIMMQRLI KVDGKVRTDI TYPAGFMDVI TIEKTGENFR smik_AUG RLKYALNGRE VKAILMQRHV KVDGKVRTDT TYPAGFMDVI TLDATNENFR spar_AUG RLKYALNGRE VKAILMQRHV KVDGKVRTDT TYPAGFMDVI TLDATNENFR hcap_186R RLKYALNARE TNAILMQRLV KVDGKVRTDS TYPTGFMDVI TIDKTGENFR anid_BRD RLKYALNGRE TKAIMMQRLI QVDGKVRTDP TYPAGFMDVI TIEKTGENFR fgra_BRD RLKYALNYRE VKAILMQRLV KVDGKVRTDS TFPSGFMDVI TIEKTGENFR dhan_GENO RLKYALNGRE VKAILMQEHV KVDGKVRTDA TFPAGFMDVI TLEATNEHFR cgui_AUG RLKYALNGRE VKAILMQEHV KVDGKVRTDS TFPAGFMDVI TLEATNEHFR pans_AUG RLKYALNFRE TRAILMQRLV KVDGKVRTDM TYPAGFMDVI SIEKTGENFR tree_jgi RLKYALNYRE TKAIMMQRLV KVDAKVRTDI TYPAGFMDVI TIEKTGENFR cneo_JEC21 RLKYALTGRE VTAIVKQRLI KVDGKVRTDE TFPAGFMDVI SIERSGEHFR cglo_BRD RLKYALNYRE TKAIMMQRLV KVDGKVRTDV TYPAGFMDVI TIEKTGENFR mgri_BRD RLKYALNGRE TKAILMQRLV KVDGKVRTDS TYPAGFMDVV SIEKTGENFR agos_GBK RLKYALNGRE VKAILMQRHV KVDGKVRTDT TYPAGFMDVI TLEATNENFR clus_AUG RLKYALNGRE VKAILMQEHV KVDGKVRTDS TYPAGFMDVI TLEATNENFR atha_gbk RLKYALTYRE VISILMQRHI QVDGKVRTDK TYPAGFMDVV SIPKTNENFR afum_GLEAN RLKYALNGRE TKAIMMQRLI KVDGKVRTDP TYPAGFMDVI SIEKTGENFR ; end; begin mrbayes; outgroup atha_gbk; prset aamodelpr=mixed; mcmc ngen=1000000 stoprule=yes nchain=2 nrun=2; end;
Supporting this in a Run wrapper
Currently it is not directly supported by BioPerl although suitable NEXUS format files can be written out as imput, but the program block (just like PAUP) will still need to be added by the user. The Run package does not currently support this application either. In reality this is a pretty easy program to pipeline on a cluster, all it needs is the input NEXUS file with the alignment and the program block, so a run wrapper would just generate this file and start the program with
mb -i FILENAME.nex
It gets more complicated when you want to submit MPI jobs on a cluster so it sometimes makes more sense to write a script which generates the jobfiles and for a set of input alignments. So it may be hard to write a perfectly generic solution to this to handle cluster (LSF, PBS, SGE) jobs as as well as single CPU jobs.