Create a Bioperl PPM Package
ActivePerl PPM has documentation about creating and installing your own PPM packages. We are reproducing those here for Bioperl developers who wish to create their own PPM package to make installing a little easier. As of Bioperl 1.5.2 we only support Windows running ActivePerl 188.8.131.529 or later, as detailed here. PPM FAQ are available here.
Creating the PPM4 package
Essentially, any module can be made into a PPM distribution fairly easily, though more complex distributions will require more work. For instance, a complex PPM distribution, the Generic Genome Browser, requires an attached perl script to download and install extra CGI scripts into the proper directories. The Bioperl core packages and most of the other Bioperl-associated packages, like bioperl-db and bioperl-run, are pure perl modules and so do not require much additional work.
For all PPM creation on Windows, nmake is required. This is the Windows version of make and should be placed in a directory in the Windows PATH env variable (C:\Windows usually works). There are also reports that GNU make may also work, but this isn't generally recommended as ActivePerl is traditionally built using MS Visual C++ v6 and binary incompatibilities may occur.
Unlike a CPAN or manual installation of Bioperl, the PPM installation does not run any tests (these are not even included in the PPM package). Therefore, it is important for developers to download and run the Bioperl test suite on Windows prior to building the PPM package so as not to distribute a faulty package.
PPM4 is the only version of PPM distributed with ActivePerl as of version 184.108.40.2069. PPM4 is superior to PPM3 and more accurately models dependencies using two additional tags <PROVIDE> and <REQUIRE>. For simplicity, the ppd XML code for all PPM4 packages should go in the package.xml file in the Bioperl repository. The top level XML tag in this file should be <REPOSITORY>.
PPD XML Structure
At the time of this writing (03:55, 4 December 2006 (EST)) PPM4 is still very new and there isn't an awful lot of documentation about the new PPD XML structure. This section aims to highlight and describe some of the main features.
The first line in the XML for describing a package is the SOFTPKG tag containing at least the NAME and VERSION attributes but possibly the DATE attribute:
<SOFTPKG NAME="bioperl" VERSION="1.5.2-RC2" DATE="2006-10-02">
The NAME attribute is used by PPM4 to identify which packages are infact the same. Therefore it is important to leave value of the NAME attribute as "bioperl". The VERSION attribute is simply a human readable form of the package version which is used for displaying the version to the user. Therefore, it may contain any string fit for this purpose. The DATE attribute is used to indicate the package release date and is optional. However, it does provide some idea as to how old a package is, so I recommend its use.
The tags nested within the SOFTPKG tag can come in any order, but it makes more sense to have those tags which describe the contents of the package first. The ABSTRACT and AUTHOR tags should come next and should be maintained as:
<ABSTRACT>Bioinformatics Toolkit</ABSTRACT> <AUTHOR>Bioperl Team (email@example.com)</AUTHOR>
The next tags that should appear is the PROVIDE tag(s). There should be a PROVIDE tag for EVERY module in that release of Bioperl.
<PROVIDE NAME="Bio::Root::Version" VERSION="1.005002002"/>
PPM4 uses the PROVIDE tag to handle dependencies better than its predecessor PPM3. More about this when I introduce the REQUIRE tag below. The NAME attribute is the fully qualified module name using the double colon '::' notation. It must contain at least one double colon! If the module name doesn't inherently contain a double colon, then one must be appended to the end. The VERSION attribute should be a float to allow PPM4 to be able to do version comparisons internally for the package upgrading feature. NOTE: it may also be a good idea to have a PROVIDE tag for bioperl, even though it is not strictly a module in itself. This would allow other modules to "REQUIRE" bioperl as opposed to a specific module. E.g.:
<PROVIDE NAME="bioperl::" VERSION="1.005002005"/>
The next tag is the IMPLEMENTATION tag which contains several tags for describing an implementation on a specific OS:
<IMPLEMENTATION> <ARCHITECTURE NAME="MSWin32-x86-multi-thread-5.8"/> <CODEBASE HREF="bioperl-1.5.2.rc5-ppm.tar.gz"/> <REQUIRE NAME="Graph::Directed" VERSION=""/> <REQUIRE NAME="GD::" VERSION="1.3"/> </IMPLEMENTATION>
The ARCHITECTURE attribute describe which architecture the current IMPLEMENTATION tag is describing. For windows running Perl 5.8, this should be as shown above. The HREF attribute of the CODEBASE tag is used to describe the location of the file containing the compressed blib directory of the package. The HREF attribute may be the fully qualified URL of the file or a relative path to the file containing this XML. Following this is the full list of module requirements for this implementation using the REQUIRE tag. The NAME attribute is the fully qualified module name using the double colon '::' notation. It must contain at least one double colon! If the module name doesn't inherently contain a double colon, then one must be appended to the end. The VERSION attribute should be a float to indicate the minimum version required or empty to denote no minimum version of that module is required. PPM4 uses the information from the PROVIDE and REQUIRE tags of packages to find and solve dependencies during the installation process. This is my the VERSION attribute of the PROVIDE and REQUIRE tags need to be floats.
The following is the start of a PPM4 compatible PPD XML file:
<SOFTPKG NAME="bioperl" VERSION="1.5.2-RC5" DATE="2006-11-23"> <ABSTRACT>Bioinformatics Toolkit</ABSTRACT> <AUTHOR>Bioperl Team (firstname.lastname@example.org)</AUTHOR> <PROVIDE NAME="Bio::Root::Version" VERSION="1.005002002"/> <PROVIDE NAME="Bio::SeqIO" VERSION="1.005002002"/> <!-- Add all modules provided by this package here - this will be a long list! --> <IMPLEMENTATION> <ARCHITECTURE NAME="MSWin32-x86-multi-thread-5.8"/> <CODEBASE HREF="bioperl-1.5.2.rc5-ppm.tar.gz"/> <REQUIRE NAME="IO::String" VERSION=""/> <REQUIRE NAME="HTML::Entities" VERSION=""/> <REQUIRE NAME="DB_File::" VERSION=""/> <!-- Add all additional module dependencies for this implementation here --> </IMPLEMENTATION> </SOFTPKG>
The following are known bugs/issues with the current release of PPM4 which is packaged with ActivePerl 220.127.116.119.
- A known bug requires a workaround to redirect incorrect calls to a PPM repository for packages that actually reside in the ActiveState PPM repository. The following lines need to be added to the httpd.conf file on the server hosting the PPM repository:
RedirectMatch /MSWin32-x86-multi-thread-5.8/(.*) http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1
This bug has been addressed and should be fixed for the ActivePerl 18.104.22.1680 release. However, this may take several months from Sept 2006.
- Currently if a packages is avilable from the ActiveState repository, PPM4 will always give it install/upgrade priority over the same package available in any other repository regardless of which version it is. This has the annoying behaviour of installing a lesser version of the package even when a newer version is available from a non-ActiveState repository. Currently the ActiveState repository holds Bioperl 1.2.3. Therefore, if you try to do a install bioperl through ther ppm-shell it will install bioperl from ActiveState and not the latest version from the Bioperl repository. You must manually do a search bioperl and install <num> to get the latest version using the ppm-shell. The same problem doesn't exist for installing via the PPM GUI. However, both ways of upgrading bioperl are affected - they will both try to "upgrade" to the version in the ActiveState repository even if it would actually result in a "downgrade". To get around this problem, remove the currently installed version of Bioperl, and then manually install the latest version.
While making the PPM4 packages for Bioperl 1.5.2 release candidates I have come across several inconsistancies with some of the PPM repositories. I think most of these are due to the state of flux for getting a repository/ppd files PPM4 ready. Some of the problems I've encountered are:
- 1) Some ppd files in a repository might not contain a double colon '::' in the NAME attribute of the PROVIDE tag. For example, the DB_File ppd file had <PROVIDE NAME="DB_File" VERSION="" /> instead of <PROVIDE NAME="DB_File::" VERSION="" />.
- 2) Some Bioperl dependencies fail to install with a 404 error. However, installing these separately prior to Bioperl seems to work -
I have yet to discover why this might be.Grrrr, this is to do with the bug in PPM4 shiped with ActivePerl 22.214.171.1249 and either the appropriate workaround isn't implemented, or it isn't working.
Creating the PPM the easy way
A barebones Bioperl build will not install additional scripts in the perl /bin directory. This is the easiest PPM package to make as it only requires nmake and the bioperl distribution of choice.
First, initialize the Build script:
Then run the following command to build the archive and PPD file:
Now just append the contents of the generated PPD file to the package.xml file (between the <REPOSITORY> tag).
There are other ways to make construction of PPM packages. For example, CPAN modules like PPM::Make should make the creation of PPM package a little easier. PPM::Make has also been recently updated to produce PPM4-compatible XML output; it is not known whether it is still capable of also producing PPM3 output. This is currently being investigated.
Creating a Local Repository
A local repository can be created simply by placing the package.xml file in a directory on the file system. In addition, the package files (the .tar.gz files of the blib directory) can also be stored on the local file system. However, you will need to update the <CODEBASE HREF="" /> tag to reflect the location of the package file on your file system.
Simply start PPM4 and add the local directory as a new repository. PPM4 should now be able to read and install modules specified in your local repository.