In Bioperl we have 3 main players that people are going to use frequently:
- just the sequence and its names, nothing else. e.g. a fasta file of a sequence.
- a feature on a sequence, potentially with a sequence and a location and annotation. e.g. a single entry in an EMBL/GenBank/DDBJ feature table
- A sequence and a collection of sequence features (an aggregate) with its own annotation. e.g. a single EMBL/GenBank/DDBJ entry
Note: Bioperl is not tied heavily to file formats, however, the above objects do map to file formats sensibly and the examples are given to help bioinformaticians.
By having this split we avoid a lot of nasty circular references (sequence features can hold a reference to a sequence without the sequence holding a reference to the sequence feature). See Bio::PrimarySeq and Bio::SeqFeatureI for more information.