Bio::Index
AbstractSeq
Summary
Bio::Index::AbstractSeq - Base class for AbstractSeq s
Package variables
No package variables defined.
Included modules
Inherit
Synopsis
# Make a new sequence file indexing package
package MyShinyNewIndexer;
use Bio::Index::AbstractSeq;
@ISA = ('Bio::Index::AbstractSeq');
# Now provide the necessary methods...
Description
Provides a common base class for multiple
sequence files built using the
Bio::Index::Abstract system, and provides a
Bio::DB::SeqI interface.
Methods
Methods description
Title : _file_format
Usage : $self->_file_format
Function: Derived classes should override this
method (it throws an exception here)
to give the file format of the files used
Example :
Returns :
Args : |
Title : fetch
Usage : $index->fetch( $id )
Function: Returns a Bio::Seq object from the index
Example : $seq = $index->fetch( 'dJ67B12' )
Returns : Bio::Seq object
Args : ID |
Title : _get_SeqIO_object
Usage : $index->_get_SeqIO_object( $file )
Function: Returns a Bio::SeqIO object for the file
Example : $seq = $index->_get_SeqIO_object( 0 )
Returns : Bio::SeqIO object
Args : File number (an integer) |
Title : get_Seq_by_id
Usage : $seq = $db->get_Seq_by_id()
Function: retrieves a sequence object, identically to
->fetch, but here behaving as a Bio::DB::BioSeqI
Returns : new Bio::Seq object
Args : string represents the id |
Title : get_Seq_by_acc
Usage : $seq = $db->get_Seq_by_acc()
Function: retrieves a sequence object, identically to
->fetch, but here behaving as a Bio::DB::BioSeqI
Returns : new Bio::Seq object
Args : string represents the accession number |
Title : get_PrimarySeq_stream
Usage : $stream = get_PrimarySeq_stream
Function: Makes a Bio::DB::SeqStreamI compliant object
which provides a single method, next_primary_seq
Returns : Bio::DB::SeqStreamI
Args : none |
Title : get_all_primary_ids
Usage : @ids = $seqdb->get_all_primary_ids()
Function: gives an array of all the primary_ids of the
sequence objects in the database. These
maybe ids (display style) or accession numbers
or something else completely different - they
*are not* meaningful outside of this database
implementation.
Example :
Returns : an array of strings
Args : none |
Title : get_Seq_by_primary_id
Usage : $seq = $db->get_Seq_by_primary_id($primary_id_string);
Function: Gets a Bio::Seq object by the primary id. The primary
id in these cases has to come from $db->get_all_primary_ids.
There is no other way to get (or guess) the primary_ids
in a database.
The other possibility is to get Bio::PrimarySeqI objects
via the get_PrimarySeq_stream and the primary_id field
on these objects are specified as the ids to use here.
Returns : A Bio::Seq object
Args : primary id (as a string)
Throws : "acc does not exist" exception |
Methods code
sub new
{ my ($class, @args) = @_;
my $self = $class->SUPER::new(@args);
$self->{'_seqio_cache'} = [];
return $self;} |
sub _file_format
{ my ($self,@args) = @_;
my $pkg = ref($self);
$self->throw("Class '$pkg' must provide a file format method correctly");} |
sub fetch
{ my( $self, $id ) = @_;
my $db = $self->db();
my $seq;
if (my $rec = $db->{ $id }) {
my ($file, $begin) = $self->unpack_record( $rec );
my $seqio = $self->_get_SeqIO_object( $file );
my $fh = $seqio->_fh();
$begin-- if( $^O =~ /mswin/i); seek($fh, $begin, 0);
$seq = $seqio->next_seq();
}
$seq->primary_id($seq->display_id()) if( defined $seq && ref($seq) &&
$seq->isa('Bio::PrimarySeqI') );
return $seq;} |
sub _get_SeqIO_object
{ my( $self, $i ) = @_;
unless ($self->{'_seqio_cache'}[$i]) {
my $fh = $self->_file_handle($i);
my $seqio = Bio::SeqIO->new( -Format => $self->_file_format,
-fh => $fh);
$self->{'_seqio_cache'}[$i] = $seqio;
}
return $self->{'_seqio_cache'}[$i];} |
sub get_Seq_by_id
{ my ($self,$id) = @_;
return $self->fetch($id); } |
sub get_Seq_by_acc
{ my ($self,$id) = @_;
return $self->fetch($id); } |
sub get_PrimarySeq_stream
{ my $self = shift;
my $num = $self->_file_count() || 0;
my @file;
for (my $i = 0; $i < $num; $i++) {
my( $file, $stored_size ) = $self->unpack_record( $self->db->{"__FILE_$i"} );
push(@file,$file);
}
my $out = Bio::SeqIO::MultiFile->new( '-format' => $self->_file_format , -files =>\@ file);
return $out;} |
sub get_all_primary_ids
{ my ($self,@args) = @_;
my $db = $self->db;
my( %bytepos );
while (my($id, $rec) = each %$db) {
if( $id =~ /^__/ ) {
next;
}
my ($file, $begin) = $self->unpack_record( $rec );
$bytepos{"$file:$begin"} = $id;
}
return values %bytepos;} |
sub get_Seq_by_primary_id
{ my ($self,$id) = @_;
return $self->fetch($id); } |
General documentation
User feedback is an integral part of the evolution of this
and other Bioperl modules. Send your comments and suggestions preferably
to one of the Bioperl mailing lists.
Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bioperl.org/MailList.shtml - About the mailing lists
Report bugs to the Bioperl bug tracking system to help us keep track
the bugs and their resolution.
Bug reports can be submitted via email or the web:
bioperl-bugs@bio.perl.org
http://bugzilla.bioperl.org/
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _
Bio::Index::Abstract - Module which
Bio::Index::AbstractSeq inherits off, which
provides dbm indexing for flat files (which are
not necessarily sequence files).