Bio::SeqIO
agave
Summary
Bio::SeqIO::agave - AGAVE sequence output stream.
Package variables
No package variables defined.
Included modules
Data::Dumper
IO::File
XML::Writer
Inherit
Synopsis
It is probably best not to use this object directly, but
rather go through the SeqIO handler system. Go:
$in = Bio::SeqIO->new('-file' => "$file_in",
'-format' => 'EMBL');
$out = Bio::SeqIO->new('-file' => ">$file_out",
'-format' => 'AGAVE');
while (my $seq = $in->next_seq){
$out->write_seq($seq);
}
Description
This object can transform Bio::Seq objects to agave xml file and
vice-versa. I (Simon) coded up this module because I needed a parser
to extract data from AGAVE xml to be utitlized by the GenQuire genome
annotation system (See
http://www.bioinformatics.org/Genquire).
***NOTE*** At the moment, not all of the tags are implemented. In
general, I followed the output format for the XEMBL project
http://www.ebi.ac.uk/xembl/
Methods
Methods description
Title : _process Usage : $self->_process Function : Parses the agave xml file. Args : None. Returns : Nothing. Note : Method(s) that call(s) this method : _initialize Method(s) that this method calls : _process_sciobj FIRST/START sub. |
Title : _process_sciobj Usage : $self->_process_sciobj Function : Parses the data between the <sciobj></sciobj> tags. Args : The string that holds the attributes for <sciobj>. Returns : Data structure holding the values parsed between the <sciobj></sciobj> tags. Note : Method(s) that call(s) this method : _process Method(s) that this method calls : _helper_store_attribute_list , _process_contig |
Title : _process_contig Usage : $self->_process_contig Function : Parses the data between the <contig></contig> tags. Args : 2 scalars: - reference to a scalar holding the line to be parsed. - scalar holding the attributes for the <contig> tag to be parsed. Returns : Data structure holding the values parsed between the <contig></contig> tags. Note : Method(s) that call(s) this method : _process_sciobj Method(s) that this method calls : _helper_store_attribute_list, _one_tag , _process_fragment_order |
Title : _process_fragment_order Usage : $self->_process_fragment_order Function : Parses the data between the <fragment_order></fragment_order> tags. Args : 2 scalars: - reference to a scalar holding the value of the line to be parsed. - reference to a data structure to store the <fragment_order> data. Returns : Nothing. Note : Method(s) that call(s) this method : _process_contig Method(s) that this method calls : _helper_store_attribute_list , _process_fragment_orientation |
Title : _process_fragment_orientation Usage : $self->_process_fragment_orientation Function : Parses the data between the <fragment_orientation> and </fragment_orientation> tags. Args : 2 scalars: - reference to a scalar holding the value of the line to be parsed. - reference to a data structure to store the <fragment_orientation> data. Returns : Nothing. Note : Method(s) that call(s) this method : _process_fragment_order
Method(s) that this method calls : _helper_store_attribute_list , _process_bio_sequence |
Title : _process_bio_sequence Usage : $self->_process_bio_sequence Function : Parses the data between the <bio_sequence></bio_sequence> tags. Args : 2 scalars: - reference to a scalar holding the value of the line to be parsed. - scalar holding the value of the attributes for <bio_sequence> Returns : data structure holding the values between <bio_sequence></bio_sequence> Note : Method(s) that call(s) this method : _process_fragment_orientation
Method(s) that this method calls : _helper_store_attribute_list , _one_tag , _question_mark_tag , _star_tag , _process_alt_ids , _process_xrefs , _process_sequence_map |
Title : _process_xrefs Usage : $self->_process_xrefs Function : Parse the data between the <xrefs></xrefs> tags. Args : reference to a scalar holding the value of the line to be parsed. Return : Nothing. Note : Method(s) that call(s) this method: _process_bio_sequence Method(s) that this method calls: _one_tag , _process_xref |
Title : _process_xref Usage : $self->_process_xref Function : Parses the data between the <xref></xref> tags. Args : 2 scalars: - reference to a scalar holding the value of the line to be parsed. - reference to a data structure to store the <xref> data. Returns : Nothing. Note : Method(s) that call(s) this method : _process_xrefs (note the 's' in 'xrefs') Method(s) that this method calls : _helper_store_attribute_list , _star_tag |
Title : _process_sequence_map Usage : $self->_process_sequence_map Function : Parses the data between the <sequence_map></sequence_map> tags. Args : Reference to scalar holding the line to be parsed. Returns : Data structure that holds the values that were parsed. Note : Method(s) that call(s) this method : _process_bio_sequence Method(s) that this method calls : _helper_store_attribute_list , _question_mark_tag , _process_annotations |
Title : _process_annotations Usage : $self->_process_annotations Function : Parse the data between the <annotations></annotations> tags. Args : Reference to scalar holding the line to be parsed. Returns : Data structure that holds the values that were parsed. Note : Method(s) that call(s) this method : _process_sequence_map Method(s) that this method calls : _process_seq_feature |
Title : _process_seq_feature Usage : $self->_process_seq_feature Function : Parses the data between the <seq_feature></seq_feature> tag. Args : 2 scalars: - Reference to scalar holding the line to be parsed. - Scalar holding the attributes for <seq_feature>. Returns : Data structure holding the values parsed. Note : Method(s) that call(s) this method: _process_annotations
Method(s) that this method calls: _helper_store_attribute_list , _process_classification , _question_mark_tag , _one_tag , _process_evidence , _process_qualifier , _process_seq_feature , _process_related_annot |
Title : _process_qualifier Usage : $self->_process_qualifier Function : Parse the data between the <qualifier></qualifier> tags. Args : 2 scalars: - reference to a scalar holding the value of the line to be parsed. - reference to a data structure to store the <qualifer> data. Returns : Nothing. Note : Method(s) that call(s) this method : _process_seq_feature Method(s) that this method calls : _star_tag |
Title : _process_classification Usage : $self->_process_classification Function: Parse the data between the <classification></classification> tags. Args : 2 scalars: - reference to a scalar holding the value of the line to be parsed. - reference to a data structure to store the <qualifer> data. Returns : Nothing. Note : Method(s) that call(s) this method: _process_seq_feature
Method(s) that this method calls: _helper_store_attribute_list ,
_question_mark_tag , _star_tag, _process_evidence |
Title : _tag_processing_helper Usage : $self->_tag_processing_helper Function : Stores the tag value within the data structure. Also calls _helper_store_attribute_list to store the attributes and their values in the data structure. Args : 5 scalars: - Scalar holding the value of the attributes - Reference to a data structure to store the data for <$tag_name> - Scalar holding the tag name. - Scalar holding the value of the tag. - Scalar holding the value of either 'star', 'plus', or 'question mark' which specifies what type of method called this method. Returns : Nothing. Note : Method(s) that call(s) this method: Method(s) that this method calls: _helper_store_attribute_list |
Title : _one_tag Usage : $self->_one_tag Function : A method to store data from tags that occurs just once. Args : 2 scalars: - reference to a scalar holding the value of the line to be parsed. - reference to a data structure to store the data for <$tag_name> Returns : Nothing. Note : Method(s) that call(s) this method : many Method(s) that this method calls : _tag_processing_helper |
Title : _question_mark_tag Usage : $self->_question_mark_tag Function : Parses values from tags that occurs zero or one time. ie: tag_name? Args : 3 scalars: - reference to a scalar holding the value of the line to be parsed. - reference to a data structure to store the data for <$tag_name> - scalar holding the name of the tag. Returns : Nothing. Note : Method(s) that call(s) this method : many. Method(s) that this method calls : _tag_processing_helper |
Title : _star_tag Usage : $self->_star_tag Function : Parses values from tags that occur zero or more times. ie: tag_name* Args : 3 scalars: - reference to a scalar holding the value of the line to be parsed. - reference to a data structure to store the data for <$tag_name> - scalar holding the name of the tag. Returns : Nothing. Note : Method(s) that call(s) this method : many. Method(s) that this method calls : _tag_processing_helper |
Title : _plus_tag Usage : $self->_plus_tag Function : Handles 'plus' tags (tags that occur one or more times). tag_name+ Args : 3 scalars: - reference to a scalar holding the value of the line to be parsed. - reference to a data structure to store the data for <$tag_name> - scalar holding the name of the tag. Returns : Nothing. Note : Method(s) that call(s) this method : many. Method(s) that this method calls : _star_tag |
Title : _helper_store_attribute_list Usage : $self->_helper_store_attribute_list Function : A helper method used to store the attributes from the tags into the data structure. Args : 2 scalars: - scalar holding the attribute values to be parsed. - reference to a data structure to store the data between the 2 tags. Returns : Nothing. Note : Method(s) that call(s) this method : Many. Method(s) that this method call(s) : None. |
Title : _store_seqs Usage : $self->_store_seqs Function : This method is called once in the life time of the script. It stores the data parsed from the agave xml file into the Bio::Seq object. Args : None. Returns : Nothing. Note : Method(s) that call(s) this method : next_seq Method(s) that this method calls : None. |
Title : next_seq Usage : $seq = $stream->next_seq() Function : Returns the next sequence in the stream. Args : None. Returns : Bio::Seq object
Method is called from the script. Method(s) that this method calls: _store_seqs (only once throughout the life time of script execution). |
Title : next_primary_seq Usage : $seq = $stream->next_primary_seq() Function: returns the next primary sequence (ie no seq_features) in the stream Returns : Bio::PrimarySeq object Args : NONE |
Title : write_seq Usage : Not Yet Implemented! $stream->write_seq(@seq) Function: writes the $seq object into the stream Returns : 1 for success and 0 for error Args : Bio::Seq object |
Title : _write_each_record Usage : $agave->_write_each_record( $seqI ) Function: change data into agave format Returns : NONE Args : Bio::SeqI object |
Usage : $agave->_write_each_record( $seqfeature, $write ) Function: change seeqfeature data into agave format Returns : NONE Args : Bio::SeqFeature object and XML::writer object |
Title : _filehandle Usage : $obj->_filehandle($newval) Function: Example : Returns : value of _filehandle Args : newvalue (optional) |
Title : throw Usage : $self->throw; Function : Throw's error message. Calls SeqIO's throw method. Args : Array of string(s), holding error message(s). Returns : Nothing. Note : Method(s) that call(s) this method: many. Method(s) that this method calls: Bio::SeqIO's throw method. |
Methods code
sub _initialize
{
my ($self,@args) = @_;
$self->SUPER::_initialize(@args);
my %tmp = @args ;
$self->{'file'} = $tmp{'-file'};
if ($self->{'file'} !~ /^>/) {
$self->_process;
$self->{'parsed'} = 1;
}
$self->{'seqs_stored'} = 0;} |
sub _process
{ my ($self) = @_;
while (1) {
my $line = $self->_readline;
next unless $line;
next if $line =~ /^\s*$/;
if ($line =~ /<\?xml version/o) {
} elsif ($line =~ /\<!DOCTYPE (\w+) SYSTEM "([\w\.]+)"\>/) {
$self->throw("Error: This xml file is not in AGAVE format! DOCTYPE: $1 , SYSTEM: $2\n\n")
if $1 ne 'sciobj' || $2 ne 'sciobj.dtd';
} elsif ($line =~ /<sciobj (.*)>/) {
push @{$self->{'sciobj'}}, $self->_process_sciobj($1);
} elsif ($line =~ /<\/sciobj>/) {
last;
} else {
}
}
return;} |
sub _process_sciobj
{
my ($self, $attribute_line) = @_;
my $sciobj;
$self->_helper_store_attribute_list($attribute_line,\$ sciobj);
my $line = $self->_readline;
while ($line =~ /<contig\s?(.*?)\s?>/) {
my $contig = $self->_process_contig(\$line, $1);
push @{$sciobj->{'contig'}}, $contig;
}
return $sciobj;} |
sub _process_contig
{
my ($self, $line, $attribute_line) = @_;
my $contig;
$self->_helper_store_attribute_list($attribute_line,\$ contig);
$$line = $self->_readline;
$self->_one_tag($line,\$ contig, 'db_id');
$self->_process_fragment_order($line,\$ contig);
return $contig;} |
sub _process_fragment_order
{
my ($self, $line, $data_structure) = @_;
while ($$line =~ /<fragment_order\s?(.*?)\s?>/) {
my $fragment_order;
$self->_helper_store_attribute_list($1,\$ fragment_order);
$$line = $self->_readline;
$self->_process_fragment_orientation($line,\$ fragment_order);
push @{$$data_structure->{'fragment_order'}}, $fragment_order;
}
return;} |
sub _process_fragment_orientation
{
my ($self, $line, $data_structure) = @_;
my $count = 0;
while ($$line =~ /<fragment_orientation\s?(.*?)\s?>/) {
my $fragment_orientation;
$self->_helper_store_attribute_list($1,\$ fragment_orientation);
$$line = $self->_readline;
$$line =~ /<bio_sequence\s?(.*?)\s?>/;
my $bio_sequence = $self->_process_bio_sequence($line, $1);
$fragment_orientation->{'bio_sequence'} = $bio_sequence;
push @{$$data_structure->{'fragment_orientation'}}, $fragment_orientation;
++$count;
}
$self->throw("Error: Missing <fragment_orientation> tag. Got this: $$line\n\n")
if $count == 0;
return;} |
sub _process_bio_sequence
{
my ($self, $line, $attribute_line) = @_;
my $bio_sequence;
$self->_helper_store_attribute_list($attribute_line,\$ bio_sequence);
$$line = $self->_readline;
$self->_one_tag($line,\$ bio_sequence, 'db_id');
$self->_question_mark_tag($line,\$ bio_sequence, 'note');
$self->_question_mark_tag($line,\$ bio_sequence, 'description');
$self->_star_tag($line,\$ bio_sequence, 'keyword');
$self->_question_mark_tag($line,\$ bio_sequence, 'sequence');
if ($$line =~ /<xrefs\s?(.*?)\s?>/) {
my $xrefs = $self->_process_xrefs($line,\$ bio_sequence);
$bio_sequence->{'xrefs'} = $xrefs || 'null';
}
if ($$line =~ /<sequence_map\s?(.*?)\s?>/) {
my $sequence_map = $self->_process_sequence_map($line);
push @{$bio_sequence->{'sequence_map'}}, $sequence_map;
}
return $bio_sequence;} |
sub _process_xrefs
{
my ($self, $line) = @_;
my $xrefs;
$$line = $self->_readline;
if ($$line =~ /<db_id|xref\s?(.*?)\s?>/) {
while ($$line =~ /<(db_id|xref)\s?(.*?)\s?>/) {
if ($1 eq "db_id") {
my $db_id;
$self->_one_tag($line,\$ db_id, 'db_id');
push @{$xrefs->{'db_id'}}, $db_id;
} elsif ($1 eq "xref") {
my $xref;
$self->_process_xref($line,\$ xref);
push @{$xrefs->{'xref'}}, $xref;
} else {
$self->throw("Error: Tag type should be one of db_id or xref! Got this: $$line\n\n");
}
}
if ($$line =~ /<\/xrefs>/) {
$$line = $self->_readline; return $xrefs;
} else {
$self->throw("Error: Missing </xrefs> tag. Got this: $$line\n\n");
}
} else {
$self->throw("Error: Missing <db_id> or <xref> tag. Got this: $$line\n\n");
}
return;} |
sub _process_xref
{
my ($self, $line, $xref) = @_;
$$line = $self->_readline;
if ($$line =~ /<db_id\s?(.*?)\s?>/) {
$self->_helper_store_attribute_list($1, $xref);
} else {
$self->throw("Error: Missing <db_id> tag. Got this: $$line\n\n");
}
$self->_star_tag($line, $xref, 'xref_propery');
return;} |
sub _process_sequence_map
{
my ($self, $line) = @_;
my $sequence_map;
while ($$line =~ /<sequence_map\s?(.*?)\s?>/) {
$self->_helper_store_attribute_list($1,\$ sequence_map) if defined $1;
$$line = $self->_readline;
$self->_question_mark_tag($line,\$ sequence_map, 'note');
if ($$line =~ /<annotations\s?(.*?)\s?>/) {
my $annotations = $self->_process_annotations($line);
$sequence_map->{'annotations'} = $annotations;
}
}
if ($$line =~ /<\/sequence_map>/) {
return $sequence_map;
} else {
$self->throw("Error: Missing </sequence_map> tag. Got this: $$line\n\n");
}} |
sub _process_annotations
{
my ($self, $line) = @_;
my $annotations;
$$line = $self->_readline;
my $count = 0;
while ($$line =~ /<(seq_feature|gene|comp_result)\s?(.*?)\s?>/) {
if ($$line =~ /<seq_feature\s?(.*?)\s?>/) {
my $seq_feature = $self->_process_seq_feature($line, $1);
push @{$annotations->{'seq_feature'}}, $seq_feature;
} elsif ($$line =~ /<gene\s?(.*?)\s?>/) {
} elsif ($$line =~ /<comp_result\s?(.*?)\s?>/) {
}
++$count;
}
$self->throw("Error: Missing <seq_feature> tag. Got: $$line\n\n") if $count == 0;
if ($$line =~ /<\/annotations/) {
$$line = $self->_readline; return $annotations;
} else {
$self->throw("Error: Missing </annotations> tag. Got this: $$line\n\n");
}} |
sub _process_seq_feature
{
my ($self, $line, $attribute_line) = @_;
my $seq_feature;
$self->_helper_store_attribute_list($attribute_line,\$ seq_feature);
$$line = $self->_readline;
$self->_process_classification($line,\$ seq_feature);
$self->_question_mark_tag($line,\$ seq_feature, 'note');
$self->_one_tag($line,\$ seq_feature, 'seq_location');
$self->_question_mark_tag($line,\$ seq_feature, 'xrefs');
$self->_process_evidence($line,\$ seq_feature);
$self->_process_qualifier($line,\$ seq_feature);
while ($$line =~ /<seq_feature\s?(.*?)\s?>/) {
$self->_process_seq_feature($line, $1);
$$line = $self->_readline;
}
while ($$line =~ /<related_annot\s?(.*?)\s?>/) {
$self->_process_related_annot($line, $1);
$$line = $self->_readline;
}
if ($$line =~ /<\/seq_feature>/) {
$$line = $self->_readline; return $seq_feature;
} else {
$self->throw("Error. Missing </seq_feature> tag. Got this: $$line\n");
}} |
sub _process_qualifier
{
my ($self, $line, $data_structure) = @_;
my $qualifier;
$self->_star_tag($line,\$ qualifier, 'qualifier');
push @{$$data_structure->{'qualifier'}},$qualifier;
return;
} |
sub _process_classification
{ my ($self, $line, $data_structure) = @_;
my $classification = $$data_structure->{'classification'};
while ($$line =~ /<classification\s?(.*?)\s?>/) {
$self->_helper_store_attribute_list($1,\$ classification);
$self->_question_mark_tag($line,\$ classification, 'description');
$self->_star_tag($line,\$ classification, 'id_alias');
$self->_process_evidence($line,\$ classification);
} } |
sub _process_evidence
{ my ($self, $line, $data_structure) = @_;
if ($$line =~ /<evidence>/) {
$$line = $self->_readline;
while ($$line =~ /<(element_id|comp_result)\s?(.*?)\s?>/) {
if ($$line =~ /<element_id\s?(.*?)\s?>/) {
my $element_id;
$self->_plus_tag($line,\$ element_id, 'element_id');
push @{$$data_structure->{'element_id'}}, $element_id;
} elsif ($$line =~ /<comp_result\s?(.*?)\s?>/) {
my $comp_result;
$self->_process_comp_result($line,\$ comp_result, $1);
push @{$$data_structure->{'comp_result'}}, $comp_result;
}
$$line = $self->_readline;
}
} } |
sub _process_comp_result
{
my ($self, $line, $comp_result, $attribute_line) = @_;
$self->_helper_store_attribute_list($attribute_line, $comp_result);
$$line = $self->_readline;
$self->_question_mark_tag($line, $comp_result, 'note');
$self->_question_mark_tag($line, $comp_result, 'match_desc');
$self->_question_mark_tag($line, $comp_result, 'match_align');
$self->_process_query_region($line, $comp_result);
$self->_process_match_region($line, $comp_result);
$self->_star_tag($line, $comp_result, 'result_property');
$self->_process_result_group($line, $comp_result);
$self->_process_related_annot($line, $comp_result); } |
sub _process_related_annot
{ my ($self, $line, $data_structure) = @_;
while ($$line =~ /<related_annot\s?(.*?)\s?>/) {
my $related_annot;
$self->_helper_store_attribute_list($1,\$ related_annot);
$$line = $self->_readline;
my $element_id_count = 0;
while ($$line =~ /<element_id\s?(.*?)\s?>/) {
my $element_id;
$self->_helper_store_attribute_list($1,\$ element_id);
push @{$related_annot->{'element_id'}}, $element_id;
$$line = $self->_readline;
++$element_id_count;
}
if ($element_id_count == 0) {
$self->throw("Error. Missing <element_id> tag. Got: $$line");
}
$self->_star_tag($line,\$ related_annot, 'sci_property');
push @{$data_structure->{'related_annot'}}, $related_annot;
unless ($$line =~ /<\/related_annot>/){
$self->throw("Error. Missing </related_tag>. Got: $$line\n");
}
} } |
sub _process_result_group
{ my ($self, $line, $data_structure) = @_;
while ($$line =~ /<result_group\s?(.*?)\s?>/) {
my $result_group = $$data_structure->{'result_group'};
$self->_helper_store_attribute_list($1,\$ result_group);
my $count = 0;
$$line = $self->_readline;
while ($$line =~ /<comp_result\s?(.*?)\s?>/) {
$self->_process_comp_result(\$line,\$ result_group, $1);
$$line = $self->_readline;
++$count;
}
$self->throw("Error. No <comp_result></comp_result> tag! Got this: $$line")
if $count == 0;
if ($line =~ /<\/result_group>/) {
$$line = $self->_readline;
} else {
$self->throw("Error. No </result_tag>! Got this: $$line");
}
} } |
sub _process_match_region
{ my ($self, $line, $data_structure) = @_;
my $match_region = $data_structure->{'match_region'};
if ($$line =~ /<match_region\s?(.*?)\s?>(.*?)>/) {
$self->_helper_store_attribute_line($1,\$ match_region);
$$line = $self->_readline;
if ($$line =~ /<db_id\s?(.*?)\s?>(.*?)<\/db_id>/) {
$self->_question_mark_tag($line,\$ match_region, 'db_id');
} elsif ($$line =~ /<element_id\s?(.*?)\s?>/) { $self->_question_mark_tag($line,\$ match_region, 'element_id');
} elsif ($$line =~ /<bio_sequence\s?(.*?)\s?>/) {
$match_region->{'bio_sequence'} = $self->_process_bio_sequence($line, $1);
}
$$line = $self->_readline;
if ($$line =~ /<\/match_region>/o) {
$$line = $self->_readline; return;
} else {
$self->throw("No closing tag </match_region>! Got this: $$line\n");
}
} } |
sub _process_query_region
{ my ($self, $line, $data_structure) = @_;
my $query_region = $data_structure->{'query_region'};
if ($$line =~ /<query_region\s?(.*?)\s?>/) {
$self->_helper_store_attribute_list($1,\$ query_region);
$$line = $self->_readline;
$self->_question_mark_tag($line,\$ query_region, 'db_id');
if ($$line =~ /<\/query_region>/) {
$$line = $self->_readline; return;
} else {
$self->throw("No closing tag </query_region>. Got this: $$line\n");
}
} } |
sub _tag_processing_helper
{
my ($self, $attribute_list, $data_structure, $tag_name, $tag_value, $caller) = @_;
if (defined $attribute_list) {
$self->_helper_store_attribute_list($attribute_list, $data_structure);
}
if ($caller eq 'star' || $caller eq 'plus') {
push @{$$data_structure->{$tag_name}}, $tag_value;
} else {
$$data_structure->{$tag_name} = $tag_value || 'null';
}
return;} |
sub _one_tag
{
my ($self, $line, $data_structure, $tag_name) = @_;
$self->throw("Error: Missing <$tag_name></$tag_name>. Got: $$line\n\n")
if $$line !~ /\<$tag_name/;
if ($$line =~ /<$tag_name\s?(.*?)\s?\/?>(.*?)<\/$tag_name>/) {
$self->_tag_processing_helper($1, $data_structure, $tag_name, $2, 'one');
} elsif ($$line =~ /<$tag_name\s?(.*?)\s?\/?>/) {
$self->_tag_processing_helper($1, $data_structure, $tag_name, '', 'one');
} else {
$self->throw("Error: Cannot parse this line: $$line\n\n");
}
$$line = $self->_readline;
return;} |
sub _question_mark_tag
{
my ($self, $line, $data_structure, $tag_name) = @_;
if ($$line =~ /<$tag_name\s?(.*?)\s?>(.*?)<\/$tag_name>/) {
$self->_tag_processing_helper($1, $data_structure, $tag_name, $2, 'question mark');
$$line = $self->_readline;
}
return;} |
sub _star_tag
{
my ($self, $line, $data_structure, $tag_name) = @_;
while ($$line =~ /<$tag_name\s?(.*?)\s?>(.*?)<\/$tag_name>/) {
$self->_tag_processing_helper
($1, $data_structure, $tag_name, $2, 'star');
$$line = $self->_readline;
}
return;} |
sub _plus_tag
{
my ($self, $line, $data_structure, $tag_name) = @_;
if ($$line =~ /<$tag_name\s?(.*?)\s?>(.*?)<\/$tag_name>/) {
$self->_tag_processing_helper($1, $data_structure, $tag_name, $2, 'plus');
$$line = $self->_readline;
$self->_star_tag($line, $data_structure, $tag_name);
} else {
$self->throw("Error: Missing <$tag_name></$tag_name>. Got: $$line\n\n");
}
return;} |
sub _helper_store_attribute_list
{
my ($self, $attribute_line, $data_structure) = @_;
my %attribs = ($attribute_line =~ /(\w+)\s*=\s*"([^"]*)"/g);
my $attribute_list;
for my $key (keys %attribs) {
push @{$$data_structure->{$key}}, $attribs{$key};
}
return;} |
sub _store_seqs
{
my ($self) = @_;
for my $sciobj (@{$self->{'sciobj'}}) {
for my $contig (@{$sciobj->{'contig'}}) {
for my $fragment_order (@{$contig->{'fragment_order'}}) {
for my $fragment_orientation (@{$fragment_order->{'fragment_orientation'}}) {
my $bio_sequence = $fragment_orientation->{'bio_sequence'};
my $sequence = $bio_sequence->{'sequence'};
my $accession_number = $bio_sequence->{'sequence_id'}->[0]; my $organism = $bio_sequence->{'organism'};
my $description = $bio_sequence->{'description'};
my $molecule_type = $bio_sequence->{'molecule_type'}->[0];
my $primary_seq = Bio::PrimarySeq->new(
-id => $accession_number,
-alphabet => $molecule_type,
-seq => $sequence,
-desc => $description,
);
my $seq = Bio::Seq->new (
-display_id => $accession_number,
-accession_number => $accession_number,
-primary_seq => $primary_seq,
-seq => $sequence,
-description => $description,
);
my $organism_name = $bio_sequence->{organism_name}->[0];
if (defined $organism_name) {
my @classification = split(' ', $organism_name);
my $species = Bio::Species->new();
$species->classification(@classification);
$seq->species($species);
}
my $keywords = $bio_sequence->{keyword};
my %key_to_value;
for my $keywords (@$keywords) {
my @words = split(':', $keywords);
for (my $i = 0; $i < scalar @words - 1; $i++) {
if ($i % 2 == 0) {
my $j = $i; $j++;
$key_to_value{$words[$i]} = $words[$j];
}
}
my $reference = Bio::Annotation::Reference->
new(-authors => $key_to_value{authors},
-title => $key_to_value{title},
-database => $key_to_value{database},
-pubmed => $key_to_value{pubmed},
);
$seq->annotation->add_Annotation('reference', $reference);
}
if (defined $bio_sequence->{'sequence_map'}) {
for my $sequence_map (@{$bio_sequence->{'sequence_map'}}) {
my $label = $sequence_map->{label};
if (defined $sequence_map->{annotations} &&
ref($sequence_map->{annotations}) eq 'HASH') {
for my $seq_feature (@{$sequence_map->{'annotations'}->{'seq_feature'}}) {
my $seq_location = $seq_feature->{'seq_location'};
my $start_coord = $seq_feature->{'least_start'}->[0];
my $feature_type = $seq_feature->{'feature_type'}->[0];
my $end_coord = $seq_feature->{'greatest_end'}->[0];
my $is_on_complement = $seq_feature->{'is_on_complement'}->[0];
my $feat = Bio::SeqFeature::Generic->
new(
-start => $start_coord,
-end => $end_coord,
-primary_tag => $feature_type,
);
if (defined $seq_feature->{'qualifier'} &&
ref($seq_feature->{'qualifier'}) eq 'ARRAY') {
for my $feature (@{$seq_feature->{'qualifier'}}) {
my $value = $feature->{'qualifier'};
my $feature_type = $feature->{'qualifier_type'};
for (my $i = 0;
$i < scalar @{$value};
$i++) {
$feat->add_tag_value(
$feature_type->[$i] => $value->[$i]
);
}
}
}
$seq->add_SeqFeature($feat);
}
}
}
}
push @{$self->{'sequence_objects'}}, $seq;
}
}
}
}
$self->{'seqs_stored'} = 1;
return;} |
sub next_seq
{
my ($self) = @_;
$self->_store_seqs if $self->{'seqs_stored'} == 0;
$self->throw("Error: No Bio::Seq objects stored yet!\n\n")
if !defined $self->{'sequence_objects'};
if (scalar @{$self->{'sequence_objects'}} > 0) {
return shift @{$self->{'sequence_objects'}};
} else {
return 0;
}} |
sub next_primary_seq
{ my $self=shift;
return 0;} |
sub write_seq
{
my ($self,@seqs) = @_;
foreach my $seq ( @seqs ) {
$self->_write_each_record( $seq ); }
return;} |
sub _write_each_record
{ my ($self,$seq) = @_;
my $output = new IO::File(">" . $self->{'file'});
my $writer = new XML::Writer(OUTPUT => $output,
NAMESPACES => 0,
DATA_MODE => 1,
DATA_INDENT => 2 ) ;
$writer->xmlDecl("UTF-8");
$writer->doctype("sciobj", '', "sciobj.dtd");
$writer ->startTag('sciobj',
'version', '2',
'release', '2');
$writer->startTag('contig', 'length', $seq->length);
my $annotation = $seq ->annotation;
if ( $annotation->get_Annotations('dblink') ) {
my $dblink = $annotation->get_Annotations('dblink')->[0] ;
$writer ->startTag('db_id',
'id', $dblink->primary_id ,
'db_code', $dblink->database );
} else {
$writer ->startTag('db_id',
'id', $seq->display_id ,
'db_code', 'default' );
}
$writer ->endTag('db_id') ;
$writer->startTag('fragment_order');
$writer->startTag('fragment_orientation');
$writer ->startTag('bio_sequence',
'sequence_id', $seq->display_id,
'seq_length', $seq->length,
'molecule_type', $self->alphabet,
);
$annotation = $seq ->annotation;
if ( $annotation->get_Annotations('dblink') ) {
my $dblink = $annotation->get_Annotations('dblink')->[0] ;
$writer ->startTag('db_id',
'id', $dblink->primary_id ,
'db_code', $dblink->database );
} else {
$writer ->startTag('db_id',
'id', $seq->display_id ,
'db_code', 'default' );
}
$writer ->endTag('db_id') ;
my $note = "" ;
foreach my $comment ( $annotation->get_Annotations('comment') ) {
$note .= $comment->text() . "\n";
}
$writer ->startTag('note');
$writer ->characters( $note ) ;
$writer ->endTag('note');
$writer ->startTag('description');
$writer->characters($seq->{primary_seq}->{desc});
$writer ->endTag('description');
foreach my $genename ( $annotation->get_Annotations('gene_name') ) {
$writer ->startTag('keyword');
$writer ->characters( $genename ) ;
$writer ->endTag('keyword');
}
foreach my $ref ( $annotation->get_Annotations('reference') ) {
$writer ->startTag('keyword');
my $medline = $ref->medline || 'null';
my $pubmed = $ref->pubmed || 'null';
my $database = $ref->database || 'null';
my $authors = $ref->authors || 'null';
my $title = $ref->title || 'null';
$writer ->characters( 'medline:' . "$medline" . ':' . 'pubmed:' .
"$pubmed" . ':' . 'database:' . "$database" .
':' .'authors:' . "$authors" . ':' . 'title:' . "$title" ) ;
$writer ->endTag('keyword');
}
$writer ->startTag('sequence');
$writer ->characters( $seq->seq ) ;
$writer ->endTag('sequence');
$writer ->startTag('xrefs');
foreach my $link ( $annotation->get_Annotations('dblink') ) {
$writer ->startTag('db_id',
'db_code', $link->database,
'id', $link->primary_id);
$writer ->characters( $link->comment ) ;
$writer ->endTag('db_id');
}
$writer ->endTag('xrefs') ;
my @feats = $seq->top_SeqFeatures ;
my $features;
my $maps;
foreach my $feature (@feats) {
my $map_type = $feature ->source_tag;
push (@{$maps->{ $map_type }}, $feature);
}
foreach my $map_type (keys %$maps ) {
$writer->startTag('sequence_map',
'label', $map_type );
$writer->startTag('annotations');
foreach my $feature ( @{$maps->{ $map_type }} ) {
$self->_write_seqfeature( $feature, $writer ) ;
}
$writer->endTag('annotations');
$writer->endTag('sequence_map');
}
$writer->endTag('bio_sequence');
$writer->endTag('fragment_orientation');
$writer->endTag('fragment_order');
$writer->endTag('contig');
$writer->endTag('sciobj');} |
sub _write_seqfeature
{
my ($self,$seqf, $writer) = @_;
$writer ->startTag('seq_feature',
'feature_type', $seqf->primary_tag() );
my $strand = $seqf->strand();
$strand = 0 if !defined $strand;
my $is_on_complement;
if ($strand == 1) {
$is_on_complement = 'true';
} else {
$is_on_complement = 'false';
}
$writer ->startTag('seq_location',
'lease_start', $seqf->start(),
'greatest_end', $seqf->end(),
'is_on_complement' , $is_on_complement);
$writer ->endTag('seq_location');
foreach my $tag ( $seqf->all_tags() ) {
$writer ->startTag('qualifier',
'qualifier_type', $tag);
$writer ->characters( $seqf->each_tag_value($tag) ) ;
$writer ->endTag('qualifier');
}
foreach my $subfeat ( $seqf->sub_SeqFeature ) {
$self->_write_seqfeature( $subfeat, $writer ) ;
}
$writer->endTag('seq_feature');
return;} |
sub _filehandle
{
my ($obj,$value) = @_;
if ( defined $value) {
$obj->{'_filehandle'} = $value;
}
return $obj->{'_filehandle'};} |
sub throw
{
my ($self, @s) = @_;
my $string = "[$.]" . join('', @s);
$self->SUPER::throw($string);
return;} |
General documentation
User feedback is an integral part of the evolution of this and other
Bioperl modules. Send your comments and suggestions preferably to one
of the Bioperl mailing lists. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bioperl.org/wiki/Mailing_lists - About the mailing lists
Report bugs to the Bioperl bug tracking system to help us keep track
the bugs and their resolution.
Bug reports can be submitted via the web:
http://bugzilla.open-bio.org/
| AUTHOR - Simon K. Chan | Top |
Email:
The rest of the documentation details each of the object
methods. Internal methods are usually preceded with a _