Bio::SeqIO
scf
Summary
Bio::SeqIO::scf - .scf file input/output stream
Package variables
Privates (from "my" definitions)
$dumper = new Dumpvalue()
Included modules
Inherit
Synopsis
Do not use this module directly. Use it via the Bio::SeqIO class, see
Bio::SeqIO for more information.
Description
This object can transform .scf files to and from
Bio::Seq::SeqWithQuality objects. Mechanisms are present to retrieve
trace data from scf files.
Methods
Methods description
Title : next_seq()
Usage : $scf = $stream->next_seq()
Function: returns the next scf sequence in the stream
Returns : a Bio::Seq::SequenceTrace object
Args : NONE
Notes : Fills the interface specification for SeqIO.
The SCF specification does not provide for having more then
one sequence in a given scf. So once the filehandle has been open
and passed to SeqIO don't expect to run this function more then
once on a given scf unless you embraced and extended the SCF
standard. (But that's just C R A Z Y talk, isn't it.) |
Title : _get_v3_quality()
Usage : $self->_get_v3_quality()
Function: Set the base qualities from version3 scf's
Returns : Nothing. Alters $self.
Args : None.
Notes : |
Title : _get_v3_peak_indices($buffer)
Usage : $self->_get_v3_peak_indices($buffer);
Function: Unpacks the base accuracies for version3 scf
Returns : Nothing. Alters $self
Args : A scalar containing binary data.
Notes : |
Title : _get_v3_base_accuracies($buffer)
Usage : $self->_get_v3_base_accuracies($buffer)
Function: Set the base accuracies for version 3 scf's
Returns : Nothing. Alters $self.
Args : A scalar containing binary data.
Notes : |
Title : _get_comments($buffer)
Usage : $self->_get_comments($buffer);
Function: Gather the comments section from the scf and parse it into its
components.
Returns : Nothing. Modifies $self.
Args : The buffer. It is expected that the buffer contains a binary
string for the comments section of an scf file according to
the scf file specifications.
Notes : None. Works like Jello. |
Title : _get_header($buffer)
Usage : $self->_get_header($buffer);
Function: Gather the header section from the scf and parse it into its
components.
Returns : Nothing. Modifies $self.
Args : The buffer. It is expected that the buffer contains a binary
string for the header section of an scf file according to the
scf file specifications.
Notes : None. |
Title : _parse_v2_bases($buffer)
Usage : $self->_parse_v2_bases($buffer);
Function: Gather the bases section from the scf and parse it into its
components.
Returns :
Args : The buffer. It is expected that the buffer contains a binary
string for the bases section of an scf file according to the
scf file specifications.
Notes : None. |
Title : _pares_v2_traces(\@traces_array)
Usage : $self->_parse_v2_traces(\@traces_array);
Function: Parses an scf Version2 trace array into its base components.
Returns : Nothing. Modifies $self.
Args : A reference to an array of the unpacked traces section of an
scf version2 file. |
Title : get_header()
Usage : %header = %{$obj->get_header()};
Function: Return the header for this scf.
Returns : A reference to a hash containing the header for this scf.
Args : None.
Notes : |
Title : write_seq(-SeqWithQuality => $swq, )
Usage : $obj->write_seq(
-target => $swq,
-version => 2,
-CONV => "Bioperl-Chads Mighty SCF writer.");
Function: Write out an scf.
Returns : Nothing.
Args : Requires: a reference to a SeqWithQuality object to form the
basis for the scf.
if -version is provided, it should be "2" or "3". A SCF of that
version will be written.
Any other arguments are assumed to be comments and are put into
the comments section of the scf. Read the specifications for scf
to decide what might be good to put in here.
Notes :
For best results, use a SequenceTrace object.
The things that you need to write an scf:
a) sequence
b) quality
c) peak indices
d) traces
- You _can_ write an scf with just a and b by passing in a
SequenceWithQuality object- false traces will be synthesized
for you. |
Title : _get_binary_header();
Usage : $self->_get_binary_header();
Function: Provide the binary string that will be used as the header for
a scfv2 document.
Returns : A binary string.
Args : None. Uses the entries in the $self->{'header'} hash. These
are set on construction of the object (hopefully correctly!).
Notes : |
Title : _set_binary_tracesbases($version,$ref)
Usage : $self->_set_binary_tracesbases($version,$ref);
Function: Constructs the trace and base strings for all scfs
Returns : Nothing. Alters self.
Args : $version - "2" or "3"
$sequence - a scalar containing arbitrary sequence data
$ref - a reference to either a SequenceTraces or a
SequenceWithQuality object.
Notes : This is a really complicated thing. |
Title : _make_trace_string($version)
Usage : $self->_make_trace_string($version)
Function: Merges trace data for the four bases to produce an scf
trace string. _requires_ $version
Returns : Nothing. Alters $self.
Args : $version - a version number. "2" or "3"
Notes : |
Title : _get_binary_comments(\@comments)
Usage : $self->_get_binary_comments(\@comments);
Function: Provide a binary string that will be the comments section of
the scf file. See the scf specifications for detailed
specifications for the comments section of an scf file. Hint:
CODE=something\nBODE=something\n\0
Returns :
Args : A reference to an array containing comments.
Notes : None. |
Title : _fill_missing_data($swq)
Usage : $self->_fill_missing_data($swq);
Function: If the $swq with quality has no qualities, set all qualities
to 0.
If the $swq has no sequence, set the sequence to N's.
Returns : Nothing. Modifies the SeqWithQuality that was passed as an
argument.
Args : A reference to a Bio::Seq::SeqWithQuality
Notes : None. |
Title : _delta(\@trace_data,$direction)
Usage : $self->_delta(\@trace_data,$direction);
Function:
Returns : A reference to an array containing modified trace values.
Args : A reference to an array containing trace data and a string
indicating the direction of conversion. ("forward" or
"backward").
Notes : This code is taken from the specification for SCF3.2.
http://www.mrc-lmb.cam.ac.uk/pubseq/manual/formats_unix_4.html |
Title : _unpack_magik($buffer)
Usage : $self->_unpack_magik($buffer)
Function: What unpack specification should be used? Try them all.
Returns : Nothing.
Args : A buffer containing arbitrary binary data.
Notes : Eliminate the ambiguity and the guesswork. Used in the
adaptation of _delta(), mostly. |
Title : read_from_buffer($filehandle,$buffer,$length)
Usage : $self->read_from_buffer($filehandle,$buffer,$length);
Function: Read from the buffer.
Returns : $buffer, containing a read of $length
Args : a filehandle, a buffer, and a read length
Notes : I just got tired of typing
"unless (length($buffer) == $length)" so I put it here. |
Title : _dump_keys()
Usage : &_dump_keys($a_reference_to_some_hash)
Function: Dump out the keys in a hash.
Returns : Nothing.
Args : A reference to a hash.
Notes : A debugging method. |
Title : _dump_base_accuracies()
Usage : $self->_dump_base_accuracies();
Function: Dump out the v3 base accuracies in an easy to read format.
Returns : Nothing.
Args : None.
Notes : A debugging method. |
Title : _dump_peak_indices_incoming()
Usage : $self->_dump_peak_indices_incoming();
Function: Dump out the v3 peak indices in an easy to read format.
Returns : Nothing.
Args : None.
Notes : A debugging method. |
Title : _dump_base_accuracies_incoming()
Usage : $self->_dump_base_accuracies_incoming();
Function: Dump out the v3 base accuracies in an easy to read format.
Returns : Nothing.
Args : None.
Notes : A debugging method. |
Title : _dump_comments()
Usage : $self->_dump_comments();
Function: Debug dump the comments section from the scf.
Returns : Nothing.
Args : Nothing.
Notes : None. |
Methods code
BEGIN { $DEFAULT_QUALITY= 10; } |
sub _initialize
{ my($self,@args) = @_;
$self->SUPER::_initialize(@args);
if( ! defined $self->sequence_factory ) {
$self->sequence_factory(new Bio::Seq::SeqFactory
(-verbose => $self->verbose(),
-type => 'Bio::Seq::SeqWithQuality'));
}} |
sub next_seq
{ my ($self) = @_;
my ($seq, $seqc, $fh, $buffer, $offset, $length, $read_bytes, @read,
%names);
$fh = $self->_filehandle();
unless ($fh) { if ( !fileno(ARGV) or eof(ARGV) ) {
return unless my $ARGV = shift;
open(ARGV,$ARGV) or
$self->throw("Could not open $ARGV for SCF stream reading $!");
}
$fh =\* ARGV;
}
binmode $fh; return unless read $fh, $buffer, 128;
my $creator;
$creator->{header} = $self->_get_header($buffer);
if ($creator->{header}->{'version'} lt "3.00") {
$length = $creator->{header}->{'samples'}*
$creator->{header}->{sample_size}*4;
$buffer = $self->read_from_buffer($fh,$buffer,$length);
$creator->{traces} = $self->_parse_v2_traces(
$buffer,$creator->{header}->{sample_size});
$offset = $creator->{header}->{bases_offset};
$length = ($creator->{header}->{bases} * 12);
seek $fh,$offset,0;
$buffer = $self->read_from_buffer($fh,$buffer,$length);
($creator->{peak_indices},
$creator->{qualities},
$creator->{sequence},
$creator->{accuracies}) = $self->_parse_v2_bases($buffer);
} else {
my $transformed_read;
foreach (qw(a c g t)) {
$length = $creator->{header}->{'samples'}*
$creator->{header}->{sample_size};
$buffer = $self->read_from_buffer($fh,$buffer,$length);
my $byte = "n";
if ($creator->{header}->{sample_size} == 1){
$byte = "c";
}
@read = unpack "${byte}${length}",$buffer;
foreach (@read) {
if ($_ > 30000) {
$_ -= 65536;
}
}
$transformed_read = $self->_delta(\@read,"backward");
if($creator->{header}->{sample_size} == 1){
foreach (@{$transformed_read}) {
$_ += 256 if ($_ < 0);
}
}
$creator->{'traces'}->{$_} = join(' ',@{$transformed_read});
}
$offset = $creator->{header}->{bases_offset};
$length = ($creator->{header}->{bases} * 4);
seek $fh,$offset,0;
$buffer = $self->read_from_buffer($fh,$buffer,$length);
$creator->{peak_indices} = $self->_get_v3_peak_indices($buffer);
$buffer = $self->read_from_buffer($fh,$buffer,$length);
$creator->{accuracies} = $self->_get_v3_base_accuracies($buffer);
$length = $creator->{header}->{bases};
$buffer = $self->read_from_buffer($fh,$buffer,$length);
$creator->{'sequence'} = unpack("a$length",$buffer);
$creator->{qualities} = $self->_get_v3_quality(
$creator->{'sequence'},$creator->{accuracies});
}
$offset = $creator->{header}->{comments_offset};
seek $fh,$offset,0;
$length = $creator->{header}->{comment_size};
$buffer = $self->read_from_buffer($fh,$buffer,$length);
$creator->{comments} = $self->_get_comments($buffer);
my $swq = Bio::Seq::SeqWithQuality->new(
-seq => $creator->{'sequence'},
-qual => $creator->{'qualities'},
-id => $creator->{'comments'}->{'NAME'}
);
my $returner = Bio::Seq::SequenceTrace->new(
-swq => $swq,
-trace_a => $creator->{'traces'}->{'a'},
-trace_t => $creator->{'traces'}->{'t'},
-trace_g => $creator->{'traces'}->{'g'},
-trace_c => $creator->{'traces'}->{'c'},
-accuracy_a => $creator->{'accuracies'}->{'a'},
-accuracy_t => $creator->{'accuracies'}->{'t'},
-accuracy_g => $creator->{'accuracies'}->{'g'},
-accuracy_c => $creator->{'accuracies'}->{'c'},
-peak_indices => $creator->{'peak_indices'}
);
return $returner;} |
sub _get_v3_quality
{ my ($self,$sequence,$accuracies) = @_;
my @bases = split//,$sequence;
my (@qualities,$currbase,$currqual,$counter);
for ($counter=0; $counter <= $#bases ; $counter++) {
$currbase = lc($bases[$counter]);
if ($currbase eq "a") { $currqual = $accuracies->{'a'}->[$counter]; }
elsif ($currbase eq "c") { $currqual = $accuracies->{'c'}->[$counter]; }
elsif ($currbase eq "g") { $currqual = $accuracies->{'g'}->[$counter]; }
elsif ($currbase eq "t") { $currqual = $accuracies->{'t'}->[$counter]; }
else { $currqual = "unknown"; }
push @qualities,$currqual;
}
return\@ qualities;} |
sub _get_v3_peak_indices
{ my ($self,$buffer) = @_;
my $length = length($buffer);
my @read = unpack "N$length",$buffer;
return join(' ',@read);} |
sub _get_v3_base_accuracies
{ my ($self,$buffer) = @_;
my $length = length($buffer);
my $qlength = $length/4; my $offset = 0;
my (@qualities,@sorter,$counter,$round,$last_base,$accuracies,$currbase);
foreach $currbase (qw(a c g t)) {
my @read;
$last_base = $offset + $qlength;
for (;$offset < $last_base; $offset += $qlength) {
@read = unpack "c$qlength", substr($buffer,$offset,$qlength);
$accuracies->{$currbase} =\@ read;
}
}
return $accuracies;} |
sub _get_comments
{ my ($self,$buffer) = @_;
my $comments;
my $size = length($buffer);
my $comments_retrieved = unpack "a$size",$buffer;
$comments_retrieved =~ s/\0//;
my @comments_split = split/\n/,$comments_retrieved;
if (@comments_split) {
foreach (@comments_split) {
/(\w+)=(.*)/;
if ($1 && $2) {
$comments->{$1} = $2;
}
}
}
return $comments;} |
sub _get_header
{ my ($self,$buffer) = @_;
my $header;
($header->{'scf'},
$header->{'samples'},
$header->{'sample_offset'},
$header->{'bases'},
$header->{'bases_left_clip'},
$header->{'bases_right_clip'},
$header->{'bases_offset'},
$header->{'comment_size'},
$header->{'comments_offset'},
$header->{'version'},
$header->{'sample_size'},
$header->{'code_set'},
@{$header->{'header_spare'}} ) = unpack "a4 NNNNNNNN a4 NN N20", $buffer;
return $header;} |
sub _parse_v2_bases
{ my ($self,$buffer) = @_;
my $length = length($buffer);
my ($offset2,$currbuff,$currbase,$currqual,$sequence,@qualities,@indices);
my (@read,$harvester,$accuracies);
for ($offset2=0;$offset2<$length;$offset2+=12) {
@read = unpack "N C C C C a C3", substr($buffer,$offset2,$length);
push @indices,$read[0];
$currbase = lc($read[5]);
if ($currbase eq "a") { $currqual = $read[1]; }
elsif ($currbase eq "c") { $currqual = $read[2]; }
elsif ($currbase eq "g") { $currqual = $read[3]; }
elsif ($currbase eq "t") { $currqual = $read[4]; }
else { $currqual = "UNKNOWN"; }
push @{$accuracies->{"a"}},$read[1];
push @{$accuracies->{"c"}},$read[2];
push @{$accuracies->{"g"}},$read[3];
push @{$accuracies->{"t"}},$read[4];
$sequence .= $currbase;
push @qualities,$currqual;
}
return (\@indices,\@qualities,$sequence,$accuracies)} |
sub _parse_v2_traces
{ my ($self,$buffer,$sample_size) = @_;
my $byte;
if ($sample_size == 1) { $byte = "c"; }
else { $byte = "n"; }
my $length = CORE::length($buffer);
my @read = unpack "${byte}${length}",$buffer;
my $traces;
my $array = 0;
for (my $offset2 = 0; $offset2< scalar(@read); $offset2+=4) {
push @{$traces->{'a'}},$read[$offset2];
push @{$traces->{'t'}},$read[$offset2+1];
push @{$traces->{'g'}},$read[$offset2+3];
push @{$traces->{'c'}},$read[$offset2+2];
}
return $traces;} |
| get_trace_deprecated_use_the_sequencetrace_object_instead | description | prev | next | Top |
sub get_trace_deprecated_use_the_sequencetrace_object_instead
{
} |
| _deprecated_get_peak_indices_deprecated_use_the_sequencetrace_object_instead | description | prev | next | Top |
sub _deprecated_get_peak_indices_deprecated_use_the_sequencetrace_object_instead
{ my ($self) = shift;
my @temp = split(' ',$self->{'parsed'}->{'peak_indices'});
return\@ temp;} |
sub get_header
{ my ($self) = shift;
my %header;
foreach (qw(scf samples sample_offset bases bases_left_clip
bases_right_clip bases_offset comment_size comments_offset
version sample_size code_set peak_indices)) {
$header{"$_"} = $self->{"$_"};
}
return\% header;} |
| _dump_traces_incoming_deprecated_use_the_sequencetrace_object | description | prev | next | Top |
sub _dump_traces_incoming_deprecated_use_the_sequencetrace_object
{
} |
| _dump_traces_outgoing_deprecated_use_the_sequencetrace_object | description | prev | next | Top |
sub _dump_traces_outgoing_deprecated_use_the_sequencetrace_object
{ my ($self,$transformed) = @_;
my (@sA,@sT,@sG,@sC);
if ($transformed) {
@sA = @{$self->{'text'}->{'t_samples_a'}};
@sC = @{$self->{'text'}->{'t_samples_c'}};
@sG = @{$self->{'text'}->{'t_samples_g'}};
@sT = @{$self->{'text'}->{'t_samples_t'}};
}
else {
@sA = @{$self->{'text'}->{'samples_a'}};
@sC = @{$self->{'text'}->{'samples_c'}};
@sG = @{$self->{'text'}->{'samples_g'}};
@sT = @{$self->{'text'}->{'samples_t'}};
}
print ("Count\ta\tc\tg\tt\n");
for (my $curr=0; $curr < scalar(@sG); $curr++) {
print("$curr\t$sA[$curr]\t$sC[$curr]\t$sG[$curr]\t$sT[$curr]\n");
}
return;} |
sub write_seq
{ my ($self,%args) = @_;
my %comments;
my ($label,$arg);
my ($swq) = $self->_rearrange([qw(TARGET)], %args);
my $writer_fodder;
if (ref($swq) =~ /Bio::Seq::SequenceTrace|Bio::Seq::SeqWithQuality/) {
if (ref($swq) eq "Bio::Seq::SeqWithQuality") {
my $swq2 = new Bio::Seq::SequenceTrace(
-swq => $swq
);
$swq2->_synthesize_traces();
$swq2->set_accuracies();
$swq = $swq2;
}
}
else {
$self->throw("You must pass a Bio::Seq::SeqWithQuality or a Bio::Seq::SequenceTrace object to write_seq as a parameter named\" target\"");
}
foreach $arg (sort keys %args) {
next if ($arg =~ /target/i);
($label = $arg) =~ s/^\-//;
$writer_fodder->{comments}->{$label} = $args{$arg};
}
if (!$comments{'NAME'}) { $comments{'NAME'} = $swq->id(); }
$writer_fodder->{comments}->{'CONV'} = "Bioperl-Chads Mighty SCF writer." unless defined $comments{'CONV'};
if ($writer_fodder->{comments}->{version}) {
if ($writer_fodder->{comments}->{version} != 2 && $comments{version} != 3) {
$self->warn("This module can only write version 2.0 or 3.0 scf's. Writing a version 2.0 scf by default.");
$writer_fodder->{header}->{version} = "2.00";
}
elsif ($writer_fodder->{comments}->{'version'} > 2) {
$writer_fodder->{header}->{'version'} = "3.00";
}
else {
$writer_fodder->{header}->{version} = "2";
}
}
else {
$writer_fodder->{header}->{'version'} = "3.00";
}
$writer_fodder->{'header'}->{'magic'} = ".scf";
$writer_fodder->{'header'}->{'sample_size'} = "2";
$writer_fodder->{'header'}->{'bases'} = length($swq->seq());
$writer_fodder->{'header'}->{'bases_left_clip'} = "0";
$writer_fodder->{'header'}->{'bases_right_clip'} = "0";
$writer_fodder->{'header'}->{'sample_size'} = "2";
$writer_fodder->{'header'}->{'code_set'} = "9";
@{$writer_fodder->{'header'}->{'spare'}} = qw(0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0);
$writer_fodder->{'header'}->{'samples_offset'} = "128";
$writer_fodder->{'header'}->{'samples'} = $swq->trace_length();
$writer_fodder->{comments} = $self->_get_binary_comments(
$writer_fodder->{comments});
$writer_fodder->{traces} = $self->_get_binary_traces(
$writer_fodder->{'header'}->{'version'},
$swq,$writer_fodder->{'header'}->{'sample_size'});
my ($b_base_offsets,$b_base_accuracies,$samples_size,$bases_size);
if ($writer_fodder->{'header'}->{'version'} == 2) {
$writer_fodder->{bases} = $self->_get_binary_bases(
2,
$swq,
$writer_fodder->{'header'}->{'sample_size'});
$samples_size = CORE::length($writer_fodder->{traces}->{'binary'});
$bases_size = CORE::length($writer_fodder->{bases}->{binary});
$writer_fodder->{'header'}->{'bases_offset'} = 128 + $samples_size;
$writer_fodder->{'header'}->{'comments_offset'} = 128 +
$samples_size + $bases_size;
$writer_fodder->{'header'}->{'comments_size'} =
length($writer_fodder->{'comments'}->{binary});
$writer_fodder->{'header'}->{'private_size'} = "0";
$writer_fodder->{'header'}->{'private_offset'} = 128 +
$samples_size + $bases_size +
$writer_fodder->{'header'}->{'comments_size'};
$writer_fodder->{'header'}->{'binary'} =
$self->_get_binary_header($writer_fodder->{header});
$dumper->dumpValue($writer_fodder);
$self->_print ($writer_fodder->{'header'}->{'binary'})
or print("Could not write binary header...\n");
$self->_print ($writer_fodder->{'traces'}->{'binary'})
or print("Could not write binary traces...\n");
$self->_print ($writer_fodder->{'bases'}->{'binary'})
or print("Could not write binary base structures...\n");
$self->_print ($writer_fodder->{'comments'}->{'binary'})
or print("Could not write binary comments...\n");
}
else {
($writer_fodder->{peak_indices},
$writer_fodder->{accuracies},
$writer_fodder->{bases},
$writer_fodder->{reserved} ) =
$self->_get_binary_bases(
3,
$swq,
$writer_fodder->{'header'}->{'sample_size'}
);
$writer_fodder->{'header'}->{'bases_offset'} = 128 +
length($writer_fodder->{'traces'}->{'binary'});
$writer_fodder->{'header'}->{'comments_size'} =
length($writer_fodder->{'comments'}->{'binary'});
$writer_fodder->{'header'}->{'private_size'} = "0";
$writer_fodder->{'header'}->{'comments_offset'} =
128+length($writer_fodder->{'traces'}->{'binary'})+
length($writer_fodder->{'peak_indices'}->{'binary'})+
length($writer_fodder->{'accuracies'}->{'binary'})+
length($writer_fodder->{'bases'}->{'binary'})+
length($writer_fodder->{'reserved'}->{'binary'});
$writer_fodder->{'header'}->{'private_offset'} =
$writer_fodder->{'header'}->{'comments_offset'} +
$writer_fodder->{'header'}->{'comments_size'};
$writer_fodder->{'header'}->{'spare'}->[1] =
$writer_fodder->{'header'}->{'comments_offset'} +
length($writer_fodder->{'comments'}->{'binary'});
$writer_fodder->{header}->{binary} =
$self->_get_binary_header($writer_fodder->{header});
$self->_print ($writer_fodder->{'header'}->{'binary'})
or print("Couldn't write header\n");
$self->_print ($writer_fodder->{'traces'}->{'binary'})
or print("Couldn't write samples\n");
$self->_print ($writer_fodder->{'peak_indices'}->{'binary'})
or print("Couldn't write peak offsets\n");
$self->_print ($writer_fodder->{'accuracies'}->{'binary'})
or print("Couldn't write accuracies\n");
$self->_print ($writer_fodder->{'bases'}->{'binary'})
or print("Couldn't write called_bases\n");
$self->_print ($writer_fodder->{'reserved'}->{'binary'})
or print("Couldn't write reserved\n");
$self->_print ($writer_fodder->{'comments'}->{'binary'})
or print ("Couldn't write comments\n");
}
$self->flush if $self->_flush_on_write && defined $self->_fh;
$self->close();} |
sub _get_binary_header
{ my ($self,$header) = @_;
my $binary = pack "a4 NNNNNNNN a4 NN N20",
(
$header->{'magic'},
$header->{'samples'},
$header->{'samples_offset'},
$header->{'bases'},
$header->{'bases_left_clip'},
$header->{'bases_right_clip'},
$header->{'bases_offset'},
$header->{'comments_size'},
$header->{'comments_offset'},
$header->{'version'},
$header->{'sample_size'},
$header->{'code_set'},
@{$header->{'spare'}}
);
return $binary;} |
sub _get_binary_traces
{ my ($self,$version,$ref,$sample_size) = @_;
my $returner;
my $sequence = $ref->seq();
my $sequence_length = length($sequence);
my ($traceobj,@traces,$current);
if ( ref($ref) eq "Bio::Seq::SeqWithQuality" ) {
$traceobj = new Bio::Seq::SeqWithQuality(
-target => $ref
);
$traceobj->_synthesize_traces();
}
else {
$traceobj = $ref;
if ($version eq "2") {
my $trace_length = $traceobj->trace_length();
for ($current = 1; $current <= $trace_length; $current++) {
foreach (qw(a c g t)) {
push @traces,$traceobj->trace_value_at($_,$current);
}
}
}
elsif ($version == 3) {
foreach my $current_trace (qw(a c g t)) {
my @trace = @{$traceobj->trace($current_trace)};
foreach (@trace) {
if ($_ > 30000) {
$_ -= 65536;
}
}
my $transformed = $self->_delta(\@trace,"forward");
if($sample_size == 1){
foreach (@{$transformed}) {
$_ += 256 if ($_ < 0);
}
}
push @traces,@{$transformed};
}
}
}
$returner->{version} = $version;
$returner->{string} =\@ traces;
my $length_of_traces = scalar(@traces);
my $byte;
if ($sample_size == 1) { $byte = "c"; } else { $byte = "n"; }
$returner->{binary} = pack "n${length_of_traces}",@traces;
$returner->{length} = CORE::length($returner->{binary});
return $returner;} |
sub _get_binary_bases
{ my ($self,$version,$trace,$sample_size) = @_;
my $byte;
if ($sample_size == 1) { $byte = "c"; } else { $byte = "n"; }
my ($returner,@current_row,$current_base,$string,$binary);
my $length = $trace->length();
if ($version == 2) {
$returner->{'version'} = "2";
for (my $current_base =1; $current_base < $length; $current_base++) {
my @current_row;
push @current_row,$trace->peak_index_at($current_base);
push @current_row,$trace->accuracy_at("a",$current_base);
push @current_row,$trace->accuracy_at("c",$current_base);
push @current_row,$trace->accuracy_at("g",$current_base);
push @current_row,$trace->accuracy_at("t",$current_base);
push @current_row,$trace->baseat($current_base);
push @current_row,0,0,0;
push @{$returner->{string}},@current_row;
$returner->{binary} .= pack "N C C C C a C3",@current_row;
}
return $returner;
}
else {
$returner->{'version'} = "3.00";
$returner->{peak_indices}->{string} = $trace->peak_indices();
my $length = scalar(@{$returner->{peak_indices}->{string}});
$returner->{peak_indices}->{binary} =
pack "N$length",@{$returner->{peak_indices}->{string}};
$returner->{peak_indices}->{length} =
CORE::length($returner->{peak_indices}->{binary});
my @accuracies;
foreach my $base (qw(a c g t)) {
$returner->{accuracies}->{$base} = $trace->accuracies($base);
push @accuracies,@{$trace->accuracies($base)};
}
$returner->{sequence} = $trace->seq();
$length = scalar(@accuracies);
$returner->{accuracies}->{binary} = pack "c${length}",@accuracies;
$returner->{accuracies}->{length} =
CORE::length($returner->{accuracies}->{binary});
$length = $trace->seq_obj()->length();
for (my $count=0; $count< $length; $count++) {
push @{$returner->{reserved}->{string}},0,0,0;
}
}
$length = scalar(@{$returner->{reserved}->{string}});
$returner->{'reserved'}->{'binary'} =
pack "c$length",@{$returner->{reserved}->{string}};
$returner->{'reserved'}->{'length'} =
CORE::length($returner->{'reserved'}->{'binary'});
my @bases = split('',$trace->seq());
$length = $trace->length();
$returner->{'bases'}->{'binary'} = $trace->seq();
return ($returner->{peak_indices},
$returner->{accuracies},
$returner->{bases},
$returner->{reserved});} |
sub _make_trace_string
{ my ($self,$version) = @_;
my @traces;
my @traces_view;
my @as = @{$self->{'text'}->{'samples_a'}};
my @cs = @{$self->{'text'}->{'samples_c'}};
my @gs = @{$self->{'text'}->{'samples_g'}};
my @ts = @{$self->{'text'}->{'samples_t'}};
if ($version == 2) {
for (my $curr=0; $curr < scalar(@as); $curr++) {
$as[$curr] = $DEFAULT_QUALITY unless defined $as[$curr];
$cs[$curr] = $DEFAULT_QUALITY unless defined $cs[$curr];
$gs[$curr] = $DEFAULT_QUALITY unless defined $gs[$curr];
$ts[$curr] = $DEFAULT_QUALITY unless defined $ts[$curr];
push @traces,($as[$curr],$cs[$curr],$gs[$curr],$ts[$curr]);
}
}
elsif ($version == 3) {
@traces = (@as,@cs,@gs,@ts);
}
else {
$self->throw("No idea what version required to make traces here. You gave #$version# Bailing.");
}
my $length = scalar(@traces);
$self->{'text'}->{'samples_all'} =\@ traces;} |
sub _get_binary_comments
{ my ($self,$rcomments) = @_;
my $returner;
my $comments_string = '';
my %comments = %$rcomments;
foreach my $key (sort keys %comments) {
$comments{$key} ||= '';
$comments_string .= "$key=$comments{$key}\n";
}
$comments_string .= "\n\0";
my $length = CORE::length($comments_string);
$returner->{length} = $length;
$returner->{string} = $comments_string;
$returner->{binary} = pack "A$length",$comments_string;
return $returner;} |
sub _fill_missing_data
{ my ($self,$swq) = @_;
my $qual_obj = $swq->qual_obj();
my $seq_obj = $swq->seq_obj();
if ($qual_obj->length() == 0 && $seq_obj->length() != 0) {
my $fake_qualities = ("$DEFAULT_QUALITY ")x$seq_obj->length();
$swq->qual($fake_qualities);
}
if ($seq_obj->length() == 0 && $qual_obj->length != 0) {
my $sequence = ("N")x$qual_obj->length();
$swq->seq($sequence);
}} |
sub _delta
{ my ($self,$rsamples,$direction) = @_;
my @samples = @$rsamples;
my ($i,$num_samples,$p_delta,$p_sample,@samples_converted,$p_sample1,$p_sample2);
my $SLOW_BUT_CLEAR = 0;
$num_samples = scalar(@samples);
if ( $direction eq "forward" ) {
if($SLOW_BUT_CLEAR){
$p_delta = 0;
for ($i=0; $i < $num_samples; $i++) {
$p_sample = $samples[$i];
$samples[$i] = $samples[$i] - $p_delta;
$p_delta = $p_sample;
}
$p_delta = 0;
for ($i=0; $i < $num_samples; $i++) {
$p_sample = $samples[$i];
$samples[$i] = $samples[$i] - $p_delta;
$p_delta = $p_sample;
}
} else {
for ($i = $num_samples-1; $i > 1; $i--){
$samples[$i] = $samples[$i] - 2*$samples[$i-1] + $samples[$i-2];
}
$samples[1] = $samples[1] - 2*$samples[0];
}
}
elsif ($direction eq "backward") {
if($SLOW_BUT_CLEAR){
$p_sample = 0;
for ($i=0; $i < $num_samples; $i++) {
$samples[$i] = $samples[$i] + $p_sample;
$p_sample = $samples[$i];
}
$p_sample = 0;
for ($i=0; $i < $num_samples; $i++) {
$samples[$i] = $samples[$i] + $p_sample;
$p_sample = $samples[$i];
}
} else {
$p_sample1 = $p_sample2 = 0;
for ($i = 0; $i < $num_samples; $i++){
$p_sample1 = $p_sample1 + $samples[$i];
$samples[$i] = $p_sample1 + $p_sample2;
$p_sample2 = $samples[$i];
}
}
}
else {
$self->warn("Bad direction. Use\" forward\" or\" backward\".");
}
return\@ samples; } |
sub _unpack_magik
{ my ($self,$buffer) = @_;
my $length = length($buffer);
my (@read,$counter);
foreach (qw(c C s S i I l L n N v V)) {
@read = unpack "$_$length", $buffer;
for ($counter=0; $counter < 20; $counter++) {
print("$read[$counter]\n");
}
}} |
sub read_from_buffer
{ my ($self,$fh,$buffer,$length) = @_;
read $fh, $buffer, $length;
unless (length($buffer) == $length) {
$self->warn("The read was incomplete! Trying harder.");
my $missing_length = $length - length($buffer);
my $buffer2;
read $fh,$buffer2,$missing_length;
$buffer .= $buffer2;
if (length($buffer) != $length) {
$self->throw("Unexpected end of file while reading from SCF file. I should have read $length but instead got ".length($buffer)."! Current file position is ".tell($fh).".");
}
}
return $buffer;} |
sub _dump_keys
{ my $rhash = shift;
if ($rhash !~ /HASH/) {
print("_dump_keys: that was not a hash.\nIt was #$rhash# which was this reference:".ref($rhash)."\n");
return;
}
print("_dump_keys: The keys for $rhash are:\n");
foreach (sort keys %$rhash) {
print("$_\n");
}} |
sub _dump_base_accuracies
{ my $self = shift;
print("Dumping base accuracies! for v3\n");
print("There are this many elements in a,c,g,t:\n");
print(scalar(@{$self->{'text'}->{'v3_base_accuracy_a'}}).",".scalar(@{$self->{'text'}->{'v3_base_accuracy_c'}}).",".scalar(@{$self->{'text'}->{'v3_base_accuracy_g'}}).",".scalar(@{$self->{'text'}->{'v3_base_accuracy_t'}})."\n");
my $number_traces = scalar(@{$self->{'text'}->{'v3_base_accuracy_a'}});
for (my $counter=0; $counter < $number_traces; $counter++ ) {
print("$counter\t");
print $self->{'text'}->{'v3_base_accuracy_a'}->[$counter]."\t";
print $self->{'text'}->{'v3_base_accuracy_c'}->[$counter]."\t";
print $self->{'text'}->{'v3_base_accuracy_g'}->[$counter]."\t";
print $self->{'text'}->{'v3_base_accuracy_t'}->[$counter]."\t";
print("\n");
}} |
sub _dump_peak_indices_incoming
{ my $self = shift;
print("Dump peak indices incoming!\n");
my $length = $self->{'bases'};
print("The length is $length\n");
for (my $count=0; $count < $length; $count++) {
print("$count\t$self->{parsed}->{peak_indices}->[$count]\n");
}} |
sub _dump_base_accuracies_incoming
{ my $self = shift;
print("Dumping base accuracies! for v3\n");
my $number_traces = $self->{'bases'};
for (my $counter=0; $counter < $number_traces; $counter++ ) {
print("$counter\t");
foreach (qw(A T G C)) {
print $self->{'parsed'}->{'base_accuracies'}->{$_}->[$counter]."\t";
}
print("\n");
}} |
sub _dump_comments
{ my ($self) = @_;
warn ("SCF comments:\n");
foreach my $k (keys %{$self->{'comments'}}) {
warn ("\t {$k} ==> ", $self->{'comments'}->{$k}, "\n");
}} |
General documentation
User feedback is an integral part of the evolution of this and other
Bioperl modules. Send your comments and suggestions preferably to one
of the Bioperl mailing lists. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://www.bioperl.org/MailList.shtml - About the mailing lists
Report bugs to the Bioperl bug tracking system to help us keep track
the bugs and their resolution. Bug reports can be submitted via email
or the web:
bioperl-bugs@bio.perl.org
http://bugzilla.bioperl.org/
The rest of the documentation details each of the object
methods. Internal methods are usually preceded with a _