| Summary | Included libraries | Package variables | Synopsis | Description | General documentation | Methods |
| WebCvs |
| new | Description | Code |
| debug | No description | Code |
| features_db | Description | Code |
| dbh | No description | Code |
| get_dna | Description | Code |
| get_abscoords | Description | Code |
| get_features | Description | Code |
| classes | Description | Code |
| make_classes_query | Description | Code |
| _feature_by_name | Description | Code |
| _feature_by_id | Description | Code |
| _feature_by_attribute | No description | Code |
| get_types | Description | Code |
| range_query | Description | Code |
| make_features_by_range_where_part | Description | Code |
| do_straight_join | Description | Code |
| string_match | Description | Code |
| exact_match | Description | Code |
| search_notes | Description | Code |
| meta | Description | Code |
| make_meta_get_query | Description | Code |
| dna_chunk_size | No description | Code |
| make_meta_set_query | Description | Code |
| default_meta_values | Description | Code |
| min_bin | No description | Code |
| max_bin | No description | Code |
| straight_join_limit | No description | Code |
| get_features_iterator | Description | Code |
| do_initialize | Description | Code |
| finish_load | Description | Code |
| create_other_schema_objects | Description | Code |
| drop_all | Description | Code |
| clone | Description | Code |
| drop_other_schema_objects | Description | Code |
| make_features_select_part | Description | Code |
| tables | Description | Code |
| schema | Description | Code |
| DESTROY | Description | Code |
| make_features_by_name_where_part | Description | Code |
| make_features_by_alias_where_part | No description | Code |
| make_features_by_attribute_where_part | No description | Code |
| make_features_by_id_where_part | Description | Code |
| make_features_by_gid_where_part | Description | Code |
| make_features_from_part | Description | Code |
| make_features_join_part | Description | Code |
| make_features_order_by_part | Description | Code |
| make_features_group_by_part | Description | Code |
| refseq_query | Description | Code |
| do_attributes | No description | Code |
| overlap_query_nobin | Description | Code |
| contains_query_nobin | Description | Code |
| contained_in_query_nobin | Description | Code |
| types_query | Description | Code |
| make_types_select_part | Description | Code |
| make_types_from_part | Description | Code |
| make_types_join_part | Description | Code |
| make_types_where_part | Description | Code |
| make_types_group_part | Description | Code |
| get_feature_id | Description | Code |
| make_abscoord_query | Description | Code |
| make_aliasabscoord_query | No description | Code |
| getseqcoords_query | No description | Code |
| getaliascoords_query | No description | Code |
| bin_query | No description | Code |
| _bin_query | No description | Code |
| overlap_query | No description | Code |
| contains_query | No description | Code |
| contained_in_query | No description | Code |
| _delete_fattribute_to_feature | No description | Code |
| _delete_features | No description | Code |
| _delete_groups | No description | Code |
| _delete | No description | Code |
| feature_summary | Description | Code |
| coverage_array | Description | Code |
| build_summary_statistics | Description | Code |
| _load_bins | No description | Code |
| _add_interval_stats_table | No description | Code |
| enable_keys | No description | Code |
| new | code | next | Top |
Title : newThis is the constructor for the adaptor. It is called automatically by Bio::DB::GFF->new. In addition to arguments that are common among all adaptors, the following class-specific arguments are recgonized: Argument Description |
| features_db | code | prev | next | Top |
Title : features_db |
| get_dna | code | prev | next | Top |
Title : get_dnaThis method performs the low-level fetch of a DNA substring given its name, class and the desired range. It is actually a front end to the abstract method make_dna_query(), which it calls after some argument consistency checking. |
| get_abscoords | code | prev | next | Top |
Title : get_abscoordsThis method performs the low-level resolution of a landmark into a reference sequence and position. The result is an array ref, each element of which is a five-element list containing reference sequence name, class, start, stop and strand. |
| get_features | code | prev | next | Top |
Title : get_featuresThis is the low-level method that is called to retrieve GFF lines from the database. It is responsible for retrieving features that satisfy range and feature type criteria, and passing the GFF fields to a callback subroutine. See the manual page for Bio::DB::GFF for the interpretation of the arguments and how the information retrieved by get_features is passed to the callback for processing. Internally, get_features() is a front end for range_query(). The latter method constructs the query and executes it. get_features() calls fetchrow_array() to recover the fields and passes them to the callback. |
| classes | code | prev | next | Top |
Title : classesThis routine returns the list of reference classes known to the database, or empty if classes are not used by the database. Classes are distinct from types, being essentially qualifiers on the reference namespaces. NOTE: In the current mysql-based schema, this query takes a while to run due to the classes not being normalized. |
| make_classes_query | code | prev | next | Top |
Title : make_classes_query |
| _feature_by_name | code | prev | next | Top |
Title : _feature_by_nameThis method is used internally. The callback arguments are those used by make_feature(). Internally, it invokes the following abstract procedures: make_features_select_part |
| _feature_by_id | code | prev | next | Top |
Title : _feature_by_idThis method is used internally. The $type selector is one of "feature" or "group". The callback arguments are those used by make_feature(). Internally, it invokes the following abstract procedures: make_features_select_part |
| get_types | code | prev | next | Top |
Title : get_typesThis method is responsible for fetching the list of feature type names from the database. The query may be limited to a particular range, in which case the range is indicated by a landmark sequence name and class and its subrange, if any. These arguments may be undef if it is desired to retrieve all feature types in the database (which may be a slow operation in some implementations). If the $count flag is false, the method returns a simple list of vBio::DB::GFF::Typename objects. If $count is true, the method returns a list of $name=>$count pairs, where $count indicates the number of times this feature occurs in the range. Internally, this method calls upon the following functions to generate the SQL and its bind variables: ($q1,@args) = make_types_select_part(@args);The components are then combined as follows: $query = "SELECT $q1 FROM $q2 WHERE $q3 AND $q4 GROUP BY $q5";If any of the query fragments contain the ? bind variable, then the same number of bind arguments must be provided in @args. The fragment-generating functions are described below. |
| range_query | code | prev | next | Top |
Title : range_queryThis method constructs the statement handle for this module's central query: given a range and/or a list of feature types, fetch their GFF records. The positional arguments are as follows: Argument DescriptionIf successful, this method returns a statement handle. The handle is expected to return the fields described for get_features(). Internally, range_query() makes calls to the following methods, each of which is expected to be overridden in subclasses: $select = $self->make_features_select_part;The query that is constructed looks like this: SELECT $select FROM $from WHERE $join AND $whereThe arguments that are returned from make_features_by_range_where_part() are passed to the statement handler's execute() method. range_query() also calls a do_straight_join() method, described below. If this method returns true, then the keyword "straight_join" is inserted right after SELECT. |
| make_features_by_range_where_part | code | prev | next | Top |
Title : make_features_by_range_where_partThis method creates the part of the features query that immediately follows the WHERE keyword and is ANDed with the string returned by make_features_join_part(). The six positional arguments are a flag indicating whether to perform a range search or an overlap search, the reference sequence, class, start and stop, all of which define an optional range to search in, and an array reference containing a list [$method,$souce] pairs. The method result is a multi-element list containing the query string and the list of runtime arguments to bind to it with the execute() method. This method's job is to clean up arguments and perform consistency checking. The real work is done by the following abstract methods: Method DescriptionSee Bio::DB::Adaptor::dbi::mysql for an example of how this works. |
| do_straight_join | code | prev | next | Top |
Title : do_straight_joinThis subroutine, called by range_query() returns a boolean flag. If true, range_query() will perform a straight join, which can be used to optimize certain SQL queries. The four arguments correspond to similarly-named arguments passed to range_query(). |
| string_match | code | prev | next | Top |
Title : string_matchThis method examines the passed value for meta characters. If so it produces a SQL fragment that performs a regular expression match. Otherwise, it produces a fragment that performs an exact string match. This method is not used in the module, but is available for use by subclasses. |
| exact_match | code | prev | next | Top |
Title : exact_matchThis method produces the SQL fragment for matching a field name to a constant string value. |
| search_notes | code | prev | next | Top |
Title : search_notesThis is a mysql-specific method. Given a search string, it performs a full-text search of the notes table and returns an array of results. Each row of the returned array is a arrayref containing the following fields: column 1 A Bio::DB::GFF::Featname object, suitable for passing to segment() |
| meta | code | prev | next | Top |
Title : metaGet or set a named metavariable for the database. Metavariables can be used for database-specific settings. This method calls two class-specific methods which must be implemented: make_meta_get_query() Returns a sql fragment which given a metaDon't make changes unless you know what you're doing! It will affect the persistent database. |
| make_meta_get_query | code | prev | next | Top |
Title : make_meta_get_queryBy default this does nothing; meta parameters are not stored or retrieved. |
| make_meta_set_query | code | prev | next | Top |
Title : make_meta_set_queryBy default this does nothing; meta parameters are not stored or retrieved. |
| default_meta_values | code | prev | next | Top |
Title : default_meta_valuesThis method returns a list of tag=>value pairs that contain default meta information about the database. It is invoked by initialize() to write out the default meta values. The base class version returns an empty list. For things to work properly, meta value names must be UPPERCASE. |
| get_features_iterator | code | prev | next | Top |
Title : get_features_iteratorThis method is similar to get_features(), except that it returns an iterator across the query. See Bio::DB::GFF::Adaptor::dbi::iterator. |
| do_initialize | code | prev | next | Top |
Title : do_initializeThis method will load the schema into the database. If $drop_all is true, then any existing data in the tables known to the schema will be deleted. Internally, this method calls schema() to get the schema data. |
| finish_load | code | prev | next | Top |
Title : finish_loadThis method performs schema-specific cleanup after loading a set of GFF records. It finishes each of the statement handlers prepared by setup_load(). |
| create_other_schema_objects | code | prev | next | Top |
Title : create_other_schema_objects |
| drop_all | code | prev | next | Top |
Title : drop_allThis method drops the tables known to this module. Internally it calls the abstract tables() method. |
| clone | code | prev | next | Top |
| The clone() method should be used when you want to pass the Bio::DB::GFF object to a child process across a fork(). The child must call clone() before making any queries. This method does two things: (1) it sets the underlying database handle's InactiveDestroy parameter to 1, thereby preventing the database connection from being destroyed in the parent when the dbh's destructor is called; (2) it replaces the dbh with the result of dbh->clone(), so that we now have an independent handle. |
| drop_other_schema_objects | code | prev | next | Top |
Title : drop_other_schema_objects |
| make_features_select_part | code | prev | next | Top |
Title : make_features_select_partThis abstract method creates the part of the features query that immediately follows the SELECT keyword. |
| tables | code | prev | next | Top |
Title : tablesThis method lists the tables known to the module. |
| schema | code | prev | next | Top |
Title : schemaThis method returns an array ref containing the various CREATE statements needed to initialize the database tables. The keys are the table names, and the values are strings containing the appropriate CREATE statement. |
| DESTROY | code | prev | next | Top |
Title : DESTROYThis is the destructor for the class. |
| make_features_by_name_where_part | code | prev | next | Top |
Title : make_features_by_name_where_part |
| make_features_by_id_where_part | code | prev | next | Top |
Title : make_features_by_id_where_part |
| make_features_by_gid_where_part | code | prev | next | Top |
Title : make_features_by_id_where_part |
| make_features_from_part | code | prev | next | Top |
Title : make_features_from_partThis method creates the part of the features query that immediately follows the FROM keyword. |
| make_features_join_part | code | prev | next | Top |
Title : make_features_join_partThis method creates the part of the features query that immediately follows the WHERE keyword. |
| make_features_order_by_part | code | prev | next | Top |
Title : make_features_order_by_partThis method creates the part of the features query that immediately follows the ORDER BY part of the query issued by features() and related methods. |
| make_features_group_by_part | code | prev | next | Top |
Title : make_features_group_by_partThis method creates the part of the features query that immediately follows the GROUP BY part of the query issued by features() and related methods. |
| refseq_query | code | prev | next | Top |
Title : refseq_queryThis method is called by make_features_by_range_where_part() to construct the part of the select WHERE section that selects a particular reference sequence. It returns a mult-element list in which the first element is the SQL fragment and subsequent elements are bind values. For example: sub refseq_query {The current schema does not distinguish among different classes ofreference sequence. |
| overlap_query_nobin | code | prev | next | Top |
Title : overlap_queryThis method is called by make_features_byrange_where_part() to construct the part of the select WHERE section that selects a set of features that overlap a range. It returns a multi-element list in which the first element is the SQL fragment and subsequent elements are bind values. sub overlap_query_nobin { my ($start,$stop) = @_; return ('gff.stop>=? AND gff.start<=?', $start,$stop); |
| contains_query_nobin | code | prev | next | Top |
Title : contains_queryThis method is called by make_features_byrange_where_part() to construct the part of the select WHERE section that selects a set of features entirely enclosed by a range. It returns a multi-element list in which the first element is the SQL fragment and subsequent elements are bind values. For example: sub contains_query_nobin { |
| contained_in_query_nobin | code | prev | next | Top |
Title : contained_in_query_nobinThis method is called by make_features_byrange_where_part() to construct the part of the select WHERE section that selects a set of features entirely enclosed by a range. It returns a multi-element list in which the first element is the SQL fragment and subsequent elements are bind values.For example: sub contained_in_query_nobin { |
| types_query | code | prev | next | Top |
Title : types_queryThis method is called by make_features_byrange_where_part() to construct the part of the select WHERE section that selects a set of features based on their type. It returns a multi-element list in which the first element is the SQL fragment and subsequent elements are bind values. The argument is an array reference containing zero or more [$method,$source] pairs. |
| make_types_select_part | code | prev | next | Top |
Title : make_types_select_partThis method is called by get_types() to generate the query fragment and bind arguments for the SELECT part of the query that retrieves lists of feature types. The four positional arguments are as follows: $refseq reference sequence nameIf $want_count is false, the SQL fragment returned must produce a list of feature types in the format (method, source). If $want_count is true, the returned fragment must produce a list of feature types in the format (method, source, count). |
| make_types_from_part | code | prev | next | Top |
Title : make_types_from_partThis method is called by get_types() to generate the query fragment and bind arguments for the FROM part of the query that retrieves lists of feature types. The four positional arguments are as follows: $refseq reference sequence nameIf $want_count is false, the SQL fragment returned must produce a list of feature types in the format (method, source). If $want_count is true, the returned fragment must produce a list of feature types in the format (method, source, count). |
| make_types_join_part | code | prev | next | Top |
Title : make_types_join_partThis method is called by get_types() to generate the query fragment and bind arguments for the JOIN part of the query that retrieves lists of feature types. The four positional arguments are as follows: $refseq reference sequence name |
| make_types_where_part | code | prev | next | Top |
Title : make_types_where_partThis method is called by get_types() to generate the query fragment and bind arguments for the WHERE part of the query that retrieves lists of feature types. The four positional arguments are as follows: $refseq reference sequence name |
| make_types_group_part | code | prev | next | Top |
Title : make_types_group_partThis method is called by get_types() to generate the query fragment and bind arguments for the GROUP BY part of the query that retrieves lists of feature types. The four positional arguments are as follows: $refseq reference sequence name |
| get_feature_id | code | prev | next | Top |
Title : get_feature_idThis internal method is called by load_gff_line to look up the integer ID of an existing feature. It is ony needed when replacing a feature with new information. |
| make_abscoord_query | code | prev | next | Top |
Title : make_abscoord_queryThe statement handler should return rows containing five fields: 1. reference sequence nameThis query always returns "Sequence" as the class of the reference sequence. |
| feature_summary | code | prev | next | Top |
Title : feature_summaryThis method is used to get coverage density information across a region of interest. You provide it with a region of interest, optional a list of feature types, and a count of the number of bins over which you want to calculate the coverage density. An object is returned corresponding to the requested region. It contains a tag called "coverage" that will return an array ref of "bins" length. Each element of the array describes the number of features that overlap the bin at this postion. Arguments: Argument DescriptionNote that this method uses an approximate algorithm that is only accurate to 500 bp, so when dealing with bins that are smaller than 1000 bp, you may see some shifting of counts between adjacent bins. Although an -iterator option is provided, the method only ever returns a single feature, so this is fairly useless. |
| coverage_array | code | prev | next | Top |
Title : coverage_arrayThis method is used to get coverage density information across a region of interest. The arguments are identical to feature_summary, except that instead of returning a Bio::SeqFeatureI object, it returns an array reference of the desired number of bins. The value of each element corresponds to the number of features in the bin. Arguments: Argument DescriptionNote that this method uses an approximate algorithm that is only accurate to 500 bp, so when dealing with bins that are smaller than 1000 bp, you may see some shifting of counts between adjacent bins. |
| build_summary_statistics | code | prev | next | Top |
Title : build_summary_statisticsThis method is used to build the summary statistics table that is used by the feature_summary() and coverage_array() methods. It needs to be called whenever the database is updated. |
| new | description | prev | next | Top |
my $class = shift; my ($features_db,$username,$auth,$other) = rearrange([ [qw(FEATUREDB DB DSN)], [qw(USERNAME USER)], [qw(PASSWORD PASSWD PASS)], ],@_); $features_db || $class->throw("new(): Provide a data source or DBI database"); if (!ref($features_db)) { my $dsn = $features_db; my @args; push @args,$username if defined $username; push @args,$auth if defined $auth; $features_db = Bio::DB::GFF::Adaptor::dbi::caching_handle->new($dsn,@args) || $class->throw("new(): Failed to connect to $dsn: " . Bio::DB::GFF::Adaptor::dbi::caching_handle->errstr); } else { $features_db->isa('DBI::db') || $class->throw("new(): $features_db is not a DBI handle"); } # fill in object}
return bless { features_db => $features_db },$class;
| debug | description | prev | next | Top |
my $self = shift; $self->features_db->debug(@_); $self->SUPER::debug(@_);}
| features_db | description | prev | next | Top |
shift->{features_db}}
| dbh | description | prev | next | Top |
shift->{features_db}}
| get_dna | description | prev | next | Top |
my $self = shift; my ($ref,$start,$stop,$class) = @_; my ($offset_start,$offset_stop); my $has_start = defined $start; my $has_stop = defined $stop; my $reversed; if ($has_start && $has_stop && $start > $stop) { $reversed++; ($start,$stop) = ($stop,$start); } # turn start and stop into 0-based offsets}
my $cs = $self->dna_chunk_size; $start -= 1; $stop -= 1; $offset_start = int($start/$cs)*$cs;
$offset_stop = int($stop/$cs)*$cs;
my $sth; # special case, get it all
if (!($has_start || $has_stop)) { $sth = $self->dbh->do_query('select fdna,foffset from fdna where fref=? order by foffset',$ref); } elsif (!$has_stop) { $sth = $self->dbh->do_query('select fdna,foffset from fdna where fref=? and foffset>=? order by foffset', $ref,$offset_start); } else { # both start and stop defined
$sth = $self->dbh->do_query('select fdna,foffset from fdna where fref=? and foffset>=? and foffset<=? order by foffset', $ref,$offset_start,$offset_stop); } my $dna = ''; while (my($frag,$offset) = $sth->fetchrow_array) { substr($frag,0,$start-$offset) = '' if $has_start && $start > $offset; $dna .= $frag; } substr($dna,$stop-$start+1) = '' if $has_stop && $stop-$start+1 < length($dna); if ($reversed) { $dna = reverse $dna; $dna =~ tr/gatcGATC/ctagCTAG/; } $sth->finish; $dna;
| get_abscoords | description | prev | next | Top |
my $self = shift; my ($name,$class,$refseq) = @_; my $sth = $self->make_abscoord_query($name,$class,$refseq); my @result; while (my @row = $sth->fetchrow_array) { push @result,\@row } $sth->finish; if (@result == 0) { #$self->error("$name not found in database");}
my $sth2 = $self->make_aliasabscoord_query($name,$class); while (my @row2 = $sth2->fetchrow_array) { push @result,\@row2 } $sth->finish; if (@result == 0){ $self->error("$name not found in database"); return; } } return\@ result;
| get_features | description | prev | next | Top |
my $self = shift; my ($search,$options,$callback) = @_; $callback || $self->throw('must provide a callback argument'); my $sth = $self->range_query(@{$search}{qw(rangetype refseq refclass start stop types) }, @{$options}{qw( sparse sort_by_group ATTRIBUTES BINSIZE)}) or return; my $count = 0; while (my @row = $sth->fetchrow_array) { $callback->(@row); $count++; } $sth->finish; return $count;}
| classes | description | prev | next | Top |
my $self = shift; my ($query,@args) = $self->make_classes_query or return; my $sth = $self->dbh->do_query($query,@args); my @classes; while (my ($c) = $sth->fetchrow_array) { push @classes,$c; } @classes;}
| make_classes_query | description | prev | next | Top |
my $self = shift; return;}
| _feature_by_name | description | prev | next | Top |
my $self = shift; my ($class,$name,$location,$callback) = @_; $callback || $self->throw('must provide a callback argument'); my $select = $self->make_features_select_part; my $from = $self->make_features_from_part(undef,{sparse_groups=>1}); my ($where,@args) = $self->make_features_by_name_where_part($class,$name); my $join = $self->make_features_join_part; my $range = $self->make_features_by_range_where_part('overlaps', {refseq=>$location->[0], class =>'', start=>$location->[1], stop =>$location->[2]}) if $location; # group query}
my $query1 = "SELECT $select FROM $from WHERE $where AND $join"; $query1 .= " AND $range" if $range; # alias query
$from = $self->make_features_from_part(undef,{attributes=>1}); ($where,@args) = $self->make_features_by_alias_where_part($class,$name); # potential bug - @args1==@args2?
my $query2 = "SELECT $select FROM $from WHERE $where AND $join"; $query2 .= " AND $range" if $range; my $count = 0; for my $query ($query1,$query2) { my $sth = $self->dbh->do_query($query,@args); while (my @row = $sth->fetchrow_array) { $callback->(@row); $count++; } $sth->finish; } return $count;
| _feature_by_id | description | prev | next | Top |
my $self = shift; my ($ids,$type,$callback) = @_; $callback || $self->throw('must provide a callback argument'); my $select = $self->make_features_select_part; my $from = $self->make_features_from_part; my ($where,@args) = $type eq 'feature' ? $self->make_features_by_id_where_part($ids) : $self->make_features_by_gid_where_part($ids); my $join = $self->make_features_join_part; my $query = "SELECT $select FROM $from WHERE $where AND $join"; my $sth = $self->dbh->do_query($query,@args); my $count = 0; while (my @row = $sth->fetchrow_array) { $callback->(@row); $count++; } $sth->finish; return $count;}
| _feature_by_attribute | description | prev | next | Top |
my $self = shift; my ($attributes,$callback) = @_; $callback || $self->throw('must provide a callback argument'); my $select = $self->make_features_select_part; my $from = $self->make_features_from_part(undef,{attributes=>$attributes}); my ($where,@args) = $self->make_features_by_range_where_part('',{attributes=>$attributes}); my $join = $self->make_features_join_part({attributes=>$attributes}); my $query = "SELECT $select FROM $from WHERE $where AND $join"; my $sth = $self->dbh->do_query($query,@args); my $count = 0; while (my @row = $sth->fetchrow_array) { $callback->(@row); $count++; } $sth->finish; return $count;}
| get_types | description | prev | next | Top |
my $self = shift; my ($srcseq,$class,$start,$stop,$want_count,$typelist) = @_; my $straight = $self->do_straight_join($srcseq,$start,$stop,[]) ? 'straight_join' : ''; my ($select,@args1) = $self->make_types_select_part($srcseq,$start,$stop,$want_count,$typelist); my ($from,@args2) = $self->make_types_from_part($srcseq,$start,$stop,$want_count,$typelist); my ($join,@args3) = $self->make_types_join_part($srcseq,$start,$stop,$want_count,$typelist); my ($where,@args4) = $self->make_types_where_part($srcseq,$start,$stop,$want_count,$typelist); my ($group,@args5) = $self->make_types_group_part($srcseq,$start,$stop,$want_count,$typelist); my $query = "SELECT $straight $select FROM $from WHERE $join AND $where"; $query .= " GROUP BY $group" if $group; my @args = (@args1,@args2,@args3,@args4,@args5); my $sth = $self->dbh->do_query($query,@args) or return; my (%result,%obj); while (my ($method,$source,$count) = $sth->fetchrow_array) { my $type = Bio::DB::GFF::Typename->new($method,$source); $result{$type} = $count; $obj{$type} = $type; } return $want_count ? %result : values %obj;}
| range_query | description | prev | next | Top |
my $self = shift; my($rangetype,$refseq,$class,$start,$stop,$types,$sparse,$order_by_group,$attributes,$bin) = @_; my $dbh = $self->features_db; # NOTE: straight_join is necessary in some database to force the right index to be used.}
my %a = (refseq=>$refseq,class=>$class,start=>$start,stop=>$stop,types=>$types,attributes=>$attributes,bin_width=>$bin); my $straight = $self->do_straight_join(\%a) ? 'straight_join' : ''; my $select = $self->make_features_select_part(\%a); my $from = $self->make_features_from_part($sparse,\%a); my $join = $self->make_features_join_part(\%a); my ($where,@args) = $self->make_features_by_range_where_part($rangetype,\%a); my ($group_by,@more_args) = $self->make_features_group_by_part(\%a); my $order_by = $self->make_features_order_by_part(\%a) if $order_by_group; my $query = "SELECT $straight $select FROM $from WHERE $join"; $query .= " AND $where" if $where; if ($group_by) { $query .= " GROUP BY $group_by"; push @args,@more_args; } $query .= " ORDER BY $order_by" if $order_by; my $sth = $self->dbh->do_query($query,@args); $sth;
| make_features_by_range_where_part | description | prev | next | Top |
my $self = shift; my ($rangetype,$options) = @_; $options ||= {}; my ($refseq,$class,$start,$stop,$types,$attributes) = @{$options}{qw(refseq class start stop types attributes)}; my (@query,@args); if ($refseq) { my ($q,@a) = $self->refseq_query($refseq,$class); push @query,$q; push @args,@a; } if (defined $start or defined $stop) { $start = 0 unless defined($start); $stop = MAX_SEGMENT unless defined($stop); my ($range_query,@range_args) = $rangetype eq 'overlaps' ? $self->overlap_query($start,$stop) : $rangetype eq 'contains' ? $self->contains_query($start,$stop) : $rangetype eq 'contained_in' ? $self->contained_in_query($start,$stop) : (); push @query,$range_query; push @args,@range_args; } if (defined $types && @$types) { my ($type_query,@type_args) = $self->types_query($types); push @query,$type_query; push @args,@type_args; } if ($attributes) { my ($attribute_query,@attribute_args) = $self->make_features_by_attribute_where_part($attributes); push @query,"($attribute_query)"; push @args,@attribute_args; } my $query = join "\n\tAND ",@query; return wantarray ? ($query,@args) : $self->dbh->dbi_quote($query,@args);}
| do_straight_join | description | prev | next | Top |
0 } # false by default}
| string_match | description | prev | next | Top |
my $self = shift; my ($field,$value) = @_; return qq($field = ?) if $value =~ /^[!@%&a-zA-Z0-9_\'\" ~-]+$/; return qq($field REGEXP ?);}
| exact_match | description | prev | next | Top |
my $self = shift; my ($field,$value) = @_; return qq($field = ?);}
| search_notes | description | prev | next | Top |
my $self = shift; my ($search_string,$limit) = @_; $search_string =~ tr/*?//d; my @words = $search_string =~ /(\w+)/g; my $regex = join '|',@words; my @searches = map {"fattribute_value LIKE '%${_}%'"} @words; my $search = join(' OR ',@searches); my $query = <<END;}
SELECT distinct gclass,gname,fattribute_value,fmethod,fsource
FROM fgroup,fattribute_to_feature,fdata,ftype
WHERE fgroup.gid=fdata.gid
AND fdata.fid=fattribute_to_feature.fid
AND fdata.ftypeid=ftype.ftypeid
AND ($search)
END
; my $sth = $self->dbh->do_query($query); my @results; while (my ($class,$name,$note,$method,$source) = $sth->fetchrow_array) { next unless $class && $name; # sorry, ignore NULL objects
my @matches = $note =~ /($regex)/g; my $relevance = 10*@matches; my $featname = Bio::DB::GFF::Featname->new($class=>$name); my $type = Bio::DB::GFF::Typename->new($method,$source); push @results,[$featname,$note,$relevance,$type]; last if $limit && @results >= $limit; } @results;
| meta | description | prev | next | Top |
my $self = shift; my $param_name = uc shift; # getting}
if (@_) { my $value = shift; my $sql = $self->make_meta_set_query() or return; my $sth = $self->dbh->prepare_delayed($sql) or $self->error("Can't prepare $sql: ",$self->dbh->errstr), return; $sth->execute($param_name,$value) or $self->error("Can't execute $sql: ",$self->dbh->errstr), return; $sth->finish; return $self->{meta}{$param_name} = $value; } elsif (exists $self->{meta}{$param_name}) { return $self->{meta}{$param_name}; } else { undef $self->{meta}{$param_name}; # so that we don't check again
my $sql = $self->make_meta_get_query() or return; my $sth = $self->dbh->prepare_delayed($sql) or $self->error("Can't prepare $sql: ",$self->dbh->errstr), return; $sth->execute($param_name) or $self->error("Can't execute $sql: ",$sth->errstr),return; my ($value) = $sth->fetchrow_array; $sth->finish; return $self->{meta}{$param_name} = $value; }
| make_meta_get_query | description | prev | next | Top |
return 'SELECT fvalue FROM fmeta WHERE fname=?';}
| dna_chunk_size | description | prev | next | Top |
my $self = shift; $self->meta('chunk_size') || DNA_CHUNK_SIZE;}
| make_meta_set_query | description | prev | next | Top |
return;}| default_meta_values | description | prev | next | Top |
my $self = shift; my @values = $self->SUPER::default_meta_values; return ( @values, max_bin => MAX_BIN, min_bin => MIN_BIN, straight_join_limit => STRAIGHT_JOIN_LIMIT, chunk_size => DNA_CHUNK_SIZE, );}
| min_bin | description | prev | next | Top |
my $self = shift; return $self->meta('min_bin') || MIN_BIN;}
| max_bin | description | prev | next | Top |
my $self = shift; return $self->meta('max_bin') || MAX_BIN;}
| straight_join_limit | description | prev | next | Top |
my $self = shift; return $self->meta('straight_join_limit') || STRAIGHT_JOIN_LIMIT;}
| get_features_iterator | description | prev | next | Top |
my $self = shift; my ($search,$options,$callback) = @_; $callback || $self->throw('must provide a callback argument'); my $sth = $self->range_query(@{$search}{qw(rangetype refseq refclass start stop types)}, @{$options}{qw( sparse sort_by_group ATTRIBUTES BINSIZE)}) or return; return Bio::DB::GFF::Adaptor::dbi::iterator->new($sth,$callback); } ########################## loading and initialization #####################}
| do_initialize | description | prev | next | Top |
#shift->throw("do_initialize(): must be implemented by subclass");}
my $self = shift; my $erase = shift; $self->drop_all if $erase; my $dbh = $self->features_db; my $schema = $self->schema; foreach my $table_name ($self->tables) { my $create_table_stmt = $schema->{$table_name}{table} ; $dbh->do($create_table_stmt) || warn $dbh->errstr; $self->create_other_schema_objects(\%{$schema->{$table_name}}); } 1;
| finish_load | description | prev | next | Top |
my $self = shift; my $dbh = $self->features_db or return; $dbh->do('UNLOCK TABLES') if $self->lock_on_load; foreach (keys %{$self->{load_stuff}{sth}}) { $self->{load_stuff}{sth}{$_}->finish; } my $counter = $self->{load_stuff}{counter}; delete $self->{load_stuff}; return $counter;}
| create_other_schema_objects | description | prev | next | Top |
#shift->throw("create_other_schema_objects(): must be implemented by subclass");}
my $self = shift ; my $table_schema = shift ; my $dbh = $self->features_db; foreach my $object_type(keys %$table_schema){ if ($object_type !~ /table/) { foreach my $object_name(keys %{$table_schema->{$object_type}}){ my $create_object_stmt = $table_schema->{$object_type}{$object_name}; $dbh->do($create_object_stmt) || warn $dbh->errstr; } } } 1;
| drop_all | description | prev | next | Top |
#shift->throw("drop_all(): must be implemented by subclass");}
my $self = shift; my $dbh = $self->features_db; my $schema = $self->schema; local $dbh->{PrintError} = 0; foreach ($self->tables) { $dbh->do("drop table $_") || warn $dbh->errstr; #when dropping a table - the indexes and triggers are being dropped automatically
# sequences needs to be dropped - if there are any (Oracle, PostgreSQL)
if ($schema->{$_}{sequence}){ foreach my $sequence_name(keys %{$schema->{$_}{sequence}}) { $dbh->do("drop sequence $sequence_name"); } } #$self->drop_other_schema_objects($_);
}
| clone | description | prev | next | Top |
my $self = shift; $self->features_db->clone;}
| drop_other_schema_objects | description | prev | next | Top |
#shift->throw("drop_other_schema_objects(): must be implemented by subclass");}
| make_features_select_part | description | prev | next | Top |
shift->throw("make_features_select_part(): must be implemented by subclass");}
| tables | description | prev | next | Top |
my $schema = shift->schema; return keys %$schema;}
| schema | description | prev | next | Top |
shift->throw("The schema() method must be implemented by subclass");}
| DESTROY | description | prev | next | Top |
my $self = shift; $self->features_db->disconnect if defined $self->features_db; } ################## query cache ##################}
#########################################
## Moved from mysql.pm and mysqlopt.pm ##
#########################################
| make_features_by_name_where_part | description | prev | next | Top |
my $self = shift; my ($class,$name) = @_; if ($name =~ /\*/) { $name =~ s/%/\\%/g; $name =~ s/_/\\_/g; $name =~ tr/*/%/; return ("fgroup.gclass=? AND fgroup.gname LIKE ?",$class,$name); } else { return ("fgroup.gclass=? AND fgroup.gname=?",$class,$name); }}
| make_features_by_alias_where_part | description | prev | next | Top |
my $self = shift; my ($class,$name) = @_; if ($name =~ /\*/) { $name =~ tr/*/%/; $name =~ s/_/\\_/g; return ("fgroup.gclass=? AND fattribute_to_feature.fattribute_value LIKE ? AND fgroup.gid=fdata.gid AND fattribute.fattribute_name in ('Alias','Name') AND fattribute_to_feature.fattribute_id=fattribute.fattribute_id AND fattribute_to_feature.fid=fdata.fid AND ftype.ftypeid=fdata.ftypeid",$class,$name) } else { return ("fgroup.gclass=? AND fattribute_to_feature.fattribute_value=? AND fgroup.gid=fdata.gid AND fattribute.fattribute_name in ('Alias','Name') AND fattribute_to_feature.fattribute_id=fattribute.fattribute_id AND fattribute_to_feature.fid=fdata.fid AND ftype.ftypeid=fdata.ftypeid",$class,$name); }}
| make_features_by_attribute_where_part | description | prev | next | Top |
my $self = shift; my $attributes = shift; my @args; my @sql; foreach (keys %$attributes) { push @sql,"(fattribute.fattribute_name=? AND fattribute_to_feature.fattribute_value=?)"; push @args,($_,$attributes->{$_}); } return (join(' OR ',@sql),@args);}
| make_features_by_id_where_part | description | prev | next | Top |
my $self = shift; my $ids = shift; my $set = join ",",@$ids; return ("fdata.fid IN ($set)");}
| make_features_by_gid_where_part | description | prev | next | Top |
my $self = shift; my $ids = shift; my $set = join ",",@$ids; return ("fgroup.gid IN ($set)");}
| make_features_from_part | description | prev | next | Top |
my $self = shift; my $sparse = shift; my $options = shift || {}; return $options->{attributes} ? "fdata,ftype,fgroup,fattribute,fattribute_to_feature\n" : "fdata,ftype,fgroup\n";}
| make_features_join_part | description | prev | next | Top |
my $self = shift; my $options = shift || {}; return !$options->{attributes} ? <<END1 : <<END2;}
fgroup.gid = fdata.gid
AND ftype.ftypeid = fdata.ftypeid
END1
fgroup.gid = fdata.gid
AND ftype.ftypeid = fdata.ftypeid
AND fattribute.fattribute_id=fattribute_to_feature.fattribute_id
AND fdata.fid=fattribute_to_feature.fid
END2
| make_features_order_by_part | description | prev | next | Top |
my $self = shift; my $options = shift || {}; return "fgroup.gname";}
| make_features_group_by_part | description | prev | next | Top |
my $self = shift; my $options = shift || {}; if (my $att = $options->{attributes}) { my $key_count = keys %$att; return unless $key_count > 1; return ("fdata.fid,fref,fstart,fstop,fsource, fmethod,fscore,fstrand,fphase,gclass,gname,ftarget_start, ftarget_stop,fdata.gid HAVING count(fdata.fid) > ?",$key_count-1); } elsif (my $b = $options->{bin_width}) { return "fref,fstart,fdata.ftypeid"; }}
| refseq_query | description | prev | next | Top |
my $self = shift; my ($refseq,$refclass) = @_; my $query = "fdata.fref=?"; return wantarray ? ($query,$refseq) : $self->dbh->dbi_quote($query,$refseq);}
| do_attributes | description | prev | next | Top |
my $self = shift; my ($id,$tag) = @_; my $sth; if ($id) { my $from = 'fattribute_to_feature,fattribute'; my $join = 'fattribute.fattribute_id=fattribute_to_feature.fattribute_id'; my $where1 = 'fid=? AND fattribute_name=?'; my $where2 = 'fid=?'; $sth = defined($tag) ? $self->dbh->do_query("SELECT fattribute_value FROM $from WHERE $where1 AND $join",$id,$tag) : $self->dbh->do_query("SELECT fattribute_name,fattribute_value FROM $from WHERE $where2 AND $join",$id); } else { $sth = $self->dbh->do_query("SELECT fattribute_name FROM fattribute"); } my @result; while (my @stuff = $sth->fetchrow_array) { push @result,@stuff; } $sth->finish; return @result;}
| overlap_query_nobin | description | prev | next | Top |
my $self = shift; my ($start,$stop) = @_; my $query = qq(fdata.fstop>=? AND fdata.fstart<=?); return wantarray ? ($query,$start,$stop) : $self->dbh->dbi_quote($query,$start,$stop);}
| contains_query_nobin | description | prev | next | Top |
my $self = shift; my ($start,$stop) = @_; my $query = qq(fdata.fstart>=? AND fdata.fstop<=?); return wantarray ? ($query,$start,$stop) : $self->dbh->dbi_quote($query,$start,$stop);}
| contained_in_query_nobin | description | prev | next | Top |
my $self = shift; my ($start,$stop) = @_; my $query = qq(fdata.fstart<=? AND fdata.fstop>=?); return wantarray ? ($query,$start,$stop) : $self->dbh->dbi_quote($query,$start,$stop);}
| types_query | description | prev | next | Top |
my $self = shift; my $types = shift; my @method_queries; my @args; for my $type (@$types) { my ($method,$source) = @$type; my ($mlike, $slike) = (0, 0); if ($method && $method =~ m/\.\*/) {}
$method =~ s/%/\\%/g;
$method =~ s/_/\\_/g; $method =~ s/\.\*\??/%/g; $mlike++; } if ($source && $source =~ m/\.\*/) {
$source =~ s/%/\\%/g;
$source =~ s/_/\\_/g; $source =~ s/\.\*\??/%/g; $slike++; } my @pair; if (defined $method && length $method) { push @pair, $mlike ? qq(fmethod LIKE ?) : qq(fmethod = ?); push @args, $method; } if (defined $source && length $source) { push @pair, $slike ? qq(fsource LIKE ?) : qq(fsource = ?); push @args, $source; } push @method_queries,"(" . join(' AND ',@pair) .")" if @pair; } my $query = " (".join(' OR ',@method_queries).")\n" if @method_queries; return wantarray ? ($query,@args) : $self->dbh->dbi_quote($query,@args);
| make_types_select_part | description | prev | next | Top |
my $self = shift; my ($srcseq,$start,$stop,$want_count) = @_; my $query = $want_count ? 'ftype.fmethod,ftype.fsource,count(fdata.ftypeid)' : 'fmethod,fsource'; return $query;}
| make_types_from_part | description | prev | next | Top |
my $self = shift; my ($srcseq,$start,$stop,$want_count) = @_; my $query = defined($srcseq) || $want_count ? 'fdata,ftype' : 'ftype'; return $query;}
| make_types_join_part | description | prev | next | Top |
my $self = shift; my ($srcseq,$start,$stop,$want_count) = @_; my $query = defined($srcseq) || $want_count ? 'fdata.ftypeid=ftype.ftypeid' : ''; return $query || '1=1';}
| make_types_where_part | description | prev | next | Top |
my $self = shift; my ($srcseq,$start,$stop,$want_count,$typelist) = @_; my (@query,@args); if (defined($srcseq)) { push @query,'fdata.fref=?'; push @args,$srcseq; if (defined $start or defined $stop) { $start = 1 unless defined $start; $stop = MAX_SEGMENT unless defined $stop; my ($q,@a) = $self->overlap_query($start,$stop); push @query,"($q)"; push @args,@a; } } if (defined $typelist && @$typelist) { my ($q,@a) = $self->types_query($typelist); push @query,($q); push @args,@a; } my $query = @query ? join(' AND ',@query) : '1=1'; return wantarray ? ($query,@args) : $self->dbh->dbi_quote($query,@args);}
| make_types_group_part | description | prev | next | Top |
my $self = shift; my ($srcseq,$start,$stop,$want_count) = @_; return unless $srcseq or $want_count; return 'ftype.ftypeid,ftype.fmethod,ftype.fsource';}
| get_feature_id | description | prev | next | Top |
my $self = shift; my ($ref,$start,$stop,$typeid,$groupid) = @_; my $s = $self->{load_stuff}; unless ($s->{get_feature_id}) { my $dbh = $self->features_db; $s->{get_feature_id} = $dbh->prepare_delayed('SELECT fid FROM fdata WHERE fref=? AND fstart=? AND fstop=? AND ftypeid=? AND gid=?'); } my $sth = $s->{get_feature_id} or return; $sth->execute($ref,$start,$stop,$typeid,$groupid) or return; my ($fid) = $sth->fetchrow_array; return $fid;}
| make_abscoord_query | description | prev | next | Top |
my $self = shift; my ($name,$class,$refseq) = @_; #my $query = GETSEQCOORDS;}
my $query = $self->getseqcoords_query(); my $getforcedseqcoords = $self->getforcedseqcoords_query() ; if ($name =~ /\*/) { $name =~ s/%/\\%/g; $name =~ s/_/\\_/g; $name =~ tr/*/%/; $query =~ s/gname=\?/gname LIKE ?/; } defined $refseq ? $self->dbh->do_query($getforcedseqcoords,$name,$class,$refseq) : $self->dbh->do_query($query,$name,$class);
| make_aliasabscoord_query | description | prev | next | Top |
my $self = shift; my ($name,$class) = @_; #my $query = GETALIASCOORDS;}
my $query = $self->getaliascoords_query(); if ($name =~ /\*/) { $name =~ s/%/\\%/g; $name =~ s/_/\\_/g; $name =~ tr/*/%/; $query =~ s/gname=\?/gname LIKE ?/; } $self->dbh->do_query($query,$name,$class);
| getseqcoords_query | description | prev | next | Top |
shift->throw("getseqcoords_query(): must be implemented by a subclass");}
| getaliascoords_query | description | prev | next | Top |
shift->throw("getaliascoords_query(): must be implemented by a subclass");}
| bin_query | description | prev | next | Top |
my $self = shift; my ($start,$stop,$minbin,$maxbin) = @_; if ($start && $start < 0 && $stop > 0) { # split the queries}
my ($lower_query,@lower_args) = $self->_bin_query($start,0,$minbin,$maxbin); my ($upper_query,@upper_args) = $self->_bin_query(0,$stop,$minbin,$maxbin); my $query = "$lower_query\n\t OR $upper_query"; my @args = (@lower_args,@upper_args); return wantarray ? ($query,@args) : $self->dbh->dbi_quote($query,@args); } else { return $self->_bin_query($start,$stop,$minbin,$maxbin); }
| _bin_query | description | prev | next | Top |
my $self = shift; my ($start,$stop,$minbin,$maxbin) = @_; my ($query,@args); $start = 0 unless defined($start); $stop = $self->meta('max_bin') unless defined($stop); my @bins; $minbin = defined $minbin ? $minbin : $self->min_bin; $maxbin = defined $maxbin ? $maxbin : $self->max_bin; my $tier = $maxbin; while ($tier >= $minbin) { my ($tier_start,$tier_stop) = (bin_bot($tier,$start)-EPSILON(),bin_top($tier,$stop)+EPSILON()); ($tier_start,$tier_stop) = ($tier_stop,$tier_start) if $tier_start > $tier_stop; # can happen when working with negative coordinates}
if ($tier_start == $tier_stop) { push @bins,'fbin=?'; push @args,$tier_start; } else { push @bins,'fbin between ? and ?'; push @args,($tier_start,$tier_stop); } $tier /= 10;
} $query = join("\n\t OR ",@bins); return wantarray ? ($query,@args) : $self->dbh->dbi_quote($query,@args); } # find features that overlap a given range
| overlap_query | description | prev | next | Top |
my $self = shift; my ($start,$stop) = @_; my ($query,@args); my ($iq,@iargs) = $self->overlap_query_nobin($start,$stop); if (OPTIMIZE) { my ($bq,@bargs) = $self->bin_query($start,$stop); $query = "($bq)\n\tAND $iq"; @args = (@bargs,@iargs); } else { $query = $iq; @args = @iargs; } return wantarray ? ($query,@args) : $self->dbh->dbi_quote($query,@args); } # find features that are completely contained within a ranged}
| contains_query | description | prev | next | Top |
my $self = shift; my ($start,$stop) = @_; my ($bq,@bargs) = $self->bin_query($start,$stop,undef,bin($start,$stop,$self->min_bin)); my ($iq,@iargs) = $self->contains_query_nobin($start,$stop); my $query = "($bq)\n\tAND $iq"; my @args = (@bargs,@iargs); return wantarray ? ($query,@args) : $self->dbh->dbi_quote($query,@args); } # find features that are completely contained within a range}
| contained_in_query | description | prev | next | Top |
my $self = shift; my ($start,$stop) = @_; my ($bq,@bargs) = $self->bin_query($start,$stop,abs($stop-$start)+1,undef); my ($iq,@iargs) = $self->contained_in_query_nobin($start,$stop); my $query = "($bq)\n\tAND $iq"; my @args = (@bargs,@iargs); return wantarray ? ($query,@args) : $self->dbh->dbi_quote($query,@args); } # implement the _delete_fattribute_to_feature() method}
| _delete_fattribute_to_feature | description | prev | next | Top |
my $self = shift; my @feature_ids = @_; my $dbh = $self->features_db; my $fields = join ',',map{$dbh->quote($_)} @feature_ids; my $query = "delete from fattribute_to_feature where fid in ($fields)"; warn "$query\n" if $self->debug; my $result = $dbh->do($query); defined $result or $self->throw($dbh->errstr); $result; } # implement the _delete_features() method}
| _delete_features | description | prev | next | Top |
my $self = shift; my @feature_ids = @_; my $dbh = $self->features_db; my $fields = join ',',map{$dbh->quote($_)} @feature_ids; # delete from fattribute_to_feature}
$self->_delete_fattribute_to_feature(@feature_ids); my $query = "delete from fdata where fid in ($fields)"; warn "$query\n" if $self->debug; my $result = $dbh->do($query); defined $result or $self->throw($dbh->errstr); $result; } # implement the _delete_groups() method
| _delete_groups | description | prev | next | Top |
my $self = shift; my @group_ids = @_; my $dbh = $self->features_db; my $fields = join ',',map{$dbh->quote($_)} @group_ids; foreach my $gid (@group_ids){ my @features = $self->get_feature_by_gid($gid); $self->delete_features(@features); } my $query = "delete from fgroup where gid in ($fields)"; warn "$query\n" if $self->debug; my $result = $dbh->do($query); defined $result or $self->throw($dbh->errstr); $result; } # implement the _delete() method}
| _delete | description | prev | next | Top |
my $self = shift; my $delete_spec = shift; my $ranges = $delete_spec->{segments} || []; my $types = $delete_spec->{types} || []; my $force = $delete_spec->{force}; my $range_type = $delete_spec->{range_type}; my $dbh = $self->features_db; my $query = 'delete from fdata'; my @where; my @range_part; for my $segment (@$ranges) { my $ref = $dbh->quote($segment->abs_ref); my $start = $segment->abs_start; my $stop = $segment->abs_stop; my $range = $range_type eq 'overlaps' ? $self->overlap_query($start,$stop) : $range_type eq 'contains' ? $self->contains_query($start,$stop) : $range_type eq 'contained_in' ? $self->contained_in_query($start,$stop) : $self->throw("Invalid range type '$range_type'"); push @range_part,"(fref=$ref AND $range)"; } push @where,'('. join(' OR ',@range_part).')' if @range_part; # get all the types}
if (@$types) { my $types_where = $self->types_query($types); my $types_query = "select ftypeid from ftype where $types_where"; my $result = $dbh->selectall_arrayref($types_query); my @typeids = map {$_->[0]} @$result; my $typelist = join ',',map{$dbh->quote($_)} @typeids; $typelist ||= "0"; # don't cause DBI to die with invalid SQL when
# unknown feature types were requested.
push @where,"(ftypeid in ($typelist))"; } $self->throw("This operation would delete all feature data and -force not specified") unless @where || $force; $query .= " where ".join(' and ',@where) if @where; warn "$query\n" if $self->debug; my $result = $dbh->do($query); defined $result or $self->throw($dbh->errstr); $result;
| feature_summary | description | prev | next | Top |
my $self = shift; my ($seq_name,$start,$end,$types,$bins,$iterator) = rearrange([['SEQID','SEQ_ID','REF'],'START',['STOP','END'], ['TYPES','TYPE','PRIMARY_TAG'], 'BINS', 'ITERATOR', ],@_); my ($coverage,$tag) = $self->coverage_array(-seqid=> $seq_name, -start=> $start, -end => $end, -type => $types, -bins => $bins) or return; my $score = 0; for (@$coverage) { $score += $_ } $score /= @$coverage;}
my $feature = Bio::SeqFeature::Lite->new(-seq_id => $seq_name, -start => $start, -end => $end, -type => $tag, -score => $score, -attributes => { coverage => [$coverage] }); return $iterator ? Bio::DB::GFF::FeatureIterator->new($feature) : $feature;
| coverage_array | description | prev | next | Top |
my $self = shift; my ($seq_name,$start,$end,$types,$bins) = rearrange([['SEQID','SEQ_ID','REF'],'START',['STOP','END'], ['TYPES','TYPE','PRIMARY_TAG'],'BINS'],@_); $types = $self->parse_types($types); my $dbh = $self->features_db; $bins ||= 1000; $start ||= 1; unless ($end) { my $segment = $self->segment($seq_name) or $self->throw("unknown seq_id $seq_name"); $end = $segment->end; } my $binsize = ($end-$start+1)/$bins;}
my $seqid = $seq_name; return [] unless $seqid; # where each bin starts
my @his_bin_array = map {$start + $binsize * $_} (0..$bins); my @sum_bin_array = map {int(($_-1)/SUMMARY_BIN_SIZE)} @his_bin_array;
my $interval_stats_table = 'finterval_stats'; # pick up the type ids
my ($type_from,@a) = $self->types_query($types); my $query = "select ftypeid,fmethod,fsource from ftype where $type_from"; my $sth = $dbh->prepare_delayed($query); my (@t,$report_tag); $sth->execute(@a); while (my ($t,$method,$source) = $sth->fetchrow_array) { $report_tag ||= "$method:$source"; push @t,$t; } my %bins; my $sql = <<END;
SELECT fbin,fcum_count
FROM $interval_stats_table
WHERE ftypeid=?
AND fref=? AND fbin >= ?
LIMIT 1
END
; $sth = $dbh->prepare_delayed($sql) or warn $dbh->errstr; eval { for my $typeid (@t) { for (my $i=0;$i<@sum_bin_array;$i++) { my @args = ($typeid,$seqid,$sum_bin_array[$i]); $self->_print_query($sql,@args) if $self->debug; $sth->execute(@args) or $self->throw($sth->errstr); my ($bin,$cum_count) = $sth->fetchrow_array; push @{$bins{$typeid}},[$bin,$cum_count]; } } }; return unless %bins; my @merged_bins; my $firstbin = int(($start-1)/$binsize);
for my $type (keys %bins) { my $arry = $bins{$type}; my $last_count = $arry->[0][1]; my $last_bin = -1; my $i = 0; my $delta; for my $b (@$arry) { my ($bin,$count) = @$b; $delta = $count - $last_count if $bin > $last_bin; $merged_bins[$i++] = $delta; $last_count = $count; $last_bin = $bin; } } return wantarray ? (\@merged_bins,$report_tag) :\@ merged_bins;
| build_summary_statistics | description | prev | next | Top |
my $self = shift; my $interval_stats_table = 'finterval_stats'; my $dbh = $self->dbh; $dbh->begin_work; my $sbs = SUMMARY_BIN_SIZE; my $result = eval { $self->_add_interval_stats_table; $self->disable_keys($interval_stats_table); $dbh->do("DELETE FROM $interval_stats_table"); my $insert = $dbh->prepare(<<END) or $self->throw($dbh->errstr); INSERT INTO $interval_stats_table (ftypeid,fref,fbin,fcum_count) VALUES (?,?,?,?) END ; my $sql = 'select ftypeid,fref,fstart,fstop from fdata order by ftypeid,fref,fstart'; my $select = $dbh->prepare($sql) or $self->throw($dbh->errstr); my $current_bin = -1; my ($current_type,$current_seqid,$count); my $cum_count = 0; my (%residuals,$last_bin); my $le = -t\* STDERR ? "\r" : "\n"; $select->execute; while (my($typeid,$seqid,$start,$end) = $select->fetchrow_array) { print STDERR $count," features processed$le" if ++$count % 1000 == 0; my $bin = int($start/$sbs);}
$current_type ||= $typeid; $current_seqid ||= $seqid; # because the input is sorted by start, no more features will contribute to the
# current bin so we can dispose of it
if ($bin != $current_bin) { if ($seqid != $current_seqid or $typeid != $current_type) { # load all bins left over
$self->_load_bins($insert,\%residuals,\$cum_count,$current_type,$current_seqid); %residuals = () ; $cum_count = 0; } else { # load all up to current one
$self->_load_bins($insert,\%residuals,\$cum_count,$current_type,$current_seqid,$current_bin); } } $last_bin = $current_bin; ($current_seqid,$current_type,$current_bin) = ($seqid,$typeid,$bin); # summarize across entire spanned region
my $last_bin = int(($end-1)/$sbs);
for (my $b=$bin;$b<=$last_bin;$b++) { $residuals{$b}++; } } # handle tail case
# load all bins left over
$self->_load_bins($insert,\%residuals,\$cum_count,$current_type,$current_seqid); $self->enable_keys($interval_stats_table); 1; }; if ($result) { $dbh->commit } else { warn "Can't build summary statistics: $@"; $dbh->rollback }; print STDERR "\n";
| _load_bins | description | prev | next | Top |
my $self = shift; my ($insert,$residuals,$cum_count,$type,$seqid,$stop_after) = @_; for my $b (sort {$a<=>$b} keys %$residuals) { last if defined $stop_after and $b > $stop_after; $$cum_count += $residuals->{$b}; my @args = ($type,$seqid,$b,$$cum_count); $insert->execute(@args) or warn $insert->errstr; delete $residuals->{$b}; # no longer needed}
}
| _add_interval_stats_table | description | prev | next | Top |
my $self = shift; my $schema = $self->schema; my $create_table_stmt = $schema->{'finterval_stats'}{'table'}; my $dbh = $self->features_db; $dbh->do("drop table finterval_stats"); $dbh->do($create_table_stmt) || warn $dbh->errstr;}
| enable_keys | description | prev | next | Top |
} # noop}
1; __END__
| QUERIES TO IMPLEMENT | Top |
| attributes | Top |
Title : attributesSome GFF version 2 files use the groups column to store a series of
Usage : @attributes = $db->attributes($id,$name)
Function: get the attributes on a particular feature
Returns : an array of string
Args : feature ID
Status : public
%attributes = $db->attributes($id);Normally, attributes() will be called by the feature:
@notes = $feature->attributes('Note');| BUGS | Top |
| SEE ALSO | Top |
| AUTHOR | Top |