EFetch retrieves data records from a list of
ID's. This can be accomplished directly (using id) or indirectly
(by using a
Cookie.
The following are a general list of parameters that can be used to take
advantage of Efetch. Up-to-date help for Efetch is available at this URL
(the information below is a summary of the options found there):
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/efetch_help.html
db
One or more database available through EUtilities. EFetch currently only
supports database retrieval from the following databases:
pubmed,
pmc (PubMed Central),
journals,
omim,
nucleotide,
protein,
genome,
gene,
snp (dbSBP),
popset, and
taxonomy.
Also supported are
sequences (nucleotide, protein, popset and genome), and
the three subsets of nucleotide:
nuccore,
nucest,
nucgss id
a list of primary ID's
Below are a list of IDs which can be used with EFetch:
For sequence databases:
NCBI sequence number (GI),
accession,
accession.version,
fasta,
GeneID,
genome ID,
seqid All other databases:
PMID (pubmed),
MIM number (omim),
GI number (nucleotide, protein),
Genome ID (genome),
Popset ID (popset),
SNP cluster ID (snp),
UniSTS ID (unists),
UniGene cluster ID (unigene),
MMDB-ID (structure),
PSSM-ID (cdd),
3D SDI (domains),
TAXID (taxonomy),
GEO ID (geo)
C<mindate>, maxdate
limits results by dates (yyyy/mm/dd format, or by year)
rettype
Output type based on the database. Not all return types are compatible with
all return modes (-retmode). For more information, see the specific
literature or sequence database links at URL above.
Literature databases have the below return types:
uilist (all databases),
abstract,
citation,
medline (not omim),
full (journals and omim)
Literature databases have the below return types:
native (full record, all databases),
fasta,
seqid,
acc (nucleotide or protein),
gb,
gbc,
gbwithparts (nucleotide only),
est (dbEST only),
gss (dbGSS only),
gp,
gpc (protein only),
chr,
flt,
rsr,
brief,
docset (dbSNP only)
retmode
EFetch is set, by default, to return a specific format for each Entrez database;
this is set in the %DATABASE hash in
Bio::DB::EUtilities. To override this
format, you can set -retmode. The normal return modes are text, HTML, XML,
and ASN1. Error checking for the set return mode is currently not
implemented.
report
Used for the output format for Taxonomy; set to
uilist,
brief,
docsum,
xml C<strand> - sequence only
The strand of DNA to show: 1=plus, 2=minus
C<seq_start>, C<seq_stop> - sequence only
the start and end coordinates of the sequence to display
C<complexity> - sequence only
The GI is often part of a biological blob containing other GIs
* 0 - get the whole blob
* 1 - get the bioseq for gi of interest (default in Entrez)
* 2 - get the minimal bioseq-set containing the gi of interest
* 3 - get the minimal nuc-prot containing the gi of interest
* 4 - get the minimal pub-set containing the gi of interest
These are Bioperl-related settings and are not used as CGI parameters when
eutil
The relevant EUtility to be used (efetch).
cookie
Uses a
Cookie-based search (see below)
sub _initialize
{ my ($self, @args ) = @_;
$self->SUPER::_initialize(@args);
my ($retmode, $reldate, $mindate, $maxdate, $datetype, $rettype, $retstart,
$retmax, $report, $seq_start, $seq_stop, $strand, $complexity) =
$self->_rearrange([qw(RETMODE RELDATE MINDATE MAXDATE DATETYPE RETTYPE
RETSTART RETMAX REPORT SEQ_START SEQ_STOP STRAND COMPLEXITY)], @args);
$self->_eutil($EUTIL);
$datetype ||= 'mdat';
$self->datetype($datetype) if $datetype;
defined($retstart) && $self->retstart($retstart);
$retmode && $self->retmode($retmode);
$retmax && $self->retmax($retmax);
$rettype && $self->rettype($rettype);
$seq_start && $self->seq_start($seq_start);
$seq_stop && $self->seq_stop($seq_stop);
$strand && $self->strand($strand);
defined($complexity) && $self->complexity($complexity);
$report && $self->report($report);} |