API documentation

class taxoniq.Accession(accession_id: str)[source]

An object representing an NCBI GenBank nucleotide or protein sequence accession ID. This is used by Taxoniq to represent sequences associated with taxons; use Taxon as the starting point.

property blast_db

The BLAST database in which this sequence accession ID was indexed.

property blast_db_volume

The numeric BLAST database volume ID in which this sequence accession was indexed.

property db_offset

The byte offset in the BLAST database volume at which this sequence starts.

get_from_gs()[source]

Returns a file-like object streaming the nucleotide sequence for this accession from the Google Storage NCBI BLAST database mirror (https://registry.opendata.aws/ncbi-blast-databases/), if available.

get_from_s3()[source]

Returns a file-like object streaming the nucleotide sequence for this accession from the AWS S3 NCBI BLAST database mirror (https://registry.opendata.aws/ncbi-blast-databases/), if available.

property length

The length of the sequence (number of nucleotides or amino acids).

property tax_id

The taxon ID associated with this sequence accession ID.

url()[source]

Returns the HTTPS URL for the NCBI GenBank web page representing this sequence accession ID.

class taxoniq.BLASTDatabase(value)

An enumeration.

exception taxoniq.NoValue[source]
class taxoniq.Rank(value)

An enumeration.

class taxoniq.Taxon(tax_id: Optional[int] = None, accession_id: Optional[str] = None, scientific_name: Optional[str] = None)[source]

An object representing an NCBI Taxonomy taxon, identified by its taxon ID. The object can be instantiated by uniquely identifying a taxon using the numeric taxon ID, an alphanumeric accession ID of a sequence associated with the taxon ID, or the scientific name of the taxon.

property best_available_description

Introductory paragraph from English Wikipedia for this taxon or the first parent taxon where a description is available.

property best_refseq_taxon

Best related taxon with refseq representative genome sequence available. For viruses, this is the RefSeq “genome neighbor” as defined in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383986/ and retrieved from https://ftp.ncbi.nlm.nih.gov/genomes/Viruses/Viruses_RefSeq_and_neighbors_genome_data.tab. For other domains, this is the RefSeq representative genome for the taxon’s species, if available, as seen in the species_taxid column of https://ftp.ncbi.nlm.nih.gov/genomes/refseq/assembly_summary_refseq.txt.

The accessions for the genome can be accessed as follows:

Taxon(123).best_refseq_taxon.refseq_representative_genome_accessions

property child_nodes

Returns a list of taxon objects that list this taxon as their parent.

closest_taxon_with_refseq_genome()[source]

Returns a taxon closest by phylogenetic distance as computed by WoL and with a refseq genome associated

property common_name

Common name of the taxon. In taxoniq, this is defined as the NCBI taxonomy blast name if available, or the genbank common name if available, or the first listed common name. See https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245000/ for definitions of these fields.

property description

Introductory paragraph for this taxon from English Wikipedia, if available.

classmethod distance(taxa)[source]

Phylogenetic distance between taxa as computed by WoL

property host

A text description of a symbiont host or hosts for this taxon’s organism, if any.

classmethod lca(taxa)[source]

Given a list of Taxon objects, returns the last common ancestor taxon.

property lineage

Lineage for this taxon (the list of parent nodes from the taxon to the root of the taxonomic tree).

property parent

The parent taxon for this taxon.

For the root of the tree, parent is None.

property rank

Rank of the taxon. See https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7408187/#sec9title for more details.

property ranked_child_nodes

List of child nodes in the next main taxonomic rank (species, genus, family, order, class, phylum, kingdom, superkingdom).

property ranked_lineage

Lineage of main taxonomic ranks (species, genus, family, order, class, phylum, kingdom, superkingdom).

property refseq_genome_accessions

A list of Accession objects for sequences in the most recent RefSeq genome assembly for this taxon, if available.

property refseq_representative_genome_accessions

A list of Accession objects for sequences in the RefSeq representative genome assembly for this taxon, if available.

property scientific_name

The unique scientific name of the taxon.

property url

Returns the HTTPS URL for the NCBI Taxonomy web page representing this taxon.

property wikidata_id

Wikidata ID for this taxon.

property wikidata_url

URL of the Wikidata web page representing this taxon.

exception taxoniq.TaxoniqException[source]