Organism Taxonomy (taxonomy)

This module provides access to the NCBI’s organism taxonomy information and organism name unification across different modules.

orangecontrib.bioinformatics.ncbi.taxonomy.name(tax_id)[source]

Return the scientific name for organism with provided taxonomy id.

Parameters:tax_id (str) – Taxonomy if (NCBI taxonomy database)
orangecontrib.bioinformatics.ncbi.taxonomy.other_names(tax_id)[source]

Return a list of (name, name_type) tuples excluding the scientific name.

Use name() to retrieve the scientific name.

Parameters:tax_id (str) – Taxonomy if (NCBI taxonomy database)
orangecontrib.bioinformatics.ncbi.taxonomy.search(string, onlySpecies=True, exact=False)[source]

Search the NCBI taxonomy database for an organism.

Parameters:
  • string (str) – Search string.
  • onlySpecies (bool) – Return only taxids of species (and subspecies).
  • exact (bool) – Return only taxids of organism that exactly match the string.
orangecontrib.bioinformatics.ncbi.taxonomy.lineage(tax_id)[source]

Return a list of taxids ordered from the topmost node (root) to taxid.

Parameters:tax_id (str) – Taxonomy if (NCBI taxonomy database)
orangecontrib.bioinformatics.ncbi.taxonomy.common_taxids()[source]

Return taxonomy IDs for most common organisms.

Return type:list of common organisms
orangecontrib.bioinformatics.ncbi.taxonomy.common_taxid_to_name(tax_id)[source]

Return a name for a common organism taxonomy id.

Parameters:tax_id (str) – Taxonomy if (NCBI taxonomy database)
Return type:str
orangecontrib.bioinformatics.ncbi.taxonomy.taxname_to_taxid(organism_name)[source]

Return taxonomy ID for a taxonomy name.

Parameters:organism_name (str) – Official organism name, e.g., ‘Homo sapiens’
Return type:str

Examples

The following script takes the list of taxonomy IDs and prints out their name:

import orangecontrib.bioinformatics.ncbi.taxonomy

for taxid in orangecontrib.bioinformatics.ncbi.taxonomy.common_taxids():
    print("%-6s %s" % (taxid, orangecontrib.bioinformatics.ncbi.taxonomy.name(taxid)))

The output of the script is:

3702   Arabidopsis thaliana
9913   Bos taurus
6239   Caenorhabditis elegans
5476   Candida albicans
3055   Chlamydomonas reinhardtii
7955   Danio rerio
352472 Dictyostelium discoideum AX4
7227   Drosophila melanogaster
562    Escherichia coli
11103  Hepatitis C virus
9606   Homo sapiens
10090  Mus musculus
2104   Mycoplasma pneumoniae
4530   Oryza sativa
5833   Plasmodium falciparum
4754   Pneumocystis carinii
10116  Rattus norvegicus
4932   Saccharomyces cerevisiae
4896   Schizosaccharomyces pombe
31033  Takifugu rubripes
8355   Xenopus laevis
4577   Zea mays