Explore expression metrics and annotation information for your gene



Remember: All expression upper to the maximun limit chosed are represented with the higher color of the scale.

Expression per tissue

Median expression by tissue

Loading...

Mean and Median expression by tissue

Loading...







On the left figure, tissues are colored based on mean expression values in TPM. The figure on the right shows the median and mean expression of the searched gene in each tissue. For simplification, all trichome samples are grouped as Trichome, all seed samples are grouped as Seed and male induced flower samples are grouped with male flowers in this figure and barplot. As the Cannabis Expression Atlas is highly heterogeneous, this figure should never be interpreted alone.

Global expression profile

t-SNE plot of all samples colored by tissue

t-SNE plot of all samples colored by expression

Loading...

Visualization of the input gene's expression in each sample (right) with the support of a t-SNE representation of samples colored by tissue (left). Given the heterogeneity of the Cannabis Expression Atlas, these plots are particularly useful to find out if the input gene is expressed in only a subset of the samples of particular tissues.

Explore metadata and expression of multiple genes at a time.



Ex: Trichome specific genes Housekeeping genes

Download options

Filtered data (.csv)

Download data based on the columns used as filters. If no filters are used, you can download the complete data.


Copy Gene ID
Loading...

Remember: for computational optimization reasons, it is necessary to refresh the plots after the table is filtered.

Graphical Data summary

Number of genes per expression category

Loading...

Distribution of Tau index across classes

Loading...

Number of genes per specific tissue or group

Loading...


Here you can see the 20 most frequent terms of:

Click on the + button tho see the plot.

Gene Description

Loading...

Gene_Ontology(UniProt)

Loading...

TF family

Loading...

KEGG pathway

Loading...

Ex: Male-flower specific genes Hypocotyl specific genes

Download options

Download data (.tsv)

Avaiable after search

Data summary
Loading...

Gene expression heatmap

Loading...

Sample Metadata

Loading...

Use your own nucleotide or protein sequences to search for genes in this database.


Input sequence (FASTA format)

Obs: Limit of 30 sequences at a time.

Ex: Example nucleotide sequence Example amino acid sequence

BLAST options

Download options

BLAST Results

Copy Gene ID
Loading...

Download data by tissue or BioProject of interest.





Download options

Data summary
Loading...




Download data (.tsv)
Expression matrix

Sample metadata


Number of Samples per Tissue


Click the row corresponding to a BioProject to be able to download its expression matrix and sample metadata in the 'Download options' box. You can use the filter boxes above each column to filter based on a particular variable, or the global search box (above all columns) to find matches in any column.

NOTE: You can only download 1 BioProject at a time to avoid server overload.

Download options

Data summary
Loading...
Download data (.tsv)

Frequently Asked Questions

  1. What was the pipeline used to create this atlas?

The pipeline used to create this atlas is summarized below.


  1. What does “bias-corrected counts” mean?

RNA-seq software tools (including salmon, the one used here) report the number of reads mapped to each transcript, which is typically called raw read counts. However, transcript abundance estimates in raw counts are biased, because variations may be due to differences in gene length and library size. To correct for these biases, we used the “bias correction without an offset” method implemented in the Bioconductor package tximport, which scales raw counts using the average transcript length over samples, and then library size.


  1. Can I obtain transcript-level abundance estimates with this web application?

This application allows the exploration, visualization, and download of gene-level transcript abundances (i.e., “gene expression”) only. However, transcript-level abundances are available in the FigShare repository associated with this project, in se_atlas_transcript.rda, an RData file that stores a SummarizedExperiment object.

The SummarizedExperiment object stores two assays named tx_TPM and tx_counts with transcript-level abundances in TPM and read counts, respectively.


  1. Can I obtain a single file with all expression data in the Cannabis Expression Atlas?

Yes. Quantitative data for gene- and transcript-level abundances can be found in the FigShare repository associated with this project, in RData files named se_atlas_gene.rda and se_atlas_transcript.rda, respectively. These RData files store SummarizedExperiment objects with the following assays:

  • se_atlas_gene.rda: assays named gene_TPM and gene_counts.
  • se_atlas_transcript.rda: assays named tx_TPM and tx_counts.

To load the SummarizedExperiment object into an R session and access the data, you would run the following R code:

library(SummarizedExperiment)

# Load gene-level abundance data
load("se_atlas_gene.rda")

# Access the matrix with gene expression in TPM
assay(se_atlas_gene, "gene_TPM")

# Access the matrix with gene expression in bias-corrected counts
assay(se_atlas_gene, "gene_counts")

# Access sample metadata
colData(se_atlas_gene)

For more information on how to work with SummarizedExperiment objects, check the package’s documentation.

NOTE: these files are very large, as they store matrices with more than 27000 rows (genes) and 390 columns (samples). As R stores data in memory, make sure you have enough memory if you want to work with the entire quantitative data.

  1. How do I report a bug or issue? You can open an issue in this GitHub repository that was specifically created as a communication channel with our users.