CannabisExpressionAtlas

Explore expression metrics and annotation information for your gene

Search by gene ID

Ex: THCAS Pectate lyase ATP synthase

Choose color scale for expression visualization (maximum TPM):

Remember: All expression upper to the maximun limit chosed are represented with the higher color of the scale.

Expression per tissue

Median expression by tissue

Mean and Median expression by tissue

On the left figure, tissues are colored based on mean expression values in TPM. The figure on the right shows the median and mean expression of the searched gene in each tissue. For simplification, all trichome samples are grouped as Trichome, all seed samples are grouped as Seed and male induced flower samples are grouped with male flowers in this figure and barplot. As the Cannabis Expression Atlas is highly heterogeneous, this figure should never be interpreted alone.

Global expression profile

t-SNE plot of all samples colored by tissue

t-SNE plot of all samples colored by expression

Visualization of the input gene's expression in each sample (right) with the support of a t-SNE representation of samples colored by tissue (left). Given the heterogeneity of the Cannabis Expression Atlas, these plots are particularly useful to find out if the input gene is expressed in only a subset of the samples of particular tissues.

Explore metadata and expression of multiple genes at a time.

Use your own nucleotide or protein sequences to search for genes in this database.

Input sequence (FASTA format)

Obs: Limit of 30 sequences at a time.

Ex: Example nucleotide sequence Example amino acid sequence

Or upload a file...

BLAST options

Program:

perc_identity:

similarity_matrix:

e-value:

qcov_hsp_perc:

max_target_seqs:

Download options

BLAST results (.tsv)

BLAST Results

Copy Gene ID

Download data by tissue or BioProject of interest.

Download by tissue
Download by BioProject

Select Tissue(s):

Quantification measure

Download options

Data summary

Download data (.tsv)

Expression matrix

Sample metadata

Number of Samples per Tissue

Choose the columns to display:

BioProject N_samples Tissue Cultivar Chemotype PMID DOI Study title Study abstract

Click the row corresponding to a BioProject to be able to download its expression matrix and sample metadata in the 'Download options' box. You can use the filter boxes above each column to filter based on a particular variable, or the global search box (above all columns) to find matches in any column.

NOTE: You can only download 1 BioProject at a time to avoid server overload.

Download options

Quantification measure

Data summary

Download data (.tsv)

Expression matrix

Sample metadata

Frequently Asked Questions

What was the pipeline used to create this atlas?

The pipeline used to create this atlas is summarized below.

What does “bias-corrected counts” mean?

RNA-seq software tools (including salmon, the one used here) report the number of reads mapped to each transcript, which is typically called raw read counts. However, transcript abundance estimates in raw counts are biased, because variations may be due to differences in gene length and library size. To correct for these biases, we used the “bias correction without an offset” method implemented in the Bioconductor package tximport, which scales raw counts using the average transcript length over samples, and then library size.

Can I obtain transcript-level abundance estimates with this web application?

This application allows the exploration, visualization, and download of gene-level transcript abundances (i.e., “gene expression”) only. However, transcript-level abundances are available in the FigShare repository associated with this project, in se_atlas_transcript.rda, an RData file that stores a SummarizedExperiment object.

The SummarizedExperiment object stores two assays named tx_TPM and tx_counts with transcript-level abundances in TPM and read counts, respectively.

Can I obtain a single file with all expression data in the Cannabis Expression Atlas?

Yes. Quantitative data for gene- and transcript-level abundances can be found in the FigShare repository associated with this project, in RData files named se_atlas_gene.rda and se_atlas_transcript.rda, respectively. These RData files store SummarizedExperiment objects with the following assays:

se_atlas_gene.rda: assays named gene_TPM and gene_counts.
se_atlas_transcript.rda: assays named tx_TPM and tx_counts.

To load the SummarizedExperiment object into an R session and access the data, you would run the following R code:

library(SummarizedExperiment)

# Load gene-level abundance data
load("se_atlas_gene.rda")

# Access the matrix with gene expression in TPM
assay(se_atlas_gene, "gene_TPM")

# Access the matrix with gene expression in bias-corrected counts
assay(se_atlas_gene, "gene_counts")

# Access sample metadata
colData(se_atlas_gene)

For more information on how to work with SummarizedExperiment objects, check the package’s documentation.

NOTE: these files are very large, as they store matrices with more than 27000 rows (genes) and 390 columns (samples). As R stores data in memory, make sure you have enough memory if you want to work with the entire quantitative data.

How do I report a bug or issue? You can open an issue in this GitHub repository that was specifically created as a communication channel with our users.

Explore expression metrics and annotation information for your gene

Remember: All expression upper to the maximun limit chosed are represented with the higher color of the scale.

Expression per tissue

Global expression profile

Explore metadata and expression of multiple genes at a time.

Download options

Download data based on the columns used as filters. If no filters are used, you can download the complete data.

Remember: for computational optimization reasons, it is necessary to refresh the plots after the table is filtered.

Graphical Data summary

Click on the + button tho see the plot.

Gene Description

Gene_Ontology(UniProt)

TF family

KEGG pathway

Download options

Download data (.tsv)

Data summary

Gene expression heatmap

Sample Metadata

Use your own nucleotide or protein sequences to search for genes in this database.

Input sequence (FASTA format)

Obs: Limit of 30 sequences at a time.

BLAST options

Download options

BLAST Results

Download data by tissue or BioProject of interest.

Download options

Data summary

Download data (.tsv)

NOTE: You can only download 1 BioProject at a time to avoid server overload.

Download options

Data summary

Download data (.tsv)

Frequently Asked Questions

Cite us:

Barbosa-Xavier, K. et al. (2024) ‘Cannabis Expression Atlas: a comprehensive resource for integrative analysis of Cannabis sativa L. gene expression’.
bioRxiv, p. 2024.09.27.615413. Available at: https://doi.org/10.1101/2024.09.27.615413.

Explore expression metrics and annotation information for your gene

Remember: All expression upper to the maximun limit chosed are represented with the higher color of the scale.

Expression per tissue

Global expression profile

Explore metadata and expression of multiple genes at a time.

Download options

Download data based on the columns used as filters. If no filters are used, you can download the complete data.

Remember: for computational optimization reasons, it is necessary to refresh the plots after the table is filtered.

Graphical Data summary

Click on the + button tho see the plot.

Gene Description

Gene_Ontology(UniProt)

TF family

KEGG pathway

Download options

Download data (.tsv)

Data summary

Gene expression heatmap

Sample Metadata

Use your own nucleotide or protein sequences to search for genes in this database.

Input sequence (FASTA format)

Obs: Limit of 30 sequences at a time.

BLAST options

Download options

BLAST Results

Download data by tissue or BioProject of interest.

Download options

Data summary

Download data (.tsv)

NOTE: You can only download 1 BioProject at a time to avoid server overload.

Download options

Data summary

Download data (.tsv)

Frequently Asked Questions

Cite us:

Barbosa-Xavier, K. et al. (2024) ‘Cannabis Expression Atlas: a comprehensive resource for integrative analysis of Cannabis sativa L. gene expression’. bioRxiv, p. 2024.09.27.615413. Available at: https://doi.org/10.1101/2024.09.27.615413.

Barbosa-Xavier, K. et al. (2024) ‘Cannabis Expression Atlas: a comprehensive resource for integrative analysis of Cannabis sativa L. gene expression’.
bioRxiv, p. 2024.09.27.615413. Available at: https://doi.org/10.1101/2024.09.27.615413.