Eight additional genomes have been assembled to varying levels of completeness (Table 1). Amongst these are the genome sequences of a father-mother-daughter trio from the Jamaican Lion (JL) cultivar, which was sequenced using PacBio (McKernan et al., 2020). The parental genome assemblies including gene annotation are available on the NCBI database, while all three genome assemblies are available on the Medicinal Genomics website (https://www.medicinalgenomics.com/jamaican-lion-data-release/). In addition to these three genomes, 40 genomes from a diverse range of cultivars were sequenced with Illumina short-read sequencing as part of the Medicinal Genomics ‘Cannabis Pan-Genome Project’ (McKernan et al., 2020). The whole-genome sequencing (WGS) data generated in this project are available on the NCBI sequence read archive (Supplementary Table 2). These genome sequences will be an invaluable resource for characterising the genetic basis behind the wide phenotypic diversity observed within Cannabis. Specifically, they will facilitate the development of a Cannabis pan-genome, where gene sets unique to specific cultivars could be defined. Such cultivar-specific genes are often representative of niche phenotypic adaptations that have evolved in response to specific environmental conditions (Montenegro et al., 2017; Tao et al., 2019). Cultivar-specific genes could be key targets for breeding, where new cultivars could be designed with desirable traits for specific production purposes (Tao et al., 2019).
There is also a wealth of additional genomics data available. This includes sequences of organellar genomes, of which there are seven mitochondrial and nine chloroplast genome assemblies available (Supplementary Table 3). The organellar genomes are particularly useful for resolving phylogenetic relationships. The rate of nucleotide substitution of mitochondrial coding sequences is lower than that of the nuclear and plastid genomes, making them useful molecular markers for resolving deep taxonomic relationships (Knoop, 2004; Wolfe et al., 1987). Despite this high intragenic sequence conservation, angiosperm mitochondria can exhibit high variation in genome organisation both within and between species (Cole et al., 2018; Davila et al., 2011; Palmer and Herbon, 1988). Perhaps taking a comparative genomics approach to investigate organisational variation within the mitochondrial genome between different Cannabis cultivars would be insightful for resolving relationships within the Cannabis genus. In contrast, the chloroplast genome is characterized by both stability in genome organisation and sequence conservation between species (Palmer and Herbon, 1988). Hence the chloroplast genome is often used to resolve phylogenies at the ordinal and familial taxonomic levels (Oh et al., 2016; Vergara et al., 2015; H. Zhang et al., 2018).
Furthermore, genotyping by sequencing (GBS), amplicon sequencing, bisulfite sequencing and Hi-C data are available for a multitude of different hemp as well as marijuana varieties (Supplementary Table 2). GBS is an efficient and cost-effective method to genotype a large number of samples, providing insight into the population structure and genetic diversity within a species (He et al., 2014). There have been at least three population-based studies that have generated GBS data for ~400 samples, representing both hemp and marijuana lines (Lynch et al., 2016; Sawler et al., 2015; Soorni et al., 2017). These studies find that hemp and marijuana often form distinct populations, not segregating based only on the BT and BD loci, but on a genome-wide level (Lynch et al., 2016; Sawler et al., 2015; Soorni et al., 2017). Bisulfite sequencing detects DNA methylation and is useful for understanding epigenetic gene regulation (Elhamamsy, 2016; Li et al., 2020). Two bisulfite sequencing datasets are available for analysis (McKernan et al., 2020; Niederhuth et al., 2016). Given that economically important traits like sex expression and flowering time are under strong environmental control, it will be interesting to explore to which extent those traits are epigenetically regulated. This may open the possibility of breeding ‘climate smart’ Cannabis plants, similarly to other crops where epigenetically regulated heat, drought or cold adaption are explored for crop improvement (Varotto et al., 2020).
Lastly, the 3D organisation of the genome within the nucleus can be mapped with Hi-C data (Rodriguez-Granados et al., 2016). One Hi-C dataset exists for the JL cultivar and is available on NCBI (Gao et al., 2020). Additional Hi-C datasets are available for the Jamaican Lion genomes through the Medicinal Genomics website (https://www.medicinalgenomics.com/jamaican-lion-data-release/). The 3D organization of the genome and its implications for gene regulation are currently being heavily investigated in plants (Santos et al., 2020). The available Cannabis Hi-C data are both useful for facilitating genome assembly as well as for understanding epigenetic regulation of gene expression (Burton et al., 2013; Lieberman-Aiden et al., 2009; Xie et al., 2015).
There have also been many studies that have focused on characterising the Cannabis transcriptomes (Supplementary Table 2). Perhaps most notably, in 2019, an extensive ‘transcriptome atlas’ was generated for Cannabis(Braich et al., 2019). This study involved RNA-sequencing of 71 samples taken from multiple tissues of the Cannbio-2 cultivar (CN2), at various developmental stages. This transcriptome data will be useful for the annotation of new genome assemblies, as well as for inferring gene functions based on spatiotemporal gene expression patterns. Other studies have characterised the transcriptome of hemp lines grown under salinity and drought stress (Gao et al., 2018; Liu et al., 2016), as well as during bast fibre development (Behr et al., 2016; Guerriero et al., 2017). Three further studies have focused on sequencing the transcriptome of glandular trichomes, with the aim of profiling the expression of genes involved in terpene and phytocannabinoid biosynthesis (Booth et al., 2020; Livingston et al., 2020; Zager et al., 2019). Furthermore, two recent studies have focused on identifying the sex chromosomes based on characterising the expression of sex-linked genes in male and female plants (McKernan et al., 2020; Prentout et al., 2020). The transcriptomes of the PK and FN cultivars sequenced in 2011 are also available (van Bakel et al., 2011).
While wide-spread illegalization of Cannabis has stunted genomics research in the past, it is clear that there have been major advances in this field in recent years. With chromosome-level genome assemblies now available, as well as genome-wide annotations and abundant transcriptome data, the resources for future research are plentiful.

8. More than the sum of its parts: Medical applications of phytocannabinoids

Cannabis plants represent a rich source of biologically active compounds, including more than 100 plant-derived cannabinoids (phytocannabinoids) and more than 200 terpenoids (Russo, 2011). Thus far, research into the medicinal effects of Cannabis has largely focussed on phytocannabinoids. Among these, the most well-studied are the psychoactive THC, and the non-psychoactive CBD, though other phytocannabinoids such as CBG and CBC also show therapeutic potential (Russo, 2011) (see chapter 3 for details on phytocannabinoid synthesis and genetics).