Title: | Interactively Visualize Genetic Mutation Data using a Lollipop-Diagram |
---|---|
Description: | Interface for 'g3-lollipop' 'JavaScript' library. Visualize genetic mutation data using an interactive lollipop diagram in 'RStudio' or your web browser. |
Authors: | Xin Guo [aut, cre], Rener Zhang [ctb], Ruining Zhu [ctb], Feng Zhao [ctb] |
Maintainer: | Xin Guo <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.2.0 |
Built: | 2025-03-09 04:27:13 UTC |
Source: | https://github.com/g3viz/g3viz |
Render g3lollipop diagram for the given mutation data
g3Lollipop( mutation.dat, gene.symbol, uniprot.id = NA, gene.symbol.col = "Hugo_Symbol", aa.pos.col = "AA_Position", protein.change.col = c("Protein_Change", "HGVSp_Short"), factor.col = "Mutation_Class", plot.options = g3Lollipop.options(), save.png.btn = TRUE, save.svg.btn = TRUE, btn.style = NA, output.filename = "output" )
g3Lollipop( mutation.dat, gene.symbol, uniprot.id = NA, gene.symbol.col = "Hugo_Symbol", aa.pos.col = "AA_Position", protein.change.col = c("Protein_Change", "HGVSp_Short"), factor.col = "Mutation_Class", plot.options = g3Lollipop.options(), save.png.btn = TRUE, save.svg.btn = TRUE, btn.style = NA, output.filename = "output" )
mutation.dat |
Input genomic mutation data frame |
gene.symbol |
HGNC primary gene symbol |
uniprot.id |
UniProt ID, in case that the specified gene symbol links to multiple UniProt entries (isoforms). For example, AKAP7 gene has two isoforms in UniProt, O43687 and Q9P0M2. |
gene.symbol.col |
Column name of Hugo gene symbols (e.g., TP53). Default Hugo_Symbol. |
aa.pos.col |
Column name of the parsed amino-acid change position. Default AA_Position. |
protein.change.col |
Column name of protein change information (e.g., p.K960R, G658S, L14Sfs*15). Default is a list of Protein_Change, HGVSp_Short. |
factor.col |
column of classes in the plot legend. IF |
plot.options |
g3lollipop diagram options in list format. Check
|
save.png.btn |
If add save-as-png button to the diagram. Default
|
save.svg.btn |
If add save-as-svg button to the diagram. Default
|
btn.style |
button style, including browser default button style, and
two built-in styles, blue or gray. Default |
output.filename |
Specify output file name. |
lollipop diagram for the given mutation data. The chart is interactive within either Shiny applications or Rmd documents under the bindings.
# system mutation data maf.file <- system.file("extdata", "TCGA.BRCA.varscan.somatic.maf.gz", package = "g3viz") # read in MAF file mutation.dat <- readMAF(maf.file) # use built-in chart theme chart.options <- g3Lollipop.theme(theme.name = "default", title.text = "PIK3CA gene (default theme)") # generate chart g3Lollipop(mutation.dat, gene.symbol = "PIK3CA", plot.options = chart.options, btn.style = "blue", output.filename = "default_theme")
# system mutation data maf.file <- system.file("extdata", "TCGA.BRCA.varscan.somatic.maf.gz", package = "g3viz") # read in MAF file mutation.dat <- readMAF(maf.file) # use built-in chart theme chart.options <- g3Lollipop.theme(theme.name = "default", title.text = "PIK3CA gene (default theme)") # generate chart g3Lollipop(mutation.dat, gene.symbol = "PIK3CA", plot.options = chart.options, btn.style = "blue", output.filename = "default_theme")
Output and render functions for using g3viz lollipop diagram within Shiny applications and interactive Rmd documents.
g3LollipopOutput(outputId, width = "100%", height = "520px") renderG3Lollipop(expr, env = parent.frame(), quoted = FALSE)
g3LollipopOutput(outputId, width = "100%", height = "520px") renderG3Lollipop(expr, env = parent.frame(), quoted = FALSE)
outputId |
output variable to read from |
width , height
|
Must be a valid CSS unit (like |
expr |
An expression that generates a g3-lollipop |
env |
The environment in which to evaluate |
quoted |
Is |
No value returned. It is the binding which enables interactive functions within Shiny applications and Rmd documents.
G3Lollipop plot options
g3Lollipop.options( chart.width = 800, chart.type = "circle", chart.margin = list(left = 40, right = 20, top = 15, bottom = 25), chart.background = "transparent", transition.time = 600, y.axis.label = "# of mutations", axis.label.font = "normal 12px Arial", axis.label.color = "#4f4f4f", axis.label.alignment = "middle", axis.label.dy = "-2em", y.axis.line.color = "#c4c8ca", y.axis.line.style = "dash", y.axis.line.width = 1, y.max.range.ratio = 1.1, legend.margin = list(left = 10, right = 0, top = 5, bottom = 5), legend.interactive = TRUE, legend.title = NA, lollipop.track.height = 420, lollipop.track.background = "rgb(233,233,233)", lollipop.pop.min.size = 2, lollipop.pop.max.size = 12, lollipop.pop.info.limit = 8, lollipop.pop.info.color = "#EEE", lollipop.pop.info.dy = "0.35em", lollipop.line.color = "rgb(42,42,42)", lollipop.line.width = 0.5, lollipop.circle.color = "wheat", lollipop.circle.width = 0.5, lollipop.label.ratio = 1.4, lollipop.label.min.font.size = 10, lollipop.color.scheme = "accent", highlight.text.angle = "90", title.text = "", title.font = "normal 16px Arial", title.color = "#424242", title.alignment = "middle", title.dy = "0.35em", anno.height = 30, anno.margin = list(top = 4, bottom = 0), anno.background = "transparent", anno.bar.fill = "#e5e3e1", anno.bar.margin = list(top = 2, bottom = 2), domain.color.scheme = "category10", domain.margin = list(top = 0, bottom = 0), domain.text.font = "normal 11px Arial", domain.text.color = "#f2f2f2", brush = TRUE, brush.selection.background = "#666", brush.selection.opacity = 0.2, brush.border.color = "#969696", brush.handler.color = "#333", brush.border.width = 1, legend = TRUE, tooltip = TRUE, zoom = TRUE )
g3Lollipop.options( chart.width = 800, chart.type = "circle", chart.margin = list(left = 40, right = 20, top = 15, bottom = 25), chart.background = "transparent", transition.time = 600, y.axis.label = "# of mutations", axis.label.font = "normal 12px Arial", axis.label.color = "#4f4f4f", axis.label.alignment = "middle", axis.label.dy = "-2em", y.axis.line.color = "#c4c8ca", y.axis.line.style = "dash", y.axis.line.width = 1, y.max.range.ratio = 1.1, legend.margin = list(left = 10, right = 0, top = 5, bottom = 5), legend.interactive = TRUE, legend.title = NA, lollipop.track.height = 420, lollipop.track.background = "rgb(233,233,233)", lollipop.pop.min.size = 2, lollipop.pop.max.size = 12, lollipop.pop.info.limit = 8, lollipop.pop.info.color = "#EEE", lollipop.pop.info.dy = "0.35em", lollipop.line.color = "rgb(42,42,42)", lollipop.line.width = 0.5, lollipop.circle.color = "wheat", lollipop.circle.width = 0.5, lollipop.label.ratio = 1.4, lollipop.label.min.font.size = 10, lollipop.color.scheme = "accent", highlight.text.angle = "90", title.text = "", title.font = "normal 16px Arial", title.color = "#424242", title.alignment = "middle", title.dy = "0.35em", anno.height = 30, anno.margin = list(top = 4, bottom = 0), anno.background = "transparent", anno.bar.fill = "#e5e3e1", anno.bar.margin = list(top = 2, bottom = 2), domain.color.scheme = "category10", domain.margin = list(top = 0, bottom = 0), domain.text.font = "normal 11px Arial", domain.text.color = "#f2f2f2", brush = TRUE, brush.selection.background = "#666", brush.selection.opacity = 0.2, brush.border.color = "#969696", brush.handler.color = "#333", brush.border.width = 1, legend = TRUE, tooltip = TRUE, zoom = TRUE )
chart.width |
chart width. Default 800. |
chart.type |
pie or circle. Default circle. |
chart.margin |
specify chart margin in _list_ format. |
chart.background |
chart background. Default transparent. |
transition.time |
animation transition time when clicking lollipop pops to show labels (in millisecond). Default 600. |
y.axis.label |
Y-axis label text. Default "# of mutations". |
axis.label.font |
css font style shorthand (font-style font-variant font-weight font-size/line-height font-family). Default "normal 12px Arial". |
axis.label.color |
axis label text color. Default #4f4f4f. |
axis.label.alignment |
axis label text alignment (start/end/middle). Default middle. |
axis.label.dy |
text adjustment of axis label text. Default -2em. |
y.axis.line.color |
color of y-axis in-chart lines (ticks). Default #c4c8ca. |
y.axis.line.style |
style of y-axis in-chart lines (ticks), "dash" or "line". Default dash. |
y.axis.line.width |
width of y-axis in-chart lines (ticks). Default 1. |
y.max.range.ratio |
ratio of y-axis range to data value range. Default 1.1. |
legend.margin |
legend margin in list. Default |
legend.interactive |
legend interactive mode. Default |
legend.title |
legend title. If |
lollipop.track.height |
height of lollipop track. Default 420. |
lollipop.track.background |
background of lollipop track. Default rgb(244,244,244) |
lollipop.pop.min.size |
lollipop pop minimal size. Default 2. |
lollipop.pop.max.size |
lollipop pop maximal size. Default 12. |
lollipop.pop.info.limit |
threshold of lollipop pop size to show count information in middle of pop. Default 8. |
lollipop.pop.info.color |
lollipop pop information text color. Default #EEE. |
lollipop.pop.info.dy |
y-axis direction text adjustment of lollipop pop information. Default -0.35em. |
lollipop.line.color |
lollipop line color. Default rgb(42,42,42). |
lollipop.line.width |
lollipop line width. Default 0.5. |
lollipop.circle.color |
lollipop circle border color. Default wheat. |
lollipop.circle.width |
lollipop circle border width. Default 0.5. |
lollipop.label.ratio |
lollipop click-out label font size to circle size ratio. Default 1.4. |
lollipop.label.min.font.size |
lollipop click-out label minimal font size. Default 10. |
lollipop.color.scheme |
color scheme to fill lollipop pops. Default accent. |
highlight.text.angle |
pop-on-click highlight text angle. Default 90. |
title.text |
title of chart. Default is empty. |
title.font |
font of chart title. Default normal 16px Arial. |
title.color |
color of chart title. Default #424242. |
title.alignment |
text alignment of chart title (start/middle/end). Default middle. |
title.dy |
text adjustment of chart title. Default 0.35em. |
anno.height |
height of protein structure annotation track. Default 30. |
anno.margin |
margin of protein structure annotation track. Default |
anno.background |
background of protein structure annotation track. Default transparent. |
anno.bar.fill |
background of protein bar in protein structure annotation track. Default #e5e3e1. |
anno.bar.margin |
margin of protein bar in protein structure annotation track. Default |
domain.color.scheme |
color scheme of protein domains. Default category10. |
domain.margin |
margin of protein domains. Default |
domain.text.font |
domain label text font in shorthand format. Default normal 11px Arial. |
domain.text.color |
domain label text color. Default #f2f2f2. |
brush |
if show brush. Default |
brush.selection.background |
background color of selection brush. Default #666. |
brush.selection.opacity |
background opacity of selection brush. Default 0.2. |
brush.border.color |
border color of selection brush. Default #969696. |
brush.handler.color |
color of left and right handlers of selection brush. Default #333. |
brush.border.width |
border width of selection brush. Default 1. |
legend |
if show legend. Default |
tooltip |
if show tooltip. Default |
zoom |
if enable zoom feature. Default |
a list with g3Lollipop plot options
G3Lollipop chart options of built-in themes.
g3Lollipop.theme( theme.name = "default", title.text = "", y.axis.label = "# of mutations", legend.title = NA )
g3Lollipop.theme( theme.name = "default", title.text = "", y.axis.label = "# of mutations", legend.title = NA )
theme.name |
theme name, including default, cbioportal, nature, nature2, dark, blue, ggplot2, and simple. Default default. |
title.text |
title of chart. Default is empty. |
y.axis.label |
Y-axis label text. Default "# of mutations". |
legend.title |
legend title. If |
a list with g3Lollipop plot options
Retrieve and parse mutation data from cBioPortal by the given cBioPortal cancer study ID and the gene symbol.
getMutationsFromCbioportal( study.id, gene.symbol, output.file = NA, mutation.type.to.class.df = NA )
getMutationsFromCbioportal( study.id, gene.symbol, output.file = NA, mutation.type.to.class.df = NA )
study.id |
cbioprotal study ID |
gene.symbol |
HGNC gene symbol. |
output.file |
if specified, output to a file in CSV format.
Default is |
mutation.type.to.class.df |
mapping table from mutation type to class.
See |
a data frame with columns
Hugo gene symbol
Protein change information (cBioportal uses HGVSp format)
Sample ID
mutation type, aka, variant classification.
chromosome
start position
end position
reference allele
variant allele
mutation class (e.g., Truncating/Missense/Inframe/Other)
amino-acid position of the variant; if the variant is not in
protein-coding region, NA
# Usage:
# Usage:
Guess column name for MAF file
guessMAFColumnName(maf.df, alt.column.names)
guessMAFColumnName(maf.df, alt.column.names)
maf.df |
MAF data frame |
alt.column.names |
a vector of alternative column names |
if hit one alternative column name, return the name; otherwise, return NA
Mapping from Hugo symbol to Pfam-A domain composition.
If the given Hugo symbol has multiple UniProt ID mappings,
and guess == TRUE
,
the longest UniProt protein is selected. Return is either a list of a JSON.
hgnc2pfam(hgnc.symbol, guess = TRUE, uniprot.id = NA, output.format = "json")
hgnc2pfam(hgnc.symbol, guess = TRUE, uniprot.id = NA, output.format = "json")
hgnc.symbol |
primary Hugo symbol |
guess |
if the given Hugo symbol links to multiple UniProt IDs,
choose the longest one ( |
uniprot.id |
UniProt ID, in case that gene symbol maps to multiple UniProt entries. |
output.format |
output format: JSON or list |
A list or a JSON with attributes: symbol, uniprot, length, and a list of Pfam entries, including hmm.acc, hmm.name, start, end, and type.
# general usage hgnc2pfam("TP53") hgnc2pfam("TP53", output.format = "json") hgnc2pfam("TP53", output.format = "list") hgnc2pfam("TP53", output.format = "json", uniprot.id = "P04637") # OK # for gene mapping to multiple UniProt enties hgnc2pfam("GNAS", guess = TRUE) hgnc2pfam("GNAS", guess = FALSE) hgnc2pfam("GNAS", output.format = "list") hgnc2pfam("GNAS", output.format = "list", uniprot.id = "P84996") ## Not run: hgnc2pfam("GNAS", output.format = "list", uniprot.id = "P84997") # , returns FALSE ## End(Not run) hgnc2pfam("PRAMEF9", output.format = "list") # no Pfam mappings
# general usage hgnc2pfam("TP53") hgnc2pfam("TP53", output.format = "json") hgnc2pfam("TP53", output.format = "list") hgnc2pfam("TP53", output.format = "json", uniprot.id = "P04637") # OK # for gene mapping to multiple UniProt enties hgnc2pfam("GNAS", guess = TRUE) hgnc2pfam("GNAS", guess = FALSE) hgnc2pfam("GNAS", output.format = "list") hgnc2pfam("GNAS", output.format = "list", uniprot.id = "P84996") ## Not run: hgnc2pfam("GNAS", output.format = "list", uniprot.id = "P84997") # , returns FALSE ## End(Not run) hgnc2pfam("PRAMEF9", output.format = "list") # no Pfam mappings
A dataset containing the mapping table between Hugo symbol, UniProt ID, and Pfam ACC.
hgnc2pfam.df
hgnc2pfam.df
A data frame with columns:
Gene symbol
UniProt ID
protein length
starting position of Pfam domain
ending position of Pfam domain
Pfam accession number
Pfam name
Pfam type, i.e., domain/family/motif/repeat/disordered/coiled-coil
Pfam (v31.0) and UniProt
hgnc2pfam.df
hgnc2pfam.df
Mapping from Hugo Symbol to UniProt ID using internal mapping table. Return a data frame with columns symbol (Hugo symbol), uniprot (UniProt ID), and length (protein length).
hgnc2uniprot(hgnc.symbol)
hgnc2uniprot(hgnc.symbol)
hgnc.symbol |
primary HUGO symbol |
a data frame with columns symbol (Hugo symbol), uniprot (UniProt ID), and length (protein length).
# maps to single UniProt entry hgnc2uniprot("TP53") # maps to multiple UniProt entries hgnc2uniprot("GNAS") hgnc2uniprot("AKAP7")
# maps to single UniProt entry hgnc2uniprot("TP53") # maps to multiple UniProt entries hgnc2uniprot("GNAS") hgnc2uniprot("AKAP7")
Map from mutation type (aka, variant classification) to mutation class. Default mappings are as follows,
Missense
Missense_Mutation — a point mutation in which a single nucleotide change results in a codon that codes for a different amino acid See https://en.wikipedia.org/wiki/Missense_mutation.
Inframe
In_Frame_Del — a deletion that keeps the sequence in frame
In_Frame_Ins — an insertion that keeps the sequence in frame
Silent — variant is in coding region of the chosen transcript, but protein structure is identical (i.e., a synonymous mutation)
Targeted_Region — targeted region
Truncating
Frame_Shift — a variant caused by indels of a number of nucleotides in a DNA sequence that is not divisible by three. See https://en.wikipedia.org/wiki/Frameshift_mutation.
Frame_Shift_Ins — a variant caused by insertion that moves the coding sequence out of frame. See https://en.wikipedia.org/wiki/Frameshift_mutation.
Frame_Shift_Del — a variant caused by deletion that moves the coding sequence out of frame. See https://en.wikipedia.org/wiki/Frameshift_mutation.
Nonsense_Mutation — a premature stop codon that is created by the variant. See https://en.wikipedia.org/wiki/Nonsense_mutation.
Nonstop_Mutation — a variant that removes stop codon.
Splice_Site — a variant that is within two bases of a splice site.
Splice_Region — a variant that is within splice region.
Other
5'UTR — a variant that is on the 5'UTR for the chosen transcript.
3'UTR — a variant that is on the 3'UTR for the chosen transcript.
5'Flank — a variant that is upstream of the chosen transcript (generally within 3kb).
3'Flank — a variant that is downstream of the chosen transcript (generally within 3kb).
Fusion — a gene fusion.
IGR — an intergenic region. Does not overlap any transcript.
Intron — a variant that lies between exons within the bounds of the chosen transcript.
Translation_Start_Site — a variant that is in translation start site.
De_novo_Start_InFrame — a novel start codon that is created by the given variant using the chosen transcript. However, it is in frame relative to the coded protein.
De_novo_Start_OutOfFrame — a novel start codon that is created by the given variant using the chosen transcript. However, it is out of frame relative to the coded protein.
Start_Codon_SNP — a point mutation that overlaps the start codon.
Start_Codon_Ins — an insertion that overlaps the start codon.
Start_Codon_Del — a deletion that overlaps the start codon.
RNA — a variant that lies on one of the RNA transcripts.
lincRNA — a variant that lies on one of the lincRNAs.
Unknown — Unknown
mapMutationTypeToMutationClass( mutation.type.vec, mutation.type.to.class.df = NA )
mapMutationTypeToMutationClass( mutation.type.vec, mutation.type.to.class.df = NA )
mutation.type.vec |
a vector of mutation type information |
mutation.type.to.class.df |
A mapping table from mutation type (header Mutation_Type)
to mutation class (header Mutation_Class).
Default |
a vector of mapped mutation class information
A dataset containing the mapping table between genomic mutation type (aka, variant classification) to mutation class.
See mapMutationTypeToMutationClass
for details.
mutation.table.df
mutation.table.df
A data frame with three columns:
Mutation type, aka, variant classification
mutation class
short name of mutation type
mutation.table.df
mutation.table.df
Parse amino_acid_position according to HGVSp_short format.
For example, p.Q16Rfs*28, amino-acid position is 16.
See http://varnomen.hgvs.org/recommendations/protein/ or https://www.hgvs.org/mutnomen/recs-prot.html.
parseProteinChange(protein.change.vec, mutation.class.vec)
parseProteinChange(protein.change.vec, mutation.class.vec)
protein.change.vec |
a vector of strings with protein change information, usually in HGVSp_short format. |
mutation.class.vec |
a vector of strings with mutation class (or so-called variant classification) information. |
a vector of parsed amino-acid position
Read mutation information from MAF file. For MAF format specification, see https://docs.gdc.cancer.gov/Data/File_Formats/MAF_Format/.
readMAF( maf.file, gene.symbol.col = "Hugo_Symbol", variant.class.col = c("Variant_Classification", "Mutation_Type"), protein.change.col = c("Protein_Change", "HGVSp_Short"), if.parse.aa.pos = TRUE, if.parse.mutation.class = TRUE, mutation.class.col = "Mutation_Class", aa.pos.col = "AA_Position", mutation.type.to.class.df = NA, sep = "\t", quote = "", ... )
readMAF( maf.file, gene.symbol.col = "Hugo_Symbol", variant.class.col = c("Variant_Classification", "Mutation_Type"), protein.change.col = c("Protein_Change", "HGVSp_Short"), if.parse.aa.pos = TRUE, if.parse.mutation.class = TRUE, mutation.class.col = "Mutation_Class", aa.pos.col = "AA_Position", mutation.type.to.class.df = NA, sep = "\t", quote = "", ... )
maf.file |
MAF file name. Gnuzipped input file allowed, with ".gz" file extension. |
gene.symbol.col |
Column name of Hugo gene symbols (e.g., TP53). Default Hugo_Symbol. |
variant.class.col |
Column name for variant class information (e.g., Missense_Mutation, Nonsense_Mutation). Default is the first match of Variant_Classification or Mutation_Type. |
protein.change.col |
Column name for protein change information (e.g., p.K960R, G658S, L14Sfs*15). Default is the first match of Protein_Change or HGVSp_Short. |
if.parse.aa.pos |
if parse amino-acid position of mutations. Default is
|
if.parse.mutation.class |
if parse mutation class from mutation type
(variant classification) information. Default is |
mutation.class.col |
Column name of the parsed mutation class. Default Mutation_Class. |
aa.pos.col |
Column name of the parsed amino-acid change position. Default AA_Position. |
mutation.type.to.class.df |
mapping table from mutation type to class.
|
sep |
separator of columns. Default |
quote |
the set of quoting characters. To disable quoting altogether,
use |
... |
additional parameters pass to |
a data frame containing MAF information, plus optional columns of the parsed Mutation_Class and Protein_Position.
Map from UniProt ID to Pfam-A domain composition.
uniprot2pfam(uniprot.id)
uniprot2pfam(uniprot.id)
uniprot.id |
UniProt ID |
a data frame with columns
uniprot — UniProt ID
length — protein length
hmm.acc — accession number of Pfam HMM model, e.g., PF08563
hmm.name — Pfam name, e.g., P53_TAD
start — Pfam domain start position
end — Pfam domain end position
type — Pfam type, including domain/motif/family
uniprot2pfam("Q5VWM5") # PRAMEF9; PRAMEF15 uniprot2pfam("P04637")
uniprot2pfam("Q5VWM5") # PRAMEF9; PRAMEF15 uniprot2pfam("P04637")