# From Bioconductor
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("gVenn")
# From GitHub (development version)
# install.packages("pak")
# pak::pak("ckntav/gVenn")Gene set analysis workflow with gVenn: identifying overlaps and shared genes
Supplementary data S2
Last updated: 11 February 2026
1 Overview
This supplementary material demonstrates a complete gene set analysis workflow using the gVenn R package. The workflow includes computing overlaps between gene lists, visualizing results with Venn diagrams and UpSet plots, extracting overlap groups, and exporting results for downstream analysis.
2 Installation
The gVenn package can be installed from Bioconductor or GitHub:
3 Load required packages
library(gVenn)
library(knitr)3.1 Checking gVenn version
packageVersion("gVenn")[1] '1.1.1'
4 Example dataset
The gVenn package includes a synthetic gene list dataset (gene_list) comprising three sets of human gene symbols with designed overlaps. This dataset was generated from the first 250 gene symbols in the org.Hs.eg.db package using a reproducible random seed.
# Load the example gene list dataset
data(gene_list)
# Display the structure of the dataset
str(gene_list)List of 3
$ random_genes_A: chr [1:125] "ALPP" "ACTG1P9" "AHSG" "ASIC2" ...
$ random_genes_B: chr [1:115] "AFM" "ADPRH" "AIF1" "ACVR2A" ...
$ random_genes_C: chr [1:70] "ACTN1" "ALDOA" "CRYBG1" "AK4" ...
The dataset contains three gene lists with the following sizes:
# Create a summary table
summary_df <- data.frame(
"gene list" = names(gene_list),
"number of genes" = sapply(gene_list, length),
check.names = FALSE
)
kable(summary_df,
caption = "Summary of example gene lists")| gene list | number of genes | |
|---|---|---|
| random_genes_A | random_genes_A | 125 |
| random_genes_B | random_genes_B | 115 |
| random_genes_C | random_genes_C | 70 |
5 Computing overlaps
The computeOverlaps() function analyzes the intersection patterns across all gene lists and returns a structured object containing:
- A vector of all unique elements across sets
- A logical matrix indicating set membership for each element
- Category labels encoding the overlap pattern (e.g., “110”, “101”)
# Compute overlaps between gene lists
gene_overlaps <- computeOverlaps(gene_list)
# Display the structure of the result
class(gene_overlaps)[1] "SetOverlapResult"
names(gene_overlaps)[1] "unique_elements" "overlap_matrix" "intersect_category"
5.1 Overlap matrix
The overlap matrix shows which genes belong to which sets:
# Display the first 10 rows of the overlap matrix
kable(head(gene_overlaps$overlap_matrix, 10),
caption = "First 10 rows of the overlap matrix (TRUE = gene present in set)")| random_genes_A | random_genes_B | random_genes_C | |
|---|---|---|---|
| ALPP | TRUE | FALSE | FALSE |
| ACTG1P9 | TRUE | FALSE | FALSE |
| AHSG | TRUE | FALSE | FALSE |
| ASIC2 | TRUE | FALSE | FALSE |
| ACTG1P10 | TRUE | FALSE | FALSE |
| ALAS1 | TRUE | FALSE | FALSE |
| AKT2 | TRUE | FALSE | FALSE |
| PARP1P1 | TRUE | FALSE | FALSE |
| ABCD1 | TRUE | FALSE | FALSE |
| SLC25A6 | TRUE | FALSE | FALSE |
5.2 Overlap categories
Each gene is assigned a category code representing its overlap pattern:
# Show the distribution of overlap categories
category_table <- table(gene_overlaps$intersect_category)
category_df <- data.frame(
"Overlap pattern" = names(category_table),
"Number of genes" = as.integer(category_table),
check.names = FALSE
)
kable(category_df,
caption = "Distribution of overlap patterns")| Overlap pattern | Number of genes |
|---|---|
| 001 | 17 |
| 010 | 45 |
| 011 | 16 |
| 100 | 67 |
| 101 | 4 |
| 110 | 21 |
| 111 | 33 |
Interpretation of overlap patterns:
100: Genes only in set A (random_genes_A)010: Genes only in set B (random_genes_B)001: Genes only in set C (random_genes_C)110: Genes in A ∩ B (not in C)101: Genes in A ∩ C (not in B)011: Genes in B ∩ C (not in A)111: Genes in A ∩ B ∩ C (all three sets)
6 Visualization
6.1 Venn diagram
The plotVenn() function creates area-proportional Venn diagrams to visualize the overlaps:
# Create a basic Venn diagram
plotVenn(gene_overlaps)6.1.1 Customized Venn Diagram
The appearance can be customized by adjusting colors, transparency, labels, and other parameters:
# Create a customized Venn diagram
plotVenn(gene_overlaps,
fills = list(fill = c("#2B70AB", "#FFB027", "#3EA742"), alpha = 0.6),
edges = list(col = "gray30", lwd = 1.5),
labels = list(col = "black", fontsize = 12, font = 2),
quantities = list(type = c("counts", "percent"),
col = "black", fontsize = 10),
main = list(label = "gene set overlaps",
fontsize = 14, font = 2, col = "navy"),
legend = list(side = "right", fontsize = 10))6.2 UpSet Plot
For larger numbers of sets (>3), UpSet plots provide a clearer alternative to Venn diagrams:
# Create an UpSet plot
plotUpSet(gene_overlaps)6.2.1 Customized UpSet plot
The UpSet plot can also be customized with colors:
# Create a customized UpSet plot with colored dots
plotUpSet(gene_overlaps,
comb_col = c("#2B70AB", "#FFB027", "#3EA742", "#CD3301",
"#9370DB", "#008B8B", "#D87093"))7 Extracting overlap groups
The extractOverlaps() function separates genes into distinct groups based on their overlap patterns:
# Extract genes grouped by overlap pattern
gene_groups <- extractOverlaps(gene_overlaps)
# Display the number of genes per group
group_sizes <- sapply(gene_groups, length)
group_df <- data.frame(
"Overlap group" = names(group_sizes),
"Number of genes" = as.integer(group_sizes),
check.names = FALSE
)
kable(group_df,
caption = "Number of genes in each overlap group")| Overlap group | Number of genes |
|---|---|
| group_001 | 17 |
| group_010 | 45 |
| group_100 | 67 |
| group_011 | 16 |
| group_101 | 4 |
| group_110 | 21 |
| group_111 | 33 |
7.1 Examining specific groups
Individual overlap groups can be accessed for downstream analysis:
7.1.1 Extract genes present in all three sets (group_111)
# Extract genes present in all three sets (group_111)
genes_in_all_three <- gene_groups[["group_111"]]
cat("Genes present in all three sets (A ∩ B ∩ C):\n")Genes present in all three sets (A ∩ B ∩ C):
print(genes_in_all_three) [1] "ACP2" "ALDH3A1" "ACTB" "ACACA" "ASIC1" "SLC25A5"
[7] "ACTL6A" "AMY2B" "AMH" "AMPH" "ADK" "ALDH3A2"
[13] "ACTG1P3" "ACO1" "ACTG1P7" "ALPI" "ANXA4" "AGL"
[19] "ADRB2" "ABCF1" "ABO" "AMD1" "ALS3" "ALOX12"
[25] "AMBP" "AMPD2" "ALDH1A1" "AFG3L1P" "ADFN" "ADCYAP1R1"
[31] "ADD3" "ALOX12P2" "BIN1"
7.1.2 Extract genes unique to random_genes_A (group_100)
# Extract genes unique to random_genes_A (group_100)
genes_unique_to_A <- gene_groups[["group_100"]]
cat("Genes unique to random_genes_A:\n")Genes unique to random_genes_A:
print(genes_unique_to_A) [1] "ALPP" "ACTG1P9" "AHSG" "ASIC2" "ACTG1P10" "ALAS1"
[7] "AKT2" "PARP1P1" "ABCD1" "SLC25A6" "AAMP" "ADCP1"
[13] "ACADVL" "ACTG1" "ANGPT2" "AGTR1" "ACACB" "ACTBP9"
[19] "ALDH1B1" "ADAR" "ABCD2" "AMHR2" "ABCB7" "ABCA1"
[25] "PARP4" "ACTG1P1" "JAG1" "ACTA2" "ADH7" "AP1B1"
[31] "ACVR1" "ACTN4" "A2MP1" "ABCA4" "ALAD" "ADRA1A"
[37] "ADCY5" "ALDOB" "AP2B1" "AMELY" "ABL1" "ACTC1"
[43] "AK2" "ALOX12B" "ACTN3" "AIC" "ALB" "NATP"
[49] "ANG" "AHR" "ABCA2" "ALPL" "ANXA2P1" "AMELX"
[55] "AHCY" "PARP1P2" "ALOX5" "AMPD1" "AFA" "ACADSB"
[61] "AIH3" "ACAN" "AGA" "AMY1C" "ADSS2" "ALDH2"
[67] "ALOX15B"
8 Exporting results
8.1 Export to Excel
The exportOverlaps() function exports each overlap group to a separate sheet in an Excel file:
# Export overlap groups to Excel
exportOverlaps(gene_groups,
output_dir = "results",
output_file = "gene_overlap_groups",
with_date = TRUE,
verbose = TRUE)This creates an Excel file with one sheet per overlap group, making it easy to:
- Review genes in each category
- Perform functional enrichment analysis
- Share results with collaborators
- Import into other analysis tools
8.2 Saving visualizations
Visualizations can be exported in multiple formats (PDF, PNG, SVG):
# Create a Venn diagram
venn_plot <- plotVenn(gene_overlaps)
# Save as PDF
saveViz(venn_plot,
output_dir = "figures",
output_file = "gene_venn_diagram",
format = "pdf",
width = 6,
height = 4)
# Save as high-resolution PNG
saveViz(venn_plot,
output_dir = "figures",
output_file = "gene_venn_diagram",
format = "png",
width = 6,
height = 4,
resolution = 300)
# Save with transparent background for presentations
saveViz(venn_plot,
output_dir = "figures",
output_file = "gene_venn_diagram_transparent",
format = "png",
bg = "transparent")9 References
For more information about the gVenn package, visit:
- Package documentation: https://ckntav.github.io/gVenn
- GitHub repository: https://github.com/ckntav/gVenn
10 Session information
Code
sessionInfo()R version 4.5.2 (2025-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.7.3
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
locale:
[1] fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8
time zone: America/Toronto
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.51 gVenn_1.1.1
loaded via a namespace (and not attached):
[1] ComplexHeatmap_2.26.1 jsonlite_2.0.0 compiler_4.5.2
[4] rjson_0.2.23 crayon_1.5.3 Rcpp_1.1.1
[7] stringr_1.6.0 magick_2.9.0 parallel_4.5.2
[10] cluster_2.1.8.2 IRanges_2.44.0 png_0.1-8
[13] yaml_2.3.12 fastmap_1.2.0 generics_0.1.4
[16] shape_1.4.6.1 Cairo_1.7-0 BiocGenerics_0.56.0
[19] iterators_1.0.14 GetoptLong_1.1.0 htmlwidgets_1.6.4
[22] polyclip_1.10-7 circlize_0.4.17 lubridate_1.9.5
[25] RColorBrewer_1.1-3 polylabelr_1.0.0 rlang_1.1.7
[28] stringi_1.8.7 xfun_0.56 GlobalOptions_0.1.3
[31] otel_0.2.0 doParallel_1.0.17 timechange_0.4.0
[34] cli_3.6.5 magrittr_2.0.4 digest_0.6.39
[37] foreach_1.5.2 grid_4.5.2 lifecycle_1.0.5
[40] clue_0.3-67 eulerr_7.0.4 vctrs_0.7.1
[43] S4Vectors_0.48.0 glue_1.8.0 evaluate_1.0.5
[46] codetools_0.2-20 stats4_4.5.2 colorspace_2.1-2
[49] rmarkdown_2.30 matrixStats_1.5.0 tools_4.5.2
[52] htmltools_0.5.9