|  | 
|  | 1 | +# --- | 
|  | 2 | +# jupyter: | 
|  | 3 | +#   jupytext: | 
|  | 4 | +#     text_representation: | 
|  | 5 | +#       extension: .py | 
|  | 6 | +#       format_name: percent | 
|  | 7 | +#       format_version: '1.3' | 
|  | 8 | +#       jupytext_version: 1.17.2 | 
|  | 9 | +#   kernelspec: | 
|  | 10 | +#     display_name: vuecore-dev | 
|  | 11 | +#     language: python | 
|  | 12 | +#     name: python3 | 
|  | 13 | +# --- | 
|  | 14 | + | 
|  | 15 | +# %% [markdown] | 
|  | 16 | +# # Histogram Plot | 
|  | 17 | +# | 
|  | 18 | +# ![VueCore logo][vuecore_logo] | 
|  | 19 | +# | 
|  | 20 | +# [![Open In Colab][colab_badge]][colab_link] | 
|  | 21 | +# | 
|  | 22 | +# [VueCore][vuecore_repo] is a Python package for creating interactive and static visualizations of multi-omics data. | 
|  | 23 | +# It is part of a broader ecosystem of tools—including [ACore][acore_repo] for data processing and [VueGen][vuegen_repo] for automated reporting—that together enable end-to-end workflows for omics analysis. | 
|  | 24 | +# | 
|  | 25 | +# This notebook demonstrates how to generate histogram plots using plotting functions from VueCore. We showcase basic and advanced plot configurations, highlighting key customization options such as grouping, color mapping, text annotations, and export to multiple file formats. | 
|  | 26 | +# | 
|  | 27 | +# ## Notebook structure | 
|  | 28 | +# | 
|  | 29 | +# First, we will set up the work environment by installing the necessary packages and importing the required libraries. Next, we will create basic and advanced histogram plots. | 
|  | 30 | +# | 
|  | 31 | +# 0. [Work environment setup](#0-work-environment-setup) | 
|  | 32 | +# 1. [Basic histogram plot](#1-basic-histogram-plot) | 
|  | 33 | +# 2. [Advanced histogram plot](#2-advanced-histogram-plot) | 
|  | 34 | +# | 
|  | 35 | +# ## Credits and Contributors | 
|  | 36 | +# | 
|  | 37 | +# - This notebook was created by Sebastián Ayala-Ruano under the supervision of Henry Webel and Alberto Santos, head of the [Multiomics Network Analytics Group (MoNA)][Mona] at the [Novo Nordisk Foundation Center for Biosustainability (DTU Biosustain)][Biosustain]. | 
|  | 38 | +# - You can find more details about the project in this [GitHub repository][vuecore_repo]. | 
|  | 39 | +# | 
|  | 40 | +# [colab_badge]: https://colab.research.google.com/assets/colab-badge.svg | 
|  | 41 | +# [colab_link]: https://colab.research.google.com/github/Multiomics-Analytics-Group/vuecore/blob/main/docs/api_examples/bar_plot.ipynb | 
|  | 42 | +# [vuecore_logo]: https://raw.githubusercontent.com/Multiomics-Analytics-Group/vuecore/main/docs/images/logo/vuecore_logo.svg | 
|  | 43 | +# [Mona]: https://multiomics-analytics-group.github.io/ | 
|  | 44 | +# [Biosustain]: https://www.biosustain.dtu.dk/ | 
|  | 45 | +# [vuecore_repo]: https://github.com/Multiomics-Analytics-Group/vuecore | 
|  | 46 | +# [vuegen_repo]: https://github.com/Multiomics-Analytics-Group/vuegen | 
|  | 47 | +# [acore_repo]: https://github.com/Multiomics-Analytics-Group/acore | 
|  | 48 | + | 
|  | 49 | +# %% [markdown] | 
|  | 50 | +# ## 0. Work environment setup | 
|  | 51 | + | 
|  | 52 | +# %% [markdown] | 
|  | 53 | +# ### 0.1. Installing libraries and creating global variables for platform and working directory | 
|  | 54 | +# | 
|  | 55 | +# To run this notebook locally, you should create a virtual environment with the required libraries. If you are running this notebook on Google Colab, everything should be set. | 
|  | 56 | + | 
|  | 57 | +# %% tags=["hide-output"] | 
|  | 58 | +# VueCore library | 
|  | 59 | +# %pip install vuecore | 
|  | 60 | + | 
|  | 61 | +# %% tags=["hide-cell"] | 
|  | 62 | +import os | 
|  | 63 | + | 
|  | 64 | +IN_COLAB = "COLAB_GPU" in os.environ | 
|  | 65 | + | 
|  | 66 | +# %% tags=["hide-cell"] | 
|  | 67 | +# Create a directory for outputs | 
|  | 68 | +output_dir = "./outputs" | 
|  | 69 | +os.makedirs(output_dir, exist_ok=True) | 
|  | 70 | + | 
|  | 71 | +# %% [markdown] | 
|  | 72 | +# ### 0.2. Importing libraries | 
|  | 73 | + | 
|  | 74 | +# %% | 
|  | 75 | +# Imports | 
|  | 76 | +import pandas as pd | 
|  | 77 | +import numpy as np | 
|  | 78 | +from pathlib import Path | 
|  | 79 | +import plotly.io as pio | 
|  | 80 | + | 
|  | 81 | +from vuecore.plots.basic.histogram import create_histogram_plot | 
|  | 82 | + | 
|  | 83 | +# Set the Plotly renderer based on the environment | 
|  | 84 | +pio.renderers.default = "notebook" | 
|  | 85 | + | 
|  | 86 | +# %% [markdown] | 
|  | 87 | +# ### 0.3. Create sample data | 
|  | 88 | +# We create a synthetic dataset simulating gene expression data across two experimental conditions to demonstrate how histograms can visualize data distribution. | 
|  | 89 | + | 
|  | 90 | +# %% | 
|  | 91 | +# Set a random seed for reproducibility of the synthetic data | 
|  | 92 | +np.random.seed(42) | 
|  | 93 | + | 
|  | 94 | +# Define parameters for synthetic gene expression data | 
|  | 95 | +num_genes = 1000 | 
|  | 96 | +conditions = ["Control", "Treated"] | 
|  | 97 | +gene_names = [f"Gene_{i}" for i in range(num_genes)] | 
|  | 98 | + | 
|  | 99 | +# Simulate expression data with a slight shift in the "Treated" group | 
|  | 100 | +expression_values = np.concatenate( | 
|  | 101 | +    [ | 
|  | 102 | +        np.random.normal(loc=10, scale=2, size=num_genes // 2), | 
|  | 103 | +        np.random.normal(loc=12, scale=2, size=num_genes // 2), | 
|  | 104 | +    ] | 
|  | 105 | +) | 
|  | 106 | +condition_values = np.concatenate( | 
|  | 107 | +    [["Control"] * (num_genes // 2), ["Treated"] * (num_genes // 2)] | 
|  | 108 | +) | 
|  | 109 | + | 
|  | 110 | +# Create the DataFrame | 
|  | 111 | +gene_exp_df = pd.DataFrame( | 
|  | 112 | +    { | 
|  | 113 | +        "Gene_ID": gene_names, | 
|  | 114 | +        "Expression": expression_values, | 
|  | 115 | +        "Condition": condition_values, | 
|  | 116 | +    } | 
|  | 117 | +) | 
|  | 118 | + | 
|  | 119 | +gene_exp_df.head() | 
|  | 120 | + | 
|  | 121 | +# %% [markdown] | 
|  | 122 | +# ## 1. Basic Histogram Plot | 
|  | 123 | +# A basic histogram plot can be created by simply providing the `x` and `y` columns from the DataFrame, along with style options like `title`. | 
|  | 124 | + | 
|  | 125 | +# %% | 
|  | 126 | +# Define output file path for the PNG basic histogram | 
|  | 127 | +file_path_basic_hist_png = Path(output_dir) / "histogram_plot_basic.png" | 
|  | 128 | + | 
|  | 129 | +# Generate the basic histogram plot | 
|  | 130 | +histogram_plot_basic = create_histogram_plot( | 
|  | 131 | +    data=gene_exp_df, | 
|  | 132 | +    x="Expression", | 
|  | 133 | +    title="Distribution of Gene Expression Levels", | 
|  | 134 | +    file_path=file_path_basic_hist_png, | 
|  | 135 | +) | 
|  | 136 | + | 
|  | 137 | +histogram_plot_basic.show() | 
|  | 138 | + | 
|  | 139 | +# %% [markdown] | 
|  | 140 | +# ## 2. Advanced Histogram Plot | 
|  | 141 | +# Here is an example of an advanced histogram plot with more descriptive parameters, including `color grouping`, `overlay barmode`, `probability density normalization`, `hover tooltips`, and export to `HTML`. | 
|  | 142 | + | 
|  | 143 | +# %% | 
|  | 144 | +# Define the output file path for the advanced HTML histogram | 
|  | 145 | +file_path_adv_hist_html = Path(output_dir) / "histogram_plot_advanced.html" | 
|  | 146 | + | 
|  | 147 | +# Generate the advanced histogram plot | 
|  | 148 | +histogram_plot_adv = create_histogram_plot( | 
|  | 149 | +    data=gene_exp_df, | 
|  | 150 | +    x="Expression", | 
|  | 151 | +    color="Condition", | 
|  | 152 | +    barmode="overlay", | 
|  | 153 | +    histnorm="probability density", | 
|  | 154 | +    title="Gene Expression Distribution by Treatment Condition", | 
|  | 155 | +    subtitle="Histogram with probability density normalized", | 
|  | 156 | +    labels={"Expression": "Gene Expression", "Condition": "Treatment Condition"}, | 
|  | 157 | +    hover_data=["Gene_ID"], | 
|  | 158 | +    opacity=0.75, | 
|  | 159 | +    file_path=file_path_adv_hist_html, | 
|  | 160 | +) | 
|  | 161 | + | 
|  | 162 | +histogram_plot_adv.show() | 
0 commit comments