scanpy-gseapy pipeline

This application is a SCANPY-GSEAPY pipeline with a web interface. The SCANPY-GSEAPY pipeline is written using Python (3.8.5). Flask server connects the pipeline and ReactJS frontend. The application accepts ranked gene list or 10xGenomics gene expression dataset as input. If ranked list is inputted only GSEA part of the pipeline is executed. If 10xGenomics gene expression dataset is inputted ranked gene list is computed and thereafter GSEA is performed to the ranked list. The pipeline outputs enrichment plots.

Setup the application:

Open terminal
Clone the repository to your computer:

git clone https://github.com/ernohanninen/scanpy-gseapy_pipeline.git

Navigate:

cd scanpy-gseapy_pipeline

Set up conda virtual environment:

conda env create -f gsea_env.yml

Activate the conda environment:

conda activate gsea_env

Navigate:

cd gsa_app

Start the Flask server:

yarn run start-api

Open new terminal window and navigate to folder: ~/scanpy-gseapy_pipeline/gsa_app
Activate the conda environment:

conda activate gsea_env

Start the ReactJS frontend:

yarn start

Application is running in: http://localhost:3000/

Using the application:

Select the input file type using radio button:

Input types:
- In ranked gene list (.csv) genes should be ranked according to their differential expression. In the inputted list genes should be in column one and the ranking metric in column two. Example ranked gene list is provided. It is named ranked.csv and can be found in input_data folder.
- 10xGenomics files refers to gene expression dataset (HDF5 Format). 10xGenomics gene expression dataset consists of three files and are named in the following way: matrix.mtx, barcodes.tsv and features.tsv or genes.tsv. Example 10xGenomics dataset can be found in input_data/hg folder. Submit all the three files.

Select the gene set: Two example gene sets are provided. KEGG_2021_Human.txt and NCI-60_Cancer_Cell_Lines.txt files can be found in input_data folder
Choose the number of GSEA plots wanted as output.
Run the analysis by pressing the Run analysis button. The pipeline contains a lot of computation, thus it takes a while to run. "Starting the analysis" text is printed to the terminal window, where Flask server is running, if the application runs correctly.
The results (GSEA plots) are written to /scanpy-gseapy_pipeline/data/GSEA_Prerank folder

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
__pycache__		__pycache__
data		data
gsa_app		gsa_app
input_data		input_data
README.md		README.md
gsea.py		gsea.py
gsea_env.yml		gsea_env.yml
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scanpy-gseapy pipeline

About

Releases

Packages

Languages

ernohanninen/scanpy-gseapy_pipeline

Folders and files

Latest commit

History

Repository files navigation

scanpy-gseapy pipeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages