|
1 |
| -<img src="https://img.shields.io/badge/fastapi-109989?style=for-the-badge&logo=FASTAPI&logoColor=white" /> <img src="https://img.shields.io/badge/Python-FFD43B?style=for-the-badge&logo=python&logoColor=blue" /> <img src="https://img.shields.io/badge/PostgreSQL-316192?style=for-the-badge&logo=postgresql&logoColor=white" /> |
| 1 | +<p align="center"><img src="https://img.shields.io/badge/fastapi-109989?style=for-the-badge&logo=FASTAPI&logoColor=white" /> <img src="https://img.shields.io/badge/Python-FFD43B?style=for-the-badge&logo=python&logoColor=blue" /> <img src="https://img.shields.io/badge/PostgreSQL-316192?style=for-the-badge&logo=postgresql&logoColor=white" /> |
| 2 | +</p> |
2 | 3 |
|
3 |
| -# pephub |
| 4 | +<p align="center"> |
| 5 | + <a href="https://pephub.databio.org"><img src="./docs/imgs/pephub_logo_big.svg" alt="PEPhub"></a> |
4 | 6 |
|
5 |
| -**pephub** is a biological metadata server that lets you view, store, and share your sample metadata in form of [PEPs](https://pep.databio.org/en/latest/). It has 3 components: 1) a _database_ where PEPs are stored; 2) an _API_ to programmatically read and write PEPs in the database; and 3) a web-based _user interface_ to view and manage these PEPs via a front-end. |
6 | 7 |
|
7 |
| -## Organization |
| 8 | +</p> |
8 | 9 |
|
9 |
| -## Setting up a development environment |
| 10 | +**PEPhub** is a biological metadata server that lets you view, store, and share your sample metadata in form of [PEPs](https://pep.databio.org/en/latest/). It has 3 components: 1) a _database_ where PEPs are stored; 2) an _API_ to programmatically read and write PEPs in the database; and 3) a web-based _user interface_ to view and manage these PEPs via a front-end. |
10 | 11 |
|
11 |
| -PEPhub consists of 3 components: 1) A postgres database; 2) the PEPhub API; 3) the PEPhub UI. |
| 12 | +--- |
12 | 13 |
|
13 |
| -### 1. Database setup |
| 14 | +**Deployed public instance**: <a href="https://pephub.databio.org/" target="_blank">https://pephub.databio.org/</a> |
14 | 15 |
|
15 |
| -_pephub_ stores PEPs in a [POSTGRES](https://www.postgresql.org/) database. Create a new pephub-compatible postgres instance locally: |
| 16 | +**API**: <a href="https://pephub-api.databio.org/api/v1/docs" target="_blank">https://pephub-api.databio.org/api/v1/docs</a> |
16 | 17 |
|
17 |
| -``` |
18 |
| -docker pull postgres |
19 |
| -docker run \ |
20 |
| - -e POSTGRES_USER=postgres \ |
21 |
| - -e POSTGRES_PASSWORD=docker \ |
22 |
| - -e POSTGRES_DB=pep-db \ |
23 |
| - -p 5432:5432 \ |
24 |
| - postgres |
25 |
| -``` |
| 18 | +**Documentation**: <a href="https://pep.databio.org/pephub" target="_blank">https://pep.databio.org/pephub</a> |
26 | 19 |
|
27 |
| -You should now have a pephub-compatible postgres instance running at http://localhost:5432. |
28 |
| -You can use [load_db.py](scripts/load_db.py) to load a directory of PEPs into the database. |
| 20 | +**Source Code**: <a href="https://github.com/pepkit/pephub" target="_blank">https://github.com/pepkit/pephub</a> |
29 | 21 |
|
30 |
| -### 2. `pephub` API setup |
| 22 | +--- |
31 | 23 |
|
32 |
| -#### Install |
33 |
| - |
34 |
| -Install dependencies using `pip` (_We suggest using virtual environments_): |
35 |
| - |
36 |
| -``` |
37 |
| -python -m venv venv && source venv/bin/activate |
38 |
| -pip install -r requirements/requirements-all.txt |
39 |
| -``` |
40 |
| - |
41 |
| -#### Running |
42 |
| - |
43 |
| -_pephub_ may be run in several ways. In every case, pephub requires configuration. Configuration settings are supplied to pephub through environment variables. The following settings are **required**. While pephub has built-in defaults for these settings, you should provide them to ensure compatability: |
44 |
| - |
45 |
| -- `POSTGRES_HOST`: The hostname of the PEPhub database server |
46 |
| -- `POSTGRES_DB`: The name of the database inside the postgres server |
47 |
| -- `POSTGRES_USER`: Username for the database |
48 |
| -- `POSTGRES_PASSWORD`: Password for the user |
49 |
| -- `POSTGRES_PORT`: Port for postgres database |
50 |
| -- `GH_CLIENT_ID`: Client ID for the GitHub application that authenticates users |
51 |
| -- `GH_CLIENT_SECRET`: Client secret for the GitHub application that authenticates users |
52 |
| -- `BASE_URI`: A BASE URI of the PEPhub (e.g. localhost:8000) |
53 |
| - |
54 |
| -You must set these environment variables prior to running PEPhub. We've provided `env` files inside [`environment`](./environment) which you may `source` to load your environment. Alternatively, you may store them locally in a `.env` file. This file will get loaded and exported to your environment when the server starts up. We've included an [example](environment/template.env) `.env` file with this repository. You can read more about server settings and configuration [here](docs/server-settings.md). |
55 |
| - |
56 |
| -Once the configuration variables are set, run pephub natively with: |
57 |
| - |
58 |
| -``` |
59 |
| -uvicorn pephub.main:app --reload |
60 |
| -``` |
61 |
| - |
62 |
| -The _pephub_ API should now be running at http://localhost:8000. |
63 |
| - |
64 |
| -### 3. React PEPhub UI setup |
65 |
| - |
66 |
| -_Important:_ To make the development server work, you must include a `.env.local` file inside `web/` with the following contents: |
67 |
| - |
68 |
| -``` |
69 |
| -VITE_API_HOST=http://localhost:8000 |
70 |
| -``` |
71 |
| - |
72 |
| -This ensures that the frontend development server will proxy requests to the backend server. You can now run the frontend development server: |
73 |
| - |
74 |
| -```bash |
75 |
| -cd web |
76 |
| -npm install # yarn install |
77 |
| -npm start # yarn dev |
78 |
| -``` |
79 |
| - |
80 |
| -The pephub frontend development server should now be running at http://localhost:5173/. |
81 |
| - |
82 |
| -### 3. (_Optional_) GitHub Authentication Client Setup |
83 |
| - |
84 |
| -_pephub_ uses GitHub for namespacing and authentication. As such, a GitHub application capable of logging in users is required. We've included [instructions for setting up GitHub authentication locally](https://github.com/pepkit/pephub/blob/master/docs/authentication.md#setting-up-github-oauth-for-your-own-server) using your own GitHub account. |
85 |
| - |
86 |
| -### 4. (_Optional_) Vector Database Setup |
87 |
| - |
88 |
| -We've added [semantic-search](https://huggingface.co/course/chapter5/6?fw=tf#using-embeddings-for-semantic-search) capabilities to pephub. Optionally, you may host an instance of the [qdrant](https://qdrant.tech/) **vector database** to store embeddings computed using a sentence transformer that has mined and processed any relevant metadata from PEPs. If no qdrant connection settings are supplied, pephub will default to SQL search. Read more [here](docs/semantic-search.md). To run qdrant locally, simply run the following: |
89 |
| - |
90 |
| -``` |
91 |
| -docker pull qdrant/qdrant |
92 |
| -docker run -p 6333:6333 \ |
93 |
| - -v $(pwd)/qdrant_storage:/qdrant/storage \ |
94 |
| - qdrant/qdrant |
95 |
| -``` |
96 |
| - |
97 |
| -## Running with docker: |
98 |
| - |
99 |
| -### Option 1. Standalone `docker`: |
100 |
| - |
101 |
| -If you already have a public database instance running, you can choose to build and run the server container only. **A note to Apple Silicon (M1/M2) users**: If you have issues running, try setting your default docker platform with `export DOCKER_DEFAULT_PLATFORM=linux/amd64` to get the container to build and run properly. See [this issue](https://github.com/pepkit/pephub/issues/87) for more information. |
102 |
| - |
103 |
| -**1. Environment:** |
104 |
| -Ensure that you have your [environment](docs/server-settings.md) properly configured. To manage secrets in your environment, we leverage `pass` and curated [`.env` files](environment/production.env). You can use our `launch_docker.sh` script to start your container with these `.env` files. |
105 |
| - |
106 |
| -**2. Build and start container:** |
107 |
| - |
108 |
| -``` |
109 |
| -docker build -t pephub . |
110 |
| -./launch_docker.sh |
111 |
| -
|
112 |
| -``` |
113 |
| - |
114 |
| -Alternatively, you can inject your environment variables one-by-one: |
115 |
| - |
116 |
| -``` |
117 |
| -
|
118 |
| -docker run -p 8000:8000 \ |
119 |
| - -e POSTGRES_HOST=localhost \ |
120 |
| - -e POSTGRES_DB=pep-db \ |
121 |
| - ... |
122 |
| -pephub |
123 |
| -
|
124 |
| -``` |
125 |
| - |
126 |
| -Or, provide your own `.env` file: |
127 |
| - |
128 |
| -``` |
129 |
| -
|
130 |
| -docker run -p 8000:8000 \ |
131 |
| - --env-file path/to/.env \ |
132 |
| - pephub |
133 |
| -
|
134 |
| -``` |
135 |
| - |
136 |
| -### Option 2. `docker compose`: |
137 |
| - |
138 |
| -The server has been Dockerized and packaged with a [postgres](https://hub.docker.com/_/postgres) image to be run with [`docker compose`](https://docs.docker.com/compose/). This lets you run everything at once and develop without having to manage database instances. |
139 |
| - |
140 |
| -You can start a development environment in two steps: |
141 |
| - |
142 |
| -**1. Curate your environment:** |
143 |
| -Since we are running in `docker`, we need to supply environment variables to the container. The `docker-compose.yaml` file is written such that you can supply a `.env` file at the root with your configurations. See the [example env file](environment/template.env) for reference. See [here](docs/server-settings.md) for a detailed explanation of all configurable server settings. For now, you can simply copy the `env` file: |
144 |
| - |
145 |
| -``` |
146 |
| -cp environment/template.env .env |
147 |
| -``` |
148 |
| - |
149 |
| -**2. Build and start the containers:** |
150 |
| - |
151 |
| -```console |
152 |
| -docker compose up --build |
153 |
| -``` |
154 |
| - |
155 |
| -`pephub` now runs/listens on http://localhost:8000 |
156 |
| -`postgres` now runs/listens on http://localhost:5432 |
157 |
| - |
158 |
| -**3. (_Optional_) Utilize the [`load_db`](scripts/load_db.py) script to populate the database with `examples/`:** |
159 |
| - |
160 |
| -```console |
161 |
| -cd scripts |
162 |
| -python load_db.py \ |
163 |
| ---username docker \ |
164 |
| ---password password \ |
165 |
| ---database pephub |
166 |
| -../examples |
167 |
| -``` |
168 |
| - |
169 |
| -**4. (_Optional_) GitHub Authentication Client Setup** |
170 |
| - |
171 |
| -_pephub_ uses GitHub for namespacing and authentication. As such, a GitHub application capable of logging in users is required. We've [included instructions](https://github.com/pepkit/pephub/blob/master/docs/authentication.md#setting-up-github-oauth-for-your-own-server) for setting this up locally using your own GitHub account. |
172 |
| - |
173 |
| -**5. (_Optional_) Vector Database Setup** |
174 |
| - |
175 |
| -We've added [semantic-search](https://huggingface.co/course/chapter5/6?fw=tf#using-embeddings-for-semantic-search) capabilities to pephub. Optionally, you may host an instance of the [qdrant](https://qdrant.tech/) **vector database** to store embeddings computed using a sentence transformer that has mined and processed any relevant metadata from PEPs. If no qdrant connection settings are supplied, pephub will default to SQL search. Read more [here](docs/semantic-search.md). To run qdrant locally, simply run the following: |
176 |
| - |
177 |
| -``` |
178 |
| -docker pull qdrant/qdrant |
179 |
| -docker run -p 6333:6333 \ |
180 |
| - -v $(pwd)/qdrant_storage:/qdrant/storage \ |
181 |
| - qdrant/qdrant |
182 |
| -``` |
183 |
| - |
184 |
| -_Note: If you wish to run the development environment with a pubic database, curate your `.env` file as such._ |
185 | 24 |
|
0 commit comments