Review insights from your organiztion's Github acticity, gain a deep understanding on repositries, member activity and languages.
- Prerequisites
- Installation
- Setting Up Virtual Environment
- Installing Dependencies
- Running the Project
Before you begin, ensure you have the following installed:
-
Clone the repository:
git clone [email protected]:Muftawo/GitMetrica.git cd GitMetrica
-
Set up and activate a virtual environment:
python -m venv .venv source .venv/bin/activate
-
Install required packages:
pip install -r requirements.txt
Activate the virtual environment before running the project:
source .venv/bin/activate
The solution to the project has 3 python files extract.py
, transform.py
and main.py
-
extract.py
pulls all repos from the organization github in addition with all pull request and saves the reulting data as a JSON. run the following to extract all repo data form github orgnization -
transform.py
reads all the the saved json data apply the need transformations per the description and saves the reuslting dataframe to as a parquet file. run this to apply the needed transformation and save the parquet file. -
'
main.py
is the primary application file, it imports from the extract and transform modules and run the entire application.$ python main.py
-
delete this line