Skip to content

Commit

Permalink
Update
Browse files Browse the repository at this point in the history
  • Loading branch information
Lin-jun-xiang committed Sep 21, 2023
1 parent 051e0bd commit cb4515b
Show file tree
Hide file tree
Showing 4 changed files with 115 additions and 2 deletions.
113 changes: 113 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,129 @@

[English version](README.md) | [Chinese version README.md](README.zh-TW.md)

[Documents](https://lin-jun-xiang.github.io/pyga4/) | [Pypi](https://pypi.org/project/pyga4/)

## Overview

- [Overview](#overview)
- [Introduction](#introduction)
- [How to Stream GA4 Data to Bigquery in Real-time?](#how-to-stream-ga4-data-to-bigquery-in-real-time)
- [Features](#features)
- [How to Use?](#how-to-use)
- [Download the Package](#download-the-package)
- [Connect to Your Bigquery](#connect-to-your-bigquery)
- [Connect to GA4 Tables](#connect-to-ga4-tables)
- [Evaluate Query Cost with dry run](#evaluate-query-cost-with-dry-run)
- [Analyze User Properties](#analyze-user-properties)
- [Analyze Device Properties](#analyze-device-properties)
- [Analyze Events](#analyze-events)

---

## Introduction

* `pyGA4` is a Python toolkit designed for **extracting, processing, and analyzing** data from **Google Analytics 4 (GA4)**.
* Whether you're a digital marketing professional, a data analyst, or anyone interested in gaining insights from GA4 data, this package simplifies the process of working with your GA4 data.

## How to Stream GA4 Data to Bigquery in Real-time?

First, we assume that everyone has already integrated GA4 data into their respective platform websites (there are many online tutorials).

Next, we will use a free third-party service to stream data into Bigquery. For detailed instructions, please refer to the [official documentation](https://support.google.com/analytics/answer/9823238?hl=en#zippy=%2Cin-this-article).

If successful, you will see tables in Bigquery similar to the following (`analytics_xxxx`), [ref](https://analyticscanvas.com/knowledge-base/ga4-bigquery-export-tutorial-002-querying-event-params/):

![https://analyticscanvas.com/knowledge-base/ga4-bigquery-export-tutorial-002-querying-event-params/](static/images/2023-09-21-15-04-30.png)

## Features

- **Query Cost Estimation**: Provides the Bigquery `dry run` feature to estimate query cost before execution.
- **Data Extraction**: Easily connect to your GA4 property, retrieve data, and save it for analysis.
- **Data Preprocessing**: Prepare and clean your GA4 data for analysis with built-in data preprocessing functions.
- **Custom Queries**: Execute custom queries to filter and aggregate data based on your specific needs.
- **Data Analysis**: Perform various types of analysis, including user behavior analysis, conversion tracking, and more.
- **Data Visualization**: Create informative visualizations and reports to communicate your findings effectively.
- **Simple Integration**: Seamlessly integrate `pyGA4` into your data pipeline or analytics workflow.

## How to Use?

For more features, please refer to the [package documentation](https://lin-jun-xiang.github.io/pyga4/).

#### Download the Package

`pip install pyga4`

#### Connect to Your Bigquery
```python
from google.cloud import bigquery

client = bigquery.Client()
# Or you can use:
# client = bigquery.Client.from_service_account_json(
# './private/service-project-data-dev-01d11c742ba1.json'
# )
```

#### Connect to GA4 Tables
```python
from pyga4.model import Ga4Table

# Use your project_id, dataset_name (analytics_xxxx)
ga4_table = Ga4Table(client, PROJECT_ID, DATASET_NAME)

# Show the tables list in dataset, e.g., analytics_date1, analytics_date2
table_id_list = ga4_table all_tables_list
print(table_id_list)

# Select the table you want to analyze
ga4_table.table_id = 'events_intraday_20200812'
```

#### Evaluate Query Cost with dry run
```python
# Query with dry run:
ga4_table.query_config.dry_run = True
query = f"""
SELECT event_timestamp FROM `<project_id>.<dataset_name>.<data_table>`
"""
results = ga4_table.query(query) # return None, but you can see the query usage!
```

#### Analyze User Properties

**Query User ID and Country List**

```python
# User attribute
user_id_list = ga4_table.user_id_list
user_country_list = ga4_table.geo_country_list
```

**Query User ID and Country Distribution**

```python
from pyga4.analytic import UserAnalytic

# UserAnalytic
user_analytic = UserAnalytic(ga4_table)
countries_dist = user_analytic.countries_distribution
userid_dist = user_analytic.user_id_distribution
```

#### Analyze Device Properties

```python
# DeviceAnalytic
device_analytic = DeviceAnalytic(ga4_table)
mobile_brand_dist = device_analytic.mobile_brand_distribution
```

#### Analyze Events

```python
# EventAnalytic
event_analytic = EventAnalytic(ga4_table)
page_loc_dist = event_analytic.pages_distribution
```

<a href="#top">Back to top</a>
4 changes: 2 additions & 2 deletions README.zh-TW.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,9 @@

接下來,我們會透過免費第三方服務,將資料串流到 Bigquery 中,詳細操作請參考[官方文檔](https://support.google.com/analytics/answer/9823238?hl=en#zippy=%2Cin-this-article)

如果成功串接,你會在 Bigquery 中看到類似以下的資料表(`analytics_xxxx`):
如果成功串接,你會在 Bigquery 中看到類似以下的資料表(`analytics_xxxx`)[圖片來源](https://analyticscanvas.com/knowledge-base/ga4-bigquery-export-tutorial-002-querying-event-params/):

![](static/imgs/2023-09-21-14-15-55.png)
![https://analyticscanvas.com/knowledge-base/ga4-bigquery-export-tutorial-002-querying-event-params/](static/images/2023-09-21-15-04-30.png)


## 功能
Expand Down
Binary file removed static/images/2023-09-21-14-15-55.png
Binary file not shown.
Binary file added static/images/2023-09-21-15-04-30.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit cb4515b

Please sign in to comment.