Skip to content

Commit

Permalink
Merge pull request #81 from oceanbase/xiaofeng_branch
Browse files Browse the repository at this point in the history
增加英文版入门教程第七章
  • Loading branch information
liboyang0730 authored Oct 1, 2024
2 parents 7c61f1a + a6010f0 commit 8672b32
Show file tree
Hide file tree
Showing 55 changed files with 7,395 additions and 36 deletions.
8 changes: 4 additions & 4 deletions docs/blogs/tech/query-engines.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ The push model was initially used in stream computing. With the advent of the bi
The preceding figure shows the different control and data flow directions of the pull and push models. As shown, the control flow of the pull model aligns more intuitively with our understanding of query execution. In this model, higher-level operators request and process data from lower-level operators on demand, which is essentially a series of nested function calls. Conversely, the push engine pushes computations from higher-level operators down to the operators that produce the data. Data producers then drive the consumption of this data by the higher-level operators.

To better compare the impact of the pull and push models on code structure, we illustrate the implementation of the next() interface of each operator in the aforementioned query in pseudocode, as shown in the following figure.
![1679571653](/static/img/blog/tech/query_engins/1679571653397.png)
![1679571653](/img/blog/tech/query_engins/1679571653397.png)



Expand Down Expand Up @@ -95,13 +95,13 @@ Compared to interpreted execution, compiled execution offers the following benef

In OceanBase V2.0, we use the low-level virtual machine (LLVM) framework to optimize compiled execution for expression operations and Procedural Language (PL) code in the execution engine. Here we introduce the compiled execution of expressions in OceanBase.

![1679571705](/static/img/blog/tech/query_engins/1679571705458.png)
![1679571705](/img/blog/tech/query_engins/1679571705458.png)

> The compilation phase involves three main steps:
> 1\. Intermediate representation (IR) code generation: Consider the expression (c1+c2)\*c1+(c1+c2)\*3, where all operands are of the BIGINT type. By analyzing the semantic tree of the expression, the LLVM CodeGen API generates IR code, as shown in Figure (a).
> ![1679571741](/static/img/blog/tech/query_engins/1679571741482.png)
> ![1679571741](/img/blog/tech/query_engins/1679571741482.png)
> 2\. Code optimization: In the original code, the expression c1+c2 is computed twice. LLVM extracts it as a common subexpression. As shown in Figure (b), the optimized IR code computes c1+c2 only once, and the total number of executed instructions also decreases. If you use interpreted execution for expressions, all intermediate results are materialized in memory. Compiled code, however, allows you to store intermediate results in CPU registers for direct use in the next computation, boosting execution efficiency. LLVM also offers many similar optimizations, which we can use directly to speed up expression computation.
Expand All @@ -118,7 +118,7 @@ We compared the performance of several databases in the same test environment by
END) AS result
FROM lineitem;

![1679571812](/static/img/blog/tech/query_engins/1679571812379.png)
![1679571812](/img/blog/tech/query_engins/1679571812379.png)

As shown in the preceding figure, compiled execution offers a significant performance advantage over interpreted execution when dealing with large data volumes. This advantage increases proportionally with the data size. However, compiled execution has its drawbacks:

Expand Down
12 changes: 6 additions & 6 deletions docs/blogs/users/KYE.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ i. Selection and comparison test of real-time computing engines

We compared three mainstream real-time computing frameworks, namely Storm, Spark Streaming, and Flink. The figure below shows that both Storm and Flink control the data latency within milliseconds. However, Flink has the edge in terms of state management, unified batch and stream processing, data integration ecosystem, and ease of use. Therefore, Flink was our first choice for computing architecture.

![1706665000](/static/img/blog/users/KYE/images/141727164473_.pic.jpg)
![1706665000](/img/blog/users/KYE/images/141727164473_.pic.jpg)

ii. Selection of database services based on business-specific custom benchmark standards

Expand All @@ -47,13 +47,13 @@ We started database service selection right after deciding on the computing engi
Considering that vendors would optimize their products based on their actual business scenarios during testing, we did not rely fully on the published information. Instead, based on our specific business analysis needs, we came up with custom benchmark standards, including unified test servers and testing environment, standard data sets and standard SQL statements based on actual waybill analysis scenarios, and feature test sets based on our needs.
Then, we tested and compared the query performance of DB-U (a distributed HTAP database), OceanBase Database, DB-X (a real-time analytical database), Doris, and Trino. In the test, the databases were deployed on three servers, each with 32 CPU cores and an SSD of 128 GB, and the largest test table contained 100 million rows with 35 GB of data. OceanBase Database and DB-X exhibited better performance, as shown in the figure below.

![1706665049](/static/img/blog/users/KYE/images/1706665049385.png)
![1706665049](/img/blog/users/KYE/images/1706665049385.png)

iii. Settling on OceanBase Database after comprehensive consideration

After testing the query performance, we compared the candidate databases in terms of common features, big data ecosystem integration, and maintainability. The results are shown in the figure below. OceanBase Database supports various features except for Hive integration and federated queries. Although DB-X, Doris, and Trino performed better in big data ecosystem integration, it seems that they are not as maintainable as OceanBase Database.

![1706665065](/static/img/blog/users/KYE/images/1706665065246.png)
![1706665065](/img/blog/users/KYE/images/1706665065246.png)

Then, we compared the data write performance of OceanBase Connector, Java Database Connectivity (JDBC) Connector, and DB-X Connector. With the degree of parallelism (DOP) set to 10, OceanBase Connector wrote 10 million rows of 280 fields in 10 minutes. Such write speed is roughly the same as that of DB-X Connector, but about two times faster than that of JDBC Connector. These test results further proved the advantages of OceanBase Database in terms of database connectivity and data processing performance.

Expand All @@ -69,7 +69,7 @@ III. Application of Flink and OceanBase Database in real-time waybill analysis

The following figure shows the logic of our real-time waybill processing public layer. You can see that the business data is processed by a series of systems such as the order, tracking, load plan, scheduling, quality control, and financial systems, undergoes the aggregation of basic fields and complex join calculations, and is written in real time into the wide table in the waybill domain of the data warehouse detail (DWD) layer and stored in OceanBase Database. Then, the data can be analyzed and queried by KYE ERP through the big data platform.

![1706665196](/static/img/blog/users/KYE/images/1706665196241.png)
![1706665196](/img/blog/users/KYE/images/1706665196241.png)

With the help of OceanBase Change Data Capture (CDC) and the state management feature of Flink, we perform hierarchical calculations and lightly aggregate the data at the data warehouse summary (DWS) layer to analyze the data tables of time-sensitive services in the last 15 days and the cargo volume of each route. Users can query the aggregated data by using the data access service of our big data platform.

Expand All @@ -92,7 +92,7 @@ We researched into real-time wide table solutions, and came up with the followin

Based on the preceding real-time wide table solution, we built our real-time waybill analytics architecture 1.0 (hereinafter referred to as architecture 1.0).

![1706665229](/static/img/blog/users/KYE/images/1706665229603.png)
![1706665229](/img/blog/users/KYE/images/1706665229603.png)

ii. Optimization of the real-time waybill analytics architecture

Expand All @@ -114,7 +114,7 @@ However, we must develop a custom HBase CDC service on our own, and invest more

Therefore, we have upgraded it to the real-time waybill analytics architecture 2.0 (hereinafter referred to as architecture 2.0).

![1706665268](/static/img/blog/users/KYE/images/KYE_6.png)
![1706665268](/img/blog/users/KYE/images/KYE_6.png)

From the preceding figure, you may have noticed some changes in the implementation logic. In Step 1, Canal is deployed to listen for the MySQL business databases. The binlogs generated are written to Kafka. In Step 2, Flink SQL tasks are scheduled to read data from Kafka. The table fields with the same primary key from different modules are written to OceanBase Database. By now, a real-time waybill wide table is already built, and can be directly used by services of the big data platform. In Step 3, OceanBase CDC and the Flink state management feature are used to perform hierarchical calculations. Then, results are aggregated and classified, written to OceanBase Database, and provided for various services of the big data platform based on specific business requirements.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,4 @@ You can contact us in the following ways:

> **Note**
>
> You can click [here](https://open.oceanbase.com/course/275) to learn the supporting course of this tutorial — From Beginner to Pro: A Guide for DBAs.
> You can click [here](https://open.oceanbase.com/course/275) to learn the supporting course of this tutorial — From Beginner to Pro: A Guide for DBAs (in Chinese).
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
---
title: overview of the oceanbase database
title: Overview of the Oceanbase database
weight: 1
---
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ You can contact us in the following ways:
* Forum on the official website of OceanBase Database Community Edition: [https://ask.oceanbase.com/](https://ask.oceanbase.com/)

* GitHub page for reporting issues of OceanBase Database Community Edition: [https://github.com/oceanbase/oceanbase/issues](https://github.com/oceanbase/oceanbase/issues)
* DingTalk group ID: 33254054

> **Note**
>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
---
title: deploy oceanbase database
title: Deploy Oceanbase database
weight: 2
---
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,6 @@ You can contact us in the following ways:

* GitHub page for reporting issues of OceanBase Database Community Edition: [https://github.com/oceanbase/oceanbase/issues](https://github.com/oceanbase/oceanbase/issues)

* DingTalk group ID: 33254054

> **Note**
>
You can click [here](https://open.oceanbase.com/course/275) to learn the supporting course of this tutorial — From Beginner to Pro: A Guide for DBAs (in Chinese).
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
---
title: deploy oceanbase database
title: Test Oceanbase database
weight: 3
---
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,6 @@ You can contact us in the following ways:

* GitHub page for reporting issues of OceanBase Database Community Edition: [https://github.com/oceanbase/oceanbase/issues](https://github.com/oceanbase/oceanbase/issues)

* DingTalk group ID: 33254054

> **Note**
>
> You can click [here](https://open.oceanbase.com/course/275) to learn the supporting course of this tutorial — From Beginner to Pro: A Guide for DBAs.
> You can click [here](https://open.oceanbase.com/course/275) to learn the supporting course of this tutorial — From Beginner to Pro: A Guide for DBAs (in Chinese).
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
title: Migrate and synchronize data
weight: 4
---
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,6 @@ You can contact us in the following ways:

* GitHub page for reporting issues of OceanBase Database Community Edition: [https://github.com/oceanbase/oceanbase/issues](https://github.com/oceanbase/oceanbase/issues)

* DingTalk group ID: 33254054

> **Note**
>
> You can click [here](https://open.oceanbase.com/course/275) to learn the supporting course of this tutorial — From Beginner to Pro: A Guide for DBAs.
> You can click [here](https://open.oceanbase.com/course/275) to learn the supporting course of this tutorial — From Beginner to Pro: A Guide for DBAs (in Chinese).
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
---
title: migration and synchronization oceanbase
weight: 4
title: Operation and maintenance
weight: 5
---
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,6 @@ You can contact us in the following ways:

* GitHub page for reporting issues of OceanBase Database Community Edition: [https://github.com/oceanbase/oceanbase/issues](https://github.com/oceanbase/oceanbase/issues)

* DingTalk group ID: 33254054

> **Note**
>
> You can click [here](https://open.oceanbase.com/course/275) to learn the supporting course of this tutorial — From Beginner to Pro: A Guide for DBAs.
> You can click [here](https://open.oceanbase.com/course/275) to learn the supporting course of this tutorial — From Beginner to Pro: A Guide for DBAs (in Chinese).
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
title: Using Oceanbase for business development
weight: 6
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: Introduction
weight: 1
---

# 7.0 Introduction

This chapter describes how to perform performance diagnostics and tuning for OceanBase Database.

## Topics

* 7.1 Overview

* 7.2 Principles of ODP SQL routing

* 7.3 Manage OceanBase Database connections

* 7.4 Perform analysis based on SQL monitoring views

* 7.5 Read and manage SQL execution plans in OceanBase Database

* 7.6 Common SQL tuning methods

* 7.7 Typical scenarios and troubleshooting logic for SQL performance issues

* 7.8 Use SQL Diagnoser to diagnose and analyze SQL performance issues

* 7.9 Use obdiag for diagnostics and analytics
## Contact us

You can contact us in the following ways:

* Forum on the official website of OceanBase Database Community Edition: [https://ask.oceanbase.com/](https://ask.oceanbase.com/)

* GitHub page for reporting issues of OceanBase Database Community Edition: [https://github.com/oceanbase/oceanbase/issues](https://github.com/oceanbase/oceanbase/issues)

> **Note**
>
> You can click [here](https://open.oceanbase.com/course/275) to learn the supporting course of this tutorial — From Beginner to Pro: A Guide for DBAs (in Chinese).
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Overview
weight: 2
---

# 7.1 Overview

You can diagnose and analyze performance issues of OceanBase Database and optimize the database performance to improve resource utilization, reduce business costs, minimize operation risks of application systems, and enhance system stability. This way, OceanBase Database can provide services with higher efficiency.

This chapter describes performance diagnostics, troubleshooting, and SQL tuning for OceanBase Database. For more information, see the following topics: Principles of ODP SQL routing, Perform analysis based on SQL monitoring views, Read and manage SQL execution plans in OceanBase Database, Common SQL tuning methods, Typical scenarios and troubleshooting logic for SQL performance issues, Use SQL Diagnoser to diagnose and analyze SQL performance issues, and Use obdiag for diagnostics and analytics.
Loading

0 comments on commit 8672b32

Please sign in to comment.