Skip to content

Commit 7b22de1

Browse files
committed
Cluvio: Implement suggestions by CodeRabbit
1 parent 270b9da commit 7b22de1

File tree

1 file changed

+36
-22
lines changed

1 file changed

+36
-22
lines changed

docs/integrate/cluvio/tutorial.md

Lines changed: 36 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@
33

44
## Introduction
55

6-
In this tutorial, we'll explore how to leverage the power of [Cluvio](https://www.cluvio.com), a modern data analysis platform with [CrateDB Cloud](https://console.cratedb.cloud/) as the underlying database.
6+
Use [Cluvio] with [CrateDB Cloud] to analyze data and build interactive
7+
dashboards.
78

89
## Prerequisites
910

@@ -15,9 +16,9 @@ In this tutorial, we'll explore how to leverage the power of [Cluvio](https://ww
1516

1617
Deploying a CrateDB cloud cluster has never been easier, simply follow our tutorial [here](https://crate.io/docs/cloud/en/latest/tutorials/cluster-deployment/stripe.html#cluster-deployment-stripe) and you can have a cluster up and running within minutes. We offer a CRFREE plan which offers up to 2 vCPUs, 2 GiB of memory, and 8 GiB of storage completely for free. Ideal for small-scale testing and evaluation purposes.
1718

18-
### Load data into CrateDB cluster
19+
### Load data into CrateDB
1920

20-
In this tutorial we'll use 2 tables as our datasource. [flights](http://stat-computing.org/dataexpo/2009) and [airports](https://openflights.org/data.php) from January of 2008.
21+
In this tutorial, you use two tables[flights](http://stat-computing.org/dataexpo/2009) and [airports](https://openflights.org/data.php)from January 2008.
2122

2223
#### Create tables
2324

@@ -77,16 +78,16 @@ This creates 2 empty tables in your database. `flights` and `airports`, with the
7778

7879
Now you should import the data into the tables. We will use Console "Import" feature in this example. Use the following links:
7980

80-
* airports - https://s3.amazonaws.com/crate.sampledata/flights/dataset-airports.csv.gz
81-
* flights - https://s3.amazonaws.com/crate.sampledata/flights/dataset-flights.csv.gz
81+
* [Airports CSV (GZIP)]
82+
* [Flights CSV (GZIP)]
8283

8384
![.csv import|690x315](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/0/059227c592c98e2025b64c1d2d22e20c24624359.png)
8485

8586
Make sure to use your pre-created tables in the "Table name" field, otherwise the column types may be created incorrectly. Do this for both .csv files:
8687

8788
![Import summary|690x224](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/3/364db0dc98d56d8e27e274fd072cb5f076cf1110.png)
8889

89-
After this your tables should no longer be empty. `airports` contains 5876 records, and `flights` 150 000 records.
90+
After import, `airports` should contain about 5,876 rows and `flights` about 150,000 rows.
9091

9192
## Connect CrateDB to Cluvio
9293

@@ -118,33 +119,34 @@ A dashboard is the main point of Cluvio. It is a collection of interactive repor
118119
* [Word Cloud Chart](https://docs.cluvio.com/chart-types/word-cloud-chart)
119120
* [Histogram Chart](https://docs.cluvio.com/chart-types/histogram-chart)
120121

121-
![Example dashboard|690x343](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/4/4cc716d7a71c91476dfe2affa55881d39afe5d93.png)
122+
![Example dashboard|690x343](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/4/4cc716d7a71c91476dfe2affa55881d39afe5d93.png){width=800px}
122123

123124
Now, let's create some and see how Cluvio works. Head to **[Dashboards](https://app.cluvio.com/dashboards)** -> `New Dashboard`. After naming your Dashboard, you can create your first report. Click the `New report` in the upper right.
124125

125126
### Number of flights and delays
126127

127-
The first piece information you might be interested in, for a given period, is the number of flights and average delays of departures and arrivals. This is the code for this report:
128+
The first piece of information for a given period is the number of flights and
129+
the average departure and arrival delays. Use this query:
128130

129-
```
131+
```sql
130132
SELECT
131133
COUNT(*) AS "Number of flights",
132134
AVG(dep_delay) AS "Average Departure Delay",
133135
AVG(arr_delay) AS "Average Arrival Delay"
134136
FROM doc.flights
135-
ORDER BY 1
137+
ORDER BY 1
136138
```
137139
This is a pretty simple query that counts the number of rows in the `flights` as the number of flights, and averages values in the `dep_delay` and `arr_delay` for the departure delays and arrival delays respectively.
138140

139-
![Number of flights and delays|690x117](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/4/4841404a21b56cb1e5b92736af8b79656b0912ec.png){width=800}
141+
![Number of flights and delays|690x117](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/4/4841404a21b56cb1e5b92736af8b79656b0912ec.png){width=800px}
140142

141-
To see the information displayed this way, you need to switch to "Number" chart after running query.
143+
After running the query, switch the visualization to the “Number chart.
142144

143145
### Country distribution
144146

145147
This query looks at the country distribution in the `airports` table:
146148

147-
```
149+
```sql
148150
SELECT country,
149151
COUNT(1)
150152
FROM doc.airports
@@ -154,15 +156,15 @@ ORDER BY 2 DESC
154156

155157
In this one, it's suitable to use pie chart to better see the distribution. We also used the `Value(%)` option for the legend, and edited the legend to show up to 25 values (countries).
156158

157-
![Country distribution|690x452](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/2/2f11e42d61e93395267f847b3ee91d5be0d076f9.png){width=800}
159+
![Country distribution|690x452](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/2/2f11e42d61e93395267f847b3ee91d5be0d076f9.png){width=800px}
158160

159161
## Filters
160162

161163
[Filters](https://app.cluvio.com/settings/filters) offer a great way to quickly specify the condition under which you want to display your data.
162164

163165
In the `flights` table in `day_of_week` column 1 represents Monday, 2 means Tuesday, etc. Using that, we can create a filter to display data for a specific day of the week without changing the SQL in our reports.
164166

165-
```
167+
```sql
166168
VALUES
167169
(1, 'Monday'),
168170
(2, 'Tuesday'),
@@ -176,17 +178,17 @@ ORDER BY 1
176178

177179
Now we can filter the data by day of the week:
178180

179-
![Using filter to display data for specific day of the week|690x255](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/9/90335d44316d329ebe6d70a6a63879dec52ee5e8.png){width=800}
181+
![Using filter to display data for specific day of the week|690x255](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/9/90335d44316d329ebe6d70a6a63879dec52ee5e8.png){width=800px}
180182

181-
Find out more about filters [here](https://docs.cluvio.com/filters/overview).
183+
Learn more in the [Cluvio Filters overview](https://docs.cluvio.com/filters/overview).
182184

183185
## SQL snippets
184186

185187
SQL snippets are small reusable pieces of code that can make your work easier within larger dataset. They are managed [here](https://app.cluvio.com/settings/sql-snippets).
186188

187189
We used them to create JOIN statements:
188190

189-
```
191+
```sql
190192
JOIN doc.airports AS origin_airport ON flights.origin = origin_airport.code
191193
JOIN doc.airports AS dest_airport ON flights.dest = dest_airport.code
192194
```
@@ -195,7 +197,7 @@ This snippet creates two joins between the `flights` and `airports` tables, alia
195197

196198
Then create a report using the snippet:
197199

198-
```
200+
```sql
199201
SELECT flights.year,
200202
flights.month,
201203
origin_airport.city AS origin_city,
@@ -209,10 +211,22 @@ ORDER BY number_of_flights DESC
209211
LIMIT 100;
210212
```
211213

212-
Using the SQL snippets and filters, we can quickly find out what is the most popular destination departing from Los Angeles (LAX) on a Tuesday. Pretty cool.
214+
Using SQL snippets and filters, you can quickly find the most popular
215+
destination departing from Los Angeles (LAX) on a Tuesday.
213216

214-
![Popular destinations|690x309](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/7/7257af47c58215459e1fe2de135a966c19fedbd5.png)
217+
![Popular destinations|690x309](https://us1.discourse-cdn.com/flex020/uploads/crate/original/2X/7/7257af47c58215459e1fe2de135a966c19fedbd5.png){width=800px}
215218

216219
## Conclusion
217220

218-
That's it for this tutorial. If using Cluvio could help you make sense of your data, feel free to head to [Cloud Console](https://console.cratedb.cloud/), connect your cluster to [Cluvio](https://app.cluvio.com/) and get started! Make sure to visit their [documentation](https://docs.cluvio.com/) to explore all the features.
221+
That’s it for this tutorial. Get started in the [CrateDB Cloud Console],
222+
connect your cluster to [Cluvio], and begin analyzing your data. Explore
223+
features in the [Cluvio documentation].
224+
225+
226+
[Airports CSV (GZIP)]: https://s3.amazonaws.com/crate.sampledata/flights/dataset-airports.csv.gz
227+
[Flights CSV (GZIP)]: https://s3.amazonaws.com/crate.sampledata/flights/dataset-flights.csv.gz
228+
229+
[Cluvio]: https://www.cluvio.com/
230+
[Cluvio documentation]: https://docs.cluvio.com/
231+
[CrateDB Cloud]: https://console.cratedb.cloud/
232+
[CrateDB Cloud Console]: https://console.cratedb.cloud/

0 commit comments

Comments
 (0)