Skip to content

Commit

Permalink
add the dat tables to the readme (#44)
Browse files Browse the repository at this point in the history
* add the dat tables to the readme

* remove print statements
  • Loading branch information
MrPowers authored Jan 29, 2024
1 parent 96e1950 commit 93ada65
Showing 1 changed file with 173 additions and 0 deletions.
173 changes: 173 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,179 @@ For an example implementation of this, see the example PySpark tests in `tests/p

TBD.

## Generated tables

**all_primitive_types**

Table containing all non-nested types.

```
+----+-----+-----+-----+----+-------+-------+-----+-------------+-------+----------+-------------------+
|utf8|int64|int32|int16|int8|float32|float64| bool| binary|decimal| date32| timestamp|
+----+-----+-----+-----+----+-------+-------+-----+-------------+-------+----------+-------------------+
| 0| 0| 0| 0| 0| 0.0| 0.0| true| []| 10.000|1970-01-01|1970-01-01 00:00:00|
| 1| 1| 1| 1| 1| 1.0| 1.0|false| [00]| 11.000|1970-01-02|1970-01-01 01:00:00|
| 2| 2| 2| 2| 2| 2.0| 2.0| true| [00 00]| 12.000|1970-01-03|1970-01-01 02:00:00|
| 3| 3| 3| 3| 3| 3.0| 3.0|false| [00 00 00]| 13.000|1970-01-04|1970-01-01 03:00:00|
| 4| 4| 4| 4| 4| 4.0| 4.0| true|[00 00 00 00]| 14.000|1970-01-05|1970-01-01 04:00:00|
+----+-----+-----+-----+----+-------+-------+-----+-------------+-------+----------+-------------------+
```

**basic_append**

A basic table with two append writes.

```
+------+------+-------+
|letter|number|a_float|
+------+------+-------+
| a| 1| 1.1|
| b| 2| 2.2|
| c| 3| 3.3|
| d| 4| 4.4|
| e| 5| 5.5|
+------+------+-------+
```

**basic_partitioned**

A basic partitioned table.

```
+------+------+-------+
|letter|number|a_float|
+------+------+-------+
| b| 2| 2.2|
| NULL| 6| 6.6|
| c| 3| 3.3|
| a| 1| 1.1|
| a| 4| 4.4|
| e| 5| 5.5|
+------+------+-------+
```

**multi_partitioned**

Multiple levels of partitioning, with boolean, timestamp, and decimal partition columns.

```
+-----+-------------------+--------------------+---+
| bool| time| amount|int|
+-----+-------------------+--------------------+---+
|false|1970-01-02 08:45:00|12.00000000000000...| 3|
| true|1970-01-01 00:00:00|200.0000000000000...| 1|
| true|1970-01-01 12:30:00|200.0000000000000...| 2|
+-----+-------------------+--------------------+---+
```

**multi_partitioned_2**

Multiple levels of partitioning, with boolean, timestamp, and decimal partition columns.

```
+-----+-------------------+--------------------+---+
| bool| time| amount|int|
+-----+-------------------+--------------------+---+
|false|1970-01-02 08:45:00|12.00000000000000...| 3|
| true|1970-01-01 00:00:00|200.0000000000000...| 1|
| true|1970-01-01 12:30:00|200.0000000000000...| 2|
+-----+-------------------+--------------------+---+
```

**nested_types**

Table containing various nested types.

```
+---+------------+---------------+--------------------+
| pk| struct| array| map|
+---+------------+---------------+--------------------+
| 0| {0.0, true}| [0]| {}|
| 1|{1.0, false}| [0, 1]| {0 -> 0}|
| 2| {2.0, true}| [0, 1, 2]| {0 -> 0, 1 -> 1}|
| 3|{3.0, false}| [0, 1, 2, 3]|{0 -> 0, 1 -> 1, ...|
| 4| {4.0, true}|[0, 1, 2, 3, 4]|{0 -> 0, 1 -> 1, ...|
+---+------------+---------------+--------------------+
```

**no_replay**

Table with a checkpoint and prior commits cleaned up.

```
+------+---+----------+
|letter|int| date|
+------+---+----------+
| a| 93|1975-06-01|
| b|753|2012-05-01|
| c|620|1983-10-01|
| a|595|2013-03-01|
| NULL|653|1995-12-01|
+------+---+----------+
```

**no_stats**

Table with no stats.

```
+------+---+----------+
|letter|int| date|
+------+---+----------+
| a| 93|1975-06-01|
| b|753|2012-05-01|
| c|620|1983-10-01|
| a|595|2013-03-01|
| NULL|653|1995-12-01|
+------+---+----------+
```

**stats_as_structs**

Table with stats only written as struct (not JSON) with Checkpoint.

```
+------+---+----------+
|letter|int| date|
+------+---+----------+
| a| 93|1975-06-01|
| b|753|2012-05-01|
| c|620|1983-10-01|
| a|595|2013-03-01|
| NULL|653|1995-12-01|
+------+---+----------+
```

**with_checkpoint**

Table with a checkpoint.

```
+------+---+----------+
|letter|int| date|
+------+---+----------+
| a| 93|1975-06-01|
| b|753|2012-05-01|
| c|620|1983-10-01|
| a|595|2013-03-01|
| NULL|653|1995-12-01|
+------+---+----------+
```

**with_schema_change**

Table which has schema change using overwriteSchema=True.

```
+----+----+
|num1|num2|
+----+----+
| 22| 33|
| 44| 55|
| 66| 77|
+----+----+
```

## Models

The test cases contain several JSON files to be read by connector tests. To make it easier to read them, we provide [JSON schemas](https://json-schema.org/) for each of the file types in `out/schemas/`. They can be read to understand
Expand Down

0 comments on commit 93ada65

Please sign in to comment.