![](https://private-user-images.githubusercontent.com/11178512/251612237-036dc50c-57bd-4439-855b-3c09eeba72ab.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjExMzAyNjgsIm5iZiI6MTcyMTEyOTk2OCwicGF0aCI6Ii8xMTE3ODUxMi8yNTE2MTIyMzctMDM2ZGM1MGMtNTdiZC00NDM5LTg1NWItM2MwOWVlYmE3MmFiLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MTYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzE2VDExMzkyOFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWM3MTg4MDFkNWViZWE0M2QzNWUwMzcyOThiNWQzZWEzMjA2YWM0MWJlYmY2MzExMWExMGE4YWZjMzkwNWNlNTEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.fw48JO1uBN7R7EOyOogFiMi_cyDw9NEWwKQz5Dt5rTw)
Parquimetro is a small (10MB) and simple tool to interact with parquet files. Built around parquet-go.
To check parquet schemas:
parquimetro schema ~/path/to/file.parquet
Options available:
- Count:
-f
or--format
output format,json
orgo
. (defaultjson
) - Skip:
--tags
show go struct tags (Only available if format isgo
) - Threads:
-t
or--threads
quantity of threads to be used. (default 1)
Schema command can be easily used together with jq
:
parquimetro schema ~/path/to/file.parquet | jq .
Easy read parquet files:
parquimetro read ~/path/to/file.parquet
Options available:
- Count:
-c
or--count
quantity of rows to be shows. (default 25) - Skip:
-s
or--skip
quantity of rows to skip (from beginning) - Threads:
-t
or--threads
quantity of threads to be used. (default 1)
Just as schema, read command can be easily used together with jq
:
parquimetro read ~/path/to/file.parquet | jq .
Easy know size related data:
go run main.go size ~/Downloads/userdata1.parquet
Options available:
- Uncompressed:
--uncompressed
show uncompressed size (Defaulttrue
) - Compressed:
--compressed
show compressed size (Defaultfalse
) - Pretty:
--pretty
show pretty size, it will use the best format to print (Defaulttrue
) - Format:
--format
or-f
give format to print output. Acceptable formats:KB
,MB
,GB
,TB
. (Lower priority thanpretty
, need to set--pretty=false
to use)
If you have go installed:
go install github.com/otaviohenrique/parquimetro@latest
Or if you want, you can download the release on our releases page and install it.