Skip to content

hderms/qsv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

qsv

Performant CLI tool to query CSVs through SQL

Installation

After cloning the repository, you can install a binary locally using cargo install --path .

Features

Usage/Features

The intention is for all SQLite syntax to be supported with CSVs as the data source including: joins, subqueries, CTEs, unions, etc...

Simple queries

qsv supports syntactically valid SQLite queries run on CSV data:

qsv query "SELECT * FROM foo.csv AS foo INNER JOIN bar.csv AS bar ON (foo.id = bar.foo_id);"

qsv query "WITH ages(age) AS (SELECT age FROM testdata/people.csv) SELECT * FROM testdata/people.csv AS people INNER JOIN ages ON (people.age = ages.age);"

you can escape spaces in a filename like so (you may have to escape backticks depending on shell):

qsv query "select * from `testdata/occupations with spaces.csv`"

you can load from gzipped CSV data:

qsv query "select * from testdata/people.csv.gz"

Statistical analysis

qsv can run some limited statistical analyses on a CSV given to it, returning things like the mean, standard deviation, top 10 most common values for each column:

qsv stats testdata/statistical.csv

SQLite user defined functions

In order to make some common data analysis tasks simpler, qsv has a number of user defined functions added to SQLite. If there's something you'd like added, please request it as a Github Issue.

  • md5(text)
  • sqrt(real)
  • stddev(real)
  • mean(real)

Options

  • --delimiter= to set a custom delimiter in the CSVs. Only set globally on the query
  • --textonly force all columns to be inferred as strings/text
  • --trim trim fields in CSVs in case there is additional whitespace. Will not remove whitespace from the middle of a string
  • --output-header outputs the header alongside the results of the query. Off by default to make the output more in line with SQL