Generate Temp Table SQL

Generate Temp Table SQL is a Python package that generates SQL statements for creating a temporary table and inserting data from a CSV file. It's useful when you need to move data between disconnected databases and data warehouses. You can now simply unload a CSV, turn that CSV into SQL statements creating a temp table and inserting data with a CLI command, and copy those SQL statements into your query editor so you can start using the data in a different warehouse.

Why did I build this? I often need to move data between disconnected warehouses. E.g. when I need to join data in my finance warehouse with data in my operational warehouse to see how a feature is linked to customer spend. I can use excel to convert a CSV to SQL statements...but it's a PITA.

With generate_temp_table_sql, I can now move the data in seconds. I simply download the CSV, run generate-tt-sql csv_name.csv in my terminal, and copy it into my query editor. It's immediately available to query so I can go straight to analysis.

If you have the same problem, this will save you a ton of time!

Also available in PyPi.

Features

Load data from a CSV file
Generate a CREATE TEMP TABLE SQL statement
Generate INSERT INTO SQL statements for the data
Command-line interface (CLI) for easy usage

Installation

Prerequisites

Python 3.6 or higher
pandas library

Installing

Clone the repository:

git clone https://github.com/rywaldor/generate_temp_table_sql.git
cd generate_temp_table_sql

Install the package locally:
```
pip install -e .
```

Usage

Command-Line Interface (CLI)

After installing the package, you can use the generate-tt-sql command to generate SQL statements from a CSV file.

Basic Usage

To generate SQL statements and save them to a file:

generate-tt-sql path/to/your/file.csv

Additional Options

--o The path to the output SQL file.  Defaults to the director you call the command in.
--overwrite: Allow overwriting the output file if it exists.
--table_name: Specify the name of the temporary table to create.
--column_type: Specify the data type of the columns in the temporary table. Defaults to TEXT which works for Redshift and Snowflake. Use STRING for BigQuery.
--batch_size: Specify the number of rows inserted per insert statement.  If present, it creates multiple insert statements based on the batch size specified.

Example

Assume you have a CSV file example.csv with the following content:

name,age,city
John,30,New York
Jane,25,Los Angeles

Run the following command to generate SQL statements:

generate-tt-sql example.csv -o output.sql --table_name my_temp_table --column_type STRING

The output.sql file will contain:

CREATE TEMP TABLE my_temp_table (
    name STRING,
    age STRING,
    city STRING
);

--Insert Data SQL:
INSERT INTO my_temp_table (name, age, city) VALUES 
    ('John', '30', 'New York'),
    ('Jane', '25', 'Los Angeles');

All arguments are optional except the csv

Running Tests

To run the tests, use the following command:

python -m unittest discover -s tests

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a new branch (git checkout -b feature-branch)
Commit your changes (git commit -am 'Add new feature')
Push to the branch (git push origin feature-branch)
Create a new Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Ryan Waldorf - ryan@ryanwaldorf.com
GitHub LinkedIn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Generate Temp Table SQL

Features

Installation

Prerequisites

Installing

Usage

Command-Line Interface (CLI)

Basic Usage

Additional Options

Example

Running Tests

Contributing

License

Author

Files

README.md

Latest commit

History

README.md

File metadata and controls

Generate Temp Table SQL

Features

Installation

Prerequisites

Installing

Usage

Command-Line Interface (CLI)

Basic Usage

Additional Options

Example

Running Tests

Contributing

License

Author