Welcome to the official repository of the book Python Polars: The Definitive Guide by Jeroen Janssens and Thijs Nieuwdorp. The book is still being written and is scheduled to be published by O'Reilly in February 2025.
- Read the Early Release version on the O'Reilly Learning Platform.
- Become a member of the Polars Discord server to discuss anything related to Polars and to download Early Release PDFs from the
#book-the-definitive-guide
channel.
Get ready to speed up your data analysis and start working with larger-than-memory datasets. Polars offers a blazingly fast, multi-threaded, elegant API for data loading, manipulation, and processing. Authors Jeroen Janssens and Thijs Nieuwdorp walk you through every aspect of Python Polars as they tackle practical use cases using real-world datasets. You’ll not only learn the syntax, but also understand the underlying concepts. You don’t need to have any experience with Pandas or Spark, but if you do, this book will help you make a smooth transition.
With this definitive guide at your side, you’ll be able to:
- Process larger-than-memory datasets at record speed
- Apply the eager, lazy, and streaming APIs of Polars and decide when to use which
- Transition smoothly from Pandas or Spark to Polars
- Integrate Polars into your existing codebase
- Work with Arrow and Parquet to efficiently read and write data
- Translate complex ETL tasks into efficient and elegant queries
Note that this outline is subject to change.
- Foreword by Ritchie Vink, creator of Polars
- Acknowledgements
- Introducing Polars
- First Steps
- Transitioning from Pandas to Polars
- Data Types and Data Structures
- Eager and Lazy APIs
- Reading and Writing Data
- Beginning Expressions
- Continuing Expressions
- Combining Expressions
- Selecting and Creating Columns
- Filtering and Sorting Rows
- Working with Special Data Types
- Summarizing and Aggregating
- Joining and Concatenating
- Reshaping
- Creating Visualizations
- Extending Polars
- Polars Internals