📊 Pandas Dataset Explorer

This project is designed to help you systematically explore and analyze datasets using Python and pandas, following the structure of the Real Python tutorial “Using pandas and Python to Explore Your Dataset.” It provides scripts, notebooks, and examples that demonstrate:

📥 Environment setup
- Install Python 3, pandas, matplotlib (and optionally Jupyter/Anaconda).
- Sample install commands using pip or conda.
🧰 Data ingestion
- Use pd.read_csv() (or read_json(), read_html(), etc.) to load data.
- Example uses real-world data: e.g. NBA results CSV.
🔍 Initial data inspection
- Use .head(), .tail(), .info(), .shape, and len() to get a quick overview.
- Adjust display settings (display.max.columns, display.precision) for better visibility.
🗂️ Data structure understanding
- Explore Series and DataFrame basics.
- Compare indexing methods: bracket [], .loc, .iloc.
❓ Querying & filtering
- Filter rows via conditions (e.g., df[df["col"] > X]).
- Use .loc, .iloc to select specific rows/columns.
📊 Grouping & aggregating
- Summarize data using .groupby(), .sum(), .mean(), .count().
- Combine datasets (concat, merge) when working with multiple sources.
🧼 Cleaning & casting
- Detect and handle missing/inconsistent/invalid values.
- Convert types (df["col"] = df["col"].astype(...)) as needed.
📈 Visualization
- Use pandas' built-in .plot() (histograms, scatter, bar, etc.) to visualize distributions, trends, and categories.
- Leverage matplotlib integration within Jupyter or standalone scripts.

📁 What's Included

environment_setup/ – shell scripts and instructions to configure your Python environment.
data/ – sample datasets (e.g. NBA ELO, FiveThirtyEight, etc.).
notebooks/ – Jupyter notebooks illustrating each key step:
1. Overview & loading
2. Inspection & display settings
3. Indexing & selection
4. Querying & filtering
5. GroupBy and aggregation
6. Cleaning & typing
7. Merging datasets
8. Visualizing data
scripts/ – Python files that reproduce key tasks outside Jupyter.
requirements.txt – minimal dependencies (pandas, matplotlib, Jupyter optional).
README.md – (this file).

📝 How to Use

Clone the repo
```
git clone <repo-url>
cd <repo-folder>
```

The virtual environment (pandas_env/) is included for convenience.
Outputs

Set up your environment

pip install -r requirements.txt
# or
conda install pandas matplotlib jupyter

Run a nootbook
```
jupiter notebook
```
Explore!

Follow the notebooks step-by-step to learn:
Inspecting your data with .info(), .head(), .shape, .describe()
Subsetting using .loc, .iloc, filtering expressions
Grouping and summarizing by category
Cleaning missing and inconsistent entries
Visualizing distributions and relationships with .plot()

🎯 Learning Outcomes

By the end of this project, you’ll be able to:
Load data from multiple formats into pandas
Understand core data structures: Series & DataFrame
Access and filter data efficiently
Aggregate and group information to extract insights
Clean data and prepare it for analysis
Create visualizations that highlight key patterns
Combine multiple datasets for comprehensive analysis

📚 References

Reka Horvath (2020, January 06). Using pandas and Python to Explore Your Dataset Real Python. https://realpython.com/pandas-python-explore-dataset/

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
ExploringPandas.ipynb		ExploringPandas.ipynb
README.md		README.md
download_nba_alll_elo.py		download_nba_alll_elo.py
nba_all_elo.csv		nba_all_elo.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📊 Pandas Dataset Explorer

📁 What's Included

📝 How to Use

🎯 Learning Outcomes

📚 References

About

Uh oh!

Releases

Packages

Languages

Jonnius00/ExploringPandas

Folders and files

Latest commit

History

Repository files navigation

📊 Pandas Dataset Explorer

📁 What's Included

📝 How to Use

🎯 Learning Outcomes

📚 References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages