Image Experiments

Why does this repository exist?

This repository is a collection of experiments that I have done with images. So I can learn how to manipulate images and the various deep learning architectures that are used to generate images.

What does this repository contain?

1. UNet 2D

A basic diffusion model for generating images.

2. Vector Quantized Generative Adversarial Networks (VQGAN)

A model that learns to represent images into discrete tokens. Can be used for image tokenization. Has training code for both the Generator and the Discriminator in PyTorch.

What is the purpose of this repository?

This is my playground for experimenting with images. I will be adding more models and experiments as I learn more about other architectures and techniques. The final goal is to add multimodality in Smol-LM

Notes

The code is not optimized for performance. It is written in a way that is easy for me to experiment with.
Improvements and suggestions are welcome. Feel free to open an issue or a pull request.
Experiments with Audio is done over on AudioExpts

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
model		model
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
train_unet.py		train_unet.py
train_vqgan.py		train_vqgan.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Experiments

Why does this repository exist?

What does this repository contain?

1. UNet 2D

2. Vector Quantized Generative Adversarial Networks (VQGAN)

What is the purpose of this repository?

Notes

About

Languages

andrew264/ImageExpts

Folders and files

Latest commit

History

Repository files navigation

Image Experiments

Why does this repository exist?

What does this repository contain?

1. UNet 2D

2. Vector Quantized Generative Adversarial Networks (VQGAN)

What is the purpose of this repository?

Notes

About

Topics

Resources

Stars

Watchers

Forks

Languages