- Follow
- Copilot was used to help with completion
- Only 1 dep PIL - for parsing/displaying image
- To predict on test set
py -3 mnist.py -i model.pkl
- To train
py -3 mnist.py -t [optional_starting_model]
- Add visualization of the layer
Download mnist dataset from https://www.kaggle.com/datasets/hojjatk/mnist-dataset?resource=download
Figureout the dataset format
Write a dataset loader.
A neuron takes in multiple inputs and produce a single output - activation
Each neuron from one layer is connected to all neurons from a previous layer
output = sigmoid(w*a+b)
activation from previous layerw
weights of the connection between neuronsb
is the bias of this neuron
The sigmoid function is only there to normalize the output
Sigmoid(0) == 0.5 so if we initialize the weights and biases to 0, the output of the neuron will be 0.5
Our model will have 4 layers 784 - 16 - 16 - 10
The first layer is the input layer 784 == 28*28
Last layer is the output layer for 10 digits.
Neuron from one layer will connect to all neuron from its previous layer so we will have:
- 78416 + 1616 + 16*10 = 12960 weights
- 16 + 16 + 10 = 42 biases
How to feed forward
The number of weights between 2 layers of size and
The number of biases of layer of size
wil be m -
The activations of a layer will be the input of the next layer
Back propagation
Learning rate of
is considered too high but It seems to work for me ?
- Input need to be normalized
- Remember to reset the
in feed forward
# This is RIGHT
self.biases = [random.random() for i in range(self.dim)]
# This is WRONG
self.biases = [random()]*self.dim
- Becareful with the sign of
- Fucking
function returns 1 from30
or something makingactivations
in the 1st hidden layer becomes all 1 - My model finally converges
- It turns out how we initialize normalizing
, intializingweights
and choosingactivation function
is really important. - The
function resolution is really shittysigmoid(40)
is already 1 - It's really important to get the math right in
right in order for your model to converge. - Try to invent something with your basic intuition wouldn't work