Jan 28 ~ Feb 3 Progress
I downloaded the data set of pixel values and answers from the internet and read the README to find out how to use the data in the program.
Then I created this code to run the network by going through each layer and getting the output of the product of the input and the weights plus the biases. I used vectors and matrices to compute this efficiently.
Then the get_cost() function that takes in the final output and the answer number and returns the cost of the function.
I also watched some of the videos below to understand how to do backpropagation, how to compute the gradient descent, and how to implement it.
After that, at the end of the week and on the weekends I created smaller versions of the network and attempted to code the backpropagation mechanism. I also made sketches on the notebook to help outline the lists and variables that I need in the process and what indexes I needed to use.
The picture on the left is the mini-network, with 3 inputs 1 layer with 3 neurons and 2 outputs. Each column in the picture on the right is a list that will contain values to calculate the gradients for each weight and biases.
The first column contains the derivatives of the cost function for each output. It contains 2(a-1) and 2(b-1). The second column is the derivative for the sigmoid function in each neuron. The third column is the derivative for each input of the neurons. The fourth column labeled w is the derivative for each weight.
Backpropagation works by calculating the effect -gradient- of each weight and bias on the cost function. So, you go backward through the network and calculate how much each step affects the final cost function.
Feb 4 ~ Feb 10 Progress
This week was Chinese new year break and after I got the mini-network to do backpropagation after many lines of code. I changed the number of outputs to 3 outputs to test that it scales up accordingly
After that, I implemented the backpropagation to the bigger network and tried to make it work (however, I will note that it was quite hard to actually know if it worked correctly since it has so many parts that I can’t calculate myself and compare them.) Some minor errors occurred but I got them fixed quickly.
Then I worked to actually implement the changes calculated in the backpropagation to the network to make it learn. From the sources I read, it said that it is efficient to implement the changes in mini-batches of random cases. This is so that the network does not favor one of the numbers but also does not have to go through 10000 cases just to create slight changes.
Then, I tested with multiple examples and printed the cost of the network each time to make sure it was working correctly.
Although the cost went down in the first 10 cases, it would plateau after a certain point. So, I created a checking system that takes in an input, and spits out the output and also prints the actual answer.
From doing multiple test runs of the system, I saw that the system clearly prefers to output the answer ‘0’, so I went through the code line-by-line to see what could be causing this.
I found that the code I used for the mini-network arbitrarily set the answer of any case to 0, so I got that fixed and trained the network some more. On Sunday of this week, the network guess the majority of the answers correctly
Feb 11 ~ Feb 17 Progress
Since the network is now working I decided to find a way to store the data of the current network somewhere so that the network does not have to be trained every time you run the program. I found out that Python 3 has a built-in module called pickle that can store values in a .pickle file, so I read quick tutorials online and created save & loading code for the network.
In this process, I decided to create a 2d-list containing lists of file names for each network so that you can load in the appropriate files easily. I also created a ‘save-as'(pas) feature and a ‘create new’ feature.
While creating these features, I thought I could create a small user-interface to navigate and use the network’s features better.
Features include: train(t) that trains the network x* many times based on the number that you give it / check(c) that checks the network for one case based on the answer / check_multiple(ch) that checks x* many times and returns back the accuracy of the network / save and save_as (p,pas) that saves the network / delete(del) which deletes a saved file names of a network from the list / stop(s) to terminate network.
Finally, I wrote a few comments on each function and a few loops so that it is easier to read and understand for me and others.
Now I am thinking about where this could have a possible use around me or the school and I am going to try to implement the network for that use.
Passion Project – Spring 2019 Planner
This is my hope for the future for my Passion Project class. It somewhat feels a bit tight but I was pleasantly surprised by the speed at which the network was built, to be honest; and I know that work for other classes is most likely going to increase in the coming weeks, but I felt that I should set my goal to be pushing me further.