February 15th Update

Jan 28 ~ Feb 3 Progress

I downloaded the data set of pixel values and answers from the internet and read the README to find out how to use the data in the program.

Then I created this code to run the network by going through each layer and getting the output of the product of the input and the weights plus the biases. I used vectors and matrices to compute this efficiently.

Then the get_cost() function that takes in the final output and the answer number and returns the cost of the function.

I also watched some of the videos below to understand how to do backpropagation, how to compute the gradient descent, and how to implement it.

After that, at the end of the week and on the weekends I created smaller versions of the network and attempted to code the backpropagation mechanism. I also made sketches on the notebook to help outline the lists and variables that I need in the process and what indexes I needed to use.

The picture on the left is the mini-network, with 3 inputs 1 layer with 3 neurons and 2 outputs. Each column in the picture on the right is a list that will contain values to calculate the gradients for each weight and biases.

The first column contains the derivatives of the cost function for each output. It contains 2(a-1) and 2(b-1). The second column is the derivative for the sigmoid function in each neuron. The third column is the derivative for each input of the neurons. The fourth column labeled w is the derivative for each weight.

Backpropagation works by calculating the effect -gradient- of each weight and bias on the cost function. So, you go backward through the network and calculate how much each step affects the final cost function.

Feb 4 ~ Feb 10 Progress

This week was Chinese new year break and after I got the mini-network to do backpropagation after many lines of code. I changed the number of outputs to 3 outputs to test that it scales up accordingly

After that, I implemented the backpropagation to the bigger network and tried to make it work (however, I will note that it was quite hard to actually know if it worked correctly since it has so many parts that I can’t calculate myself and compare them.) Some minor errors occurred but I got them fixed quickly.

Then I worked to actually implement the changes calculated in the backpropagation to the network to make it learn. From the sources I read, it said that it is efficient to implement the changes in mini-batches of random cases. This is so that the network does not favor one of the numbers but also does not have to go through 10000 cases just to create slight changes.

Then, I tested with multiple examples and printed the cost of the network each time to make sure it was working correctly.

Although the cost went down in the first 10 cases, it would plateau after a certain point. So, I created a checking system that takes in an input, and spits out the output and also prints the actual answer.

From doing multiple test runs of the system, I saw that the system clearly prefers to output the answer ‘0’, so I went through the code line-by-line to see what could be causing this.

I found that the code I used for the mini-network arbitrarily set the answer of any case to 0, so I got that fixed and trained the network some more. On Sunday of this week, the network guess the majority of the answers correctly

Feb 11 ~ Feb 17 Progress

Since the network is now working I decided to find a way to store the data of the current network somewhere so that the network does not have to be trained every time you run the program. I found out that Python 3 has a built-in module called pickle that can store values in a .pickle file, so I read quick tutorials online and created save & loading code for the network.

In this process, I decided to create a 2d-list containing lists of file names for each network so that you can load in the appropriate files easily. I also created a ‘save-as'(pas) feature and a ‘create new’ feature.

While creating these features, I thought I could create a small user-interface to navigate and use the network’s features better.

Features include: train(t) that trains the network x* many times based on the number that you give it / check(c) that checks the network for one case based on the answer / check_multiple(ch) that checks x* many times and returns back the accuracy of the network / save and save_as (p,pas) that saves the network / delete(del) which deletes a saved file names of a network from the list / stop(s) to terminate network.

Finally, I wrote a few comments on each function and a few loops so that it is easier to read and understand for me and others.

Now I am thinking about where this could have a possible use around me or the school and I am going to try to implement the network for that use.

Passion Project – Spring 2019 Planner

This is my hope for the future for my Passion Project class. It somewhat feels a bit tight but I was pleasantly surprised by the speed at which the network was built, to be honest; and I know that work for other classes is most likely going to increase in the coming weeks, but I felt that I should set my goal to be pushing me further.

January 21st Progress

This week I finished the udemy python tutorial that I had been working on the past few weeks.

Some of them are basically just test/playground files. Some of them are useful programs and small games; which are blackjack, tic tac toe, the collatz conjecture ‘simulation’, and the sieve of Eratosthenes(which is for listing primes under a certain range). I also made a program to solve a math probability problem under the name Lily(or one of the files with a variation of that name). Overall, after this experience, I feel much more comfortable working in python: creating basic classes and objects, manipulating lists etc. However, I do feel a bit slow in python but I guess that will get better over time.

I also started – an attempt – to build a basis neural network. I got carried away when “trying stuff out” so I spent extra time this weekend to build the basis of the system. It going to be the same neural network as the one here:

However, at the time of writing this(Jan 27) some problems arose. Firstly, I can’t seem to find a simple way to store variable-value information between each ‘run’ of the program in python, which would undermine the “learning” part of the network. Second, I haven’t found out how to input the pixel color values into the programs from an image file. Lastly, I haven’t figured out how I am going to do any of the back-propagating required for the network to actually learn efficiently.

I have, in the course of doing this, also watched 3blue1brown(the creator above)’s series of videos on linear algebra, so that I understand what I am programming when I do vector calculations in python. That is here:

I have a lot more work to do if I am going to make anything out of this… so more work I guess.

Other ideas I thought of in conjunction with neural networks are stuff like voice recognition or figuring out things like emotion, topic or others from texts; so text analysis I guess.

January 7th &January 14th Progress

This week I have mostly been working on learning basic python and its syntax and also brainstorming goals for my project.

Python stuff:

https://www.udemy.com/complete-python-bootcamp/

  • This tutorial helped me get used to the syntax of python, using classes, manipulating lists. Now(Jan 23rd) I have made a tic-tac-toe game and a blackjack game

https://www.cheatography.com/davechild/cheat-sheets/python/

  • Just a basic sheet with syntaxes in case I forget.

https://automatetheboringstuff.com/#toc

  • Syntax, Information manipulation. It showcases a lot of different and interesting ways to use lists, dictionaries and introduces some new concepts like regular expressions.

Brainstorming:

Neural networking:

http://neuralnetworksanddeeplearning.com/chap1.html

  • This website was from the video below. It explains how neural networks work and a lot of the math behind it. I used it more as an add-on to the video but I suspect that this article will be more useful when I start making one.

http://colah.github.io/

  • This also has a lot of math and complicated concepts in it, but a few pages were an interesting read nonetheless.

3Blue1Brown’s amazing video series:

Sudoku(?):

https://stackoverflow.com/questions/6963922/java-sudoku-generatoreasiest-solution

  • A forum about solutions to a sudoku-generating program. Led to the Wikipedia article below but didn’t understand a whole lot. The code was all in java so I had some difficulty reading the code. Still was interesting though.

https://en.wikipedia.org/wiki/Dancing_Links

(links reduced to sources)

Rubiks Cube:

and this guy’s channel in general Code Bullet:

https://www.youtube.com/channel/UC0e3QhIYukixgh5VVpKHH9Q

  • He makes algorithms to play games and solve puzzles. He does some neural networking type stuff too. But a lot of his solutions seem like he’s using brute force for specific situations instead of a general solution. This Rubik’s cube isn’t one though.

Other ideas:

A few game ideas

Visualizing Math (3Blue1Brown style) and also 3D visualizations

Simulations (Virus/scientific simulations) + (Physics engine) + (random other simulations)