an image

My First Models

brain

Last week I set myself up with Jupyter Lab on GCP. This week I wrote notebooks and trained my first models. My goal was to understand enough to recognize handwritten digits using the MNIST database. After much confusion, wriggling around, and asking ChatGPT for help, I was able to produce a model that achieved 95% accuracy on my validation set.

Recognizing Digits

The notebook with the code that did this is here on Github. I'm sure that when I look back on this it will be clear that I could have done it in 5 lines of code, and that I just didn't know the magic incantation. I also wanted to be able to do it in 100 lines of code (that is, without as many of the helpers from the fastai library), but I struggled to do that too. I plan to both of these at some point.

The model started at a mere 15% accuracy after the first training epoch, and was up to 50% by the end of 10:

learning rate

Eventually, additional epochs seemed to have diminishing or negative returns, so I stopped.

learning rate

Looking at the graph of my models accuracy over the course of many epochs is very satisfying:

learning rate

An interesting post-training activity is to look at which images from the validation set produces the worst results. In some sense, these are "outliers". These are images that will look the most different than the images we trained on:

worst

Other Image Classification Problems

I trained a few other models along the way (following along with some fast ai guides). Here, I classified some pets:

pets

And here, I added multiple labels to images as a way to indicate what was in them:

labels

What's next

I followed along with chapter 4 from the book and ended up typing all the commands from scratch because just running the clean version of the notebook they provided didn't give the kind of insight that let me solve the digit classification problem. That notebook is here.

I'm excited to keep going. There's more to learn in this space of image classifiers, but I think I will steer towards two different kinds of problems: tabular data (the titanic kaggle challenge) and natural language processing. I have not decided what problem I'll try to solve with nlp just yet, but I suspect those skills will be especially useful for the project I'm on at work, which relates to finding useful insight from web browsing history.

More fun images

brain brain brain brain brain brain