I spent Sunday going a little deeper on derivatives, but I didn't think that deserved a post on its own.

Ten days in (sort of), I am starting to realize that what I set out to do here is really three things:

  1. Do one data science thing a day
  2. Pick up deep learning
  3. Write about it all for a non-technical audience

And it's clear that I've really been focused on (2), to the detriment of (1) and (3). Maybe when I'm done with this course it would make sense to take on another project where I actually make things that are valuable to people. And write in a way that provides clarity on specific topics within the field of data science instead of rambling on about my incremental breakthrough of the day.

That all makes sense, but I'm going to continue with this format for now - I think it's been doing a pretty good job of keeping me honest about my progress.

Not Hotdog

Earlier today, I read How HBO's Silicon Valley built "Not Hotdog" with mobile TensorFlow, Keras & React Native and the accompanying Hacker News disccusion and I understood what people were talking about. A pre-trained ImageNet model? VGG IS A PRE-TRAINED IMAGENET MODEL. It sounds like getting a neural network to run natively on a mobile phone is an entirely different beast though (they decided against a system that would make a call to the cloud for both user experience and cost reasons).

There were a couple of perspectives I particularly appreciated, one dealing with the problem of finding pictures of hot dogs that would match the kind of pictures the app would be likely to encounter in real life (after all, real life isn't Kaggle competitions). From the article, emphasis mine:

  • Matching image types to expected production inputs. Our guess was people would mostly try to photograph actual hotdogs, other foods, or would sometimes try to trick the system with random objects, so our dataset reflected that [...]
  • Expect distortions: in mobile situations, most photos will be worse than the “average” picture taken with a DSLR or in perfect lighting conditions. Mobile photos are dim, noisy, taken at an angle. Aggressive data augmentation was key to counter this.
  • Additionally we figured that users may lack access to real hotdogs, so may try photographing hotdogs from Google search results, which led to its own types of distortion [...]

Another was about their development machine, which I was surprised to learn was just a MacBook with a GPU attached.

MacBook plus GPU

From the Hacker News discussion, again emphasis mine:

Cloud does not quite totally make sense to me until the costs come down, unless you are 1) pressed for time and 2) will not be doing more than 1 machine learning training in your lifetime. Building your own local cluster becomes cost-efficient after 2 or 3 AI projects per year, I’d say.

Oh. What are my AWS bills again? It's worth noting of course that the author ran 240 epochs to train his model, over a total of 80 hours, so my own puny models (3-5 epochs that take a total of 30 minutes to run) are on a different level when it comes to cost considerations.

I was hoping that I would get more into neural networks today, but I was at work until 6, data analysis class until 8:30 and rehearsal until 11:30, so I guess it'll have to wait. Is this starting to feel like a tired excuse? Uhm... Have a distracting Tweet!