One Data Science a Day
Welcome to One Data Science a Day! This page is home to my current obsession, where I will do one data science a day for 30 days, and write about it.
First time here?
Read the about section
What kind of unit is 'one data science'
Honestly I don't know. The plan was to work on one data science-related concept every day. As you're about to see, it didn't quite work out that way.
Why would you do this
I've been interested in data science for a long time! (My Twitter bio has contained the words "data scientist wannabe" since at least 2015.)
Also, I like to write things, but I haven't really been doing that all too often. I'm trying to being more prolific with both my data science practice and my writing, and this seems like an ok way to do it.
I might also try to drink less during these 30 days or something, but let's not get too crazy.
Read Day Nineteen on collaborative filtering
Deep learning with AWS P2
Where I make the decision to try the Practical Deep Learning for Coders course (Fast AI from here on), and use Amazon Web Services for the first time.
Where I write about one of my favorite things I've learned all year (programmable GPUs), and explore the Fast AI teaching philosophy.
Computer vision with VGG-16
Where I teach a computer to see, and encounter some difficult truths on the internet.
Training, validation, and test data
Where I write about the sorting of data into different sets for machine learning, and... Sort a bunch of data into different sets for machine learning.
VGG in the wild, part I
Where I try to submit predictions for a Kaggle competition.
VGG in the wild, part II
Where I actually submit predictions for a Kaggle competition, and fantasize about the future.
VGG in the wild, part III
Where I improve my Kaggle ranking, and run into an old enemy.
High school math - matrix multiplication
Where I learn about matrices in an attempt to better understand neural networks, and recall the difficulties of learning math in school.
High school math - derivatives
Where I develop a basic understanding about a subject that has been evading me since 2006.
Looking back at it, this was when it became clear that this project was going to be more "One Deep Learning a Day" than "One Data Science a Day".
What's the difference? I find it useful to think of:
- Data Science as a field where we try to solve data-related problems using computer science techniques*
- Machine Learning as a data science specialization where we try to get the computers to solve those problems on their own
- Deep Learning as a machine learning technique where the way the computers solve those problems is by using neural networks
* There's some debate about whether data science is meaningfully different from statistics but I think bringing computer science into it helps us split that baby.
Where I attempt to explain neural networks by building one in a spreadsheet.
Linear model in Keras
Where I implement our spreadsheet neural network in code as a linear model.
Where I improve our spreadsheet neural network by adding a weights initialization method, and share maybe a little too much about other things I've been doing in the past few weeks.
Data for good
Where I don't actually write about a data science concept, and instead go on a tangent about an event I attended where I met one of the people who made this whole project possible.
And this was I realized that all I wanted to do was finish the Fast AI course, which was turning out to be much, much harder than I'd anticipated. Almost two months after starting, I'd barely finished Lesson 2... And my posts were ballooning in size and complexity.
Neural network layers
Where we briefly discuss famous datasets (ImageNet, MNIST), and dive into the different neural network layers (Lambda, ZeroPadding, Convolution, MaxPooling, Flatten, Dense) in the Keras library.
CONVOLUTIONS DEEP DIVE BAYBEEE. Where I make some 90s pixel art in a spreadsheet and call it deep learning. Not really. But kinda.
Where we explore gradient descent, the algorithm that helps neural networks find truth in the world.