Blog

Life and all its glorious distractions

Tell me if you’ve seen this pattern before. Some blogger, content creator or ethereal internet presence has some platform where they post things for a bit and then evaporate. Time passes and they come back briefly. They claim that life things happened and vow that production will kick back up. If you’re lucky, they may post once or twice more before returning to the void from whence they came.

I had a baby this year. She’s wonderful. But she has also been feeding directly off of my life force. She’s almost a year old and she’s only now beginning to sleep the night (sorta). I’ve been in full-on survival mode, trying to keep my head above water. LIFE THINGS HAPPENED.

The purpose of this blog was for me to kick-start my data science learnings. That happened to a degree but, eventually, I had to throw some things overboard to keep the plane in flight. I’m circling back and, while I still have a deep interest in data science, I have many other interests I’d like to share and it’d be a shame not.

For example, in the last couple months, I bought myself a Raspberry Pi and found that it’s an absolute beast for becoming an emulation machine. That’s led me to spend all sorts of time geeking out on it, overclocking it, buying heatsinks and fans to overclock it. Modding and overclocking PCs used to be my favorite thing to do on the planet.

So, my future readers, accept my proclamation of production increase. I give it to you with the best of intentions. If you see nothing else from me, well, shit.

Don’t fear the math

ML can be intimidating when you first check out some online courses or do something gonzo like try to read a white paper. None are shy about flashing mathematical formulas and functions. It’s easy to allow your eyes to go out of focus, ignore what they scan and trust that the magic ML gnomes will continue to work their mystical powers under the hood.

Personally, I never liked that feeling. In earlier attempts to take online courses, it turned out to be one of my biggest sticking points. I’d stall on progress because every new concept would send me googling around to try to learn what the hell things like the mean squared error were and why it mattered. It quickly became the proverbial rabbit hole and I plunged right down it.

My background is certainly not mathematical and I did what I needed to in order to satisfy my high school’s/college’s math requirements. Some things stuck, some things didn’t. Part of the problem is like this TED talk describes. Math can be dry, purposefully hyper abstract, divorced from the history of what made it important and lacking a human touch.

Trying to be proficient in ML, it just doesn’t help to continue to live in ignorance. I had a massive math deficit and I had to pay it down BUT I also needed to make progress on my online courses. The promise I made to myself was that I would meet the deadlines of my online course first before I sated my math curiosities. Put into practice, I’d make sure to do my ML homework first and then, time permitting, take a deeper dive into the math of things.

That’s worked out pretty good and I’ve gained some level of math literacy that I haven’t had in the past. With math formulas, my biggest trouble was from the profusion of abstract, single character named variables that show up as coefficients, superscripts and subscripts. When I slowed things down and learned how to identify how they worked, it made things much more approachable and unlocked a whole host of formulas that were formerly beyond my reach. If you want an easy-to-digest sample for yourself, start with the formula for calculating averages since it has a formal definition to go along with its simple concepts.

Outside of just being able to read the mathematical sheet music, I’ve been trying to understand the relationships in math. That’s really where much of the magic of math happens. Everything is related in some way. If you know one or more pieces of information, you can calculate or discover new data. I just went through this whole unit on word embeddings and the vector space and one of the measures of word similarity is done by calculating the cosine between word vectors. There are a variety of related math concepts that made such an interesting feat possible.

What I truly desire is a mathematical intuition. I want to see a math formula and just have a feeling for what it does without having to be told exactly what it does and why. I want to know my way around such that I can diagnose when my ML model isn’t performing well and how I could improve it. I want to proactively improve my models with what I understand.

My understanding continues to evolve. Learning about linear algebra, trigonometry, calculus, and other areas has been a blast. I highly recommend YouTube tutorials and intros to math. There’s so much good information out there and there are some really enthusiastic and prolific math YouTubers. Khan Academy is excellent too with clear explanations and with some examples you can try. Give math a chance!

What’s on ye old plate

I’ve got a lot of stuff cooking. I’m chasing a bunch of interests and my own professional improvement. To name a few,

On that last point, my wife recently gave birth to our 4 week old daughter. I’ve got 8 weeks parental leave from work (thanks, GLG!) and I had some crazy thought I could leverage that time and really dig in on ML. While I have been able to do some stuff, this kiddo is way more work than I remembered (she’s daughter #3). Most of the time, I’m a glorified heat rock for this little lizard to coil up on.

I’m not squandering a moment though and trying to maintain as much learning immersion as possible. When I’m doing one of my many parental duties, I’ve got YouTube running in the background playing introductory videos about machine learning, statistics and math topics. I’ll probably make a post sharing some of my favs.

One quick one I’ve been loving is 3Blue1Brown’s Essence of Linear Algebra. It’s a visual tour of operations like transformation and scaling that I think is going to be key to building intuition about how this math works. It’s quickly expanded my understanding of vectors and how they relate to some of these machine learning parameters that features hundreds of vector values. Definitely check it out.

Sometimes there are things that unlock my understanding that could be useful to others (like the aforementioned YouTube series). I’ll try to condense that into some TIL tidbit that I can quickly blog about. I’d tweet it but…I just never got into Twitter. Maybe that will change but I’m not holding my breath about it. So far I’ve only used it to complain to companies when I can’t get good customer service.

Data Sciencing: The Journey Begins

In recent months, I’ve taken a massive interest in doing some machine learning which is turning me on to the whole world of data science. Not for the typical, “ML, so fashionable” reasons. I have an actual use case I want to solve at work.

Great, so I wanna learn some cool ML tricks to help define and group experts using the secrets hidden in our data. Let’s do a quick google search on how to learn about this stuff…*face melts*

Long story short, everyone and their grandma has got an online course. There’s no shortage of YouTube videos/coursera/EDx/udemy/LinkedIn Learning/kaggle/etc. It’s dizzying. I decided to spite my inner FOMO engineer and just took a rough cut of courses and selected Duke University’s Introduction To Machine Learning using some weak ass selection criteria. Basically, anything that looked like I could reasonably complete it.

The good news, it was a great choice. It had a fairly low time commitment which is essential when you have 3 kids (one of which is a 3 week old infant) and a full time job. I was able to get the work done each week. What made this a great match for me was the repetition, visualization and breakdown of the different ML tools like logistic regression, RNN and CNN to name a few. Pretty easy to digest and internalize for a newb. It alleviated much of the mystique around ML as well as not destroying me with equations and math formulae. The labs were mostly good and maybe there could have been a tiny bit more hand holding on the RNN lab. None the less, it did the trick and I feel smarter.

I’m jazzed about learning all of this. Each time I’m introduced to a new concept, I run off and read about it on wikipedia or find some YouTube videos that speak slowly and uses simple words for my newly forming ML brain to comprehend. I’ll actually take the time to decipher the mathematical equations just to gain some more insight into why things are the way they are.

There’s going to be some method to the madness though. I have an actual path for the learning and I’m actively resisting any analysis paralysis of all the learning resources out there. I’ve got the following going on:

  • I like having some structure to my learning so I’ve signed up for the DeepLearning.AI Natural Language Processing Specialization.
  • I’m practicing as much immersion as I can. For example, if I’m driving, I’ll listen to a podcast or YouTube video discussing things like principle component analysis. I’m trying to fit as much of this into my day that I can.
  • I’m writing down any crazy idea for a ML project that I have. What’s going to accelerate this learning is executing on some of these new skills.
  • I’ve picked up some books to read when I want a break from online learning. I’ll follow up with some reviews when I’ve had a chance to dig in.
  • I started this damn blog to chronicle my progress and keep me accountable.

On that note, apologies for the shitty blogging and very shitty looking website. I’ve got some work to do on sprucing the place up. I’m waaay out of practice on writing altogether and it shows. Expect better in the near future (thumbsup)