Radek Osmulski: How to Teach Yourself ML

Radek details his journey switching careers into software engineering and then into machine learning. He talks about mistakes he made, how he would do it now, and gives a preview of his forthcoming book.

Guest Bio

Radek Osmulski is a fully self-taught machine learning engineer. After getting tired of his corporate job, he taught himself programming and started a new career as a Ruby on Rails developer. He then set out to learn machine learning. Since then, he's been a Fast AI International Fellow, become a Kaggle Master, and is now an AI Data Engineer on the Earth Species Project.

Links Mentioned

Show Notes

  • 02:15 How Radek got interested in programming and computer science
    • Always interested by the concept of programming
      • Used to peruse programming books in bookstores
    • Had a corporate job where there was a project to move data from one place to another, estimated 100 contractor-hours to do
      • Decided to hack together a Ruby script
      • Even though it was not very pretty code, he was "still very happy with because it delivered economic value"
    • Started to get bored at his job and started taking online courses in programming
    • Eventually became a Ruby on Rails developer
  • 09:00 How Radek taught himself machine learning
    • Used to think that "learning was this process where you would sit down with a book and in two hours you came out on the other end and could talk about something abstract, understanding the theory"
    • Took Andrew Ng's machine learning course and thought it was amazing
    • Wanted to learn more: "on the internet, people said you have to learn linear algebra. You have to learn probability and statistics"
    • "So I started doing this curriculum and then it seemed that I need some real analysis to understand some of the proofs. On my commute to work, I would read these books"
    • Implemented papers from Geoffrey Hinton from scratch
    • "But I put in all this effort and I felt that if I were to be presented with a real life machine learning problem, I wouldn't even know where to start."
    • Tried to learn with Kaggle: "to say that I was intimidated by what people were doing there would be an understatement. I would go into the forums and I didn't understand the language that people were talking. This led me to believe that maybe this is not something that I can do"
    • Found Fast AI through a post by Rachel Thomas on HN
      • "The things I was hearing in that course, they sounded like blasphemy"
        • Jeremy Howard would say that you don't need to learn the derivatives since the computer would do it for you
      • "I was not ready to go all in on that. It sounded too far out there for me. And the approach to learning was completely new. So I gave up."
    • Half a year later, found Fast AI again
      • "I just couldn't stop myself. I said, Alright, this is the last time. This is the last time where I give this a try. But this time I was so desperate I went in with the idea that I will do exactly what I'm being told to do, no questioning."
  • 26:40 The skills Radek learned from Fast AI
    • Found the importance of community in the learning process
      • "You get to see people who are just like yourself doing something and then showing you the results. You can look at their journey. So three months ago they couldn't train and deploy a new model, but now they can. And in another three months, they seem to do very well in a Kaggle competition."
      • Asked tons of questions on the forums and tried to give answers as well
        • Generation effect: we learn through reflecting on experience, not experience itself
        • "If I said something that was not completely right, others could step in and correct [my understanding]"
    • "99% of what I know that is practical about machine learning, I learned from Jeremy [Howard]. He has been my greatest and possibly only mentor."
    • Even in academic research, empiricism rules in deep learning. Math and theory usually come after the result. More important to have good intuition, which can be obtained by seeing and solving lots of problems.
    • Iteration speed is key
  • 39:20 Radek's recommendations for people learning ML now
    • Start out with courses: Andrew Ng's and Fast AI
    • Do Kaggle competitions
    • Gain understanding by re-producing example notebooks, looking at them as few times as possible
    • Root your learning in practice--pull concepts and theories as you need them for a project, let reality be the teacher
    • Learn in Public
      • "You are the best person in the entire world to help somebody who is just a few steps behind you. There are blog posts that I would not be able to write right now that I was able to write a couple of years ago from the perspective of somebody who just picked up the skills. So I found something that was hard. I learned how to do it. And then I explained it in the language that Radek from three months ago would understand"
      • "With every piece of work that you share, you are building your credibility, more people are hearing about you... Just through being active on the Fast AI forums, I got many job offers... and also through Kaggle competitions"
      • "So there is value in putting your name out there and sharing your work. It sounds bizarre that this is how the life works now, but I keep running into people who are having these freak accidents: They post something online, they keep doing it for a year or two, and suddenly they wake up in a completely different place professionally."
  • 51:30 Why Radek is writing a book
    • Giving back to the community that's given so much to him
    • Wants to document the learnings from his journey
    • Wants to have the experience of selling something online
  • 01:01:20 Radek's work at the Earth Species Project
    • ESP is trying to decode non-human communication
    • "This is something that many people in the machine learning field are not aware of, but if you train embeddings on texts from various languages, you can then translate between the languages without the Rosetta Stone. You align the embeddings [latent spaces] and in the process you can go from word to word in one language to the other."
    • Animals are more "intelligent" than most people think, obscured by communication barrier
    • Hopes that the project will increase empathy towards animals
  • 01:10:15 How the ESP collects animal language data
    • Both collecting data themselves via animal microphone tags and finding existing data
    • "A big part of the work is taking the data and making it available to researchers"
    • Lots of work to be done pre-processing it, cleaning it, organizing it
  • 01:21:05 Rapid fire questions
    • For fun: learning new things
    • Books: Guide to the Good Life, Origin of Wealth, Make Time, You are Here
    • Advice: Take the Fast AI courses
    • Recently changed mind: Need to progress through each learning stage, can't jump straight to the end
    • Important truth: all ML practitioners have a moral responsibility to make sure their models are unbiased and ethically-used