Advanced Machine Learning by Hilary Mason; O’Reilly Media
September 5, 2012 Leave a comment
Advanced Machine Learning video collection is a quick presentation of some machine learning techniques and algorithms covered in just over 2 hours. Even though this is a short period to try and cover any of the many algorithms in machine learning, there is a chance that you might learn something. If you are an experienced in the many topics of machine learning, then you will know that you cannot cover anything in enough detail in 2 hours and therefore this would not be helpful to you. If you are fairly new to the topic, then you might learn a little bit about interesting algorithms, but if you expectation is that you will be able to directly apply them or be able to explain them to co-workers then you will have to dive deeper somewhere else. There were some interesting algorithms that were covered that I had not worked with like Bloom Filter, Simhashing, and Hamming Distance. Hilary explains these algorithms through examples written in Python and utilizing libraries that have implemented these algorithms. The problem is that she does not go into enough detail that you will be able to implement them in another language, therefore you will need to research them to get a better understanding. I did enjoy the advice that she gave about becoming a better data analyst is to watch and talk with other data analyst to see the tools that they use and the approaches that they take. If you take that approach to what she is presenting here, then you will learn some new topics to apply to data mining with the caveat that you will need to spend time researching to better understand the details of the algorithms. Personally I would have enjoyed learning more details about the random forest decision tress and on dimensionality reduction. Since Hilary has the opportunity to create a collection of videos on advanced machine learning, she had the opportunity to dive a bit deeper on the different algorithms and the different situations you can apply them. She could have also taken the time to explain the results that are presented after running the algorithms.
On a side note, it even seems like the students in the class don’t seem to fully understand what is being presented to them. Not to mention, what is the person with the iPad even doing the entire time of the videos. She seems to never look up and there is no way that she is coding or if she is, then I would like to know what application she is using.
The only people that I would recommend this collection of videos for would be someone interested in starting to be a data scientist and have not taken a machine learning class. Otherwise, I would look at other books like “R in a Nutshell” or the course provided by Standford or other free online courses.
If you do get this collection of videos, I do recommend downloading the source and files from Github. Then you can follow through the examples even if you are using Window. I would recommend installing cygwin before watching the videos and make sure that you have python and a editor configured.
I did forget to mention that the data she uses in the examples come from real data sources or data feeds. She states that this is one of the differences from the introduction video collection that she did on machine learning.