Machine Learning Study Group Recap – Week 1

Hello World!

So many of you who are here are probably part of the study group.  For those who are not or are perhaps referencing this at a later time, this is in regards to the following course on Coursera. If you would like to join our study group, please see one of the following meetup pages: Fort Lauderdale Machine Learning or Florida Dot Net.

Here in South Florida we have a strong Machine Learning and Data Science community and therefor it is easy to get a study group together.  This article is a recap from the first meeting of our study group.  Note that this first meeting is the week before the class started.  Therefor this article is a great introduction to machine learning, languages, commitments and more generally applicable questions and concerns.

Course Time Commitment

As discussed the course appears to realistically take ~5 hours/week to do well in.  That said I personally am attempting to bring what I learn in this course into my every day development life and doing this adds ~5 more hours worth of effort.  To derive significant value from the course, I highly suggest doing the course, coming to the study group and then also implementing all equations and code in your chosen language path.  This is what I am doing and it is proving very successful for my own learning.

Battle of the Languages

A more detailed write up on how I came to this can be found here.  I recommend reading that article, as Python might be your number 1 pick if certain situations are correct or perhaps R.  Its all dependent.

  1. F# is your best bet if you need to deliver code to modern client devices and want to work in as few languages as possible while still having significant power.
  2. C# is your next bet.  From an ML workload perspective it is not very good, however it hits all modern client targets.  If your machine learning is not targeting modern clients, I would skip this language and use Python or R instead.
  3. Python is next up being very server side friendly as well as having a nice licensing model for delivering client applications, however it falls over for modern mobile targets.
  4. R is best if you are in a large organization and your algorithms will live primarily server side.  Through the F# R-Type provider though, you do have a path to bring it onto modern mobile applications though.  Just realize that you are on a team and have extra steps.  For a jump-start on the R language, see this video series. See episodes: 2, 3, 5 and 7. Or simply follow this guided tour.
  5. Octave is just garbage and we are only going to use this for the course because we have to.  I wouldn’t use this in production unless there is significant improvements to the run time of this language.

For those who are completely new to Programming and Machine Learning, I would suggest learning R or F# first.  R will be the easiest to learn and is very applicable in the enterprise space while F# allows you to be extremely flexible but may be a higher barrier to entry as far as learning.

Submitting Homework

First things first, Create a C:\Courses\ML folder.  After that, add folders for each week.

  • C:\Courses\ML\Week1
  • C:\Courses\ML\Week2
  • C:\Courses\ML\Week n

Download the zip for each week into those folders.  Attempt to submit initially before attempting the patch we discussed in the course.  If it fails for a reason along the lines of “url read” or “CA Certificate” then download this zip into the folder and unzip it.  It should replace various components in your lib folder.

Practical Applications of Machine Learning

This came up quite frequently, so I will lay out a few that I personally have done.

  1. Chinese bank was having issues with OCR library they were using.  Understanding the underlying technology to OCR allowed me to make recommendations on improving their recognition rates.
  2. I am currently building a financial modelling application for my own purposes but also to eventually ship to production as a mobile application.  Due to the nature of how I’m building it I am not able to take advantage of server side machine learning in all scenarios, thereby forcing me to use code libraries and have a deep knowledge to deliver this application to production.
  3. Improving Farming.  Common problems are over applications of fertilizer, under applications, not enough water etc etc.  Through machine learning techniques we will discuss and learn you can apply these in an IoT scenario to fix these issues.
  4. Improving Customer Retention for Restaraunts.  20% retention rate isn’t bad, but can be improved through understanding these techniques to help identify successful and unsuccessful behaviors to retain top customers.  This process actually also helped understand who top customers even are.  They may not be who you think they are.

So that’s just a few projects, I’ve been on so many that required some form of machine learning or intelligence that I can’t keep tabs on them anymore.  The sum of it is it is not only practical but highly profitable.

I don’t know math or how to program.

Who cares?  Nobody starts off good at this stuff.  I can literally look at stuff I wrote a week ago and feel disgusted by it.  You will start off terrible, but if you don’t start, then you never get good.  Best way to get good is to surround yourself with the best of the best and just keep showing up.  Many folks know I do brazilian jiu jitsu.  I’m actually pretty good now.  I got good by getting my ass kicked for 5 years.  Time for you to start getting your ass kicked and then you will finally be good.  Its just like any new skill.  You always start off sucking.  Just like anything its just sheer time commitment.

The nice thing is that at least here in South Florida, there is a support structure, come on out to one of our groups and we will help you get where you want to be, but it will require hard work, effort and time.

What about the Server Side Platforms?

Great question.  There are fantastic server side platforms out there at very reasonable prices.  Azure Machine Learning is one of my favorites.  I use it all the time for customers.  They key here is two fold.

  1. Its server side.  If you have disconnected needs, this will simply not work.  You need a network connection
  2. Understanding the underlying technology allows you to take advantage of the wrapping technology more effectively.  Your accuracy rates will increase, your training times will decrease, your architectures will get better.

Beyond those two key points, those platforms are fantastic for rolling quick models that can be ported to a language for your disconnected experiences as well.  And finally they may or may not offer up exactly what you want and you end up spending more time writing custom code to do that and have added an additional architectural component to maintain that you may not have needed to.  Its all ROI.  Each item has its place in a given scenario.  Server Side Platforms are great, fill that particular need, we are focusing in this course to fill a different need as well as just increase our own skills on those platforms.

What is the difference between Machine Learning and Artificial Intelligence?

Another great question!  Artificial intelligence can be thought of as an umbrella intelligence term and Machine Learning fits within that.  Artificial intelligence allows for rules based intelligence “If this, then that” and various nestings and versions of that.  Machine Learning is focused around taking the information at hand, applying mathematical models to that information to come up with a result.  In rules based intelligence you will be guarenteed the same results for every input.  In machine learning the output changes with the training and the input and provides an adaptive framework for intelligent decision making.

AI is an umbrella term and therefor may encompass that, but does not force it.  That said, you will often find in many Machine Learning applications that there is sometimes a marriage between mathematical models and rules.  It all depends on the use case and the ROI.

So thats about it!  Feel free to post comments and questions at the bottom of this page.

Leave a Reply

Your email address will not be published. Required fields are marked *