Choosing the proper machine learning method

Scikit-learn portal published a cheat sheet map for choosing a right estimator for the particular job. On the edge of map there are most common jobs: clustering, customization, regression and dimensions reduction. From the start point, graph asks a couple of questions on your problem which you want to solve. Firstly, it suggest to get more data if there are less than 50 observations 🙂 On the classification problem, possible given solving techniques are: Linear SVC, SGD Classifier or kernel approximation (for large datasets), Naive Bayes, KNeighbour Classifiers, SVC (ensemble classifiers).

Click to view larger

It would be great if somebody improved this map for more problems, frameworks (not only scikit-learn) and made a website for fast robust method suggestion (through question asking).

Also, check out a very similar (but much larger!) to the map described on dlib C++ Machine Learning library page.

Source: Jassim Moideen on “Big Data and Analytics” LinkedIn group