Choosing the proper machine learning method

Scikit-learn portal published a cheat sheet map for choosing a right estimator for the particular job. On the edge of map there are most common jobs: clustering, customization, regression and dimensions reduction. From the start point, graph asks a couple of questions on your problem which you want to solve. Firstly, it suggest to get more data if there are less than 50 observations 🙂 On the classification problem, possible given solving techniques are: Linear SVC, SGD Classifier or kernel approximation (for large datasets), Naive Bayes, KNeighbour Classifiers, SVC (ensemble classifiers).

machine-learning
Click to view larger

It would be great if somebody improved this map for more problems, frameworks (not only scikit-learn) and made a website for fast robust method suggestion (through question asking).

Also, check out a very similar (but much larger!) to the map described on dlib C++ Machine Learning library page. http://dlib.net/ml_guide.svg

Source: Jassim Moideen on “Big Data and Analytics” LinkedIn group

Advertisements