Implementing Decision Trees using Scikit-Learn

What is Scikit-Learn?

--

Scikit-Learn is a popular library for Machine Learning in python programming language. If you want to test your knowledge with just a few lines of code, scikit-learn is what you need. From Linear and Logistic Regression to SVM and KNN, you name and scikit-learn has it.

What is pandas?

You will often need to prepare and transform your data in a form that is suitable for scikit-learn to use for training the models. Pandas is an awesome library for python which can be used for this purpose. It offers amazing functionality including reading from and writing data to a variety of sources.

What are decision trees?

Decision Trees are a machine learning algorithm that can be used for both Classification and Regression. The interpretability of this algorithm makes it really popular among data scientists. You can get started with implementing Decision Tree algorithms using scikit-learn, with very little knowledge about them. But of course, that won’t take you far in the long run. So, how about having a clear understanding of

Decision Trees in Machine Learning

Understanding the Decision Tree Structure — Scikit-Learn

Decision Tree Classifier — Scikit-Learn

Scikit-Learn has a nice documentation of their API for learning a decision tree classifier from your data. All the parameters can be tuned to achieve a higher accuracy. But you can always use defaults to get started with classifying your data with just 3 lines of code. All you need to do is create an object of DecisionTreeClassifier class and fit it to your data. The attributes provided with API, let you get predictions, feature importance and much more. You will often need to process your data for scikit-learn to use, but using pandas, its like a cake walk.

Decision Tree Regressor — Scikit-Learn

Decision trees for prediction problems become easy to implement using Scikit-Learn. You would notice that the basic parameters and attributes provided by API are mostly similar to Decision Tree Classifier. The major change comes in the underlying logic of both algorithms. You can read more about it here.

This should get you going with decision tree learning using Scikit-Learn. Just get your data ready, train your model and get predictions in few lines of code. Stay tuned for knowing how to tune the parameter to achieve best performance.

If you liked this article, be sure to click ❤ below to recommend it and if you have any questions, leave a comment and I will do my best to answer.

For being more aware of the world of machine learning, follow me. It’s the best way to find out when I write more articles like this.

You can also follow me on Twitter, email me directly or find me on linkedin. I’d love to hear from you.

That’s all folks, Have a nice day :)

--

--