ML | Logistic Regression v/s Decision Tree Classification Last Updated : 25 Aug, 2021 Comments Improve Suggest changes Like Article Like Report Logistic Regression and Decision Tree classification are two of the most popular and basic classification algorithms being used today. None of the algorithms is better than the other and one's superior performance is often credited to the nature of the data being worked upon. We can compare the two algorithms on different categories - CriteriaLogistic RegressionDecision Tree ClassificationInterpretabilityLess interpretableMore interpretableDecision BoundariesLinear and single decision boundaryBisects the space into smaller spacesEase of Decision MakingA decision threshold has to be setAutomatically handles decision makingOverfittingNot prone to overfittingProne to overfittingRobustness to noiseRobust to noiseMajorly affected by noiseScalabilityRequires a large enough training setCan be trained on a small training set As a simple experiment, we run the two models on the same dataset and compare their performances. Step 1: Importing the required libraries Python3 import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier Step 2: Reading and cleaning the Dataset Python3 cd C:\Users\Dev\Desktop\Kaggle\Sinking Titanic # Changing the working location to the location of the file df = pd.read_csv('_train.csv') y = df['Survived'] X = df.drop('Survived', axis = 1) X = X.drop(['Name', 'Ticket', 'Cabin', 'Embarked'], axis = 1) X = X.replace(['male', 'female'], [2, 3]) # Hot-encoding the categorical variables X.fillna(method ='ffill', inplace = True) # Handling the missing values Step 3: Training and evaluating the Logistic Regression model Python3 X_train, X_test, y_train, y_test = train_test_split( X, y, test_size = 0.3, random_state = 0) lr = LogisticRegression() lr.fit(X_train, y_train) print(lr.score(X_test, y_test)) Step 4: Training and evaluating the Decision Tree Classifier model Python3 criteria = ['gini', 'entropy'] scores = {} for c in criteria: dt = DecisionTreeClassifier(criterion = c) dt.fit(X_train, y_train) test_score = dt.score(X_test, y_test) scores[c] = test_score print(scores) On comparing the scores, we can see that the logistic regression model performed better on the current dataset but this might not be the case always. Comment More infoAdvertise with us Next Article ML | Logistic Regression v/s Decision Tree Classification A AlindGupta Follow Improve Article Tags : Machine Learning AI-ML-DS AI-ML-DS With Python Practice Tags : Machine Learning Similar Reads ML | Why Logistic Regression in Classification ? Using Linear Regression, all predictions >= 0.5 can be considered as 1 and rest all < 0.5 can be considered as 0. But then the question arises why classification can't be performed using it? Problem - Suppose we are classifying a mail as spam or not spam and our output is y, it can be 0(spam) 3 min read Text Classification using Logistic Regression Text classification is a fundamental task in Natural Language Processing (NLP) that involves assigning predefined categories or labels to textual data. It has a wide range of applications, including spam detection, sentiment analysis, topic categorization, and language identification. Logistic Regre 4 min read Logistic Regression Vs Random Forest Classifier A statistical technique called logistic regression is used to solve problems involving binary classification, in which the objective is to predict a binary result (such as yes/no, true/false, or 0/1) based on one or more predictor variables (also known as independent variables, features, or predicto 7 min read Classification vs Regression in Machine Learning Classification and regression are two primary tasks in supervised machine learning, where key difference lies in the nature of the output: classification deals with discrete outcomes (e.g., yes/no, categories), while regression handles continuous values (e.g., price, temperature).Both approaches req 5 min read ML | Linear Regression vs Logistic Regression Linear Regression is a machine learning algorithm based on supervised regression algorithm. Regression models a target prediction value based on independent variables. It is mostly used for finding out the relationship between variables and forecasting. Different regression models differ based on â 3 min read Like