Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Data Prediction using Decision Tree of rpart
Next article icon

Data Prediction using Decision Tree of rpart

Last Updated : 23 Jul, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Decision trees are a popular choice due to their simplicity and interpretation, and effectiveness at handling both numerical and categorical data. The rpart (Recursive Partitioning) package in R specializes in constructing these trees, offering a robust framework for building predictive models.

Overview of rpart

rpart stands for Recursive Partitioning and is a versatile tool in R for creating decision trees. It builds models based on a set of binary rules, splitting the data recursively to maximize the homogeneity of the resulting subgroups. This process is useful for both regression and classification tasks, making it highly versatile.

Setting Up rpart

To set up a decision tree using rpart, you need:

  • A properly formatted dataset: Ensure no missing values or factor variables with unused levels.
  • A formula specifying the model: This formula determines which variable is predicted and which variables are used as predictors.

Parameters such as method, minsplit, cp, and maxdepth, which control the complexity and performance of the tree. Now we will discuss Step-by-Step Breakdown of the Data Prediction in R Programming Language.

Step 1: Load the Necessary Library

Loads the rpart package, which is required to build decision tree models. If rpart is not installed, you would need to install it using install.packages("rpart").

R
library(rpart) 

Step 2: Load the Dataset

Loads the built-in Iris dataset. This dataset includes four features (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) and a target variable (Species).

R
data(iris) head(iris) 

Output:

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa

Step 3: Build the Decision Tree Model

Now we will Build the Decision Tree Model.

R
model <- rpart(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,                                                          data=iris, method="class") 
  • Formula: Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width. This formula specifies that the Species is the dependent variable, and the four measurements are the independent variables.
  • Data: Specifies the dataset (iris) to use for the model.
  • Method: "class" indicates that the task is a classification. It instructs rpart to treat the Species variable as a categorical outcome.

Step 4: Plot the Decision Tree

Now we will plot the Decision Tree.

R
plot(model) text(model, use.n=TRUE) 

Output:

df
Decision Tree
  • plot(model): Draws the basic structure of the tree.
  • text(model, use.n=TRUE): Annotates the tree with node numbers, allowing you to see how many data points end up in each leaf of the tree.

Step 5: Create New Data for Prediction

Now we will Create New Data for Prediction.

R
new_data <- data.frame(Sepal.Length=5.5, Sepal.Width=3.5, Petal.Length=1.4,                                                             Petal.Width=0.2) 

Values: The measurements provided are hypothetical and are used to demonstrate how the model performs predictions.

Step 6: Make Predictions

R
prediction <- predict(model, new_data, type="class")  print(prediction) 

Output:

     1 
setosa
Levels: setosa versicolor virginica
  • model: The decision tree model built in step 3.
  • new_data: The new data point defined in step 5.
  • type="class": Specifies that the prediction should return the class (species) rather than probabilities.

Conclusion

The rpart package in R offers a user-friendly yet powerful approach to building decision trees, making it a valuable tool for both novice and experienced data scientists. By following the steps outlined in this article and adhering to best practices, one can effectively utilize decision trees to make reliable predictions and gain insights from various types of data. Whether you are performing a straightforward classification task or tackling more complex predictive modeling challenges, rpart provides the necessary tools to achieve accurate and interpretable results.


Next Article
Data Prediction using Decision Tree of rpart

P

poojashu00qn
Improve
Article Tags :
  • Machine Learning
  • Blogathon
  • AI-ML-DS
  • AI-ML-DS With R
  • Data Science Blogathon 2024
Practice Tags :
  • Machine Learning

Similar Reads

    Feature selection using Decision Tree
    Feature selection using decision trees involves identifying the most important features in a dataset based on their contribution to the decision tree's performance. The article aims to explore feature selection using decision trees and how decision trees evaluate feature importance. What is feature
    5 min read
    Predict default payments using decision tree in R
    Predicting default payments is a common task in finance, where we aim to identify whether a customer is likely to default on their loan based on various attributes. Decision trees are a popular choice for this task due to their interpretability and simplicity. In this article, we will demonstrate ho
    5 min read
    Python | Decision Tree Regression using sklearn
    Decision Tree Regression is a method used to predict continuous values like prices or scores by using a tree-like structure. It works by splitting the data into smaller parts based on simple rules taken from the input features. These splits help reduce errors in prediction. At the end of each branch
    4 min read
    Prediction Using Classification and Regression Trees in MATLAB
    A Classification and Regression Tree(CART) is a Machine learning algorithm to predict the labels of some raw data using the already trained classification and regression trees. Initially one needs enough labelled data to create a CART and then, it can be used to predict the labels of new unlabeled r
    4 min read
    How to Visualize a Decision Tree from a Random Forest
    Random Forest is a versatile and powerful machine learning algorithm used for both classification and regression tasks. It belongs to the ensemble learning method, which involves combining multiple individual decision trees to create a more robust and accurate model. In this article, we will discuss
    5 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences