Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Python Tutorial
  • Interview Questions
  • Python Quiz
  • Python Glossary
  • Python Projects
  • Practice Python
  • Data Science With Python
  • Python Web Dev
  • DSA with Python
  • Python OOPs
Open In App
Next Article:
SHAP with a Linear SVC model from Sklearn Using Pipeline
Next article icon

Make_pipeline() function in Sklearn

Last Updated : 04 Sep, 2022
Comments
Improve
Suggest changes
Like Article
Like
Report

In this article let’s learn how to use the make_pipeline method of SKlearn using Python.

The make_pipeline() method is used to Create a Pipeline using the provided estimators. This is a shortcut for the Pipeline constructor identifying the estimators is neither required nor allowed. Instead, their names will automatically be converted to lowercase according to their type. when we want to perform operations step by step on data, we can make a pipeline of all the estimators in sequence.

Syntax: make_pipeline()

parameters:

  • stepslist of Estimator objects: The chained scikit-learn estimators are listed below.
  • memorystr or object with the joblib.Memory interface, default=None: used to store the pipeline’s installed transformers. No caching is done by default. The path to the cache directory is specified if a string is provided. A copy of the transformers is made before they are fitted when caching is enabled. As a result, it is impossible to directly inspect the transformer instance that the pipeline was given. For a pipeline’s estimators, use the named steps or steps attribute. When fitting takes a while, it is useful to cache the transformers.
  • verbosebool, default=False: If True, each step’s completion time will be printed after it has taken its required amount of time.

returns:

p: Pipeline: A pipeline object is returned.

Example: Classification algorithm using make pipeline method

This example starts with importing the necessary packages. ‘diabetes.csv’ file is imported. Feature variables X and y where X variables represent a set of independent features and ‘y’ represents a dependent variable. train_test_split() is used to split X and y variables into train and test sets. test_size is 0.3, which means 30% of data is test data. make_pipeline() method is used to create a pipeline where there’s a standard scaler and logistic regression model. First, the standard scaler gets executed and then the logistic regression model. fit() method is used to fit the data in the pipe and predict() method is used to carry out predictions on the test set. accuracy_score() metric is used to find the accuracy score of the logistic regression model.

To read and download the dataset click here. 

Python3

# import packages
from sklearn.linear_model import LogisticRegression 
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import numpy as np
import pandas as pd
  
# import the csv file
df = pd.read_csv('diabetes.csv')
  
# feature variables
X = df.drop('Outcome',axis=1)
y = df['Outcome']
  
# splitting data in train and test sets
X_train, X_test, y_train, y_test = train_test_split(X,y,
                                                    test_size=0.3,
                                                    random_state=101)
# creating a pipe using the make_pipeline method
pipe = make_pipeline(StandardScaler(),
                     LogisticRegression())
  
#fitting data into the model
pipe.fit(X_train, y_train)
  
# predicting values
y_pred = pipe.predict(X_test)
  
# calculating accuracy score
accuracy_score = accuracy_score(y_pred,y_test)
print('accuracy score : ',accuracy_score)
                      
                       

Output:

accuracy score :  0.7878787878787878


Next Article
SHAP with a Linear SVC model from Sklearn Using Pipeline
author
isitapol2002
Improve
Article Tags :
  • Python
  • python-modules
Practice Tags :
  • python

Similar Reads

  • What is exactly sklearn.pipeline.Pipeline?
    The process of transforming raw data into a model-ready format often involves a series of steps, including data preprocessing, feature selection, and model training. Managing these steps efficiently and ensuring reproducibility can be challenging. This is where sklearn.pipeline.Pipeline from the sci
    5 min read
  • Fitting Different Inputs into an Sklearn Pipeline
    The Scikit-learn A tool called a pipeline class links together many processes, including feature engineering, model training, and data preprocessing, to simplify and optimize the machine learning workflow. The sequential application of each pipeline step guarantees consistent data transformation thr
    10 min read
  • Target encoding using nested CV in sklearn pipeline
    In machine learning, feature engineering plays a pivotal role in enhancing model performance. One such technique is target encoding, which is particularly useful for categorical variables. However, improper implementation can lead to data leakage and overfitting. This article delves into the intrica
    7 min read
  • AI Agents Vs. AI Pipelines
    Artificial Intelligence (AI) has made significant strides across multiple industries, with innovations driving automation, decision-making, and data processing. Two concepts often discussed in AI development are AI agents and AI pipelines. While both are key components of AI ecosystems, they serve d
    5 min read
  • SHAP with a Linear SVC model from Sklearn Using Pipeline
    SHAP (SHapley Additive exPlanations) is a powerful tool for interpreting machine learning models by assigning feature importance based on Shapley values. In this article, we will explore how to integrate SHAP with a linear SVC model from Scikit-learn using a Pipeline. We'll provide an overview of SH
    5 min read
  • How to Plot Confusion Matrix with Labels in Sklearn?
    A confusion matrix is a table used to evaluate the performance of a classification algorithm. It compares the actual target values with those predicted by the model.. This article will explain us how to plot a labeled confusion matrix using Scikit-Learn. Before go to the implementation let's underst
    4 min read
  • tf.function in TensorFlow
    TensorFlow is a machine learning framework that has offered flexibility, scalability and performance for deep learning tasks. tf.function helps to optimize and accelerate computation by leveraging graph-based execution. In the article, we will cover the concept of tf.function in TensorFlow. Table of
    5 min read
  • PCA and SVM Pipeline in Python
    Principal Component Analysis (PCA) and Support Vector Machines (SVM) are powerful techniques used in machine learning for dimensionality reduction and classification, respectively. Combining them into a pipeline can enhance the performance of the overall system, especially when dealing with high-dim
    5 min read
  • What is an ETL Pipeline?
    An ETL Pipeline is a crucial data processing tool used to extract, transform, and load data from various sources into a destination system. The ETL process begins with the extraction of raw data from multiple databases, applications, or external sources. The data then undergoes transformation, where
    5 min read
  • Create a Pipeline in Pandas
    Pipelines play a useful role in transforming and manipulating tons of data. Pipeline are a sequence of data processing mechanisms. Pandas pipeline feature allows us to string together various user-defined Python functions in order to build a pipeline of data processing. There are two ways to create
    4 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences