Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Map Reduce and its Phases with numerical example.
Next article icon

Map Reduce and its Phases with numerical example.

Last Updated : 24 Jun, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Map Reduce is a framework in which we can write applications to run huge amount of data in parallel and in large cluster of commodity hardware in a reliable manner.

Phases of MapReduce

MapReduce model has three major and one optional phase.​

  1. Mapping
  2. Shuffling and Sorting
  3. Reducing
  4. Combining

1) Mapping

It is the first phase of MapReduce programming. Mapping Phase accepts key-value pairs as input as (k, v), where the key represents the Key address of each record and the value represents the entire record content.​The output of the Mapping phase will also be in the key-value format (k’, v’).

2) Shuffling and Sorting 

The output of various mapping parts (k’, v’), then goes into Shuffling and Sorting phase.​ All the same values are deleted, and different values are grouped together based on same keys.​ The output of the Shuffling and Sorting phase will be key-value pairs again as key and array of values (k, v[ ]).

3) Reducer

The output of the Shuffling and Sorting phase (k, v[]) will be the input of the Reducer phase.​ In this phase reducer function’s logic is executed and all the values are Collected against their corresponding keys. ​Reducer stabilize outputs of various mappers and computes the final output.​

4) Combining 

It is an optional phase in the MapReduce phases .​ The combiner phase is used to optimize the performance of MapReduce phases. This phase makes the Shuffling and Sorting phase work even quicker by enabling additional performance features in MapReduce phases.

flow chart

​Numerical Example

We will be using MovieLens Data.                              

USER_ID

MOVIE_ID

RATING

TIMESTAMP

196

242

3

881250949

186

302

3

891717742

196

377

1

878887116

244

51

2

880606923

166

346

1

886397596

186

474

4

884182806

186

265

2

881171488

Solution

Step 1: First we have to map the values , it is happen in 1st phase of Map Reduce model.

196:242   ;  186:302   ;  196:377   ;  244:51   ;  166:346   ;  186:274   ;  186:265

Step 2:  After Mapping we have to shuffle and sort the values.

166:346   ;  186:302,274,265   ;  196:242,377   ;  244:51  

Step 3:  After completion of step1 and step2 we have to reduce each key's values.

Now, put all values together

Solution

Common Use Cases of MapReduce

  • Counting word frequency (as shown above)
  • Log analysis
  • Indexing web pages
  • Processing large datasets for ETL (Extract, Transform, Load)
  • Recommendation systems and data mining

Python Code For Mapper and Reducer Together

Python
from mrjob.job import MRJob from mrjob.step import MRStep   class RatingsBreak(MRJob):     def steps(self):         return [             MRstep(mapper=self.mapper_get_ratings,                    reducer=self.reducer_count_ratings)         ]         # MAPPER CODE      def mapper_get_ratings(self, _, line):         (User_id, Movie_id, Rating, Timestamp) = line.split('/t')         yield rating,         # REDUCER CODE      def reducer_count_ratings(self, key, values):         yield key, sum(values) 

Advantages of MapReduce

  • Simple and easy abstraction for large-scale data processing
  • Efficient for batch processing of massive datasets
  • Fault-tolerant and scalable
  • Integrates well with Hadoop Distributed File System (HDFS)

Limitations of MapReduce

  • Not ideal for real-time processing
  • Complex data workflows can be hard to express
  • Debugging and testing are more challenging
  • High latency due to intermediate disk I/O (especially in the shuffle phase)



Next Article
Map Reduce and its Phases with numerical example.

D

dikshantmalidev
Improve
Article Tags :
  • Data Science
  • Hadoop
  • MapReduce

Similar Reads

    Phase Diagram
    Phase diagram is the representation of temperature, pressure, and the distinct phases of a substance (i.e. solid, liquid, and gas) within a closed system. It illustrates the equilibrium between solid, liquid, gas, and sometimes supercritical fluid phases, with lines indicating boundaries where two p
    11 min read
    Chapter 4: Map Projections| Class 11 Geography Practical Work
    The Earth is a 3D sphere, but maps need to be 2D. Map projections are mathematical transformations that convert the curved Earth's surface to a flat map, though this inevitably involves some distortion. In this article, we will look into the topic of Map Projections in detail. Chapter 4: Map Project
    9 min read
    Schwarz-Christoffel Transformation
    Schwarz-Christoffel transformation is a transformation which is applied in complex analysis for mapping the region of the upper half-plane on the polygonal zones of the plane. This transformation has critically important applications in engineering, and physics and also for solving problems with con
    6 min read
    Classification of Map Projection| Class 11 Geography
    Class 11 Geography Notes: Achieving success in CBSE exams requires a clear understanding of Geography concepts. Thus, Class 11 students must obtain well-structured Geography Class 11 Notes from experienced teachers. These notes are designed to help students understand the fundamental concepts of Geo
    6 min read
    Applications of Karnaugh map
    A Karnaugh Map or K-map is one of the methods in digital design in which we can simplify complex Boolean functions and truth tables. In engineering design, it is essential to simplify complex Boolean logic expressions. There are many methods of simplifying such expressions, k-map is the simplest and
    5 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences