Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Apply a 2D Max Pooling in PyTorch
Next article icon

Apply a 2D Max Pooling in PyTorch

Last Updated : 28 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Pooling is a technique used in the CNN model for down-sampling the feature coming from the previous layer and produce the new summarised feature maps. In computer vision reduces the spatial dimensions of an image while retaining important features. The goal of pooling is to reduce the computational complexity of the model and make it less sensitive to small translations in the input image.

Types of Pooling

There are two main types of pooling used in deep learning: Max Pooling and Average Pooling.

Max Pooling: Max Pooling selects the maximum value from each set of overlapping filters and passes this maximum value to the next layer. This helps to retain the most important feature information while reducing the size of the representation.

Average Pooling: Average Pooling computes the average value of each set of overlapping filters, and passes this average value to the next layer. This helps to retain a more general form of the feature information, but with a reduced spatial resolution.

Pooling is usually applied after a convolution operation and helps to reduce overfitting and improve the generalization performance of the model.

2d Max pooling

As the name suggests, selects the maximum value in each pooling region and passes it on to the next layer. This helps to retain the most important feature information while discarding less important information. Max pooling is used to detect the presence of a feature in an image.

Syntax : 

torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)    where,   kernel_size : It denotes the filtered kernel shape, which is considered at a time.  stride : Number of pixel shift over the input feature,(default is kernel_size)  padding  : Extra 0 padding layers around both side the feature, (default is 0)  dialation: Control the stride (default is 1)  return_indices : True or False (default is False) return the max indices.  ceil_mode : True or False (default is False) when True it will use ceil instead of floor.

Let the input tensor be of a shape (n, c, h, w) and our kernel size is  (k_h, k_w) then the output can be computed as:

\begin{aligned}out(n_i, c_j, h, w) =& \max_{m=0, \ldots, k_h-1} \;\; \max_{n=0, \ldots, k_w-1} &[\text{input}(n_i, c_j, \text{stride[0]} \times h + m,\; \text{stride[1]} \times w + n)]    \end{aligned}

then the output Shape will be :
h_{out} = \left[\frac{h_{in} - \text{kernel\_size[0]}+ 2 * \text{padding[0]}}{\text{stride[0]}} + 1 \right]

w_{out} = \left[\frac{w_{in} - \text{kernel\_size[1]}+ 2 * \text{padding[1]}}{\text{stride[1]}} + 1 \right]

In this example, the input image is 4x4 and the Max-pooling operation is performed using a 2x2 pooling kernel and with stride 2X2. Stride defines how many numbers of pixels will shift over the input image.

2d Max Pooling with 2X2 kernel and stride 2 -Geeksforgeeks
2d Max Pooling with 2X2 kernel and stride 2

Here's an example of how Max-pooling can be implemented in PyTorch:

Python
import torch import torch.nn as nn  # Define the input tensor input_tensor = torch.tensor(     [         [1, 1, 2, 4],         [5, 6, 7, 8],         [3, 2, 1, 0],         [1, 2, 3, 4]     ], dtype = torch.float32)  # Reshape the input_tensor input_tensor = input_tensor.reshape(1, 1, 4, 4)  # Initialize the Max-pooling layer with kernel 2X2 and stride 2 pool = nn.MaxPool2d(kernel_size=2, stride=2)  # Apply the Max-pooling layer to the input tensor output = pool(input_tensor)  # Print the output tensor output 

Output :

tensor([[[[6., 8.],            [3., 4.]]]])

Mathematically, the Output shape can be calculated as: 

\begin{aligned} h_{out} & =\left[\frac{h_{in}  - \text{kernel\_size[0]} + 2 * \text{padding[0]} }{\text{stride[0]}} + 1\right] \\&=\left [\frac{4 -2 + 2 * 0}{2} + 1\right] \\ &= \left[\frac{4 -2 + 0}{2} + 1\right] \\ &= \left[\frac{2}{2} + 1\right] \\ &= \left [1+1 \right] \\ &= 2 \end{aligned}

\begin{aligned} w_{out} & =\left[\frac{w_{in}  - \text{kernel\_size[1]} + 2 * \text{padding[1]} }{\text{stride[1]}} + 1\right] \\&=\left [\frac{4 -2 + 2 * 0}{2} + 1\right] \\ &= \left[\frac{4 -2 + 0}{2} + 1\right] \\ &= \left[\frac{2}{2} + 1\right] \\ &= \left [1+1 \right] \\ &= 2 \end{aligned}

Apply 2d max pooling on a real image

Python3
import torch from PIL import Image import torchvision.transforms as T  # Read the image file image = Image.open('GFG.jpg')    # convert input image to torch tensor Input = T.ToTensor()(image)    # unsqueeze image to make 4D Input = Input.unsqueeze(0) print('Input Tensor :',Input.shape)    # define 2d Max pooling with square window # of (kernel_size=4, stride=2 and padding=1) pooling = torch.nn.MaxPool2d(kernel_size=(5,3),                               stride=(3,2),                               padding=(1,1),                              dilation=1) Output = pooling(Input) print('Output Tensor :',Output.shape) # squeeze image Out_img = Output.squeeze(0)  # convert tensor to image Out_img = T.ToPILImage()(Out_img) Out_img 

Output:

Input Tensor : torch.Size([1, 3, 561, 799])  Output Tensor : torch.Size([1, 3, 187, 400])
Output Image 2d max pooling -Geeksforgeeks
Output Image 2d max pooling

The output shape can be calculated as:

\begin{aligned} h_{out} & =\left[\frac{h_{in}  -\text{kernel\_size[0]}+ 2 * \text{padding[0]}}{\text{stride[0]}} + 1\right] \\&=\left [\frac{561 - 5 + 2 * 1 }{3} + 1\right] \\ &= \left[\frac{561 - 5+2}{3} + 1\right] \\ &= \left[\frac{561 - 3}{3} + 1\right] \\ &= \left[\frac{558}{3} + 1\right] \\ &= \left [186+1 \right] \\ &= 187 \end{aligned}

\begin{aligned} w_{out} & =\left[\frac{w_{in} -\text{kernel\_size[1]}+ 2 * \text{padding[1]}}{\text{stride[1]}} + 1\right] \\&=\left [\frac{799 - 3 + 2 * 1 }{2} + 1\right] \\ &= \left[\frac{799 -3+2}{2} + 1\right] \\ &= \left[\frac{799 - 1}{2} + 1\right] \\ &= \left[\frac{798}{2} + 1\right] \\ &= \left [399+1 \right] \\ &= 400 \end{aligned}

1D, 2D, 3D pooling

In PyTorch, the terms "1D," "2D," and "3D" pooling refer to the number of spatial dimensions in the input that are being reduced by the pooling operation.

1D Pooling is used to reduce the spatial resolution of 1D signals, such as time series or audio signals. In 1D pooling, the input is divided into non-overlapping regions along the time axis, and the values in each region are aggregated into a single output value.

2D Pooling is used to reduce the spatial resolution of 2D images or maps. In 2D pooling, the input is divided into non-overlapping regions along both the row and column axes, and the values in each region are aggregated into a single output value.

3D Pooling is used to reduce the spatial resolution of 3D signals, such as video sequences or volumetric data. In 3D pooling, the input is divided into non-overlapping regions along all three spatial dimensions (height, width, and depth), and the values in each region are aggregated into a single output value.


Next Article
Apply a 2D Max Pooling in PyTorch

S

siddyamgond
Improve
Article Tags :
  • Computer Vision
  • AI-ML-DS
  • Python-PyTorch

Similar Reads

    How to Apply a 2D Average Pooling in PyTorch?
    In this article, we will see how to apply a 2D average pooling in PyTorch.  AvgPool2d() method AvgPool2d() method of torch.nn module is used to apply 2D average pooling over an input image composed of several input planes in PyTorch. The shape of the input 2D average pooling layer should be [N, C, H
    2 min read
    Apply a 2D Convolution Operation in PyTorch
    A 2D Convolution operation is a widely used operation in computer vision and deep learning. It is a mathematical operation that applies a filter to an image, producing a filtered output (also called a feature map). In this article, we will look at how to apply a 2D Convolution operation in PyTorch.
    8 min read
    Displaying a Single Image in PyTorch
    Displaying images is a fundamental task in data visualization, especially when working with machine learning frameworks like PyTorch. This article will guide you through the process of displaying a single image using PyTorch, covering various methods and best practices.Table of ContentUnderstanding
    3 min read
    Apply a 2D Transposed Convolution Operation in PyTorch
    Transposed convolution, also known as fractionally-strided convolution, is a technique used in convolutional neural networks (CNNs) for the upsampling layer that increases the spatial resolution of an image. It is similar to a deconvolutional layer. A deconvolutional layer reverses the layer to a st
    7 min read
    How to Make a grid of Images in PyTorch?
    In this article, we are going to see How to Make a grid of Images in PyTorch. we can make a grid of images using the make_grid() function of torchvision.utils package. make_grid() function: The make_grid() function accept 4D tensor with [B, C ,H ,W] shape. where B represents the batch size, C repres
    3 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences