Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Open CV
  • scikit-image
  • pycairo
  • Pyglet
  • Python
  • Numpy
  • Pandas
  • Python Database
  • Data Analysis
  • ML Math
  • Machine Learning
  • NLP
  • Deep Learning
  • Deep Learning Interview Questions
  • ML Projects
  • ML Interview Questions
  • 100 Days of Machine Learning
Open In App
Next Article:
Transform a 2D image into a 3D space using OpenCV
Next article icon

Transform a 2D image into a 3D space using OpenCV

Last Updated : 28 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Transforming a 2D image into a 3D environment requires depth estimation, which can be a difficult operation depending on the amount of precision and information required. OpenCV supports a variety of depth estimation approaches, including stereo vision and depth from focus/defocus.

In this article we'll see a basic technique utilizing stereovision:

Transform a 2D image into a 3D space using OpenCV

Transforming a 2D image into a 3D space using OpenCV refers to the process of converting a two-dimensional image into a three-dimensional spatial representation using the Open Source Computer Vision Library (OpenCV). This transformation involves inferring the depth information from the 2D image, typically through techniques such as stereo vision, depth estimation, or other computer vision algorithms, to create a 3D model with depth perception. This process enables various applications such as 3D reconstruction, depth sensing, and augmented reality.

Importance of transformations of a 2D image into a 3D space

Transforming 2D images into 3D space becomes crucial in various fields due to its numerous applications and benefits:

  • Depth Perception: We are able to detect depth by transforming 2D pictures into 3D space. This makes it possible to use augmented reality, object recognition, and scene understanding.
  • 3D Reconstruction: Converting 2D photos into 3D space makes it easier to recreate 3D scenes, which is crucial in industries like robotics, computer vision, and the preservation of cultural assets.
  • Stereo Vision: Stereo vision depends on converting 2D images into 3D space. It entails taking pictures from various angles and calculating depth from the difference between matching spots. It is employed in 3D modeling, autonomous navigation, and depth sensing, among other applications.
  • Medical Imaging: Improved visualization, diagnosis, and treatment planning are possible in medical imaging when 2D medical scans—such as CT or MRI scans—are converted into 3D space.
  • Virtual Reality and Simulation: In virtual reality, simulation, and gaming, realistic 3D worlds must be constructed from 2D photos or video. This requires translating 2D visuals into 3D space.

How you get a 3D image from a 2D?

  • In conventional photography, you can either utilize a mirror and attach a camera to it to create an immediate 3D effect, or you can take a shot, step to your right (or left), and then shoot another, ensuring that all components from the first photo are present in the second.
  • However, if you just move a 2D picture left by 10 pixels, nothing changes. This is because you are shifting the entire environment, and no 3D information is saved.
  • Instead, there must be a bigger shift distance between the foreground and backdrop. In other words, the farthest point distant from the lens remains motionless while the nearest point moves.

OpenCV

OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine learning software library written in C++ with interfaces for Python, Java, and MATLAB. It provides a wide range of features for processing and evaluating pictures and movies.

Some of its capabilities are:

  • Image Processing: Filtering, edge detection, morphological operations, and color space conversions are just a few of the functions that OpenCV can do for image processing jobs.
  • Feature detection and description: OpenCV offers methods for keypoint detection and descriptor computation, which are crucial for applications like motion tracking, object identification, and picture stitching.
  • Object Identification and Recognition: OpenCV incorporates deep learning-based techniques, Haar cascades, and HOG algorithms for object detection and recognition.
  • Stereo vision and camera calibration: These are supported by OpenCV, which may be used to estimate parameters and correct distortion. Additionally, it makes stereo vision jobs easier, such 3D reconstruction from stereo pictures and disparity mapping.
  • Machine Learning Integration: Users may create and implement models for computer vision applications by integrating OpenCV with machine learning frameworks such as TensorFlow and PyTorch.
  • Real-time Processing: Suitable for applications like gesture detection, augmented reality, and surveillance, OpenCV is designed for real-time processing and may be used with cameras and video streams.

How is this related to using a 2D image to create a 3D image?

We need a method to move the pixels since an picture becomes three-dimensional when the foreground moves more than the background.

Fortunately, a technique known as depth detection exists that generates what is known as a depth map.

Now remember that this is only an estimate. Furthermore, it won't reach every nook and corner. All that depth detection does is use cues like shadows, haze, and depth of focus to determine if an object is in the forefront or background.

We can instruct a program on how far to move pixels now that we have a depth map.

Approach:

  • Obtain Stereo Images: Take or obtain two pictures of the same scene taken from two separate perspectives.
  • Stereo Calibration: For precise depth estimation, ascertain each camera's intrinsic and extrinsic properties.
  • Rectification: To make matching easier, make sure corresponding spots in stereo pictures line up on the same scanlines.
  • Stereo matching: Use methods such as block matching or SGBM to find correspondences between corrected stereo pictures.
  • Disparity: Calculate the disparity by dividing the horizontal locations of corresponding points by their pixels.
  • Depth Estimation: Using camera settings and stereo geometry, convert disparity map to depth map.
  • 3D Reconstruction: Using the depth map as a guide, reconstruct the scene's three dimensions.

Implementations of a 2D image into a 3D space Transformations using OpenCV

Input image:

  • cube 1
  • cube 2

Code steps:

  • First we have imported the required libraries.
    • PIL (Python Imaging Library), which is used to work with images
    • numpy is used for numerical computations
  • Then we have defined a function named `shift_image` that takes three parameters: `img`, `depth_img`, and `shift_amount`. `img` is the base image that will be shifted, `depth_img` is the depth map providing depth information for the shift, and `shift_amount`specifies the maximum amount of horizontal shift allowed.
  • After that we have converted the input images (`img` and `depth_img`) into arrays using NumPy for further processing. The base image (img) is converted to RGBA format to ensure it has an alpha channel, while the depth image (depth_img) is converted to grayscale (`L`) to ensure it contains only one channel representing depth values.
  • Then calculates the shift amounts based on the depth values obtained from the depth image. It scales the depth values to the range [0, 1] and then multiplies it by the `shift_amount`. The result is then converted to integers using `astype(int)`.
  • Then initializes an array (`shifted_data`) with the same shape as the base image (`data`) filled with zeros. This array will store the shifted image data.
  • The nested loops iterate over the elements of the `deltas` array to perform the shift operation. For each pixel position (x, y) in the base image, the corresponding shift amount `dx` is applied horizontally. If the resulting position (x + dx, y) is within the bounds of the image, the pixel value at (x, y) in the base image is copied to (x + dx, y) in the shifted image.
  • Image.fromarray converts the shifted image data (`shifted_data`) back to a PIL Image object. The data is converted to np.uint8 type before creating the image.
  • Then we read the images from our local files and applied to the newly created functions.
  • shifted_img.show will plot the newly transformed image
Python
from PIL import Image import numpy as np  def shift_image(img, depth_img, shift_amount=10):     # Ensure base image has alpha     img = img.convert("RGBA")     data = np.array(img)      # Ensure depth image is grayscale (for single value)     depth_img = depth_img.convert("L")     depth_data = np.array(depth_img)     deltas = ((depth_data / 255.0) * float(shift_amount)).astype(int)      # This creates the transparent resulting image.     # For now, we're dealing with pixel data.     shifted_data = np.zeros_like(data)      height, width, _ = data.shape      for y, row in enumerate(deltas):         for x, dx in enumerate(row):             if x + dx < width and x + dx >= 0:                 shifted_data[y, x + dx] = data[y, x]      # Convert the pixel data to an image.     shifted_image = Image.fromarray(shifted_data.astype(np.uint8))      return shifted_image  img = Image.open("cube1.jpg") depth_img = Image.open("cube2.jpg") shifted_img = shift_image(img, depth_img, shift_amount=10) shifted_img.show() 

Output:


transformed image-Geeksforgeeks
transformed image



Limitations of OpenCV to 2D image into a 3D space image tranformations

The process of simulating a 3D effect by shifting pixels based on depth information has several limitations also:

  • Simplified Depth Representation: Since the depth map is a grayscale picture, intricate depth fluctuations might not be properly captured.
  • It Only moves pixels horizontally, assuming that all depth fluctuations are horizontal. This assumption might not hold true in scenarios found in the actual world.
  • Uniform Shift Amount: Unrealistic effects may result from the linear connection between depth and pixel shift, which may not adequately reflect real-world depth perception.
  • Limited Depth Cues: It does not use other depth cues such as perspective distortion or occlusion, instead depending solely on the parallax effect.
  • Boundary Artifacts: Boundary artifacts may arise from the disregard of pixels that have been pushed outside the limits of the picture.
  • Dependency on Parameter: The shift amount parameter that is selected has a significant impact on the 3D effect's quality.
  • Limited Application Scope: Although it works well for basic 3D effects, it might not be accurate enough for complex 3D activities like medical imaging or augmented reality.

Conclusion

In summary, using OpenCV in Python to convert a 2D picture into a 3D space entails a number of steps, including the capture of stereo images, calibration, rectification, matching, disparity computation, depth estimate, and, in the end, 3D scene reconstruction. The extraction of spatial information from 2D pictures is made possible by this all-inclusive methodology, which makes depth sensing, augmented reality, and computer vision applications easier to use.


Next Article
Transform a 2D image into a 3D space using OpenCV

D

darken7egs
Improve
Article Tags :
  • Computer Vision
  • AI-ML-DS
  • Python-OpenCV
  • AI-ML-DS With Python

Similar Reads

    Image Transformations using OpenCV in Python
    In this tutorial, we are going to learn Image Transformation using the OpenCV module in Python. What is Image Transformation? Image Transformation involves the transformation of image data in order to retrieve information from the image or preprocess the image for further usage. In this tutorial we
    5 min read
    Negative transformation of an image using Python and OpenCV
    Image is also known as a set of pixels. When we store an image in computers or digitally, it’s corresponding pixel values are stored. So, when we read an image to a variable using OpenCV in Python, the variable stores the pixel values of the image. When we try to negatively transform an image, the b
    4 min read
    Reading an image in OpenCV using Python
    Prerequisite: Basics of OpenCVIn this article, we'll try to open an image by using OpenCV (Open Source Computer Vision) library.  Following types of files are supported in OpenCV library:Windows bitmaps - *.bmp, *.dibJPEG files - *.jpeg, *.jpgPortable Network Graphics - *.png WebP - *.webp Sun raste
    6 min read
    Dividing Images Into Equal Parts Using OpenCV In Python
    In this article, we are going to see how to divide images into equal parts using OpenCV in Python. We are familiar with python lists and list slicing in one dimension. But here in the case of images, we will be using 2D list comprehension since images are a 2D matrix of pixel intensity.  Image Used:
    3 min read
    Image Translation using OpenCV | Python
    Translation refers to the rectilinear shift of an object i.e. an image from one location to another. If we know the amount of shift in horizontal and the vertical direction, say (tx, ty) then we can make a transformation matrix e.g. \begin{bmatrix} 1 & 0 & tx \\ 0 & 1 & ty \end{bmatr
    2 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences