Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Aptitude
  • Engineering Mathematics
  • Discrete Mathematics
  • Operating System
  • DBMS
  • Computer Networks
  • Digital Logic and Design
  • C Programming
  • Data Structures
  • Algorithms
  • Theory of Computation
  • Compiler Design
  • Computer Org and Architecture
Open In App
Next Article:
Query Processing in Distributed DBMS
Next article icon

Parallelism in Query in DBMS

Last Updated : 13 Apr, 2022
Comments
Improve
Suggest changes
Like Article
Like
Report

Parallelism in a query allows us to parallel execution of multiple queries by decomposing them into the parts that work in parallel. This can be achieved by shared-nothing architecture. Parallelism is also used in fastening the process of a query execution as more and more resources like processors and disks are provided. We can achieve parallelism in a query by the following methods :

  1. I/O parallelism
  2. Intra-query parallelism
  3. Inter-query parallelism
  4. Intra-operation parallelism
  5. Inter-operation parallelism

1. I/O parallelism :
It is a form of parallelism in which the relations are partitioned on multiple disks a motive to reduce the retrieval time of relations from the disk. Within, the data inputted is partitioned and then processing is done in parallel with each partition. The results are merged after processing all the partitioned data. It is also known as data-partitioning. Hash partitioning has the advantage that it provides an even distribution of data across the disks and it is also best suited for those point queries that are based on the partitioning attribute. It is to be noted that partitioning is useful for the sequential scans of the entire table placed on 'n' number of disks and the time taken to scan the relationship is approximately 1/n of the time required to scan the table on a single disk system. We have four types of partitioning in I/O parallelism: 

  • Hash partitioning - 
    As we already know, a Hash Function is a fast, mathematical function. Each row of the original relationship is hashed on partitioning attributes.  For example, let's assume that there are 4 disks disk1, disk2, disk3, and disk4 through which the data is to be partitioned. Now if the Function returns 3, then the row is placed on disk3.
     
  • Range partitioning - 
    In range partitioning, it issues continuous attribute value ranges to each disk.  For example, we have 3 disks numbered 0, 1, and 2 in range partitioning, and may assign relation with a value that is less than 5 to disk0, values between 5-40 to disk1, and values that are greater than 40 to disk2.  It has some advantages, like it involves placing shuffles containing attribute values that fall within a certain range on the disk. See figure 1: Range partitioning given below:

  • Round-robin partitioning - 
    In Round Robin partitioning, the relations are studied in any order.  The ith tuple is sent to the disk number(i % n). So, disks take turns receiving new rows of data. This technique ensures the even distribution of tuples across disks and is ideally suitable for applications that wish to read the entire relation sequentially for each query.
     
  • Schema partitioning - 
    In schema partitioning, different tables within a database are placed on different disks. See figure 2 below:
     
figure - 2

2. Intra-query parallelism : 
 Intra-query parallelism refers to the execution of a single query in a parallel process on different CPUs using a shared-nothing paralleling architecture technique. This uses two types of approaches:

  • First approach - 
    In this approach, each CPU can execute the duplicate task against some data portion.
  • Second approach -
    In this approach, the task can be divided into different sectors with each CPU executing a distinct subtask.

3. Inter-query parallelism :
In Inter-query parallelism, there is an execution of multiple transactions by each CPU. It is called parallel transaction processing. DBMS uses transaction dispatching to carry inter query parallelism. We can also use some different methods, like efficient lock management. In this method, each query is run sequentially, which leads to slowing down the running of long queries. In such cases, DBMS must understand the locks held by different transactions running on different processes. Inter query parallelism on shared disk architecture performs best when transactions that execute in parallel do not accept the same data. Also, it is the easiest form of parallelism in DBMS, and there is an increased transaction throughput.

4. Intra-operation parallelism :
Intra-operation parallelism is a sort of parallelism in which we parallelize the execution of each individual operation of a task like sorting, joins, projections, and so on. The level of parallelism is very high in intra-operation parallelism. This type of parallelism is natural in database systems. Let's take an SQL query example: 

SELECT * FROM Vehicles ORDER BY Model_Number; 

In the above query, the relational operation is sorting and since a relation can have a large number of records in it, the operation can be performed on different subsets of the relation in multiple processors, which reduces the time required to sort the data.

5. Inter-operation parallelism :
When different operations in a query expression are executed in parallel, then it is called inter-operation parallelism. They are of two types -

  • Pipelined parallelism -
     In pipeline parallelism, the output row of one operation is consumed by the second operation even before the first operation has produced the entire set of rows in its output. Also, it is possible to run these two operations simultaneously on different CPUs, so that one operation consumes tuples in parallel with another operation, reducing them. It is useful for the small number of CPUs and avoids writing of intermediate results to disk.
  • Independent parallelism -
    In this parallelism, the operations in query expressions that are not dependent on each other can be executed in parallel. This parallelism is very useful in the case of the lower degree of parallelism.

Next Article
Query Processing in Distributed DBMS
author
tarunsinghwap7
Improve
Article Tags :
  • DBMS

Similar Reads

  • Pipeline in Query Processing in DBMS
    Database system processing in a satisfactory manner encompasses providing fast responses to data retrieval and manipulation tasks, with two of the keywords being performance and responsiveness. A concept that acts as the foundational element in improving batch processing performance is called "pipel
    5 min read
  • What is a Query in DBMS?
    In the field of Database Management Systems (DBMS), a query serves as a fundamental tool for retrieving, manipulating, and managing data stored within a database. Queries act as the bridge between users and databases, enabling them to communicate with the system to extract specific information or pe
    5 min read
  • Query Processing in Distributed DBMS
    Query processing in a distributed database management system requires the transmission of data between the computers in a network. A distribution strategy for a query is the ordering of data transmissions and local data processing in a database system. Generally, a query in Distributed DBMS requires
    5 min read
  • Measures of Query Cost in DBMS
    Query Cost is a cost in which the enhancer considers what amount of time your query will require (comparative with absolute clump time). Then the analyzer attempts to pick the most ideal query plan by taking a glance at your inquiry and insights of your information, attempting a few execution design
    4 min read
  • Merge Join in DBMS
    Merge be part of is a hard and fast-based be part of operation used in database control systems (DBMS) to mix rows from or extra tables based on an associated column among them. It is mainly efficient whilst the tables involved are large and while they are each sorted on the be a part of the key, wh
    7 min read
  • Selection Operation in Query Processing in DBMS
    Regarding query processing, the term "selection" operation denotes fetching particular rows from a database table that fulfill some given condition or conditions. Why is this important? Because databases manage vast volumes of information, users must be able to narrow down their searches based on di
    9 min read
  • Use of DBMS in System Software
    Here we are going to discuss about how a user interacts with a DBMS, and how the DBMS is related to system software. Using a general-purpose programming language, user can write a source program in the normal way. However, instead of writing I/O statements of the form provided by the programming lan
    5 min read
  • Nested Loop Join in DBMS
    The joining of tables in relational databases is a common operation aimed at merging data from many different sources. In this article, we will look into nested-loop join which is one of the basic types of joins that underlies several other join algorithms. We are going to dive deeply into the mecha
    7 min read
  • Introduction of Parallel Database
    In this article, we will discuss the overview of Parallel Databases and then will emphasize their needs and advantages, and then finally, will cover the performance measurement factor-like Speedup and Scale-up with examples. Let's discuss it one by one. Parallel Databases :Nowadays organizations nee
    3 min read
  • Design of Parallel Databases | DBMS
    A parallel DBMS is a DBMS that runs across multiple processors or CPUs and is mainly designed to execute query operations in parallel, wherever possible. The parallel DBMS link a number of smaller machines to achieve the same throughput as expected from a single large machine. In Parallel Databases,
    3 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences