Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Aptitude
  • Engineering Mathematics
  • Discrete Mathematics
  • Operating System
  • DBMS
  • Computer Networks
  • Digital Logic and Design
  • C Programming
  • Data Structures
  • Algorithms
  • Theory of Computation
  • Compiler Design
  • Computer Org and Architecture
Open In App
Next Article:
Parallel Algorithm Models in Parallel Computing
Next article icon

Loop Level Parallelism in Computer Architecture

Last Updated : 22 Jun, 2022
Comments
Improve
Suggest changes
Like Article
Like
Report

Since the beginning of multiprocessors, programmers have faced the challenge of how to take advantage of the power of process available. Sometimes parallelism is available but it is present in a form that is too complicated for the programmer to think about. In addition, there exists a large sequential code that has for years has incremental performance improvements afforded by the advancement of single-core execution. For a long time, automatic parallelization has been seen as a good solution to some of these challenges. Parallelization removes the programmer’s burden of expressing and understanding the parallelism existing in the algorithm.

Loop-level parallelism in computer architecture helps us with taking out parallel tasks within the loops in order to speed up the process. The utility for this parallelism arises where data is stored in random access data structures like arrays. A program that runs in sequence will iterate over the array and perform operations on indices at a time, a program that has loop-level parallelism will use multi-threads/ multi-processes that operate on the indices at the same time or at different times.

Loop Level Parallelism Types:

  1. DO-ALL parallelism(Independent multithreading (IMT))
  2. DO-ACROSS parallelism(Cyclic multithreading (CMT))
  3. DO-PIPE parallelism(Pipelined multithreading (PMT))

1. DO-ALL parallelism(Independent multi-threading (IMT)):

In DO-ALL parallelism every iteration of the loop is executed in parallel and completely independently with no inter-thread communication. The iterations are assigned to threads in a round-robin fashion, for example, if we have 4 cores then core 0 will execute iterations 0, 4, 8, 12, etc. (see Figure). This type of parallelization is possible only when the loop does not contain loop-carried dependencies or can be changed so that no conflicts occur between simultaneous iterations that are executing. Loops which can be parallelized in this way are likely to experience speedups since there is no overhead of inter-thread communication. However, the lack of communication also limits the applicability of this technique as many loops will not be amenable to this form of parallelization. 

 

DO-ALL parallelism(Independent multithreading (IMT))

DO-ALL parallelism(Independent multithreading (IMT))

2. DO-ACROSS parallelism(Cyclic multi-threading (CMT)):

In DO-ACROSS parallelism, like Independent multi-threading, assigns iterations to threads in a round-robin manner. Optimization techniques described to increase parallelism in Independent multi-threading loops are also available in Cyclic multi-threading. In this technique, dependencies are identified by the compiler and the beginning of each loop iteration is delayed till all dependencies from previous iterations are satisfied. In this manner, the parallel portion of one iteration is overlapped with the sequential portion of the subsequent iteration. As a result, it ends up in parallel execution. For example, in the figure the statement x = x->next; causes a loop-carried dependence since it cannot be evaluated until the statement has been completed in the previous iteration. Once all cores have started their first iteration, this can approach linear speedup if the parallel part of the loop is very large to allow full utilization of the cores.

 

 

DO-ACROSS parallelism(Cyclic multi-threading (CMT))

DO-ACROSS parallelism(Cyclic multi-threading (CMT))

3. DO-PIPE parallelism(Pipeline multi-threading (PMT)):

DO-PIPE parallelism is the way for parallelization loops with cross-iteration dependencies. In this approach, the loop body is divided into a number of pipeline stages with each pipeline stage being assigned to a different core. Each iteration of the loop is then distributed across the cores with each stage of the loop being executed by the core which was assigned that pipeline stage. Each individual core only executes the code associated with the stage which was allocated to it. For instance, in the figure the loop body is divided into 4 stages: A, B, C, and D. Each iteration is distributed across all four cores but each stage is only executed by one core.

DO-PIPE parallelism(Pipeline multi-threading (PMT))

DO-PIPE parallelism(Pipeline multi-threading (PMT))



Next Article
Parallel Algorithm Models in Parallel Computing
author
aniruddharouth
Improve
Article Tags :
  • Computer Organization and Architecture
  • GATE CS

Similar Reads

  • Computer Organization and Architecture Tutorial
    In this Computer Organization and Architecture Tutorial, you’ll learn all the basic to advanced concepts like pipelining, microprogrammed control, computer architecture, instruction design, and format. Computer Organization and Architecture is used to design computer systems. Computer architecture i
    5 min read
  • Computer Organization - Von Neumann architecture
    Computer Organization is like understanding the "blueprint" of how a computer works internally. One of the most important models in this field is the Von Neumann architecture, which is the foundation of most modern computers. Named after John von Neumann, this architecture introduced the concept of
    6 min read
  • Parallel Algorithm Models in Parallel Computing
    Parallel Computing is defined as the process of distributing a larger task into a small number of independent tasks and then solving them using multiple processing elements simultaneously. Parallel computing is more efficient than the serial approach as it requires less computation time.   Parallel
    7 min read
  • Handler's Classification in Computer Architecture
    In 1977, Wolfgang Handler presented a computer architectural classification scheme for determining the degree of parallelism and pipelining built into the computer system hardware. Parallel systems are complicated to the program as compared to the single processor system because parallel system arch
    3 min read
  • Master-Slave Architecture
    One essential design concept is master-slave architecture. Assigning tasks between central and subordinate units, it transforms system coordination. Modern computing is shaped by Master-Slave Architecture, which is used in everything from content delivery networks to database management. This articl
    6 min read
  • Julia - Concept of Parallelism
    Julia is a high-performance programming language designed for numerical and scientific computing. It is designed to be easy to use, yet powerful enough to solve complex problems efficiently. One of the key features of Julia is its support for parallelism, which allows you to take advantage of multip
    9 min read
  • Superscalar Architecture
    Prerequisite - Pipelining A more aggressive approach is to equip the processor with multiple processing units to handle several instructions in parallel in each processing stage. With this arrangement, several instructions start execution in the same clock cycle and the process is said to use multip
    2 min read
  • Differences between Computer Architecture and Computer Organization
    Computer Architecture refers to the design and functional aspects of a computer system, such as the instruction set and processor. Computer Organization deals with the physical implementation and interconnection of hardware components. Computer ArchitectureComputer architecture is the functional des
    4 min read
  • Performance of Computer in Computer Organization
    In computer organization, performance refers to the speed and efficiency at which a computer system can execute tasks and process data. A high-performing computer system is one that can perform tasks quickly and efficiently while minimizing the amount of time and resources required to complete these
    6 min read
  • foreach parallel computing using external packages
    Parallel computing is a method of breaking down large computational tasks into smaller ones that can be executed simultaneously on multiple processors or cores, leading to faster results. The foreach loop is a popular construct in R programming, which allows users to iterate over a list or vector of
    7 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences