Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • DSA
  • Algorithms
  • Analysis of Algorithms
  • Sorting
  • Searching
  • Greedy
  • Recursion
  • Backtracking
  • Dynamic Programming
  • Divide and Conquer
  • Geometric Algorithms
  • Mathematical Algorithms
  • Pattern Searching
  • Bitwise Algorithms
  • Branch & Bound
  • Randomized Algorithms
Open In App
Next Article:
Program for Worst Fit algorithm in Memory Management
Next article icon

Peterson’s Algorithm for Mutual Exclusion | Set 2 (CPU Cycles and Memory Fence)

Last Updated : 07 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Problem: Given 2 process i and j, you need to write a program that can guarantee mutual exclusion between the two without any additional hardware support.

Wastage of CPU clock cycles

In layman terms, when a thread was waiting for its turn, it ended in a long while loop which tested the condition millions of times per second thus doing unnecessary computation. There is a better way to wait, and it is known as “yield”.

To understand what it does, we need to dig deep into how the Process scheduler works in Linux. The idea mentioned here is a simplified version of the scheduler, the actual implementation has lots of complications.

Consider the following example, 
There are three processes, P1, P2 and P3. Process P3 is such that it has a while loop similar to the one in our code, doing not so useful computation, and it exists from the loop only when P2 finishes its execution. The scheduler puts all of them in a round robin queue. Now, say the clock speed of processor is 1000000/sec, and it allocates 100 clocks to each process in each iteration. Then, first P1 will be run for 100 clocks (0.0001 seconds), then P2(0.0001 seconds) followed by P3(0.0001 seconds), now since there are no more processes, this cycle repeats until P2 ends and then followed by P3’s execution and eventually its termination.

This is a complete waste of the 100 CPU clock cycles. To avoid this, we mutually give up the CPU time slice, i.e. yield, which essentially ends this time slice and the scheduler picks up the next process to run. Now, we test our condition once, then we give up the CPU. Considering our test takes 25 clock cycles, we save 75% of our computation in a time slice. To put this graphically,
 

Considering the processor clock speed as 1MHz this is a lot of saving!. 
Different distributions provide different function to achieve this functionality. Linux provides sched_yield().

C
void lock(int self) {     flag[self] = 1;     turn = 1-self;      while (flag[1-self] == 1 &&            turn == 1-self)           // Only change is the addition of         // sched_yield() call         sched_yield(); } 

Memory fence.

The code in earlier tutorial might have worked on most systems, but is was not 100% correct. The logic was perfect, but most modern CPUs employ performance optimizations that can result in out-of-order execution. This reordering of memory operations (loads and stores) normally goes unnoticed within a single thread of execution, but can cause unpredictable behaviour in concurrent programs.
Consider this example, 

C
 while (f == 0);    // Memory fence required here  print x; 

In the above example, the compiler considers the 2 statements as independent of each other and thus tries to increase the code efficiency by re-ordering them, which can lead to problems for concurrent programs. To avoid this we place a memory fence to give hint to the compiler about the possible relationship between the statements across the barrier.

So the order of statements,  

flag[self] = 1; 
turn = 1-self; 
while (turn condition check) 
yield(); 
 

has to be exactly the same in order for the lock to work, otherwise it will end up in a deadlock condition.

To ensure this, compilers provide a instruction that prevent ordering of statements across this barrier. In case of gcc, its __sync_synchronize().
So the modified code becomes, 
Full Implementation in C:

C++
// Filename: peterson_yieldlock_memoryfence.cpp // Use below command to compile: // g++ -pthread peterson_yieldlock_memoryfence.cpp -o peterson_yieldlock_memoryfence  #include<iostream> #include<thread> #include<atomic>  std::atomic<int> flag[2]; std::atomic<int> turn; const int MAX = 1e9; int ans = 0;  void lock_init() {     // Initialize lock by resetting the desire of     // both the threads to acquire the locks.     // And, giving turn to one of them.     flag[0] = flag[1] = 0;      turn = 0; }  // Executed before entering critical section void lock(int self) {     // Set flag[self] = 1 saying you want     // to acquire lock     flag[self]=1;      // But, first give the other thread the     // chance to acquire lock     turn = 1-self;      // Memory fence to prevent the reordering     // of instructions beyond this barrier.     std::atomic_thread_fence(std::memory_order_seq_cst);      // Wait until the other thread loses the     // desire to acquire  lock or it is your     // turn to get the lock.     while (flag[1-self]==1 && turn==1-self)          // Yield to avoid wastage of resources.         std::this_thread::yield(); }  // Executed after leaving critical section void unlock(int self) {     // You do not desire to acquire lock in future.     // This will allow the other thread to acquire     // the lock.     flag[self]=0; }  // A Sample function run by two threads created // in main() void func(int s) {     int i = 0;     int self = s;     std::cout << "Thread Entered: " << self << std::endl;     lock(self);      // Critical section (Only one thread     // can enter here at a time)     for (i=0; i<MAX; i++)         ans++;      unlock(self); }  // Driver code int main() {         // Initialize the lock      lock_init();      // Create two threads (both run func)     std::thread t1(func, 0);     std::thread t2(func, 1);      // Wait for the threads to end.     t1.join();     t2.join();      std::cout << "Actual Count: " << ans << " | Expected Count: " << MAX*2 << std::endl;      return 0; } 
C
// Filename: peterson_yieldlock_memoryfence.c // Use below command to compile: // gcc -pthread peterson_yieldlock_memoryfence.c -o peterson_yieldlock_memoryfence  #include<stdio.h> #include<pthread.h> #include "mythreads.h"  int flag[2]; int turn; const int MAX = 1e9; int ans = 0;  void lock_init() {     // Initialize lock by resetting the desire of     // both the threads to acquire the locks.     // And, giving turn to one of them.     flag[0] = flag[1] = 0;      turn = 0; }  // Executed before entering critical section void lock(int self) {     // Set flag[self] = 1 saying you want     // to acquire lock     flag[self]=1;      // But, first give the other thread the     // chance to acquire lock     turn = 1-self;      // Memory fence to prevent the reordering     // of instructions beyond this barrier.     __sync_synchronize();      // Wait until the other thread loses the     // desire to acquire  lock or it is your     // turn to get the lock.     while (flag[1-self]==1 && turn==1-self)          // Yield to avoid wastage of resources.         sched_yield(); }  // Executed after leaving critical section void unlock(int self) {     // You do not desire to acquire lock in future.     // This will allow the other thread to acquire     // the lock.     flag[self]=0; }  // A Sample function run by two threads created // in main() void* func(void *s) {     int i = 0;     int self = (int *)s;     printf("Thread Entered: %d\n",self);     lock(self);      // Critical section (Only one thread     // can enter here at a time)     for (i=0; i<MAX; i++)         ans++;      unlock(self); }  // Driver code int main() {         pthread_t p1, p2;      // Initialize the lock      lock_init();      // Create two threads (both run func)     Pthread_create(&p1, NULL, func, (void*)0);     Pthread_create(&p2, NULL, func, (void*)1);      // Wait for the threads to end.     Pthread_join(p1, NULL);     Pthread_join(p2, NULL);      printf("Actual Count: %d | Expected Count:"            " %d\n",ans,MAX*2);      return 0; } 
Java
import java.util.concurrent.atomic.AtomicInteger;  public class PetersonYieldLockMemoryFence {     static AtomicInteger[] flag = new AtomicInteger[2];     static AtomicInteger turn = new AtomicInteger();     static final int MAX = 1000000000;     static int ans = 0;      static void lockInit() {         flag[0] = new AtomicInteger();         flag[1] = new AtomicInteger();         flag[0].set(0);         flag[1].set(0);         turn.set(0);     }      static void lock(int self) {         flag[self].set(1);         turn.set(1 - self);         // Memory fence to prevent the reordering of instructions beyond this barrier.         // In Java, volatile variables provide this guarantee implicitly.         // No direct equivalent to atomic_thread_fence is needed.         while (flag[1 - self].get() == 1 && turn.get() == 1 - self)             Thread.yield();     }      static void unlock(int self) {         flag[self].set(0);     }      static void func(int s) {         int i = 0;         int self = s;         System.out.println("Thread Entered: " + self);         lock(self);          // Critical section (Only one thread can enter here at a time)         for (i = 0; i < MAX; i++)             ans++;          unlock(self);     }      public static void main(String[] args) {         // Initialize the lock         lockInit();          // Create two threads (both run func)         Thread t1 = new Thread(() -> func(0));         Thread t2 = new Thread(() -> func(1));          // Start the threads         t1.start();         t2.start();          try {             // Wait for the threads to end.             t1.join();             t2.join();         } catch (InterruptedException e) {             e.printStackTrace();         }          System.out.println("Actual Count: " + ans + " | Expected Count: " + MAX * 2);     } } 
Python
import threading  flag = [0, 0] turn = 0 MAX = 10**9 ans = 0  def lock_init():     # This function initializes the lock by resetting the flags and turn.     global flag, turn     flag = [0, 0]     turn = 0  def lock(self):     # This function is executed before entering the critical section. It sets the flag for the current thread and gives the turn to the other thread.     global flag, turn     flag[self] = 1     turn = 1 - self     while flag[1-self] == 1 and turn == 1-self:         pass  def unlock(self):     # This function is executed after leaving the critical section. It resets the flag for the current thread.     global flag     flag[self] = 0  def func(s):     # This function is executed by each thread. It locks the critical section, increments the shared variable, and then unlocks the critical section.     global ans     self = s     print(f"Thread Entered: {self}")     lock(self)     for _ in range(MAX):         ans += 1     unlock(self)  def main():     # This is the main function where the threads are created and started.     lock_init()     t1 = threading.Thread(target=func, args=(0,))     t2 = threading.Thread(target=func, args=(1,))     t1.start()     t2.start()     t1.join()     t2.join()     print(f"Actual Count: {ans} | Expected Count: {MAX*2}")  if __name__ == "__main__":     main() 
JavaScript
class PetersonYieldLockMemoryFence {     static flag = [0, 0];     static turn = 0;     static MAX = 1000000000;     static ans = 0;      // Function to acquire the lock     static async lock(self) {         PetersonYieldLockMemoryFence.flag[self] = 1;         PetersonYieldLockMemoryFence.turn = 1 - self;          // Asynchronous loop with a small delay to yield         while (PetersonYieldLockMemoryFence.flag[1 - self] == 1 &&             PetersonYieldLockMemoryFence.turn == 1 - self) {             await new Promise(resolve => setTimeout(resolve, 0));         }     }      // Function to release the lock     static unlock(self) {         PetersonYieldLockMemoryFence.flag[self] = 0;     }      // Function representing the critical section     static func(s) {         let i = 0;         let self = s;         console.log("Thread Entered: " + self);                  // Lock the critical section         PetersonYieldLockMemoryFence.lock(self).then(() => {             // Critical section (Only one thread can enter here at a time)             for (i = 0; i < PetersonYieldLockMemoryFence.MAX; i++) {                 PetersonYieldLockMemoryFence.ans++;             }                          // Release the lock             PetersonYieldLockMemoryFence.unlock(self);         });     }      // Main function     static main() {         // Create two threads (both run func)         const t1 = new Thread(() => PetersonYieldLockMemoryFence.func(0));         const t2 = new Thread(() => PetersonYieldLockMemoryFence.func(1));          // Start the threads         t1.start();         t2.start();          // Wait for the threads to end.         setTimeout(() => {             console.log("Actual Count: " + PetersonYieldLockMemoryFence.ans + " | Expected Count: " + PetersonYieldLockMemoryFence.MAX * 2);         }, 1000); // Delay for a while to ensure threads finish     } }  // Define a simple Thread class for simulation class Thread {     constructor(func) {         this.func = func;     }      start() {         this.func();     } }  // Run the main function PetersonYieldLockMemoryFence.main(); 
C++
// mythread.h (A wrapper header file with assert statements) #ifndef __MYTHREADS_h__ #define __MYTHREADS_h__  #include <pthread.h> #include <cassert> #include <sched.h>  // Function to lock a pthread mutex void Pthread_mutex_lock(pthread_mutex_t *m) {     int rc = pthread_mutex_lock(m);     assert(rc == 0); // Assert that the mutex was locked successfully }                                                                                  // Function to unlock a pthread mutex void Pthread_mutex_unlock(pthread_mutex_t *m) {     int rc = pthread_mutex_unlock(m);     assert(rc == 0); // Assert that the mutex was unlocked successfully }                                                                                  // Function to create a pthread void Pthread_create(pthread_t *thread, const pthread_attr_t *attr,                 void *(*start_routine)(void*), void *arg) {     int rc = pthread_create(thread, attr, start_routine, arg);     assert(rc == 0); // Assert that the thread was created successfully }  // Function to join a pthread void Pthread_join(pthread_t thread, void **value_ptr) {     int rc = pthread_join(thread, value_ptr);     assert(rc == 0); // Assert that the thread was joined successfully }  #endif // __MYTHREADS_h__ 
C
// mythread.h (A wrapper header file with assert // statements) #ifndef __MYTHREADS_h__ #define __MYTHREADS_h__  #include <pthread.h> #include <assert.h> #include <sched.h>  void Pthread_mutex_lock(pthread_mutex_t *m) {     int rc = pthread_mutex_lock(m);     assert(rc == 0); }                                                                                  void Pthread_mutex_unlock(pthread_mutex_t *m) {     int rc = pthread_mutex_unlock(m);     assert(rc == 0); }                                                                                  void Pthread_create(pthread_t *thread, const pthread_attr_t *attr,                 void *(*start_routine)(void*), void *arg) {     int rc = pthread_create(thread, attr, start_routine, arg);     assert(rc == 0); }  void Pthread_join(pthread_t thread, void **value_ptr) {     int rc = pthread_join(thread, value_ptr);     assert(rc == 0); }  #endif // __MYTHREADS_h__ 
Python
import threading import ctypes  # Function to lock a thread lock def Thread_lock(lock):     lock.acquire()  # Acquire the lock     # No need for assert in Python, acquire will raise an exception if it fails  # Function to unlock a thread lock def Thread_unlock(lock):     lock.release()  # Release the lock     # No need for assert in Python, release will raise an exception if it fails  # Function to create a thread def Thread_create(target, args=()):     thread = threading.Thread(target=target, args=args)     thread.start()  # Start the thread     # No need for assert in Python, thread.start() will raise an exception if it fails  # Function to join a thread def Thread_join(thread):     thread.join()  # Wait for the thread to finish     # No need for assert in Python, thread.join() will raise an exception if it fails 

Output: 

Thread Entered: 1
Thread Entered: 0
Actual Count: 2000000000 | Expected Count: 2000000000


 



Next Article
Program for Worst Fit algorithm in Memory Management

P

Pinkesh Badjatiya
Improve
Article Tags :
  • Algorithms
  • DSA
  • GBlog
Practice Tags :
  • Algorithms

Similar Reads

  • Peterson's Algorithm for Mutual Exclusion | Set 1 (Basic C implementation)
    Problem: Given 2 processes i and j, you need to write a program that can guarantee mutual exclusion between the two without any additional hardware support. Solution: There can be multiple ways to solve this problem, but most of them require additional hardware support. The simplest and the most pop
    6 min read
  • Program for Page Replacement Algorithms | Set 2 (FIFO)
    Prerequisite : Page Replacement Algorithms In operating systems that use paging for memory management, page replacement algorithm are needed to decide which page needed to be replaced when new page comes in. Whenever a new page is referred and not present in memory, page fault occurs and Operating S
    10 min read
  • Introduction and implementation of Karger's algorithm for Minimum Cut
    Given an undirected and unweighted graph, find the smallest cut (smallest number of edges that disconnects the graph into two components). The input graph may have parallel edges. For example consider the following example, the smallest cut has 2 edges. A Simple Solution use Max-Flow based s-t cut a
    15+ min read
  • Program for Least Recently Used (LRU) Page Replacement algorithm
    Prerequisite: Page Replacement AlgorithmsIn operating systems that use paging for memory management, page replacement algorithm are needed to decide which page needed to be replaced when new page comes in. Whenever a new page is referred and not present in memory, page fault occurs and Operating Sys
    14 min read
  • Program for Worst Fit algorithm in Memory Management
    Prerequisite : Partition allocation methodsWorst Fit allocates a process to the partition which is largest sufficient among the freely available partitions available in the main memory. If a large process comes at a later stage, then memory will not have space to accommodate it. Example: Input : blo
    8 min read
  • Bellman Ford Algorithm (Simple Implementation)
    We have introduced Bellman Ford and discussed on implementation here.Input: Graph and a source vertex src Output: Shortest distance to all vertices from src. If there is a negative weight cycle, then shortest distances are not calculated, negative weight cycle is reported.1) This step initializes di
    13 min read
  • Find if a degree sequence can form a simple graph | Havel-Hakimi Algorithm
    Given a sequence of non-negative integers arr[], the task is to check if there exists a simple graph corresponding to this degree sequence. Note that a simple graph is a graph with no self-loops and parallel edges. Examples: Input: arr[] = {3, 3, 3, 3} Output: Yes This is actually a complete graph(K
    6 min read
  • Johnson’s algorithm for All-pairs shortest paths | Implementation
    Given a weighted Directed Graph where the weights may be negative, find the shortest path between every pair of vertices in the Graph using Johnson's Algorithm.  The detailed explanation of Johnson's algorithm has already been discussed in the previous post.  Refer Johnson’s algorithm for All-pairs
    12 min read
  • Program for Best Fit algorithm in Memory Management
    Prerequisite : Partition allocation methodsBest fit allocates the process to a partition which is the smallest sufficient partition among the free available partitions. Example: Input : blockSize[] = {100, 500, 200, 300, 600}; processSize[] = {212, 417, 112, 426}; Output: Process No. Process Size Bl
    8 min read
  • Implementation of Optimal Page Replacement Algorithm in OS
    The Optimal Page Replacement Algorithm is a technique used in operating systems to manage memory efficiently by replacing pages in a way that minimizes page faults. When a new page needs to be loaded into memory and all frames are already occupied, this algorithm looks into the future to decide whic
    15+ min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences