Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • DSA
  • Practice Pattern Searching
  • Tutorial on Pattern Searching
  • Naive Pattern Searching
  • Rabin Karp
  • KMP Algorithm
  • Z Algorithm
  • Trie for Pattern Seaching
  • Manacher Algorithm
  • Suffix Tree
  • Ukkonen's Suffix Tree Construction
  • Boyer Moore
  • Aho-Corasick Algorithm
  • Wildcard Pattern Matching
Open In App
Next Article:
KMP Algorithm for Pattern Searching
Next article icon

Rabin-Karp Algorithm for Pattern Searching

Last Updated : 26 Feb, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report
Try it on GfG Practice
redirect icon

Given a text T[0. . .n-1] and a pattern P[0. . .m-1], write a function search(char P[], char T[]) that prints all occurrences of P[] present in T[] using Rabin Karp algorithm. You may assume that n > m.

Examples: 

Input:  T[] = “THIS IS A TEST TEXT”, P[] = “TEST”
Output: Pattern found at index 10

Input:  T[] =  “AABAACAADAABAABA”, P[] =  “AABA”
Output: Pattern found at index 0
              Pattern found at index 9
              Pattern found at index 12

Rabin-Karp Algorithm

In the Naive String Matching algorithm, we check whether every substring of the text of the pattern’s size is equal to the pattern or not one by one.

Like the Naive Algorithm, the Rabin-Karp algorithm also check every substring. But unlike the Naive algorithm, the Rabin Karp algorithm matches the hash value of the pattern with the hash value of the current substring of text, and if the hash values match then only it starts matching individual characters. So Rabin Karp algorithm needs to calculate hash values for the following strings.

  • Pattern itself
  • All the substrings of the text of length m which is the size of pattern.

How is Hash Value calculated in Rabin-Karp?

Hash value is used to efficiently check for potential matches between a pattern and substrings of a larger text. The hash value is calculated using a rolling hash function, which allows you to update the hash value for a new substring by efficiently removing the contribution of the old character and adding the contribution of the new character. This makes it possible to slide the pattern over the text and calculate the hash value for each substring without recalculating the entire hash from scratch.

Here’s how the hash value is typically calculated in Rabin-Karp:

Step 1: Choose a suitable base and a modulus:

  • Select a prime number ‘p‘ as the modulus. This choice helps avoid overflow issues and ensures a good distribution of hash values.
  • Choose a base ‘b‘ (usually a prime number as well), which is often the size of the character set (e.g., 256 for ASCII characters).

Step 2: Initialize the hash value:

  • Set an initial hash value ‘hash‘ to 0.

Step 3: Calculate the initial hash value for the pattern:

  • Iterate over each character in the pattern from left to right.
  • For each character ‘c’ at position ‘i’, calculate its contribution to the hash value as ‘c * (bpattern_length – i – 1) % p’ and add it to ‘hash‘.
  • This gives you the hash value for the entire pattern.

Step 4: Slide the pattern over the text:

  • Start by calculating the hash value for the first substring of the text that is the same length as the pattern.

Step 5: Update the hash value for each subsequent substring:

  • To slide the pattern one position to the right, you remove the contribution of the leftmost character and add the contribution of the new character on the right.
  • The formula for updating the hash value when moving from position ‘i’ to ‘i+1’ is:

hash = (hash – (text[i – pattern_length] * (bpattern_length – 1)) % p) * b + text[i]

Step 6: Compare hash values:

  • When the hash value of a substring in the text matches the hash value of the pattern, it’s a potential match.
  • If the hash values match, we should perform a character-by-character comparison to confirm the match, as hash collisions can occur.

Below is the Illustration of above algorithm:

rabin-karp-final

Step-by-step approach:

  • Initially calculate the hash value of the pattern.
  • Start iterating from the starting of the string:
    • Calculate the hash value of the current substring having length m.
    • If the hash value of the current substring and the pattern are same check if the substring is same as the pattern.
    • If they are same, store the starting index as a valid answer. Otherwise, continue for the next substrings.
  • Return the starting indices as the required answer.
C++
/* Following program is a C++ implementation of Rabin Karp Algorithm given in the CLRS book */ #include <bits/stdc++.h> using namespace std;  // Search the pat string in the txt string  void search(string pat, string txt, int q) {     int M = pat.size();     int N = txt.size();     int i, j;     int p = 0; // hash value for pattern     int t = 0; // hash value for txt     int h = 1;     int d = 256; // d is the number of characters in the input alphabet      // The value of h would be "pow(d, M-1)%q"     for (i = 0; i < M - 1; i++)         h = (h * d) % q;      // Calculate the hash value of pattern and first     // window of text     for (i = 0; i < M; i++) {         p = (d * p + pat[i]) % q;         t = (d * t + txt[i]) % q;     }      // Slide the pattern over text one by one     for (i = 0; i <= N - M; i++) {          // Check the hash values of current window of text         // and pattern. If the hash values match then only         // check for characters one by one         if (p == t) {             /* Check for characters one by one */             for (j = 0; j < M; j++) {                 if (txt[i + j] != pat[j]) {                     break;                 }             }              // if p == t and pat[0...M-1] = txt[i, i+1,             // ...i+M-1]              if (j == M)                 cout << "Pattern found at index " << i                      << endl;         }          // Calculate hash value for next window of text:         // Remove leading digit, add trailing digit         if (i < N - M) {             t = (d * (t - txt[i] * h) + txt[i + M]) % q;              // We might get negative value of t, converting             // it to positive             if (t < 0)                 t = (t + q);         }     } }  /* Driver code */ int main() {     string txt = "GEEKS FOR GEEKS";     string pat = "GEEK";      // we mod to avoid overflowing of value but we should     // take as big q as possible to avoid the collison     int q = INT_MAX;     // Function Call     search(pat, txt, q);     return 0; }  // This is code is contributed by rathbhupendra 
C
/* Following program is a C implementation of Rabin Karp Algorithm given in the CLRS book */ #include <stdio.h> #include <string.h>  // Search the pat string in the txt string void search(char pat[], char txt[], int q) {     int M = strlen(pat);     int N = strlen(txt);     int i, j;     int p = 0; // hash value for pattern     int t = 0; // hash value for txt     int h = 1;     int d = 256; // d is the number of characters in the input alphabet      // The value of h would be "pow(d, M-1)%q"     for (i = 0; i < M - 1; i++)         h = (h * d) % q;      // Calculate the hash value of pattern and first     // window of text     for (i = 0; i < M; i++) {         p = (d * p + pat[i]) % q;         t = (d * t + txt[i]) % q;     }      // Slide the pattern over text one by one     for (i = 0; i <= N - M; i++) {          // Check the hash values of current window of text         // and pattern. If the hash values match then only         // check for characters one by one         if (p == t) {             /* Check for characters one by one */             for (j = 0; j < M; j++) {                 if (txt[i + j] != pat[j])                     break;             }              // if p == t and pat[0...M-1] = txt[i, i+1,             // ...i+M-1]             if (j == M)                 printf("Pattern found at index %d \n", i);         }          // Calculate hash value for next window of text:         // Remove leading digit, add trailing digit         if (i < N - M) {             t = (d * (t - txt[i] * h) + txt[i + M]) % q;              // We might get negative value of t, converting             // it to positive             if (t < 0)                 t = (t + q);         }     } }  /* Driver Code */ int main() {     char txt[] = "GEEKS FOR GEEKS";     char pat[] = "GEEK";      // A prime number     int q = 101;      // function call     search(pat, txt, q);     return 0; } 
Java
// Following program is a Java implementation // of Rabin Karp Algorithm given in the CLRS book  public class Main {     // d is the number of characters in the input alphabet     public final static int d = 256;      // Search the pat string in the txt string     static void search(String pat, String txt, int q)     {         int M = pat.length();         int N = txt.length();         int i, j;         int p = 0; // hash value for pattern         int t = 0; // hash value for txt         int h = 1;          // The value of h would be "pow(d, M-1)%q"         for (i = 0; i < M - 1; i++)             h = (h * d) % q;          // Calculate the hash value of pattern and first         // window of text         for (i = 0; i < M; i++) {             p = (d * p + pat.charAt(i)) % q;             t = (d * t + txt.charAt(i)) % q;         }          // Slide the pattern over text one by one         for (i = 0; i <= N - M; i++) {              // Check the hash values of current window of             // text and pattern. If the hash values match             // then only check for characters one by one             if (p == t) {                 /* Check for characters one by one */                 for (j = 0; j < M; j++) {                     if (txt.charAt(i + j) != pat.charAt(j))                         break;                 }                  // if p == t and pat[0...M-1] = txt[i, i+1,                 // ...i+M-1]                 if (j == M)                     System.out.println(                         "Pattern found at index " + i);             }              // Calculate hash value for next window of text:             // Remove leading digit, add trailing digit             if (i < N - M) {                 t = (d * (t - txt.charAt(i) * h)                      + txt.charAt(i + M))                     % q;                  // We might get negative value of t,                 // converting it to positive                 if (t < 0)                     t = (t + q);             }         }     }      /* Driver Code */     public static void main(String[] args)     {         String txt = "GEEKS FOR GEEKS";         String pat = "GEEK";          // A prime number         int q = 101;          // Function Call         search(pat, txt, q);     } }  // This code is contributed by nuclode 
Python
# Following program is the python implementation of # Rabin Karp Algorithm given in CLRS book  # d is the number of characters in the input alphabet d = 256  # Search the pat string in the txt string def search(pat, txt, q):     M = len(pat)     N = len(txt)     i = 0     j = 0     p = 0    # hash value for pattern     t = 0    # hash value for txt     h = 1      # The value of h would be "pow(d, M-1)%q"     for i in range(M-1):         h = (h*d) % q      # Calculate the hash value of pattern and first window     # of text     for i in range(M):         p = (d*p + ord(pat[i])) % q         t = (d*t + ord(txt[i])) % q      # Slide the pattern over text one by one     for i in range(N-M+1):         # Check the hash values of current window of text and         # pattern if the hash values match then only check         # for characters one by one         if p == t:             # Check for characters one by one             for j in range(M):                 if txt[i+j] != pat[j]:                     break                 else:                     j += 1              # if p == t and pat[0...M-1] = txt[i, i+1, ...i+M-1]             if j == M:                 print("Pattern found at index " + str(i))          # Calculate hash value for next window of text: Remove         # leading digit, add trailing digit         if i < N-M:             t = (d*(t-ord(txt[i])*h) + ord(txt[i+M])) % q              # We might get negative values of t, converting it to             # positive             if t < 0:                 t = t+q   # Driver Code if __name__ == '__main__':     txt = "GEEKS FOR GEEKS"     pat = "GEEK"      # A prime number     q = 101      # Function Call     search(pat, txt, q)  # This code is contributed by Bhavya Jain 
C#
// Following program is a C# implementation // of Rabin Karp Algorithm given in the CLRS book using System; public class GFG {     // d is the number of characters in the input alphabet     public readonly static int d = 256;      // Search the pat string in the txt string     static void search(String pat, String txt, int q)     {         int M = pat.Length;         int N = txt.Length;         int i, j;         int p = 0; // hash value for pattern         int t = 0; // hash value for txt         int h = 1;          // The value of h would be "pow(d, M-1)%q"         for (i = 0; i < M - 1; i++)             h = (h * d) % q;          // Calculate the hash value of pattern and first         // window of text         for (i = 0; i < M; i++) {             p = (d * p + pat[i]) % q;             t = (d * t + txt[i]) % q;         }          // Slide the pattern over text one by one         for (i = 0; i <= N - M; i++) {              // Check the hash values of current window of             // text and pattern. If the hash values match             // then only check for characters one by one             if (p == t) {                 /* Check for characters one by one */                 for (j = 0; j < M; j++) {                     if (txt[i + j] != pat[j])                         break;                 }                  // if p == t and pat[0...M-1] = txt[i, i+1,                 // ...i+M-1]                 if (j == M)                     Console.WriteLine(                         "Pattern found at index " + i);             }              // Calculate hash value for next window of text:             // Remove leading digit, add trailing digit             if (i < N - M) {                 t = (d * (t - txt[i] * h) + txt[i + M]) % q;                  // We might get negative value of t,                 // converting it to positive                 if (t < 0)                     t = (t + q);             }         }     }      /* Driver Code */     public static void Main()     {         String txt = "GEEKS FOR GEEKS";         String pat = "GEEK";          // A prime number         int q = 101;          // Function Call         search(pat, txt, q);     } }  // This code is contributed by PrinciRaj19992 
JavaScript
// Following program is a JavaScript implementation  // of Rabin Karp Algorithm given in the CLRS book   // d is the number of characters in the input alphabet  let d = 256;   // Search the pat string in the txt string function search(pat, txt, q) {      let M = pat.length;      let N = txt.length;      let i, j;           // Hash value for pattern      let p = 0;           // Hash value for txt      let t = 0;      let h = 1;       // The value of h would be "pow(d, M-1) % q"      for (i = 0; i < M - 1; i++)          h = (h * d) % q;       // Calculate the hash value of pattern and first window of text      for (i = 0; i < M; i++) {          p = (d * p + pat[i].charCodeAt()) % q;          t = (d * t + txt[i].charCodeAt()) % q;      }       // Slide the pattern over text one by one      for (i = 0; i <= N - M; i++) {          // Check the hash values of current window of text and pattern         if (p == t) {              /* Check for characters one by one */             for (j = 0; j < M; j++) {                  if (txt[i + j] != pat[j])                      break;              }               // If p == t and pat[0...M-1] = txt[i, i+1, ...i+M-1]              if (j == M)                  console.log("Pattern found at index " + i);         }           // Calculate hash value for next window of text:         // Remove leading digit, add trailing digit          if (i < N - M) {              t = (d * (t - txt[i].charCodeAt() * h) + txt[i + M].charCodeAt()) % q;               // We might get negative value of t, converting it to positive              if (t < 0)                  t = (t + q);          }      }  }   // Driver code let txt = "GEEKS FOR GEEKS"; let pat = "GEEK";  // A prime number let q = 101;   // Function Call search(pat, txt, q);  // This code is contributed by target_2 

Output
Pattern found at index 0 Pattern found at index 10 

Time Complexity: 

  • The average and best-case running time of the Rabin-Karp algorithm is O(n+m), but its worst-case time is O(nm).
  • The worst case of the Rabin-Karp algorithm occurs when all characters of pattern and text are the same as the hash values of all the substrings of T[] match with the hash value of P[]. 

Auxiliary Space: O(1)

Limitations of Rabin-Karp Algorithm

Spurious Hit: When the hash value of the pattern matches with the hash value of a window of the text but the window is not the actual pattern then it is called a spurious hit. Spurious hit increases the time complexity of the algorithm. In order to minimize spurious hit, we use good hash function. It greatly reduces the spurious hit.

Related Posts: 
Searching for Patterns | Set 1 (Naive Pattern Searching) 
Searching for Patterns | Set 2 (KMP Algorithm)




Next Article
KMP Algorithm for Pattern Searching
author
kartik
Improve
Article Tags :
  • DSA
  • Pattern Searching
  • Modular Arithmetic
Practice Tags :
  • Modular Arithmetic
  • Pattern Searching

Similar Reads

  • What is Pattern Searching ?
    Pattern searching in Data Structures and Algorithms (DSA) is a fundamental concept that involves searching for a specific pattern or sequence of elements within a given data structure. This technique is commonly used in string matching algorithms to find occurrences of a particular pattern within a
    5 min read
  • Introduction to Pattern Searching - Data Structure and Algorithm Tutorial
    Pattern searching is an algorithm that involves searching for patterns such as strings, words, images, etc. We use certain algorithms to do the search process. The complexity of pattern searching varies from algorithm to algorithm. They are very useful when performing a search in a database. The Pat
    15+ min read
  • Naive algorithm for Pattern Searching
    Given text string with length n and a pattern with length m, the task is to prints all occurrences of pattern in text. Note: You may assume that n > m. Examples:  Input:  text = "THIS IS A TEST TEXT", pattern = "TEST"Output: Pattern found at index 10 Input:  text =  "AABAACAADAABAABA", pattern =
    6 min read
  • Rabin-Karp Algorithm for Pattern Searching
    Given a text T[0. . .n-1] and a pattern P[0. . .m-1], write a function search(char P[], char T[]) that prints all occurrences of P[] present in T[] using Rabin Karp algorithm. You may assume that n > m. Examples: Input: T[] = "THIS IS A TEST TEXT", P[] = "TEST"Output: Pattern found at index 10 In
    15 min read
  • KMP Algorithm for Pattern Searching
    Given two strings txt and pat, the task is to return all indices of occurrences of pat within txt. Examples: Input: txt = "abcab", pat = "ab"Output: [0, 3]Explanation: The string "ab" occurs twice in txt, first occurrence starts from index 0 and second from index 3. Input: txt= "aabaacaadaabaaba", p
    14 min read
  • Z algorithm (Linear time pattern searching Algorithm)
    This algorithm efficiently locates all instances of a specific pattern within a text in linear time. If the length of the text is "n" and the length of the pattern is "m," then the total time taken is O(m + n), with a linear auxiliary space. It is worth noting that the time and auxiliary space of th
    13 min read
  • Finite Automata algorithm for Pattern Searching
    Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] in txt[]. You may assume that n > m.Examples: Input: txt[] = "THIS IS A TEST TEXT" pat[] = "TEST" Output: Pattern found at index 10 Input: txt[] = "AABAACAADAAB
    13 min read
  • Boyer Moore Algorithm for Pattern Searching
    Pattern searching is an important problem in computer science. When we do search for a string in a notepad/word file, browser, or database, pattern searching algorithms are used to show the search results. A typical problem statement would be-  " Given a text txt[0..n-1] and a pattern pat[0..m-1] wh
    15+ min read
  • Aho-Corasick Algorithm for Pattern Searching
    Given an input text and an array of k words, arr[], find all occurrences of all words in the input text. Let n be the length of text and m be the total number of characters in all words, i.e. m = length(arr[0]) + length(arr[1]) + ... + length(arr[k-1]). Here k is total numbers of input words. Exampl
    15+ min read
  • ­­kasai’s Algorithm for Construction of LCP array from Suffix Array
    Background Suffix Array : A suffix array is a sorted array of all suffixes of a given string. Let the given string be "banana". 0 banana 5 a1 anana Sort the Suffixes 3 ana2 nana ----------------> 1 anana 3 ana alphabetically 0 banana 4 na 4 na 5 a 2 nanaThe suffix array for "banana" :suffix[] = {
    15+ min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences