Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • DSA
  • Interview Problems on String
  • Practice String
  • MCQs on String
  • Tutorial on String
  • String Operations
  • Sort String
  • Substring & Subsequence
  • Iterate String
  • Reverse String
  • Rotate String
  • String Concatenation
  • Compare Strings
  • KMP Algorithm
  • Boyer-Moore Algorithm
  • Rabin-Karp Algorithm
  • Z Algorithm
  • String Guide for CP
Open In App
Next Article:
Longest Common Prefix using Binary Search
Next article icon

Find the Longest Common Substring using Binary search and Rolling Hash

Last Updated : 17 Dec, 2023
Comments
Improve
Suggest changes
Like Article
Like
Report

Given two strings X and Y, the task is to find the length of the longest common substring. 

Examples:

Input: X = “GeeksforGeeks”, y = “GeeksQuiz” 
Output: 5 
Explanation: The longest common substring is “Geeks” and is of length 5.

Input: X = “abcdxyz”, y = “xyzabcd” 
Output: 4 
Explanation: The longest common substring is “abcd” and is of length 4.

Input: X = “zxabcdezy”, y = “yzabcdezx” 
Output: 6 
Explanation: The longest common substring is “abcdez” and is of length 6.

Longest Common Substring using Dynamic Programming:

This problem can be solved using dynamic programming in O(len(X) * len(Y)), see this. In this article we are going to discuss about an efficient approach.

Longest Common Substring using Binary Search and Rolling Hash

Pre-requisites:

  • Binary search
  • Polynomial rolling hash function

Observation:

If there is a common substring of length K in both the strings, then there will be common substrings of length 0, 1, ..., K - 1. Hence, binary search on answer can be applied.\

Follow the below steps to implement the idea:

  • Smallest possible answer(low) = 0 and largest possible answer(high) = min(len(X), len(Y)), Range of binary search will be [0, min(len(X), len(Y))].
    • For every mid, check if there exists a common substring of length mid, if exists then update low, else update high.
    • To check the existence of a common substring of length K, Polynomial rolling hash function can be used. 
      • Iterate over all the windows of size K  in string X and string Y and get the hash. 
      • If there is a common hash return True, else return False.

Below is the implementation of this approach.

C++
#include <iostream> #include <cmath> #include <vector>  class ComputeHash { private:     std::vector<long> hash;     std::vector<long> invMod;     long mod;     long p;  public:     ComputeHash(std::string s, long p, long mod) {         int n = s.length();         this->hash.resize(n);         this->invMod.resize(n);         this->mod = mod;         this->p = p;          long pPow = 1;         long hashValue = 0;          for (int i = 0; i < n; i++) {             char c = s[i];             c = static_cast<char>(c - 'A' + 1);             hashValue = (hashValue + c * pPow) % this->mod;             this->hash[i] = hashValue;             this->invMod[i] = static_cast<long>(std::pow(pPow, this->mod - 2)) % this->mod;             pPow = (pPow * this->p) % this->mod;         }     }      long getHash(int l, int r) {         if (l == 0) {             return this->hash[r];         }          long window = (this->hash[r] - this->hash[l - 1] + this->mod) % this->mod;         return (window * this->invMod[l]) % this->mod;     } };  bool exists(int k, std::string X, std::string Y, ComputeHash &hashX1,             ComputeHash &hashX2, ComputeHash &hashY1, ComputeHash &hashY2) {     for (int i = 0; i <= X.length() - k; i++) {         for (int j = 0; j <= Y.length() - k; j++) {             if (X.substr(i, k) == Y.substr(j, k)) {                 return true;             }         }     }     return false; }  int longestCommonSubstr(std::string X, std::string Y) {     int n = X.length();     int m = Y.length();      long p1 = 31;     long p2 = 37;     long m1 = static_cast<long>(std::pow(10, 9) + 9);     long m2 = static_cast<long>(std::pow(10, 9) + 7);      // Initialize two hash objects     // with different p1, p2, m1, m2     // to reduce collision     ComputeHash hashX1(X, p1, m1);     ComputeHash hashX2(X, p2, m2);      ComputeHash hashY1(Y, p1, m1);     ComputeHash hashY2(Y, p2, m2);      // Function that returns the existence     // of a common substring of length k     int low = 0, high = std::min(n, m);     int answer = 0;      while (low <= high) {         int mid = (low + high) / 2;         if (exists(mid, X, Y, hashX1, hashX2, hashY1, hashY2)) {             answer = mid;             low = mid + 1;         } else {             high = mid - 1;         }     }      return answer; }    int main() {     std::string X = "GeeksforGeeks";     std::string Y = "GeeksQuiz";     std::cout << longestCommonSubstr(X, Y) << std::endl;      return 0; } 
Java
import java.util.HashSet; import java.util.Set;  class ComputeHash {     private long[] hash;     private long[] invMod;     private long mod;     private long p;      // Generates hash in O(n(log(n)))     public ComputeHash(String s, long p, long mod) {         int n = s.length();         this.hash = new long[n];         this.invMod = new long[n];         this.mod = mod;         this.p = p;          long pPow = 1;         long hashValue = 0;          for (int i = 0; i < n; i++) {             char c = s.charAt(i);             c = (char) (c - 'A' + 1);             hashValue = (hashValue + c * pPow) % this.mod;             this.hash[i] = hashValue;             this.invMod[i] = (long)(Math.pow(pPow, this.mod - 2) % this.mod);             pPow = (pPow * this.p) % this.mod;         }     }      // Return hash of a window in O(1)     public long getHash(int l, int r) {         if (l == 0) {             return this.hash[r];         }          long window = (this.hash[r] - this.hash[l - 1]) % this.mod;         return (window * this.invMod[l]) % this.mod;     } }  public class Main {     // Function to get the longest common substring     public static int longestCommonSubstr(String X, String Y) {     int n = X.length();     int m = Y.length();      long p1 = 31;     long p2 = 37;     long m1 = (long) (Math.pow(10, 9) + 9);     long m2 = (long) (Math.pow(10, 9) + 7);      // Initialize two hash objects     // with different p1, p2, m1, m2     // to reduce collision     ComputeHash hashX1 = new ComputeHash(X, p1, m1);     ComputeHash hashX2 = new ComputeHash(X, p2, m2);      ComputeHash hashY1 = new ComputeHash(Y, p1, m1);     ComputeHash hashY2 = new ComputeHash(Y, p2, m2);      // Function that returns the existence     // of a common substring of length k     int low = 0, high = Math.min(n, m);     int answer = 0;     while (low <= high) {         int mid = (low + high) / 2;         if (exists(mid, X, Y)) {         answer = mid;         low = mid + 1;     } else {         high = mid - 1;     }      }     return answer; }     private static boolean exists(int k, String X, String Y) {     for (int i = 0; i <= X.length() - k; i++) {         for (int j = 0; j <= Y.length() - k; j++) {             if (X.substring(i, i + k).equals(Y.substring(j, j + k))) {                 return true;             }         }     }     return false; }      public static void main(String[] args) {     String X = "GeeksforGeeks";     String Y = "GeeksQuiz";     System.out.println(longestCommonSubstr(X, Y)); } } 
Python3
# Python code to implement the approach  # Function to implement rolling hash class ComputeHash:      # Generates hash in O(n(log(n)))     def __init__(self, s, p, mod):         n = len(s)         self.hash = [0] * n         self.inv_mod = [0] * n         self.mod = mod         self.p = p          p_pow = 1         hash_value = 0          for i in range(n):             c = ord(s[i]) - 65 + 1             hash_value = (hash_value + c * p_pow) % self.mod             self.hash[i] = hash_value             self.inv_mod[i] = pow(p_pow, self.mod - 2, self.mod)             p_pow = (p_pow * self.p) % self.mod      # Return hash of a window in O(1)     def get_hash(self, l, r):          if l == 0:             return self.hash[r]          window = (self.hash[r] - self.hash[l - 1]) % self.mod         return (window * self.inv_mod[l]) % self.mod  # Function to get the longest common substring def longestCommonSubstr(X, Y, n, m):      p1, p2 = 31, 37     m1, m2 = pow(10, 9) + 9, pow(10, 9) + 7      # Initialize two hash objects     # with different p1, p2, m1, m2     # to reduce collision     hash_X1 = ComputeHash(X, p1, m1)     hash_X2 = ComputeHash(X, p2, m2)      hash_Y1 = ComputeHash(Y, p1, m1)     hash_Y2 = ComputeHash(Y, p2, m2)      # Function that returns the existence     # of a common substring of length k     def exists(k):          if k == 0:             return True          st = set()                  # Iterate on X and get hash tuple         # of all windows of size k         for i in range(n - k + 1):             h1 = hash_X1.get_hash(i, i + k - 1)             h2 = hash_X2.get_hash(i, i + k - 1)              cur_window_hash = (h1, h2)                          # Put the hash tuple in the set             st.add(cur_window_hash)          # Iterate on Y and get hash tuple         # of all windows of size k         for i in range(m - k + 1):             h1 = hash_Y1.get_hash(i, i + k - 1)             h2 = hash_Y2.get_hash(i, i + k - 1)              cur_window_hash = (h1, h2)                          # If hash exists in st return True             if cur_window_hash in st:                 return True         return False      # Binary Search on length     answer = 0     low, high = 0, min(n, m)      while low <= high:         mid = (low + high) // 2          if exists(mid):             answer = mid             low = mid + 1         else:             high = mid - 1      return answer   # Driver Code if __name__ == '__main__':     X = 'GeeksforGeeks'     Y = 'GeeksQuiz'     print(longestCommonSubstr(X, Y, len(X), len(Y))) 
C#
using System; using System.Collections.Generic;  class ComputeHash {     private List<long> hash;     private List<long> invMod;     private long mod;     private long p;      // Constructor to initialize hash values for the given string     public ComputeHash(string s, long p, long mod)     {         int n = s.Length;         this.hash = new List<long>(n);         this.invMod = new List<long>(n);         this.mod = mod;         this.p = p;          long pPow = 1;         long hashValue = 0;          for (int i = 0; i < n; i++)         {             char c = s[i];             // Convert character to numeric value             c = (char)(c - 'A' + 1);             hashValue = (hashValue + c * pPow) % this.mod;             this.hash.Add(hashValue);             // Compute modular inverse             this.invMod.Add(ModularInverse(pPow, this.mod));             pPow = (pPow * this.p) % this.mod;         }     }      // Helper function to compute modular inverse using extended Euclidean algorithm     private long ModularInverse(long a, long m)     {         long m0 = m, t, q;         long x0 = 0, x1 = 1;          if (m == 1)             return 0;          while (a > 1)         {             q = a / m;             t = m;             m = a % m;             a = t;             t = x0;             x0 = x1 - q * x0;             x1 = t;         }          if (x1 < 0)             x1 += m0;          return x1;     }      // Function to get hash value for a substring [l, r]     public long GetHash(int l, int r)     {         if (l == 0)         {             return this.hash[r];         }          long window = (this.hash[r] - this.hash[l - 1] + this.mod) % this.mod;         return (window * this.invMod[l]) % this.mod;     } }  class LongestCommonSubstring {     // Function to check if a common substring of length k exists     private static bool Exists(int k, string X, string Y,                                 ComputeHash hashX1, ComputeHash hashX2,                                 ComputeHash hashY1, ComputeHash hashY2)     {         for (int i = 0; i <= X.Length - k; i++)         {             for (int j = 0; j <= Y.Length - k; j++)             {                 if (X.Substring(i, k) == Y.Substring(j, k))                 {                     return true;                 }             }         }         return false;     }      // Function to find the length of the longest common substring of X and Y     private static int LongestCommonSubstr(string X, string Y)     {         int n = X.Length;         int m = Y.Length;          long p1 = 31;         long p2 = 37;         long m1 = (long)Math.Pow(10, 9) + 9;         long m2 = (long)Math.Pow(10, 9) + 7;          // Initialize two hash objects with different p1, p2, m1, m2 to reduce collision         ComputeHash hashX1 = new ComputeHash(X, p1, m1);         ComputeHash hashX2 = new ComputeHash(X, p2, m2);          ComputeHash hashY1 = new ComputeHash(Y, p1, m1);         ComputeHash hashY2 = new ComputeHash(Y, p2, m2);          // Binary search to find the length of the longest common substring         int low = 0, high = Math.Min(n, m);         int answer = 0;          while (low <= high)         {             int mid = (low + high) / 2;             if (Exists(mid, X, Y, hashX1, hashX2, hashY1, hashY2))             {                 answer = mid;                 low = mid + 1;             }             else             {                 high = mid - 1;             }         }          return answer;     }      static void Main()     {         string X = "GeeksforGeeks";         string Y = "GeeksQuiz";         Console.WriteLine(LongestCommonSubstr(X, Y));          // Output: 5     } } 
JavaScript
// javascript code to implement the approach class ComputeHash {   constructor(mod) {     this.mod = mod;   }   compute(s) {     let hashValue = 0n;     let pPow = 1n;     for (let i = 0; i < s.length; i++) {       let c = BigInt(s.charCodeAt(i) - "A".charCodeAt(0) + 1);       hashValue = (hashValue + c * pPow) % this.mod;       pPow = pPow * 26n % this.mod;     }     return hashValue;   } }  function longestCommonSubstr(s, t) {   const mod = BigInt(10**9+7);   const p1 = new ComputeHash(mod);   const p2 = new ComputeHash(mod);   let left = 0, right = Math.min(s.length, t.length);   while (left < right) {     const mid = Math.floor((left + right + 1) / 2);     let found = false;     const set = new Set();     for (let i = 0; i + mid <= s.length; i++) {       const hashValue = p1.compute(s.substring(i, i + mid));       set.add(hashValue);     }     for (let i = 0; i + mid <= t.length; i++) {       const hashValue = p2.compute(t.substring(i, i + mid));       if (set.has(hashValue)) {         found = true;         break;       }     }     if (found) {       left = mid;     } else {       right = mid - 1;     }   }   return left; }  console.log(longestCommonSubstr("ABABAB", "BABABA"));  // expected output: 3  //code is implemented by chetanbargal 

Output
5

Time Complexity: O(n * log(m1)) + O(m * log((m1)) + O((n + m) * log(min(n, m)))

  1. Generating hash object takes O(n*log(m1)), where n is the length of string and m1 = pow(10, 9) + 7.
  2. Binary search takes O(log(min(n, m))), where n, m are the lengths of both strings.
  3. Hash of a window takes O(1) time.
  4. Exist function takes O(n + m) time.

Auxiliary Space: O(n + m)


Next Article
Longest Common Prefix using Binary Search

S

shivampkrr
Improve
Article Tags :
  • Strings
  • Algorithms
  • Data Structures
  • DSA
  • Binary Search
  • Hash
  • substring
Practice Tags :
  • Algorithms
  • Binary Search
  • Data Structures
  • Hash
  • Strings

Similar Reads

  • Longest substring consisting of vowels using Binary Search
    Given string str of length N, the task is to find the longest substring which contains only vowels using the Binary Search technique.Examples: Input: str = "baeicba" Output: 3 Explanation: Longest substring which contains vowels only is "aei".Input: str = "aeiou" Output: 5 Approach: Refer to the Lon
    8 min read
  • Longest Common Substring in an Array of Strings
    We are given a list of words sharing a common stem i.e the words originate from same word for ex: the words sadness, sadly and sad all originate from the stem 'sad'. Our task is to find and return the Longest Common Substring also known as stem of those words. In case there are ties, we choose the s
    7 min read
  • Longest Common Prefix using Binary Search
    Given an array of strings arr[], the task is to return the longest common prefix among each and every strings present in the array. If there’s no prefix common in all the strings, return "". Examples: Input: arr[] = [“geeksforgeeks”, “geeks”, “geek”, “geezer”]Output: "gee"Explanation: "gee" is the l
    8 min read
  • Longest common substring in binary representation of two numbers
    Given two integers n and m. Find the longest contiguous subset in binary representation of both the numbers and their decimal value. Example 1: Input : n = 10, m = 11 Output : 5 Explanation : Binary representation of 10 -> 1010 11 -> 1011 longest common substring in both is 101 and decimal val
    7 min read
  • Print the longest common substring
    Given two strings ‘X’ and ‘Y’, print the length of the longest common substring. If two or more substrings have the same value for the longest common substring, then print any one of them. Examples: Input : X = "GeeksforGeeks", Y = "GeeksQuiz" Output : Geeks Input : X = "zxabcdezy", Y = "yzabcdezx"
    15+ min read
  • Longest Palindromic Substring using hashing in O(nlogn)
    Given a string S, The task is to find the longest substring which is a palindrome using hashing in O(N log N) time. Input: S: ”forgeeksskeegfor”, Output: “geeksskeeg” Input: S: ”Geeks”, Output: “ee” Hashing to Solve the Problem:The hashing approach to solving the longest palindromic substring proble
    11 min read
  • SequenceMatcher in Python for Longest Common Substring
    Given two strings ‘X’ and ‘Y’, print the longest common sub-string. Examples: Input : X = "GeeksforGeeks", Y = "GeeksQuiz" Output : Geeks Input : X = "zxabcdezy", Y = "yzabcdezx" Output : abcdez We have existing solution for this problem please refer Print the longest common substring link. We will
    2 min read
  • Longest Non-Increasing Subsequence in a Binary String
    Given a binary string S of size N, the task is to find the length of the longest non-increasing subsequence in the given string S. Examples: Input: S = "0101110110100001011"Output: 12 Explanation: The longest non-increasing subsequence is "111111100000", having length equal to 12. Input: S = 10101Ou
    8 min read
  • Longest balanced binary substring with equal count of 1s and 0s
    Given a binary string str[] of size N. The task is to find the longest balanced substring. A substring is balanced if it contains an equal number of 0 and 1. Examples: Input: str = "110101010"Output: 10101010Explanation: The formed substring contain equal count of 1 and 0 i.e, count of 1 and 0 is sa
    8 min read
  • Find the longest Substring of a given String S
    Given a string S of length, N. Find the maximum length of any substring of S such that, the bitwise OR of all the characters of the substring is equal to the bitwise OR of the remaining characters of the string. If no such substring exists, print -1. Examples: Input: S = "2347"Output: 3?Explanation:
    10 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences