Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • DSA
  • Interview Problems on String
  • Practice String
  • MCQs on String
  • Tutorial on String
  • String Operations
  • Sort String
  • Substring & Subsequence
  • Iterate String
  • Reverse String
  • Rotate String
  • String Concatenation
  • Compare Strings
  • KMP Algorithm
  • Boyer-Moore Algorithm
  • Rabin-Karp Algorithm
  • Z Algorithm
  • String Guide for CP
Open In App
Next Article:
Queue for Competitive Programming
Next article icon

Suffix Arrays for Competitive Programming

Last Updated : 12 Mar, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

A suffix array is a sorted array of all suffixes of a given string. More formally if you are given a string 'S' then the suffix array for this string contains the indices 0 to n, such that the suffixes starting from these indices are sorted lexicographically.

suffix-array-competetive-programme

Example:

Input: banana

0 banana 5 a
1 anana Sort the Suffixes 3 ana
2 nana ----------------> 1 anana
3 ana alphabetically 0 banana
4 na 4 na
5 a 2 nana

So the suffix array for "banana" is {5, 3, 1, 0, 4, 2}

Construction of Suffix Arrays:

  1. Naive way to construct suffix array
  2. Using Radix Sort to construct suffix array in O(n * Log(n))

Use Cases of Suffix Array:

1. Searching a Substring in a string:

Problem: Given a string 'S' and a string 'T' determine whether the string T is a substring of S, if so return the index at which T is a substring of S.

Example:

Input: S = "bannana" , T = "nan"
Output: 3

Naive Solution: In O(|S| * |T|) we can iterate on each index of 'T' and then compare whether the substring starting at that index matches 'S' or not.

Solution using Suffix Array: We can notice that any substring is a prefix of some suffix. In the suffix array for string 'S' we cut off the first |T| characters of each suffix and get all the substring of length atmost |T| in a sorted order. In order to find S we can simply apply binary search and compare the mid string to string S.

  • If mid string of suffix array is lexicographically smaller than 'T' then binary search on right half.
  • If mid string of suffix array is lexicographically greater than 'T' then binary search on left half.
  • If both the string match return that index as our result.

Time Complexity: O(|S| * log(|S|) + |T| * log(|S|) ), where O(|S| * log(|S|)) is to construct suffix array for string S and O(|T| * log(|S|)) is to search and compare string T.

2. Finding Longest Common Prefix (LCP):

Problem: Given a string 'S' and Q queries of the form {i, j}. Find the LCP(i, j) i.e. length of the Longest Common Prefix(LCP) for the suffixes starting at index i and j.

Example:

Input: S = "banana" , Query = {{0, 5}, {4, 2}, {1, 3}}
Output: 0 2 3
Explanation: Query[0] = {0, 5} = LCP (banana, a) = ' ' = 0
Query[1] = {4, 2} = LCP (na, nana) = 'na' = 2
Query[2] = {1, 3} = LCP (anana, ana) = 'ana' = 3

Naive Solution: For each query we can we can compare both the suffixes starting from i and j in O(|S|) thus giving us a total time complexity of O(Q*|S| )

Solution using Suffix Array: Let our suffix array be Suffix[], in order to solve the problem let us construct an array lcp[] such that lcp[i] = LCP(Suffix[i], Suffix[i+1]). In simple language the lcp[] array stores the Longest common prefix of adjacent indices in suffix array as shown in the below image for string S = "banana".

Construction-Of-LCP-array

Now in order to calculate LCP(i, j) just find the position of i and j in suffix array and calculate the minimum value in range lcp[Suffix[i]] to lcp[Suffix[j]-1].

suffix-array

Proof: Let LCP(i, j) = k , since the Suffixes are sorted in Lexicographical order, therefore each suffix from Suffix[i] to Suffix[j] will have atleast k common characters at string, So all lcp from i to j is not less than k and therefore the minimum on this segement is not less than k. On the other hand, it cannot be greater than k, since this means that each pair of suffixes has more than k common characters, which means that i and j must have more than k common characters.

Note: Interestingly we can construct a sparse table in order to answer each query in O(1).
How to construct the lcp[] array in O(N)

Time Complexity: O((|S| * log|S|) + Q)

3. Number of Different Substrings:

Problem: Given a string 'S', the task is to find the total number of unique substrings of S.

Example:

Input: S='abab'
Output: 7
Explanation: Unique substrings of "abab" = {"abab","aba","ab","a","bab","ba","b"}

Solution using Suffix array: As we know that any substring is a prefix of some suffix. In order to calculate the total number of distinct substrings we can iterate the suffix array (where suffixes are sorted) ,the total number of prefixes is equal to the length of the suffix. In order to find out which of them have already occurred in the previous suffixes, we just need to subtract the LCP of this suffix with the previous one.

The below image shows how to calculate number of distinct substrings for the string "BANANA" using suffix and lcp array.

calculating

Practice problems on Suffix Array:

  • Check if given words are present in a string
  • Counting k-mers via Suffix Array
  • Count of distinct substrings of a string using Suffix Array
  • Print Kth character in sorted concatenated substrings of a string
  • Construct array B as last element left of every suffix array obtained by performing given operations on every suffix of given array



Next Article
Queue for Competitive Programming

V

vaibhav_gfg
Improve
Article Tags :
  • Strings
  • Competitive Programming
  • DSA
  • Suffix-Array
Practice Tags :
  • Strings

Similar Reads

  • Arrays for Competitive Programming
    In this article, we will be discussing Arrays which is one of the most commonly used data structure. It also plays a major part in Competitive Programming. Moreover, we will see built-in methods used to write short codes for array operations that can save some crucial time during contests. Table of
    15+ min read
  • Queue for Competitive Programming
    In competitive programming, a queue is a data structure that is often used to solve problems that involve tasks that need to be completed in a specific order. This article explores the queue data structure and identifies its role as a critical tool for overcoming coding challenges in competitive pro
    8 min read
  • String Guide for Competitive Programming
    Strings are a sequence of characters, and are one of the most fundamental data structures in Competitive Programming. String problems are very common in competitive programming contests, and can range from simple to very challenging. In this article we are going to discuss about most frequent string
    15 min read
  • 7 Best Books for Competitive Programming
    Do you have a dream to win a Gold Medal in the Olympics of Programming (ACM ICPC)? Do you want to ace your career with Google Kickstart or want to win a prize amount of $20,000 to become a world champion in Facebook Hackercup or Google Code jam? Then you have to be an out-of-the-box problem solver.
    8 min read
  • DP on Trees for Competitive Programming
    Dynamic Programming (DP) on trees is a powerful algorithmic technique commonly used in competitive programming. It involves solving various tree-related problems by efficiently calculating and storing intermediate results to optimize time complexity. By using the tree structure, DP on trees allows p
    15+ min read
  • What Are The Best Resources For Competitive Programming?
    Gennady Korotkevich, Petr Mitrichev, Adam D'Angelo.... Have you heard the above name ever...?? Let me tell you who they are... The first two people (Gennady Korotkevich, Petr Mitrichev) are popular for being the top competitive programmers in the world and the last one (Adam D'Angelo) is also one of
    9 min read
  • Segment Trees for Competitive Programming
    Segment Tree is one of the most important data structures used for solving problems based on range queries and updates. Problems based on Segment Trees are very common in Programming Contests. This article covers all the necessary concepts required to have a clear understanding of Segment Trees. Tab
    8 min read
  • Best Courses on Competitive Programming
    Competitive programming has gone beyond being a niche interest. Has become a skill, for computer science enthusiasts. Being able to solve algorithmic problems is highly valued in the tech industry. Recognizing this demand various online platforms offer courses tailored to skill levels and learning p
    5 min read
  • Ternary Search for Competitive Programming
    Ternary search is a powerful algorithmic technique that plays a crucial role in competitive programming. This article explores the fundamentals of ternary search, idea behind ternary search with its use cases that will help solving complex optimization problems efficiently. Table of Content What is
    8 min read
  • Which C++ libraries are useful for competitive programming?
    C++ is one of the most recommended languages in competitive programming (please refer our previous article for the reason) C++ STL contains lots of containers which are useful for different purposes. In this article, we are going to focus on the most important containers from competitive programming
    3 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences