Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Aptitude
  • Engineering Mathematics
  • Discrete Mathematics
  • Operating System
  • DBMS
  • Computer Networks
  • Digital Logic and Design
  • C Programming
  • Data Structures
  • Algorithms
  • Theory of Computation
  • Compiler Design
  • Computer Org and Architecture
Open In App
Next Article:
Regular Expression to DFA
Next article icon

Regular Expression to DFA

Last Updated : 04 Oct, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

The main function of regular expressions is to define patterns for matching strings; automata theory provides a structured pattern recognition of these patterns through Finite Automata. A very common method to construct a Deterministic Finite Automaton (DFA) based on any given regular expression is first to construct an NFA and then transform the NFA into the equivalent DFA by the method of subset construction. However, this two-step procedure can be avoided by directly constructing the DFA from the regular expression.

What is DFA?

A DFA is a type of finite automaton such that, for any state and any input symbol, there is exactly one possible transition to a subsequent state. NFAs do not have €-transitions (transitions without the consumption of any input). Because of this determinism, DFAs are an efficient model for pattern recognition tasks because the next state of the automaton is completely determined from the current state and the input symbol at any given point.

Construction of DFA

In order to construct a DFA directly from a regular expression, we need to follow the steps listed below:

Example: Suppose given regular expression r = (a|b)*abb

1. Firstly, we construct the augmented regular expression for the given expression. By concatenating a unique right-end marker '#' to a regular expression r, we give the accepting state for r a transition on '#' making it an important state of the NFA for r#.

So, r' = (a|b)*abb#

2. Then we construct the syntax tree for r#.

Syntax tree for (a|b)*abb#
Syntax tree for (a|b)*abb#

3. Next we need to evaluate four functions nullable, firstpos, lastpos, and followpos.

  1. nullable(n) is true for a syntax tree node n if and only if the regular expression represented by n has € in its language.
  2. firstpos(n) gives the set of positions that can match the first symbol of a string generated by the subexpression rooted at n.
  3. lastpos(n) gives the set of positions that can match the last symbol of a string generated by the subexpression rooted at n.

We refer to an interior node as a cat-node, or-node, or star-node if it is labeled by a concatenation, | or * operator, respectively.

Rules for Computing nullable, firstpos, and lastpos

Node nnullable(n)firstpos(n)lastpos(n)
n is a leaf node labeled €true  ∅∅
n is a leaf node labelled with position ifalse{ i } { i } 
n is an or node with left child c1 and right child c2nullable(c1) or nullable(c2)firstpos(c1) ∪ firstpos(c2)lastpos(c1) ∪ lastpos(c2)
n is a cat node with left child c1 and right child c2nullable(c1) and nullable(c2)If nullable(c1) then firstpos(c1) ∪ firstpos(c2) else firstpos(c1)If nullable(c2) then lastpos(c2) ∪ lastpos(c1) else lastpos(c2)
n is a star node with child node c1truefirstpos(c1)lastpos(c1)

Rules for computing followpos:

  1. If n is a cat-node with left child c1 and right child c2 and i is a position in lastpos(c1), then all positions in firstpos(c2) are in followpos(i).
  2. If n is a star-node and i is a position in lastpos(n), then all positions in firstpos(n) are in followpos(i).
  3. Now that we have seen the rules for computing firstpos and lastpos, we now proceed to calculate the values of the same for the syntax tree of the given regular expression (a|b)*abb#.
firstpos and lastpos for nodes in syntax tree for (a|b)*abb#
firstpos and lastpos for nodes in syntax tree for (a|b)*abb#

Let us now compute the followpos bottom up for each node in the syntax tree.

NODEfollowpos
1{1, 2, 3}
2{1, 2, 3}
3{4}
4{5}
5{6}
6∅

4.Now we construct Dstates, the set of states of DFA D and Dtran, the transition table for D. The start state of DFA D is firstpos(root) and the accepting states are all those containing the position associated with the endmarker symbol #.

According to our example, the firstpos of the root is {1, 2, 3}. Let this state be A and consider the input symbol a. Positions 1 and 3 are for a, so let B = followpos(1) ∪ followpos(3) = {1, 2, 3, 4}. Since this set has not yet been seen, we set Dtran[A, a] := B.

When we consider input b, we find that out of the positions in A, only 2 is associated with b, thus we consider the set followpos(2) = {1, 2, 3}. Since this set has already been seen before, we do not add it to Dstates but we add the transition Dtran[A, b]:= A.

Continuing like this with the rest of the states, we arrive at the below transition table.

 Input
Stateab
⇢ ABA
    BBC
    CBD
    DBA

Here, A is the start state and D is the accepting state.

5. Finally we draw the DFA for the above transition table.

The final DFA will be :

DFA for (a|b)*abb
DFA for (a|b)*abb

Conclusion

Construction of a DFA from a regular expression is one of the very fundamental processes in automata theory that ties formal languages to practice, such as lexical analysis in compilers. The construction of a DFA from the regular expression avoids taking the middle step of creating the NFA, so the process is much shorter but it does preserve the determinism of the automaton. Understanding how DFAs work also deepens knowledge of formal languages but enhances the implementation of efficient pattern recognition and parsing algorithms in many computer science applications.


Next Article
Regular Expression to DFA

D

duttasneha25122000
Improve
Article Tags :
  • Computer Subject
  • Compiler Design
  • regular-expression

Similar Reads

    Conversion of Regular Expression to Finite Automata
    As the regular expressions can be constructed from Finite Automata using the State Elimination Method, the reverse method, state decomposition method can be used to construct Finite Automata from the given regular expressions. Note: This method will construct NFA (with or without ε-transitions, depe
    3 min read
    How DFA and NFA help for Tokenization of "Regular Expression".
    Regular expressions (regex) are the universal tools for data pattern matching and processing text. In a widespread way, they are used in different programming languages, various text editors, and even software applications. Tokenization, the process that involves breaking down the text into smaller
    8 min read
    Design finite automata from regular expressions
    Prerequisite - Finite automata, Regular expressions, grammar, and language. In this article, we will see some popular regular expressions and how we can convert them to finite automata (NFA and DFA). Let's discuss it one by one. Overview :Let a and b are input symbols and r is the regular expression
    3 min read
    State Elimination Method convert DFA/NFA/Ɛ-NFA into Regular Expression
    State Elimination Method : Rules to convert a DFA/NFA//Ɛ-NFA into corresponding Regular Expression. Arden's Method is not capable of converting Ɛ-NFA. By state elimination method you can conveniently and quickly find RE without writing anything just by imagination. Rule-1 : If there are no incoming
    3 min read
    Right and Left linear Regular Grammars
    Regular Grammar is a type of grammar that describes a regular language. It is a set of rules used to describe very simple types of languages called regular languages that can be processed by computers easily, especially with finite automata. A regular grammar is a mathematical object, G, which consi
    3 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences