Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • DSA
  • Practice Searching Algorithms
  • MCQs on Searching Algorithms
  • Tutorial on Searching Algorithms
  • Linear Search
  • Binary Search
  • Ternary Search
  • Jump Search
  • Sentinel Linear Search
  • Interpolation Search
  • Exponential Search
  • Fibonacci Search
  • Ubiquitous Binary Search
  • Linear Search Vs Binary Search
  • Interpolation Search Vs Binary Search
  • Binary Search Vs Ternary Search
  • Sentinel Linear Search Vs Linear Search
Open In App
Next Article:
Closest K Elements in a Sorted Array
Next article icon

K Mmost Frequent Words in a File

Last Updated : 26 Mar, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Given a book of words and an integer K. Assume you have enough main memory to accommodate all words. Design a dynamic data structure to find the top K most frequent words in a book. The structure should allow new words to be added in main memory.

Examples:

Input: fileData = "Welcome to the world of Geeks. This portal has been created to provide well written well thought and well explained solutions for selected questions If you like Geeks for Geeks and would like to contribute here is your chance You can write article and mail your article to contribute at geeksforgeeks org See your article appearing on the Geeks for Geeks main page and help thousands of other Geeks"
Output:
"your" : 3
"well" : 3
"and" : 4
"to" : 4
"Geeks" : 6

Using Hash Map and Heap

  • Store all words and their frequencies in a hash map.
  • Store top k frequent items in a min heap (Please refer k Largest Elements in an Array for details)
  • Print the words and their frequencies in the decreasing order of frequencies.

Important Points about Implementations

  • In Python, we have a direct function most_common()
  • In JavaScript, we do not have direct implementation of min heap, so we have used sorting.
C++
#include <bits/stdc++.h> using namespace std;  void processText(const string& text, int k) {          // Store Frequencies of all words     unordered_map<string, int> freqMap;     istringstream iss(text);     for (string word; iss >> word;) freqMap[word]++;          // Store frequency map items in a priority queue (or min heap)     // with frequency as key     priority_queue<pair<int, string>, vector<pair<int, string>>, greater<>> pq;     for (auto x : freqMap) {         pq.emplace(x.second, x.first);         if (pq.size() > k) pq.pop();     }          // Get the top frequenty items      vector<pair<int, string>> res;     while (!pq.empty()) {         res.push_back(pq.top());         pq.pop();     }          // Reverse to get the desired order     reverse(res.begin(), res.end());     for (auto x : res)         cout << x.second << " : " << x.first << endl; }  int main() {     string text = "Welcome to the world of Geeks Geeks for Geeks is great";     processText(text, 5);      // to read from file     // ifstream file("file.txt");     // if (!file) {     //     cerr << "File doesn't exist" << endl;     //     return 1;     // }     // printKMostFreq(file, k);      // using string instead of file to      // test and run the code          return 0; } 
Java
import java.util.*; import java.io.*;  public class Main {     public static void processText(String text, int k) {                  // Store Frequencies of all words         Map<String, Integer> freqMap = new HashMap<>();         String[] words = text.split(" ");         for (String word : words) {             freqMap.put(word, freqMap.getOrDefault(word, 0) + 1);         }          // Store frequency map items in a priority queue (or min heap)         // with frequency as key         PriorityQueue<Map.Entry<String, Integer>> pq = new PriorityQueue<>(             (a, b) -> a.getValue() - b.getValue()         );         for (Map.Entry<String, Integer> entry : freqMap.entrySet()) {             pq.offer(entry);             if (pq.size() > k) pq.poll();         }          // Get the top frequency items          List<Map.Entry<String, Integer>> res = new ArrayList<>();         while (!pq.isEmpty()) {             res.add(pq.poll());         }          // Reverse to get the desired order         Collections.reverse(res);         for (Map.Entry<String, Integer> entry : res)              System.out.println(entry.getKey() + " : " + entry.getValue());     }      public static void main(String[] args) {         String text = "Welcome to the world of Geeks Geeks for Geeks is great";         processText(text, 5);          // to read from file         // try (Scanner file = new Scanner(new File("file.txt"))) {         //     printKMostFreq(file, k);         // }          // using string instead of file to          // test and run the code     } } 
Python
from collections import Counter   def process_text(text, k):          # Store Frequencies of all words     freq_map = Counter(text.split())          # Get the top k frequent items     res = freq_map.most_common(k)          for word, freq in res:         print(f'{word} : {freq}')  if __name__ == '__main__':     text = 'Welcome to the world of Geeks Geeks for Geeks is great'     process_text(text, 5)     # to read from file     # with open('file.txt', 'r') as file:     #     text = file.read()     #     process_text(text, k) 
C#
using System; using System.Collections.Generic; using System.Linq;  class MainClass {     public static void ProcessText(string text, int k) {                  // Store Frequencies of all words         Dictionary<string, int> freqMap = new Dictionary<string, int>();         string[] words = text.Split(' ');         foreach (string word in words) {             if (freqMap.ContainsKey(word)) {                 freqMap[word]++;             } else {                 freqMap[word] = 1;             }         }          // Store frequency map items in a priority queue          // (max heap) with frequency as key         var pq = new PriorityQueue<string, int>();         foreach (var entry in freqMap) {             pq.Enqueue(entry.Key, -entry.Value);              if (pq.Count > k) pq.Dequeue();         }          // Get the top frequency items          List<KeyValuePair<string, int>> res = new List<KeyValuePair<string, int>>();         while (pq.Count > 0) {             var item = pq.Dequeue();             res.Add(new KeyValuePair<string, int>(item.Item1, -item.Item2));          }         res.Reverse(); // To get the highest frequency first         foreach (var entry in res)              Console.WriteLine(entry.Key + " : " + entry.Value);     }      public static void Main(string[] args) {         string text = "Welcome to the world of Geeks Geeks for Geeks is great";         ProcessText(text, 5);     } } 
JavaScript
function processText(text, k) {          // Store Frequencies of all words     const freqMap = {};     const words = text.split(' ');     for (let word of words) {         freqMap[word] = (freqMap[word] || 0) + 1;     }          // Store frequency map items in an array and sort     const sortedWords = Object.entries(freqMap).sort((a, b) => a[1] - b[1]);          // Get the top k frequent items     const res = sortedWords.slice(-k).reverse();          for (const [word, freq] of res) {         console.log(`${word} : ${freq}`);     } }  const text = 'Welcome to the world of Geeks Geeks for Geeks is great'; processText(text, 5); // to read from file // const fs = require('fs'); // fs.readFile('file.txt', 'utf8', (err, data) => { //     if (err) { //         console.error("File doesn't exist"); //         return; //     } //     processText(data, k); // }); 


Time Complexity : O(n + n Log k) where n is the number of words in the file. We assume that every word is of constant length.


Using Trie and Min Heap

The approach leverages a Trie to efficiently store and search words as they are read from the file, while simultaneously keeping track of each word's occurrence count. Each Trie node is enhanced with an additional field, indexMinHeap, which indicates the position of the word in the Min Heap if it is currently among the top k frequent words (or -1 if it is not). In parallel, a Min Heap of fixed size k is maintained to record the k most frequent words encountered so far. Each node in the Min Heap contains the word, its frequency, and a pointer to the corresponding Trie leaf node. As words are processed, the algorithm updates their frequencies in the Trie and then reflects these changes in the Min Heap by either updating an existing entry, inserting a new entry if space is available, or replacing the root of the Min Heap (which represents the least frequent word among the top k) when the new word’s frequency exceeds it.

Step-by-Step Process to Execute the Code

  • Open the input file and ensure it is accessible; report an error if the file cannot be opened.
  • Read words from the file one by one. For each word, insert it into the Trie: if the word already exists, increment its frequency counter; if not, create a new node and initialize its count to 1.
  • For every word inserted or updated in the Trie, update the Min Heap as follows:
    • If the word is already present in the Min Heap (i.e., its indexMinHeap is not -1), simply update its frequency in the heap and call minHeapify() at the respective index.
    • If the word is not present and the Min Heap has available space, insert the new word into the heap, update its corresponding Trie node's indexMinHeap, and rebuild the heap.
    • If the Min Heap is full, compare the frequency of the new word with the frequency at the root of the heap (the smallest frequency among the top k). If the new word’s frequency is lower, do nothing; if it is higher, replace the root with the new word, update the Trie node of the word being replaced (setting its indexMinHeap to -1), and call minHeapify() to restore the heap property.
  • After processing all words, the Min Heap will contain the k most frequent words. Finally, iterate over the Min Heap and print each word along with its frequency.

Below is given the implementation:

C++
#include <bits/stdc++.h> using namespace std;  class Node { public:     bool isEnd;     unsigned freq;     int ind;     vector<Node*> child;      Node() : isEnd(false), freq(0),      ind(-1), child(26, nullptr) {} };  class minHeapNode { public:     Node* root;     unsigned freq;     string word;      minHeapNode() :      root(nullptr), freq(0), word("") {} };  class MinHeap { public:     int cap;     int count;     vector<minHeapNode> arr;      MinHeap(int cap) :      cap(cap), count(0), arr(cap) {}      void swapNodes(int a, int b) {         swap(arr[a], arr[b]);         arr[a].root->ind = a;         arr[b].root->ind = b;     }      void heapify(int idx) {         int left = 2 * idx + 1;         int right = 2 * idx + 2;         int mini = idx;         if (left < count &&              arr[left].freq < arr[mini].freq)             mini = left;         if (right < count &&              arr[right].freq < arr[mini].freq)             mini = right;         if (mini != idx) {             swapNodes(idx, mini);             heapify(mini);         }     }      void build() {         for (int i = (count - 1) / 2; i >= 0; --i)             heapify(i);     } };  void insert(MinHeap& mH, Node* root,                      const string& word) {      // Case 1: word is already in mH,      // so update its freq.     if (root->ind != -1) {         ++mH.arr[root->ind].freq;         mH.heapify(root->ind);     }      // Case 2: Word is not in mH and      // there's still room.     else if (mH.count < mH.cap) {         minHeapNode node;         node.root = root;         node.freq = root->freq;         node.word = word;         mH.arr[mH.count] = node;         root->ind = mH.count++;         mH.build();     }      // Case 3: Heap is full and freq of new      // word is greater than the root.     else if (root->freq > mH.arr[0].freq) {         mH.arr[0].root->ind = -1;         minHeapNode node;         node.root = root;         node.freq = root->freq;         node.word = word;         mH.arr[0] = node;         root->ind = 0;         mH.heapify(0);     } }  void insertUtil(Node*& root, MinHeap& mH,      const string& word, size_t index = 0) {     if (!root)         root = new Node();     if (index < word.size()) {         int pos = tolower(word[index]) - 'a';         if (pos >= 0 && pos < 26)             insertUtil(root->child[pos],                  mH, word, index + 1);     } else {         if (root->isEnd)             ++root->freq;         else {             root->isEnd = true;             root->freq = 1;         }         insert(mH, root, word);     } }  void insertTrieAndHeap(const string& word,     Node*& root, MinHeap& mH) {     insertUtil(root, mH, word); }  void displayMinHeap(const MinHeap& mH) {     for (int i = 0; i < mH.count; ++i)         cout << mH.arr[i].word << " : "          << mH.arr[i].freq << endl; }  void printKMostFreq(ifstream& file, int k) {     MinHeap mH(k);     Node* root = nullptr;      // to process the words in file     string word;     while (file >> word) {         insertTrieAndHeap(word, root, mH);     }     displayMinHeap(mH);          // Clean up the Trie memory     if (root) {         delete root;     } }  void printKMostFreq(string str, int k) {     MinHeap mH(k);     Node* root = nullptr;          istringstream iss(str);     string word;     while (iss >> word) {         insertTrieAndHeap(word, root, mH);     }      displayMinHeap(mH);          // Clean up the Trie memory     if (root) {         delete root;     } }  int main() {     int k = 5;     string str = "Welcome to the world of Geeks . This portal has been created to provide well written well thought and well explained solutions for selected questions If you like Geeks for Geeks and would like to contribute here is your chance You can write article and mail your article to contribute at geeksforgeeks org See your article appearing on the Geeks for Geeks main page and help thousands of other Geeks";     printKMostFreq(str, k);           // to read from file     // ifstream file("file.txt");     // if (!file) {     //     cerr << "File doesn't exist" << endl;     //     return 1;     // }     // printKMostFreq(file, k);      // using string instead of file to      // test and run the code     return 0; } 
Java
import java.io.*; import java.util.*; import java.util.regex.*;  class Node {     boolean isEnd;     int freq;     int ind;     Node[] child;      Node() {         isEnd = false;         freq = 0;         ind = -1;         child = new Node[26];     } }  class MinHeapNode {     Node root;     int freq;     String word;      MinHeapNode() {         root = null;         freq = 0;         word = "";     } }  class MinHeap {     int cap;     int count;     MinHeapNode[] arr;      MinHeap(int cap) {         this.cap = cap;         count = 0;         arr = new MinHeapNode[cap];         for (int i = 0; i < cap; i++) {             arr[i] = new MinHeapNode();         }     }      void swapNodes(int a, int b) {         MinHeapNode temp = arr[a];         arr[a] = arr[b];         arr[b] = temp;         arr[a].root.ind = a;         arr[b].root.ind = b;     }      void heapify(int idx) {         int left = 2 * idx + 1;         int right = 2 * idx + 2;         int mini = idx;         if (left < count && arr[left].freq < arr[mini].freq)             mini = left;         if (right < count && arr[right].freq < arr[mini].freq)             mini = right;         if (mini != idx) {             swapNodes(idx, mini);             heapify(mini);         }     }      void build() {         for (int i = (count - 1) / 2; i >= 0; --i)             heapify(i);     } }  class GfG {      static void insert(MinHeap mH, Node root, String word) {         if (root.ind != -1) {             ++mH.arr[root.ind].freq;             mH.heapify(root.ind);         } else if (mH.count < mH.cap) {             MinHeapNode node = new MinHeapNode();             node.root = root;             node.freq = root.freq;             node.word = word;             mH.arr[mH.count] = node;             root.ind = mH.count++;             mH.build();         } else if (root.freq > mH.arr[0].freq) {             mH.arr[0].root.ind = -1;             MinHeapNode node = new MinHeapNode();             node.root = root;             node.freq = root.freq;             node.word = word;             mH.arr[0] = node;             root.ind = 0;             mH.heapify(0);         }     }      static void insertUtil(Node root, MinHeap mH, String word, int index) {         if (index < word.length()) {             int pos = Character.toLowerCase(word.charAt(index)) - 'a';             if (pos >= 0 && pos < 26) {                 if (root.child[pos] == null) {                     root.child[pos] = new Node();                 }                 insertUtil(root.child[pos], mH, word, index + 1);             }         } else {             if (root.isEnd)                 ++root.freq;             else {                 root.isEnd = true;                 root.freq = 1;             }             insert(mH, root, word);         }     }      static void insertTrieAndHeap(String word, Node root, MinHeap mH) {         insertUtil(root, mH, word, 0);     }      static void displayMinHeap(MinHeap mH) {         for (int i = 0; i < mH.count; ++i)             System.out.println(mH.arr[i].word + " : " + mH.arr[i].freq);     }      static void printKMostFreq(BufferedReader file, int k) throws IOException {         MinHeap mH = new MinHeap(k);         Node root = new Node();         String line;          while ((line = file.readLine()) != null) {             for (String word : line.split("\\W+")) {                 if (!word.isEmpty()) {                     insertTrieAndHeap(word.toLowerCase(), root, mH);                 }             }         }         displayMinHeap(mH);     }      static void printKMostFreq(String str, int k) {         MinHeap mH = new MinHeap(k);         Node root = new Node();          for (String word : str.split("\\W+")) {             if (!word.isEmpty()) {                 insertTrieAndHeap(word.toLowerCase(), root, mH);             }         }          displayMinHeap(mH);     }      public static void main(String[] args) throws IOException {         int k = 5;          // to read from file         // BufferedReader file = new BufferedReader(new FileReader("file.txt"));         // printKMostFreq(file, k);          // using string instead of file to          // test and run the code         String str = "Welcome to the world of Geeks . This portal has been created to provide well written well thought and well explained solutions for selected questions If you like Geeks for Geeks and would like to contribute here is your chance You can write article and mail your article to contribute at geeksforgeeks org See your article appearing on the Geeks for Geeks main page and help thousands of other Geeks";         printKMostFreq(str, k);     } } 
Python
import heapq import string  class Node:     def __init__(self):         self.isEnd = False         self.freq = 0         self.ind = -1         self.child = [None] * 26  class MinHeapNode:     def __init__(self):         self.root = None         self.freq = 0         self.word = ""  class MinHeap:     def __init__(self, cap):         self.cap = cap         self.count = 0         self.arr = [MinHeapNode() for _ in range(cap)]      def swapNodes(self, a, b):         self.arr[a], self.arr[b] = self.arr[b], self.arr[a]         self.arr[a].root.ind = a         self.arr[b].root.ind = b      def heapify(self, idx):         left = 2 * idx + 1         right = 2 * idx + 2         mini = idx         if left < self.count and self.arr[left].freq < self.arr[mini].freq:             mini = left         if right < self.count and self.arr[right].freq < self.arr[mini].freq:             mini = right         if mini != idx:             self.swapNodes(idx, mini)             self.heapify(mini)      def build(self):         for i in range((self.count - 1) // 2, -1, -1):             self.heapify(i)  def insert(mH, root, word):      # Case 1: word is already in mH,      # so update its freq.     if root.ind != -1:         mH.arr[root.ind].freq += 1         mH.heapify(root.ind)      # Case 2: Word is not in mH and      # there's still room.     elif mH.count < mH.cap:         node = MinHeapNode()         node.root = root         node.freq = root.freq         node.word = word         mH.arr[mH.count] = node         root.ind = mH.count         mH.count += 1         mH.build()      # Case 3: Heap is full and freq of new      # word is greater than the root.     elif root.freq > mH.arr[0].freq:         mH.arr[0].root.ind = -1         node = MinHeapNode()         node.root = root         node.freq = root.freq         node.word = word         mH.arr[0] = node         root.ind = 0         mH.heapify(0)  def insertUtil(root, mH, word, index=0):     if root is None:         root = Node()     if index < len(word):         pos = ord(word[index].lower()) - ord('a')         if 0 <= pos < 26:             if root.child[pos] is None:                 root.child[pos] = Node()             insertUtil(root.child[pos], mH, word, index + 1)     else:         if root.isEnd:             root.freq += 1         else:             root.isEnd = True             root.freq = 1         insert(mH, root, word)  def insertTrieAndHeap(word, root, mH):     insertUtil(root, mH, word)  def displayMinHeap(mH):     for i in range(mH.count):         print(mH.arr[i].word, ":", mH.arr[i].freq)  def printKMostFreq(file, k):     mH = MinHeap(k)     root = Node()      # to process the words in file     for word in file.read().split():         insertTrieAndHeap(word, root, mH)     displayMinHeap(mH)  def printKMostFreqString(str, k):     mH = MinHeap(k)     root = Node()      for word in str.split():         insertTrieAndHeap(word, root, mH)     displayMinHeap(mH)  if __name__ == "__main__":     k = 5      # to read from file     # with open("file.txt", "r") as file:     #     printKMostFreq(file, k)      # using string instead of file to      # test and run the code     str = "Welcome to the world of Geeks . This portal has been created to provide well written well thought and well explained solutions for selected questions If you like Geeks for Geeks and would like to contribute here is your chance You can write article and mail your article to contribute at geeksforgeeks org See your article appearing on the Geeks for Geeks main page and help thousands of other Geeks"     printKMostFreqString(str, k) 
C#
using System; using System.IO; using System.Collections.Generic; using System.Text.RegularExpressions;  class Node {     public bool isEnd;     public int freq;     public int ind;     public Node[] children;      public Node() {         isEnd = false;         freq = 0;         ind = -1;         children = new Node[26];     } }  class MinHeapNode {     public Node root;     public int freq;     public string word;      public MinHeapNode() {         root = null;         freq = 0;         word = "";     } }  class MinHeap {     public int cap;     public int count;     public MinHeapNode[] arr;      public MinHeap(int cap) {         this.cap = cap;         count = 0;         arr = new MinHeapNode[cap];         for (int i = 0; i < cap; i++) {             arr[i] = new MinHeapNode();         }     }      public void SwapNodes(int a, int b) {         MinHeapNode temp = arr[a];         arr[a] = arr[b];         arr[b] = temp;         arr[a].root.ind = a;         arr[b].root.ind = b;     }      public void Heapify(int idx) {         int left = 2 * idx + 1;         int right = 2 * idx + 2;         int mini = idx;         if (left < count && arr[left].freq < arr[mini].freq)             mini = left;         if (right < count && arr[right].freq < arr[mini].freq)             mini = right;         if (mini != idx) {             SwapNodes(idx, mini);             Heapify(mini);         }     }      public void Build() {         for (int i = (count - 1) / 2; i >= 0; --i)             Heapify(i);     } }  class GfG {      static void Insert(MinHeap mH, Node root, string word) {          // Case 1: word is already in mH,          // so update its freq.         if (root.ind != -1) {             ++mH.arr[root.ind].freq;             mH.Heapify(root.ind);         }          // Case 2: Word is not in mH and          // there's still room.         else if (mH.count < mH.cap) {             MinHeapNode node = new MinHeapNode();             node.root = root;             node.freq = root.freq;             node.word = word;             mH.arr[mH.count] = node;             root.ind = mH.count++;             mH.Build();         }          // Case 3: Heap is full and freq of new          // word is greater than the root.         else if (root.freq > mH.arr[0].freq) {             mH.arr[0].root.ind = -1;             MinHeapNode node = new MinHeapNode();             node.root = root;             node.freq = root.freq;             node.word = word;             mH.arr[0] = node;             root.ind = 0;             mH.Heapify(0);         }     }      static void InsertUtil(Node root, MinHeap mH, string word, int index = 0) {         if (index < word.Length) {             int pos = Char.ToLower(word[index]) - 'a';             if (pos >= 0 && pos < 26) {                 if (root.children[pos] == null) {                     root.children[pos] = new Node();                 }                 InsertUtil(root.children[pos], mH, word, index + 1);             }         } else {             if (root.isEnd)                 ++root.freq;             else {                 root.isEnd = true;                 root.freq = 1;             }             Insert(mH, root, word);         }     }      static void InsertTrieAndHeap(string word, Node root, MinHeap mH) {         InsertUtil(root, mH, word);     }      static void DisplayMinHeap(MinHeap mH) {         for (int i = 0; i < mH.count; ++i)             Console.WriteLine(mH.arr[i].word + " : " + mH.arr[i].freq);     }      static void PrintKMostFreq(StreamReader file, int k) {         MinHeap mH = new MinHeap(k);         Node root = new Node();          // to process the words in file         string line;         while ((line = file.ReadLine()) != null) {             foreach (string word in Regex.Split(line, @"\W+")) {                 if (!string.IsNullOrEmpty(word)) {                     InsertTrieAndHeap(word.ToLower(), root, mH);                 }             }         }         DisplayMinHeap(mH);     }      static void PrintKMostFreq(string str, int k) {         MinHeap mH = new MinHeap(k);         Node root = new Node();          foreach (string word in Regex.Split(str, @"\W+")) {             if (!string.IsNullOrEmpty(word)) {                 InsertTrieAndHeap(word.ToLower(), root, mH);             }         }          DisplayMinHeap(mH);     }      public static void Main() {         int k = 5;          // to read from file         // using (StreamReader file = new StreamReader("file.txt")) {         //     PrintKMostFreq(file, k);         // }          // using string instead of file to          // test and run the code         string str = "Welcome to the world of Geeks . This portal has been created to provide well written well thought and well explained solutions for selected questions If you like Geeks for Geeks and would like to contribute here is your chance You can write article and mail your article to contribute at geeksforgeeks org See your article appearing on the Geeks for Geeks main page and help thousands of other Geeks";         PrintKMostFreq(str, k);     } } 
JavaScript
class Node {     constructor() {         this.isEnd = false;         this.freq = 0;         this.ind = -1;         this.child = new Array(26).fill(null);     } }  class MinHeapNode {     constructor() {         this.root = null;         this.freq = 0;         this.word = "";     } }  class MinHeap {     constructor(cap) {         this.cap = cap;         this.count = 0;         this.arr = new Array(cap).fill(null).map(() => new MinHeapNode());     }      swapNodes(a, b) {         [this.arr[a], this.arr[b]] = [this.arr[b], this.arr[a]];         this.arr[a].root.ind = a;         this.arr[b].root.ind = b;     }      heapify(idx) {         let left = 2 * idx + 1;         let right = 2 * idx + 2;         let mini = idx;         if (left < this.count && this.arr[left].freq < this.arr[mini].freq)             mini = left;         if (right < this.count && this.arr[right].freq < this.arr[mini].freq)             mini = right;         if (mini !== idx) {             this.swapNodes(idx, mini);             this.heapify(mini);         }     }      build() {         for (let i = Math.floor((this.count - 1) / 2); i >= 0; --i)             this.heapify(i);     } }  function insert(mH, root, word) {      // Case 1: word is already in mH,      // so update its freq.     if (root.ind !== -1) {         mH.arr[root.ind].freq++;         mH.heapify(root.ind);     }      // Case 2: Word is not in mH and      // there's still room.     else if (mH.count < mH.cap) {         let node = new MinHeapNode();         node.root = root;         node.freq = root.freq;         node.word = word;         mH.arr[mH.count] = node;         root.ind = mH.count++;         mH.build();     }      // Case 3: Heap is full and freq of new      // word is greater than the root.     else if (root.freq > mH.arr[0].freq) {         mH.arr[0].root.ind = -1;         let node = new MinHeapNode();         node.root = root;         node.freq = root.freq;         node.word = word;         mH.arr[0] = node;         root.ind = 0;         mH.heapify(0);     } }  function insertUtil(root, mH, word, index = 0) {     if (!root)         root = new Node();     if (index < word.length) {         let pos = word[index].toLowerCase().charCodeAt(0) - 'a'.charCodeAt(0);         if (pos >= 0 && pos < 26) {             if (!root.child[pos])                  root.child[pos] = new Node();             insertUtil(root.child[pos], mH, word, index + 1);         }     } else {         if (root.isEnd)             root.freq++;         else {             root.isEnd = true;             root.freq = 1;         }         insert(mH, root, word);     } }  function insertTrieAndHeap(word, root, mH) {     insertUtil(root, mH, word); }  function displayMinHeap(mH) {     for (let i = 0; i < mH.count; ++i)         console.log(mH.arr[i].word + " : " + mH.arr[i].freq); }  function printKMostFreq(str, k) {     let mH = new MinHeap(k);     let root = new Node();      let words = str.split(/\s+/);     for (let word of words) {         insertTrieAndHeap(word, root, mH);     }      displayMinHeap(mH); }  function main() {     let k = 5;      // using string instead of file to      // test and run the code     let str = "Welcome to the world of Geeks . This portal has been created to provide well written well thought and well explained solutions for selected questions If you like Geeks for Geeks and would like to contribute here is your chance You can write article and mail your article to contribute at geeksforgeeks org See your article appearing on the Geeks for Geeks main page and help thousands of other Geeks";     printKMostFreq(str, k); }  main(); 

Output
your : 3 well : 3 and : 4 to : 4 Geeks : 6 

The above output is for a file with following content. 

Welcome to the world of Geeks . This portal has been created to provide well written well thought and well explained solutions for selected questions If you like Geeks for Geeks and would like to contribute here is your chance You can write article and mail your article to contribute at geeksforgeeks org See your article appearing on the Geeks for Geeks main page and help thousands of other Geeks.


Next Article
Closest K Elements in a Sorted Array

K

kartik
Improve
Article Tags :
  • Searching
  • Advanced Data Structure
  • DSA
Practice Tags :
  • Advanced Data Structure
  • Searching

Similar Reads

    Searching Algorithms
    Searching algorithms are essential tools in computer science used to locate specific items within a collection of data. In this tutorial, we are mainly going to focus upon searching in an array. When we search an item in an array, there are two most common algorithms used based on the type of input
    3 min read

    Most Common Searching Algorithms

    Linear Search Algorithm
    Given an array, arr of n integers, and an integer element x, find whether element x is present in the array. Return the index of the first occurrence of x in the array, or -1 if it doesn't exist.Input: arr[] = [1, 2, 3, 4], x = 3Output: 2Explanation: There is one test case with array as [1, 2, 3 4]
    9 min read
    Binary Search Algorithm - Iterative and Recursive Implementation
    Binary Search Algorithm is a searching algorithm used in a sorted array by repeatedly dividing the search interval in half. The idea of binary search is to use the information that the array is sorted and reduce the time complexity to O(log N). Binary Search AlgorithmConditions to apply Binary Searc
    15 min read

    Other Searching Algorithms

    Sentinel Linear Search
    Sentinel Linear Search as the name suggests is a type of Linear Search where the number of comparisons is reduced as compared to a traditional linear search. In a traditional linear search, only N comparisons are made, and in a Sentinel Linear Search, the sentinel value is used to avoid any out-of-b
    7 min read
    Meta Binary Search | One-Sided Binary Search
    Meta binary search (also called one-sided binary search by Steven Skiena in The Algorithm Design Manual on page 134) is a modified form of binary search that incrementally constructs the index of the target value in the array. Like normal binary search, meta binary search takes O(log n) time. Meta B
    9 min read
    Ternary Search
    Computer systems use different methods to find specific data. There are various search algorithms, each better suited for certain situations. For instance, a binary search divides information into two parts, while a ternary search does the same but into three equal parts. It's worth noting that tern
    15+ min read
    Jump Search
    Like Binary Search, Jump Search is a searching algorithm for sorted arrays. The basic idea is to check fewer elements (than linear search) by jumping ahead by fixed steps or skipping some elements in place of searching all elements.For example, suppose we have an array arr[] of size n and a block (t
    11 min read
    Interpolation Search
    Given a sorted array of n uniformly distributed values arr[], write a function to search for a particular element x in the array. Linear Search finds the element in O(n) time, Jump Search takes O(n) time and Binary Search takes O(log n) time. The Interpolation Search is an improvement over Binary Se
    15+ min read
    Exponential Search
    The name of this searching algorithm may be misleading as it works in O(Log n) time. The name comes from the way it searches an element.Given a sorted array, and an element x to be searched, find position of x in the array.Input: arr[] = {10, 20, 40, 45, 55} x = 45Output: Element found at index 3Inp
    15+ min read
    Fibonacci Search
    Given a sorted array arr[] of size n and an integer x. Your task is to check if the integer x is present in the array arr[] or not. Return index of x if it is present in array else return -1. Examples: Input: arr[] = [2, 3, 4, 10, 40], x = 10Output: 3Explanation: 10 is present at index 3.Input: arr[
    11 min read
    The Ubiquitous Binary Search | Set 1
    We are aware of the binary search algorithm. Binary search is the easiest algorithm to get right. I present some interesting problems that I collected on binary search. There were some requests on binary search. I request you to honor the code, "I sincerely attempt to solve the problem and ensure th
    15+ min read

    Comparisons between Searching Algorithms

    Linear Search vs Binary Search
    Prerequisite: Linear SearchBinary SearchLINEAR SEARCH Assume that item is in an array in random order and we have to find an item. Then the only way to search for a target item is, to begin with, the first position and compare it to the target. If the item is at the same, we will return the position
    11 min read
    Interpolation search vs Binary search
    Interpolation search works better than Binary Search for a Sorted and Uniformly Distributed array. Binary Search goes to the middle element to check irrespective of search-key. On the other hand, Interpolation Search may go to different locations according to search-key. If the value of the search-k
    7 min read
    Why is Binary Search preferred over Ternary Search?
    The following is a simple recursive Binary Search function in C++ taken from here.  C++ // CPP program for the above approach #include <bits/stdc++.h> using namespace std; // A recursive binary search function. It returns location of x in // given array arr[l..r] is present, otherwise -1 int b
    11 min read
    Is Sentinel Linear Search better than normal Linear Search?
    Sentinel Linear search is a type of linear search where the element to be searched is placed in the last position and then all the indices are checked for the presence of the element without checking for the index out of bound case.The number of comparisons is reduced in this search as compared to a
    8 min read

    Library implementations of Searching algorithms

    Binary Search functions in C++ STL (binary_search, lower_bound and upper_bound)
    In C++, STL provide various functions like std::binary_search(), std::lower_bound(), and std::upper_bound() which uses the the binary search algorithm for different purposes. These function will only work on the sorted data.There are the 3 binary search function in C++ STL:Table of Contentbinary_sea
    3 min read
    Arrays.binarySearch() in Java with Examples | Set 1
    In Java, the Arrays.binarySearch() method searches the specified array of the given data type for the specified value using the binary search algorithm. The array must be sorted by the Arrays.sort() method before making this call. If it is not sorted, the results are undefined. Example:Below is a si
    3 min read
    Arrays.binarySearch() in Java with examples | Set 2 (Search in subarray)
    Arrays.binarySearch()| Set 1 Covers how to find an element in a sorted array in Java. This set will cover "How to Search a key in an array within a given range including only start index". Syntax : public static int binarySearch(data_type[] arr, int fromIndex, int toIndex, data_type key) Parameters
    5 min read
    Collections.binarySearch() in Java with Examples
    java.util.Collections.binarySearch() method is a java.util.Collections class method that returns the position of an object in a sorted list.// Returns index of key in a sorted list sorted in// ascending orderpublic static int binarySearch(List slist, T key)// Returns index of key in a sorted list so
    4 min read

    Easy problems on Searching algorithms

    Find the Missing Number
    Given an array arr[] of size n-1 with distinct integers in the range of [1, n]. This array represents a permutation of the integers from 1 to n with one element missing. Find the missing element in the array.Examples: Input: arr[] = [8, 2, 4, 5, 3, 7, 1]Output: 6Explanation: All the numbers from 1 t
    12 min read
    Find the first repeating element in an array of integers
    Given an array of integers arr[], The task is to find the index of first repeating element in it i.e. the element that occurs more than once and whose index of the first occurrence is the smallest. Examples: Input: arr[] = {10, 5, 3, 4, 3, 5, 6}Output: 5 Explanation: 5 is the first element that repe
    8 min read
    Missing and Repeating in an Array
    Given an unsorted array of size n. Array elements are in the range of 1 to n. One number from set {1, 2, ...n} is missing and one number occurs twice in the array. The task is to find these two numbers.Examples: Input: arr[] = {3, 1, 3}Output: 3, 2Explanation: In the array, 2 is missing and 3 occurs
    15+ min read
    Count 1's in a sorted binary array
    Given a binary array arr[] of size n, which is sorted in non-increasing order, count the number of 1's in it. Examples: Input: arr[] = [1, 1, 0, 0, 0, 0, 0]Output: 2Explanation: Count of the 1's in the given array is 2.Input: arr[] = [1, 1, 1, 1, 1, 1, 1]Output: 7Input: arr[] = [0, 0, 0, 0, 0, 0, 0]
    7 min read
    Two Sum - Pair Closest to 0
    Given an integer array arr[], the task is to find the maximum sum of two elements such that sum is closest to zero. Note: In case if we have two of more ways to form sum of two elements closest to zero return the maximum sum.Examples:Input: arr[] = [-8, 5, 2, -6]Output: -1Explanation: The min absolu
    15+ min read
    Pair with the given difference
    Given an unsorted array and an integer x, the task is to find if there exists a pair of elements in the array whose absolute difference is x. Examples: Input: arr[] = [5, 20, 3, 2, 50, 80], x = 78Output: YesExplanation: The pair is {2, 80}.Input: arr[] = [90, 70, 20, 80, 50], x = 45Output: NoExplana
    14 min read
    Kth smallest element in a row-wise and column-wise sorted 2D array
    Given an n x n matrix, every row and column is sorted in non-decreasing order. Given a number K where K lies in the range [1, n*n], find the Kth smallest element in the given 2D matrix.Example:Input: mat =[[10, 20, 30, 40], [15, 25, 35, 45], [24, 29, 37, 48], [32, 33, 39, 50]]K = 3Output: 20Explanat
    15+ min read
    Find common elements in three sorted arrays
    Given three sorted arrays in non-decreasing order, print all common elements in non-decreasing order across these arrays. If there are no such elements return an empty array. In this case, the output will be -1.Note: In case of duplicate common elements, print only once.Examples: Input: arr1[] = [1,
    12 min read
    Ceiling in a sorted array
    Given a sorted array and a value x, find index of the ceiling of x. The ceiling of x is the smallest element in an array greater than or equal to x. Note: In case of multiple occurrences of ceiling of x, return the index of the first occurrence.Examples : Input: arr[] = [1, 2, 8, 10, 10, 12, 19], x
    13 min read
    Floor in a Sorted Array
    Given a sorted array and a value x, find the element of the floor of x. The floor of x is the largest element in the array smaller than or equal to x.Examples:Input: arr[] = [1, 2, 8, 10, 10, 12, 19], x = 5Output: 1Explanation: Largest number less than or equal to 5 is 2, whose index is 1Input: arr[
    9 min read
    Bitonic Point - Maximum in Increasing Decreasing Array
    Given an array arr[] of integers which is initially strictly increasing and then strictly decreasing, the task is to find the bitonic point, that is the maximum value in the array. Note: Bitonic Point is a point in bitonic sequence before which elements are strictly increasing and after which elemen
    10 min read
    Given Array of size n and a number k, find all elements that appear more than n/k times
    Given an array of size n and an integer k, find all elements in the array that appear more than n/k times. Examples:Input: arr[ ] = [3, 4, 2, 2, 1, 2, 3, 3], k = 4Output: [2, 3]Explanation: Here n/k is 8/4 = 2, therefore 2 appears 3 times in the array that is greater than 2 and 3 appears 3 times in
    15+ min read

    Medium problems on Searching algorithms

    3 Sum - Find All Triplets with Zero Sum
    Given an array arr[], the task is to find all possible indices {i, j, k} of triplet {arr[i], arr[j], arr[k]} such that their sum is equal to zero and all indices in a triplet should be distinct (i != j, j != k, k != i). We need to return indices of a triplet in sorted order, i.e., i < j < k.Ex
    11 min read
    Find the element before which all the elements are smaller than it, and after which all are greater
    Given an array, find an element before which all elements are equal or smaller than it, and after which all the elements are equal or greater.Note: Print -1, if no such element exists.Examples:Input: arr[] = [5, 1, 4, 3, 6, 8, 10, 7, 9]Output: 6 Explanation: 6 is present at index 4. All elements on
    14 min read
    Largest pair sum in an array
    Given an unsorted of distinct integers, find the largest pair sum in it. For example, the largest pair sum is 74. If there are less than 2 elements, then we need to return -1.Input : arr[] = {12, 34, 10, 6, 40}, Output : 74Input : arr[] = {10, 10, 10}, Output : 20Input arr[] = {10}, Output : -1[Naiv
    10 min read
    K’th Smallest Element in Unsorted Array
    Given an array arr[] of N distinct elements and a number K, where K is smaller than the size of the array. Find the K'th smallest element in the given array. Examples:Input: arr[] = {7, 10, 4, 3, 20, 15}, K = 3 Output: 7Input: arr[] = {7, 10, 4, 3, 20, 15}, K = 4 Output: 10 Table of Content[Naive Ap
    15 min read
    Search in a Sorted and Rotated Array
    Given a sorted and rotated array arr[] of n distinct elements, the task is to find the index of given key in the array. If the key is not present in the array, return -1. Examples: Input: arr[] = [5, 6, 7, 8, 9, 10, 1, 2, 3], key = 3Output: 8Explanation: 3 is present at index 8 in arr[].Input: arr[]
    15+ min read
    Minimum in a Sorted and Rotated Array
    Given a sorted array of distinct elements arr[] of size n that is rotated at some unknown point, the task is to find the minimum element in it. Examples: Input: arr[] = [5, 6, 1, 2, 3, 4]Output: 1Explanation: 1 is the minimum element present in the array.Input: arr[] = [3, 1, 2]Output: 1Explanation:
    9 min read
    Find a Fixed Point (Value equal to index) in a given array
    Given an array of n distinct integers sorted in ascending order, the task is to find the First Fixed Point in the array. Fixed Point in an array is an index i such that arr[i] equals i. Note that integers in the array can be negative. Note: If no Fixed Point is present in the array, print -1.Example
    7 min read
    K Mmost Frequent Words in a File
    Given a book of words and an integer K. Assume you have enough main memory to accommodate all words. Design a dynamic data structure to find the top K most frequent words in a book. The structure should allow new words to be added in main memory.Examples:Input: fileData = "Welcome to the world of Ge
    15+ min read
    Closest K Elements in a Sorted Array
    You are given a sorted array arr[] containing unique integers, a number k, and a target value x. Your goal is to return exactly k elements from the array that are closest to x, excluding x itself if it is present in the array.An element a is closer to x than b if:|a - x| < |b - x|, or|a - x| == |
    15+ min read
    2 Sum - Pair Sum Closest to Target using Binary Search
    Given an array arr[] of n integers and an integer target, the task is to find a pair in arr[] such that it’s sum is closest to target.Note: Return the pair in sorted order and if there are multiple such pairs return the pair with maximum absolute difference. If no such pair exists return an empty ar
    10 min read
    Find the closest pair from two sorted arrays
    Given two arrays arr1[0...m-1] and arr2[0..n-1], and a number x, the task is to find the pair arr1[i] + arr2[j] such that absolute value of (arr1[i] + arr2[j] - x) is minimum. Example: Input: arr1[] = {1, 4, 5, 7}; arr2[] = {10, 20, 30, 40}; x = 32Output: 1 and 30Input: arr1[] = {1, 4, 5, 7}; arr2[]
    15+ min read
    Find three closest elements from given three sorted arrays
    Given three sorted arrays A[], B[] and C[], find 3 elements i, j and k from A, B and C respectively such that max(abs(A[i] - B[j]), abs(B[j] - C[k]), abs(C[k] - A[i])) is minimized. Here abs() indicates absolute value. Example : Input : A[] = {1, 4, 10} B[] = {2, 15, 20} C[] = {10, 12} Output: 10 15
    15+ min read
    Search in an Array of Rational Numbers without floating point arithmetic
    Given a sorted array of rational numbers, where each rational number is represented in the form p/q (where p is the numerator and q is the denominator), the task is to find the index of a given rational number x in the array. If the number does not exist in the array, return -1.Examples: Input: arr[
    9 min read

    Hard problems on Searching algorithms

    Median of two sorted arrays of same size
    Given 2 sorted arrays a[] and b[], each of size n, the task is to find the median of the array obtained after merging a[] and b[]. Note: Since the size of the merged array will always be even, the median will be the average of the middle two numbers.Input: a[] = [1, 12, 15, 26, 38], b[] = [2, 13, 17
    15+ min read
    Search in an almost sorted array
    Given a sorted integer array arr[] consisting of distinct elements, where some elements of the array are moved to either of the adjacent positions, i.e. arr[i] may be present at arr[i-1] or arr[i+1].Given an integer target. You have to return the index ( 0-based ) of the target in the array. If targ
    7 min read
    Find position of an element in a sorted array of infinite numbers
    Given a sorted array arr[] of infinite numbers. The task is to search for an element k in the array.Examples:Input: arr[] = [3, 5, 7, 9, 10, 90, 100, 130, 140, 160, 170], k = 10Output: 4Explanation: 10 is at index 4 in array.Input: arr[] = [2, 5, 7, 9], k = 3Output: -1Explanation: 3 is not present i
    15+ min read
    Pair Sum in a Sorted and Rotated Array
    Given an array arr[] of size n, which is sorted and then rotated around an unknown pivot, the task is to check whether there exists a pair of elements in the array whose sum is equal to a given target value.Examples : Input: arr[] = [11, 15, 6, 8, 9, 10], target = 16Output: trueExplanation: There is
    10 min read
    K’th Smallest/Largest Element in Unsorted Array | Worst case Linear Time
    Given an array of distinct integers arr[] and an integer k. The task is to find the k-th smallest element in the array. For better understanding, k refers to the element that would appear in the k-th position if the array were sorted in ascending order. Note: k will always be less than the size of t
    15 min read
    K'th largest element in a stream
    Given an input stream of n integers, represented as an array arr[], and an integer k. After each insertion of an element into the stream, you need to determine the kth largest element so far (considering all elements including duplicates). If k elements have not yet been inserted, return -1 for that
    15+ min read
    Best First Search (Informed Search)
    Best First Search is a heuristic search algorithm that selects the most promising node for expansion based on an evaluation function. It prioritizes nodes in the search space using a heuristic to estimate their potential. By iteratively choosing the most promising node, it aims to efficiently naviga
    13 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences