In computer science, data structures are crucial in efficiently managing and organizing data. Among these, the B+ tree is a powerful and important data structure, widely used in databases and file systems. In this article, we will discuss the concept of B+ trees, exploring their structure, operations, and implementation in Python.
B+ Tree in Python
A B+ tree is a self-balancing tree data structure designed for efficient storage and retrieval of data in secondary memory such as disk storage. It is a variant of the B-tree, characterized by its ability to store multiple keys in each node, with only the leaf nodes containing actual data pointers. The internal nodes act as index nodes, facilitating fast searching and traversal.
Key Characteristics of B+ Trees:
- Ordered Structure: Keys in a B+ tree are stored in sorted order, enabling efficient searching through binary search.
- Balanced Tree: B+ trees are self-balancing, ensuring that operations such as insertion and deletion maintain a balanced tree structure, which results in optimal performance.
- Leaf Node Linked List: All leaf nodes are linked together, forming a linked list. This feature facilitates a range of queries and sequential access.
Operations on B+ Trees:
- Search: Searching in a B+ tree involves traversing the tree from the root node to the leaf node, and performing a binary search to locate the desired key.
- Insertion: Inserting a new key-value pair into a B+ tree begins with a search operation to find the appropriate leaf node. If the leaf node has space, the key-value pair is inserted. Otherwise, the node is split, and the insertion is propagated upwards.
- Deletion: Deleting a key-value pair from a B+ tree follows a similar process to insertion. After locating the leaf node containing the key to be deleted, the key is removed. If the node becomes underflowed, it may borrow keys from sibling nodes or merge with them to maintain balance.
Searching in B+ Tree:
Searching in a B-tree involves traversing the tree from the root to find the node that might contain the desired key.
Step-by-step algorithm:
- Start at the root of the B+ tree.
- Check if the current node is a leaf node.
- If not a leaf node, compare the search key with the keys in the node to determine the appropriate child node to traverse.
- If a leaf node, search for the key within the keys stored in the leaf node.
- If the key is found, return the corresponding value.
- If the key is not found in any leaf node, return a failure indication (e.g., null or false).
- Repeat steps 3-6 until the key is found or until there are no more child nodes to traverse.
Below is the implementation of the above idea:
Python3 import math # Node creation class Node: def __init__(self, order): self.order = order self.values = [] self.keys = [] self.nextKey = None self.parent = None self.check_leaf = False # Search operation def search(self, value): current_node = self while(not current_node.check_leaf): temp2 = current_node.values for i in range(len(temp2)): if (value == temp2[i]): current_node = current_node.keys[i + 1] break elif (value < temp2[i]): current_node = current_node.keys[i] break elif (i + 1 == len(current_node.values)): current_node = current_node.keys[i + 1] break return current_node # B plus tree class BplusTree: def __init__(self, order): self.root = Node(order) self.root.check_leaf = True # Search operation def search(self, value): current_node = self.root while(not current_node.check_leaf): temp2 = current_node.values for i in range(len(temp2)): if (value == temp2[i]): current_node = current_node.keys[i + 1] break elif (value < temp2[i]): current_node = current_node.keys[i] break elif (i + 1 == len(current_node.values)): current_node = current_node.keys[i + 1] break return current_node # Print the tree def printTree(tree): lst = [tree.root] level = [0] leaf = None flag = 0 lev_leaf = 0 node1 = Node(str(level[0]) + str(tree.root.values)) while (len(lst) != 0): x = lst.pop(0) lev = level.pop(0) if (x.check_leaf == False): for i, item in enumerate(x.keys): print(item.values) else: for i, item in enumerate(x.keys): print(item.values) if (flag == 0): lev_leaf = lev leaf = x flag = 1 # Main code record_len = 3 bplustree = BplusTree(record_len) bplustree.search('5') printTree(bplustree) if(bplustree.search('5')): print("Found") else: print("Not found")
Time Complexity: O(log n)
Auxiliary Space: O(1)
Insertion in B+ Tree:
Insertion in a B+ tree involves placing a new key-value pair into the appropriate leaf node and ensuring tree balance through splits and updates.
Step-by-step algorithm:
- Insertion:
- Search for the leaf node to insert the key-value pair.
- If the leaf node is full, split it.
- Propagate median key to parent if necessary.
- Split parent if full.
- Repeat until balanced.
- Search:
- Traverse down the tree to find leaf node.
- Search for key in leaf node.
- Return value if found, else indicate failure.
- Print Tree:
- Traverse tree to print its structure.
- Example Operations:
- Insert key-value pairs.
- Search for specific keys.
- Print the tree structure.
Below is the implementation of the above idea:
Python3 import math # Node creation class Node: def __init__(self, order): self.order = order self.values = [] self.keys = [] self.nextKey = None self.parent = None self.check_leaf = False # Insert at the leaf def insert_at_leaf(self, leaf, value, key): if (self.values): temp1 = self.values for i in range(len(temp1)): if (value == temp1[i]): self.keys[i].append(key) break elif (value < temp1[i]): self.values = self.values[:i] + [value] + self.values[i:] self.keys = self.keys[:i] + [[key]] + self.keys[i:] break elif (i + 1 == len(temp1)): self.values.append(value) self.keys.append([key]) break else: self.values = [value] self.keys = [[key]] # B plus tree class BplusTree: def __init__(self, order): self.root = Node(order) self.root.check_leaf = True # Insert operation def insert(self, value, key): value = str(value) old_node = self.search(value) old_node.insert_at_leaf(old_node, value, key) if (len(old_node.values) == old_node.order): node1 = Node(old_node.order) node1.check_leaf = True node1.parent = old_node.parent mid = int(math.ceil(old_node.order / 2)) - 1 node1.values = old_node.values[mid + 1:] node1.keys = old_node.keys[mid + 1:] node1.nextKey = old_node.nextKey old_node.values = old_node.values[:mid + 1] old_node.keys = old_node.keys[:mid + 1] old_node.nextKey = node1 self.insert_in_parent(old_node, node1.values[0], node1) # Search operation for different operations def search(self, value): current_node = self.root while(current_node.check_leaf == False): temp2 = current_node.values for i in range(len(temp2)): if (value == temp2[i]): current_node = current_node.keys[i + 1] break elif (value < temp2[i]): current_node = current_node.keys[i] break elif (i + 1 == len(current_node.values)): current_node = current_node.keys[i + 1] break return current_node # Find the node def find(self, value, key): l = self.search(value) for i, item in enumerate(l.values): if item == value: if key in l.keys[i]: return True else: return False return False # Inserting at the parent def insert_in_parent(self, n, value, ndash): if (self.root == n): rootNode = Node(n.order) rootNode.values = [value] rootNode.keys = [n, ndash] self.root = rootNode n.parent = rootNode ndash.parent = rootNode return parentNode = n.parent temp3 = parentNode.keys for i in range(len(temp3)): if (temp3[i] == n): parentNode.values = parentNode.values[:i] + \ [value] + parentNode.values[i:] parentNode.keys = parentNode.keys[:i + 1] + [ndash] + parentNode.keys[i + 1:] if (len(parentNode.keys) > parentNode.order): parentdash = Node(parentNode.order) parentdash.parent = parentNode.parent mid = int(math.ceil(parentNode.order / 2)) - 1 parentdash.values = parentNode.values[mid + 1:] parentdash.keys = parentNode.keys[mid + 1:] value_ = parentNode.values[mid] if (mid == 0): parentNode.values = parentNode.values[:mid + 1] else: parentNode.values = parentNode.values[:mid] parentNode.keys = parentNode.keys[:mid + 1] for j in parentNode.keys: j.parent = parentNode for j in parentdash.keys: j.parent = parentdash self.insert_in_parent(parentNode, value_, parentdash) # Print the tree def printTree(tree): lst = [tree.root] level = [0] leaf = None flag = 0 lev_leaf = 0 node1 = Node(str(level[0]) + str(tree.root.values)) while (len(lst) != 0): x = lst.pop(0) lev = level.pop(0) if (x.check_leaf == False): for i, item in enumerate(x.keys): print(item.values) else: for i, item in enumerate(x.keys): print(item.values) if (flag == 0): lev_leaf = lev leaf = x flag = 1 record_len = 3 bplustree = BplusTree(record_len) bplustree.insert('5', '33') bplustree.insert('15', '21') bplustree.insert('25', '31') bplustree.insert('35', '41') bplustree.insert('45', '10') printTree(bplustree) if(bplustree.find('5', '34')): print("Found") else: print("Not found")
Output['15', '25'] ['35', '45'] ['5'] Not found
Time Complexity: O(log n)
Auxiliary Space: O(1)
Deletion in B+ Tree:
Deletion in a B+ tree involves removing a key-value pair, adjusting the structure of the tree, and potentially merging or redistributing nodes to maintain B+ tree properties.
Step-by-step algorithm:
- Define Node with order, values, keys, nextKey, parent, and check_leaf.
- Implement insert_at_leaf in Node to insert values and keys.
- Define BplusTree with root as leaf.
- Implement insert to add key-value, splitting leaf nodes if needed.
- Implement search to find leaf node with given value.
- Implement find to locate specific key-value pair.
- Implement insert_in_parent to handle parent node splitting.
- Implement delete to remove key-value, adjusting structure.
- Implement deleteEntry for deletion at various levels.
- Define printTree to visualize tree.
- Initialize B+ tree, insert key-value pairs, print tree structure, and search.
Below is the implementation of the above idea:
Python3 # B+ tee in python import math # Node creation class Node: def __init__(self, order): self.order = order self.values = [] self.keys = [] self.nextKey = None self.parent = None self.check_leaf = False # Insert at the leaf def insert_at_leaf(self, leaf, value, key): if (self.values): temp1 = self.values for i in range(len(temp1)): if (value == temp1[i]): self.keys[i].append(key) break elif (value < temp1[i]): self.values = self.values[:i] + [value] + self.values[i:] self.keys = self.keys[:i] + [[key]] + self.keys[i:] break elif (i + 1 == len(temp1)): self.values.append(value) self.keys.append([key]) break else: self.values = [value] self.keys = [[key]] # B plus tree class BplusTree: def __init__(self, order): self.root = Node(order) self.root.check_leaf = True # Insert operation def insert(self, value, key): value = str(value) old_node = self.search(value) old_node.insert_at_leaf(old_node, value, key) if (len(old_node.values) == old_node.order): node1 = Node(old_node.order) node1.check_leaf = True node1.parent = old_node.parent mid = int(math.ceil(old_node.order / 2)) - 1 node1.values = old_node.values[mid + 1:] node1.keys = old_node.keys[mid + 1:] node1.nextKey = old_node.nextKey old_node.values = old_node.values[:mid + 1] old_node.keys = old_node.keys[:mid + 1] old_node.nextKey = node1 self.insert_in_parent(old_node, node1.values[0], node1) # Search operation for different operations def search(self, value): current_node = self.root while(current_node.check_leaf == False): temp2 = current_node.values for i in range(len(temp2)): if (value == temp2[i]): current_node = current_node.keys[i + 1] break elif (value < temp2[i]): current_node = current_node.keys[i] break elif (i + 1 == len(current_node.values)): current_node = current_node.keys[i + 1] break return current_node # Find the node def find(self, value, key): l = self.search(value) for i, item in enumerate(l.values): if item == value: if key in l.keys[i]: return True else: return False return False # Inserting at the parent def insert_in_parent(self, n, value, ndash): if (self.root == n): rootNode = Node(n.order) rootNode.values = [value] rootNode.keys = [n, ndash] self.root = rootNode n.parent = rootNode ndash.parent = rootNode return parentNode = n.parent temp3 = parentNode.keys for i in range(len(temp3)): if (temp3[i] == n): parentNode.values = parentNode.values[:i] + \ [value] + parentNode.values[i:] parentNode.keys = parentNode.keys[:i + 1] + [ndash] + parentNode.keys[i + 1:] if (len(parentNode.keys) > parentNode.order): parentdash = Node(parentNode.order) parentdash.parent = parentNode.parent mid = int(math.ceil(parentNode.order / 2)) - 1 parentdash.values = parentNode.values[mid + 1:] parentdash.keys = parentNode.keys[mid + 1:] value_ = parentNode.values[mid] if (mid == 0): parentNode.values = parentNode.values[:mid + 1] else: parentNode.values = parentNode.values[:mid] parentNode.keys = parentNode.keys[:mid + 1] for j in parentNode.keys: j.parent = parentNode for j in parentdash.keys: j.parent = parentdash self.insert_in_parent(parentNode, value_, parentdash) # Delete a node def delete(self, value, key): node_ = self.search(value) temp = 0 for i, item in enumerate(node_.values): if item == value: temp = 1 if key in node_.keys[i]: if len(node_.keys[i]) > 1: node_.keys[i].pop(node_.keys[i].index(key)) elif node_ == self.root: node_.values.pop(i) node_.keys.pop(i) else: node_.keys[i].pop(node_.keys[i].index(key)) del node_.keys[i] node_.values.pop(node_.values.index(value)) self.deleteEntry(node_, value, key) else: print("Value not in Key") return if temp == 0: print("Value not in Tree") return # Delete an entry def deleteEntry(self, node_, value, key): if not node_.check_leaf: for i, item in enumerate(node_.keys): if item == key: node_.keys.pop(i) break for i, item in enumerate(node_.values): if item == value: node_.values.pop(i) break if self.root == node_ and len(node_.keys) == 1: self.root = node_.keys[0] node_.keys[0].parent = None del node_ return elif (len(node_.keys) < int(math.ceil(node_.order / 2)) and node_.check_leaf == False) or (len(node_.values) < int(math.ceil((node_.order - 1) / 2)) and node_.check_leaf == True): is_predecessor = 0 parentNode = node_.parent PrevNode = -1 NextNode = -1 PrevK = -1 PostK = -1 for i, item in enumerate(parentNode.keys): if item == node_: if i > 0: PrevNode = parentNode.keys[i - 1] PrevK = parentNode.values[i - 1] if i < len(parentNode.keys) - 1: NextNode = parentNode.keys[i + 1] PostK = parentNode.values[i] if PrevNode == -1: ndash = NextNode value_ = PostK elif NextNode == -1: is_predecessor = 1 ndash = PrevNode value_ = PrevK else: if len(node_.values) + len(NextNode.values) < node_.order: ndash = NextNode value_ = PostK else: is_predecessor = 1 ndash = PrevNode value_ = PrevK if len(node_.values) + len(ndash.values) < node_.order: if is_predecessor == 0: node_, ndash = ndash, node_ ndash.keys += node_.keys if not node_.check_leaf: ndash.values.append(value_) else: ndash.nextKey = node_.nextKey ndash.values += node_.values if not ndash.check_leaf: for j in ndash.keys: j.parent = ndash self.deleteEntry(node_.parent, value_, node_) del node_ else: if is_predecessor == 1: if not node_.check_leaf: ndashpm = ndash.keys.pop(-1) ndashkm_1 = ndash.values.pop(-1) node_.keys = [ndashpm] + node_.keys node_.values = [value_] + node_.values parentNode = node_.parent for i, item in enumerate(parentNode.values): if item == value_: p.values[i] = ndashkm_1 break else: ndashpm = ndash.keys.pop(-1) ndashkm = ndash.values.pop(-1) node_.keys = [ndashpm] + node_.keys node_.values = [ndashkm] + node_.values parentNode = node_.parent for i, item in enumerate(p.values): if item == value_: parentNode.values[i] = ndashkm break else: if not node_.check_leaf: ndashp0 = ndash.keys.pop(0) ndashk0 = ndash.values.pop(0) node_.keys = node_.keys + [ndashp0] node_.values = node_.values + [value_] parentNode = node_.parent for i, item in enumerate(parentNode.values): if item == value_: parentNode.values[i] = ndashk0 break else: ndashp0 = ndash.keys.pop(0) ndashk0 = ndash.values.pop(0) node_.keys = node_.keys + [ndashp0] node_.values = node_.values + [ndashk0] parentNode = node_.parent for i, item in enumerate(parentNode.values): if item == value_: parentNode.values[i] = ndash.values[0] break if not ndash.check_leaf: for j in ndash.keys: j.parent = ndash if not node_.check_leaf: for j in node_.keys: j.parent = node_ if not parentNode.check_leaf: for j in parentNode.keys: j.parent = parentNode # Print the tree def printTree(tree): lst = [tree.root] level = [0] leaf = None flag = 0 lev_leaf = 0 node1 = Node(str(level[0]) + str(tree.root.values)) while (len(lst) != 0): x = lst.pop(0) lev = level.pop(0) if (x.check_leaf == False): for i, item in enumerate(x.keys): print(item.values) else: for i, item in enumerate(x.keys): print(item.values) if (flag == 0): lev_leaf = lev leaf = x flag = 1 record_len = 3 bplustree = BplusTree(record_len) bplustree.insert('5', '33') bplustree.insert('15', '21') bplustree.insert('25', '31') bplustree.insert('35', '41') bplustree.insert('45', '10') printTree(bplustree) if(bplustree.find('5', '34')): print("Found") else: print("Not found")
Output['15', '25'] ['35', '45'] ['5'] Not found
Time Complexity: O(log n)
Auxiliary Space: O(1)
Similar Reads
B Tree in Python
A B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. Table of Content Characteristics of B-TreesTraversal of B-Tree in PythonSearch operation in B Tree in PythonInsert operation in B Tree in
14 min read
AVL Tree in Python
The AVL tree in Python is a selfâbalancing binary search tree that guarantees the difference of the heights of the left and right subtrees of a node is at most 1. The algorithm is named after its inventors, Georgy Adelson-Velsky, and Evgenii Landis who published their paper in 1962. The AVL tree kee
6 min read
Binary Tree in Python
Binary Tree is a non-linear and hierarchical data structure where each node has at most two children referred to as the left child and the right child. The topmost node in a binary tree is called the root, and the bottom-most nodes are called leaves. Representation of Binary TreeEach node in a Binar
9 min read
Tree Sort in Python
Tree sort is a sorting algorithm that builds a Binary Search Tree (BST) from the elements of the array to be sorted and then performs an in-order traversal of the BST to get the elements in sorted order. In this article, we will learn about the basics of Tree Sort along with its implementation in Py
3 min read
Red Black Tree in Python
Red Black Tree is a self-balancing binary search tree where each node has an extra bit representing its color, either red or black. By constraining how nodes are colored during insertions and deletions, red-black trees ensure the tree remains approximately balanced during all operations, allowing fo
11 min read
Fractal Trees in Python
Implementation of Fractal Binary Trees in python. Introduction A fractal tree is known as a tree which can be created by recursively symmetrical branching. The trunk of length 1 splits into two branches of length r, each making an angle q with the direction of the trunk. Both of these branches divid
3 min read
Binary Search Tree In Python
A Binary search tree is a binary tree where the values of the left sub-tree are less than the root node and the values of the right sub-tree are greater than the value of the root node. In this article, we will discuss the binary search tree in Python. What is a Binary Search Tree(BST)?A Binary Sear
11 min read
Insertion in a B+ tree
Prerequisite: Introduction of B+ treesIn this article, we will discuss that how to insert a node in B+ Tree. During insertion following properties of B+ Tree must be followed:Â Each node except root can have a maximum of M children and at least ceil(M/2) children.Each node can contain a maximum of M
15+ min read
B-Tree in Java
A B-tree is a self-balanced tree data structure that will maintain the sorted data and allow for operations such as insertion, deletion and search operations. B-tree is particularly well-suited for systems that need to perform disk-based operations and it minimizes the number of disk accesses requir
6 min read
Insert Operation in B-Tree
In this post, we'll discuss the insert() operation in a B-Tree. A new key is always inserted into a leaf node. To insert a key k, we start from the root and traverse down the tree until we reach the appropriate leaf node. Once there, the key is added to the leaf. Unlike Binary Search Trees (BSTs), n
15+ min read