Python – Flatten Nested Keys
Last Updated : 06 Apr, 2023
Sometimes, while working with Python data, we can have a problem in which we need to perform the flattening of certain keys in nested list records. This kind of problem occurs while data preprocessing. Let us discuss certain ways in which this task can be performed.
Method #1: Using loop
This is a brute-force method to perform this task. In this, we construct new dictionary by assigning base keys and then perform the flattening of inner key elements using a nested loop.
Python3
test_list = [{ 'Gfg' : 1 , 'id' : 1 , 'data' : [{ 'rating' : 7 , 'price' : 4 }, { 'rating' : 17 , 'price' : 8 }]}, { 'Gfg' : 1 , 'id' : 2 , 'data' : [{ 'rating' : 18 , 'price' : 19 }]}] print ( "The original list is : " + str (test_list)) res = [] for sub in test_list: temp1 = { 'Gfg' : sub[ 'Gfg' ], 'id' : sub[ 'id' ] } for data in sub.get( 'data' , []): res.append({ * * temp1, 'rating' : data[ 'rating' ], 'price' : data[ 'price' ]}) print ( "The flattened list : " + str (res)) |
Output The original list is : [{'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4}, {'rating': 17, 'price': 8}]}, {'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]}] The flattened list : [{'Gfg': 1, 'id': 1, 'rating': 7, 'price': 4}, {'Gfg': 1, 'id': 1, 'rating': 17, 'price': 8}, {'Gfg': 1, 'id': 2, 'rating': 18, 'price': 19}]
Time Complexity: O(n * m), where n is the number of elements in the outer list and m is the average number of elements in each inner ‘data’ list.
Auxiliary Space: O(n), where n is the number of elements in the outer list. This is because we use a single res list to store the flattened elements and its size grows with each iteration of the loop.
Method #2: Using list comprehension + zip() + itemgetter()
The combination of the above functions can be used to perform this task. In this, we extract the required pairs using itemgetter() and combine pairs using zip(). The compilation of data is using list comprehension.
Python3
from operator import itemgetter test_list = [{ 'Gfg' : 1 , 'id' : 1 , 'data' : [{ 'rating' : 7 , 'price' : 4 }, { 'rating' : 17 , 'price' : 8 }]}, { 'Gfg' : 1 , 'id' : 2 , 'data' : [{ 'rating' : 18 , 'price' : 19 }]}] print ("The original list is : " + str (test_list)) base_keys = 'Gfg' , 'id' flatten_keys = 'rating' , 'price' res = [ dict ( zip (base_keys + flatten_keys, itemgetter( * base_keys)(sub) + itemgetter( * flatten_keys)(data))) for sub in test_list for data in sub[ 'data' ]] print ("The flattened list : " + str (res)) |
Output : The original list is : [{‘data’: [{‘rating’: 7, ‘price’: 4}, {‘rating’: 17, ‘price’: 8}], ‘id’: 1, ‘Gfg’: 1}, {‘data’: [{‘rating’: 18, ‘price’: 19}], ‘id’: 2, ‘Gfg’: 1}] The flattened list : [{‘price’: 4, ‘rating’: 7, ‘id’: 1, ‘Gfg’: 1}, {‘price’: 8, ‘rating’: 17, ‘id’: 1, ‘Gfg’: 1}, {‘price’: 19, ‘rating’: 18, ‘id’: 2, ‘Gfg’: 1}]
Time complexity: O(n) where n is the total number of elements in the nested list. The reason for this is that the code needs to iterate through the elements of the nested list in the list comprehension.
Auxiliary space: O(n) as the result list will have n elements. This is because the result list is being constructed by iterating through the elements of the nested list and adding new elements to it.
Method #3 : Using update() method
Python3
test_list = [{ 'Gfg' : 1 , 'id' : 1 , 'data' : [{ 'rating' : 7 , 'price' : 4 }, { 'rating' : 17 , 'price' : 8 }]}, { 'Gfg' : 1 , 'id' : 2 , 'data' : [{ 'rating' : 18 , 'price' : 19 }]}] print ( "The original list is : " + str (test_list)) res = [] for sub in test_list: for j in sub[ "data" ]: j.update({ "id" :sub[ "id" ]}) j.update({ "Gfg" :sub[ "Gfg" ]}) res.append(j) print ( "The flattened list : " + str (res)) |
Output The original list is : [{'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4}, {'rating': 17, 'price': 8}]}, {'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]}] The flattened list : [{'rating': 7, 'price': 4, 'id': 1, 'Gfg': 1}, {'rating': 17, 'price': 8, 'id': 1, 'Gfg': 1}, {'rating': 18, 'price': 19, 'id': 2, 'Gfg': 1}]
Time complexity: O(n*m), where n is the length of the input list and m is the average length of the “data” list in each dictionary.
Auxiliary Space: O(nm), because we are creating a new list to store the flattened data, which can have a size of up to nm elements.
Method #4: Using recursion
We can also flatten the nested keys using recursion. Here’s how we can implement it:
- Define a function flatten_keys that takes a dictionary as an input.
- Initialize an empty dictionary result.
- Loop through each key-value pair in the dictionary.
- If the value is a dictionary, call flatten_keys recursively with the value as the input and store the result in result.
- If the value is a list, loop through each element in the list and call flatten_keys recursively with the element as the input and store the result in result.
- If the value is not a dictionary or list, add it to the result dictionary with the same key.
- Return the result dictionary.
Python3
def flatten_keys(d): result = {} for k, v in d.items(): if isinstance (v, dict ): flat_v = flatten_keys(v) for flat_k, flat_v in flat_v.items(): result[k + '.' + flat_k] = flat_v elif isinstance (v, list ): for i, item in enumerate (v): flat_item = flatten_keys(item) for flat_k, flat_v in flat_item.items(): result[f "{k}.{i}.{flat_k}" ] = flat_v else : result[k] = v return result test_list = [{ 'Gfg' : 1 , 'id' : 1 , 'data' : [{ 'rating' : 7 , 'price' : 4 }, { 'rating' : 17 , 'price' : 8 }]}, { 'Gfg' : 1 , 'id' : 2 , 'data' : [{ 'rating' : 18 , 'price' : 19 }]}] flattened_list = [] for d in test_list: flattened_dict = flatten_keys(d) flattened_list.append(flattened_dict) print ( "The flattened list : " + str (flattened_list)) |
Output The flattened list : [{'Gfg': 1, 'id': 1, 'data.0.rating': 7, 'data.0.price': 4, 'data.1.rating': 17, 'data.1.price': 8}, {'Gfg': 1, 'id': 2, 'data.0.rating': 18, 'data.0.price': 19}]
Time complexity of this approach is O(n), where n is the number of elements in the nested dictionary.
Auxiliary space required is also O(n), where n is the number of elements in the flattened dictionary.
Method #5: Using a stack
We can flatten the nested dictionaries using a stack data structure. We start with an empty dictionary and a stack containing the input dictionary. We pop a dictionary from the stack, iterate through its key-value pairs, and if the value is a dictionary, we add the current key to each of its keys and push it onto the stack. If the value is a list, we iterate through each element of the list and add the current index and key to its keys and push it onto the stack. Otherwise, we add the key-value pair to the flattened dictionary.
step-by-step approach :
- Define a function named flatten_keys_stack that takes in a dictionary as its only argument.
- Create an empty dictionary named flattened_dict to hold the flattened key-value pairs.
- Create a list named stack containing a tuple of the input dictionary and an empty string (representing the current prefix for keys).
- While the stack is not empty, pop a tuple from the stack and unpack it into curr_dict and prefix.
- Iterate through the key-value pairs of curr_dict.
- If the value is a dictionary, add the current key to the prefix and push the dictionary and new prefix onto the stack.
- If the value is a list, iterate through each element of the list, add the current index and key to the prefix, and push the element and new prefix onto the stack.
- Otherwise, add the prefix and current key as a concatenated string as the key to flattened_dict, with the value being the current value.
- Return flattened_dict.
- Modify the existing code to use the flatten_keys_stack function instead of the flatten_keys function.
- Run the code and verify that the flattened list is the same as the previous methods.
Python3
def flatten_keys_stack(d): flattened_dict = {} stack = [(d, '')] while stack: curr_dict, prefix = stack.pop() for k, v in curr_dict.items(): if isinstance (v, dict ): stack.append((v, prefix + k + '.' )) elif isinstance (v, list ): for i, item in enumerate (v): stack.append((item, prefix + k + '.' + str (i) + '.' )) else : flattened_dict[prefix + k] = v return flattened_dict test_list = [{ 'Gfg' : 1 , 'id' : 1 , 'data' : [{ 'rating' : 7 , 'price' : 4 }, { 'rating' : 17 , 'price' : 8 }]}, { 'Gfg' : 1 , 'id' : 2 , 'data' : [{ 'rating' : 18 , 'price' : 19 }]}] flattened_list = [] for d in test_list: flattened_dict = flatten_keys_stack(d) flattened_list.append(flattened_dict) print ( "Flattened dictionary: " + str (flattened_dict)) print ( "The flattened list: " + str (flattened_list)) |
Output Flattened dictionary: {'Gfg': 1, 'id': 1, 'data.1.rating': 17, 'data.1.price': 8, 'data.0.rating': 7, 'data.0.price': 4} Flattened dictionary: {'Gfg': 1, 'id': 2, 'data.0.rating': 18, 'data.0.price': 19} The flattened list: [{'Gfg': 1, 'id': 1, 'data.1.rating': 17, 'data.1.price': 8, 'data.0.rating': 7, 'data.0.price': 4}, {'Gfg': 1, 'id': 2, 'data.0.rating': 18, 'data.0.price': 19}]
The time and auxiliary space complexity of this method are both O(N), where N is the total number of key-value pairs in the input dictionary.
The auxiliary space comes from the use of the stack to traverse the nested dictionaries and lists.
Method 6 : Uses the flatten_dict_pandas() function:
step-by-step approach for the program:
- Import the pandas library with the alias pd.
- Define the flatten_dict_pandas() function that takes a dictionary d as input.
- Inside the function, use pd.json_normalize() to convert the dictionary to a flattened pandas DataFrame, with the keys joined by dots as the column names. The sep parameter specifies the separator to use between the keys.
- Then, use the to_dict() method of the DataFrame to convert it back to a dictionary, with orient=’records’ to return a list of dictionaries, and take the first dictionary from the list.
- Return the flattened dictionary.
- Define an input list test_list that contains nested dictionaries and lists.
- Create an empty list flattened_list to store the flattened dictionaries.
- Loop through each dictionary d in test_list.
- Call flatten_dict_pandas() on d to flatten it.
- Append the flattened dictionary to flattened_list.
- Print the flattened dictionary.
- After the loop, print the entire flattened list.
Python3
import pandas as pd def flatten_dict_pandas(d): df = pd.json_normalize(d, sep = '.' ) return df.to_dict(orient = 'records' )[ 0 ] test_list = [{ 'Gfg' : 1 , 'id' : 1 , 'data' : [{ 'rating' : 7 , 'price' : 4 }, { 'rating' : 17 , 'price' : 8 }]}, { 'Gfg' : 1 , 'id' : 2 , 'data' : [{ 'rating' : 18 , 'price' : 19 }]}] flattened_list = [] for d in test_list: flattened_dict = flatten_dict_pandas(d) flattened_list.append(flattened_dict) print ( "Flattened dictionary: " + str (flattened_dict)) print ( "The flattened list: " + str (flattened_list)) |
OUTPUT:
Flattened dictionary: {'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4}, {'rating': 17, 'price': 8}]} Flattened dictionary: {'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]} The flattened list: [{'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4}, {'rating': 17, 'price': 8}]}, {'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]}]
The time complexity of this method is O(nlogn), where n is the total number of keys and values in the dictionary.
The auxiliary space is also O(nlogn), due to the creation of the pandas DataFrame.
Similar Reads
Python - Flatten Nested Tuples
Sometimes, while working with Python Tuples, we can have a problem in which we need to perform flattening of tuples, which can be nested and undesired. This can have application across many domains such as Data Science and web development. Let's discuss certain way in which this task can be performe
7 min read
Python | Safe access nested dictionary keys
Sometimes, while working with Python we can have a problem in which we need to get the 2nd degree key of dictionary i.e the nested key. This type of problem is common in case of web development, especially with the advent of NoSQL databases. Let's discuss certain ways to safely get the nested availa
3 min read
Python | Sort Flatten list of list
The flattening of list of lists has been discussed earlier, but sometimes, in addition to flattening, it is also required to get the string in a sorted manner. Let's discuss certain ways in which this can be done. Method #1 : Using sorted() + list comprehension This idea is similar to flattening a l
7 min read
Python - Flatten Nested Dictionary to Matrix
Sometimes, while working with data, we can have a problem in which we need to convert a nested dictionary into Matrix, each nesting comprising the different rows in the matrix. This can have applications in many data domains. Let us discuss certain ways in which this task can be performed. Method #1
5 min read
Add Keys to Nested Dictionary
The task of adding keys to a nested dictionary in Python involves inserting new keys or updating the values of existing ones within the nested structure. Since Python dictionaries do not allow duplicate keys, if a key already exists then its value will be updated. If the key doesnât exist at any lev
4 min read
Python - Sorted Nested Keys in Dictionary
Sometimes, while working with Python dictionaries, we can have a problem in which we need to extract all the keys of nested dictionaries and render them in sorted order. This kind of application can occur in domains in which we work with data. Lets discuss certain ways in which this task can be perf
4 min read
Python - K list Nested Dictionary Mesh
Given 2 lists, create nested mesh with constant List. Input : test_list1 = [4, 6], test_list2 = [2, 7], K = [] Output : {4: {2: [], 7: []}, 6: {2: [], 7: []}} Explanation : Nested dictionary initialized with []. Input : test_list1 = [4], test_list2 = [2], K = [1] Output : {4: {2: [1]}} Explanation :
2 min read
Flatten a List of Lists in Python
Flattening a list of lists means turning a nested list structure into a single flat list. This can be useful when we need to process or analyze the data in a simpler format. In this article, we will explore various approaches to Flatten a list of Lists in Python. Using itertools.chain itertools modu
3 min read
Get Next Key in Dictionary - Python
We are given a dictionary and need to find the next key after a given key, this can be useful when iterating over dictionaries or accessing elements in a specific order. For example, consider the dictionary: data = {"a": 1, "b": 2, "c": 3, "d": 4} if the current key is "b" then the next key should b
2 min read
Python - Inversion in nested dictionary
Given a nested dictionary, perform inversion of keys, i.e innermost nested becomes outermost and vice-versa. Input : test_dict = {"a" : {"b" : {}}, "d" : {"e" : {}}, "f" : {"g" : {}} Output : {'b': {'a': {}}, 'e': {'d': {}}, 'g': {'f': {}} Explanation : Nested dictionaries inverted as outer dictiona
3 min read