Python Program To Remove all control characters
Last Updated : 14 Feb, 2023
In the telecommunication and computer domain, control characters are non-printable characters which are a part of the character set. These do not represent any written symbol. They are used in signaling to cause certain effects other than adding symbols to text. Removing these control characters is an essential utility. In this article, we will discuss how to remove all those control characters.
Example:
Input : test_str = 'Geeks\0\r for \n\bge\tee\0ks\f'
Output : Geeks for geeeks
Explanation : \n, \0, \f, \r, \b, \t being control characters are removed from string.
Input : test_str = 'G\0\r\n\fg'
Output : Gfg
Explanation : \n, \0, \f, \r being control characters are removed from string, giving Gfg as output.
Method 1 : Using translate().
The logic applied here is that each non-control character is at the top 33 ASCII characters, hence translation is used to avoid all others except these via mapping.
Python3 # Python3 code to demonstrate working of # Remove all control characters # Using translate() # initializing string test_str = 'Geeks\0\r for \n\bge\tee\0ks\f' # printing original string print("The original string is : " + str(test_str)) # using translate() and fromkeys() # to escape all control characters mapping = dict.fromkeys(range(32)) res = test_str.translate(mapping) # printing result print("String after removal of control characters : " + str(res))
Output:
for original string is : Geeks ge eeks String after removal of control characters : Geeks for geeeks
Method 2: Using unicodedata library
In this, using unicodedata.category(), we can check each character starting with "C" is the control character and hence be avoided in the result string.
Python3 # Python3 code to demonstrate working of # Remove all control characters # Using unicodedata library import unicodedata # initializing string test_str = 'Geeks\0\r for \n\bge\tee\0ks\f' # printing original string print("The original string is : " + str(test_str)) # surpassing all control characters # checking for starting with C res = "".join(char for char in test_str if unicodedata.category(char)[0]!="C") # printing result print("String after removal of control characters : " + str(res))
Output:
for original string is : Geeks ge eeks String after removal of control characters : Geeks for geeeks
Method 3: Using Regular Expression
In this, using re library's sub() function, we can remove all those control characters which are identified with \x format.
Python3 # Python3 code to demonstrate working of # Remove all control characters # Using Regular Expression import re # initializing string test_str = 'Geeks\0\r for \n\bge\tee\0ks\f' # printing original string print("The original string is : " + str(test_str)) # surpassing all control characters # using sub() res = re.sub(r'[\x00-\x1f]', '', test_str) # printing result print("String after removal of control characters : " + str(res)) #This code is contributed by Edula Vinay Kumar Reddy
Time Complexity: O(N)
Space Complexity: O(N)
Similar Reads
Python program to remove last N characters from a string In this article, weâll explore different ways to remove the last N characters from a string in Python. This common string manipulation task can be achieved using slicing, loops, or built-in methods for efficient and flexible solutions.Using String SlicingString slicing is one of the simplest and mos
2 min read
Python program for removing i-th character from a string In this article, we will explore different methods for removing the i-th character from a string in Python. The simplest method involves using string slicing.Using String SlicingString slicing allows us to create a substring by specifying the start and end index. Here, we use two slices to exclude t
2 min read
Remove words containing list characters - Python In this article, we will explore various methods to remove words containing list characters in Python. The simplest way to do is by using a loop.Using a LoopIterate over each word in the list and check if it contains any of the characters from remove_chars. If a word contains any of those characters
2 min read
Python Program to Find ASCII Value of a Character Given a character, we need to find the ASCII value of that character using Python. ASCII (American Standard Code for Information Interchange) is a character encoding standard that employs numeric codes to denote text characters. Every character has its own ASCII value assigned from 0-127. Examples:
2 min read
Python - Remove N characters after K Given a String, remove N characters after K character. Input : test_str = 'ge@987eksfor@123geeks is best@212 for cs', N = 3, K = '@' Output : 'geeksforgeeks is best for cs' Explanation : All 3 required occurrences removed. Input : test_str = 'geeksfor@123geeks is best for cs', N = 3, K = '@' Output
2 min read