How to Remove Characters from String in Python

Software developers need to commonly remove characters from string in Python. This is required in case the string data that you get is different from the one you need to process or display. It is also needed during data cleansing. But Python strings are immutable, that is, they cannot be modified. So you need to create a new string without the specific characters that you want to remove from the original string. In this article, we will learn the most common ways to remove characters from string in Python.

How to Remove Characters from String in Python

There are many ways to remove or replace one or more characters in Python string. Let us look at the most common methods.

1. Using replace()

As the name suggests, replace() function has been specifically created to replace a character or substring in a string. It can be directly called on all string literals and variables. Here is its syntax.

string.replace(old_character,new_character)

It returns a new string without the specified character or substring. If you want to modify the original string, then you need to overwrite it with the result of replace() function.

a = "good morning"
a=a.replace('o','')
print(a) # gd mrning

In the above code, replace() function will find and replace all occurrences of ‘o’ with empty character ” effectively removing it. Optionally, you can specify the number of times you want the replacement to happen, as the third argument.

Here is an example to replace only 1 occurrence of ‘o’, not all occurrences.

a = "good morning"
a=a.replace('o','',1)
print(a) # god morning

2. Using For loop

In the above case, replace() function will start replacing/removing characters from left to right, for every occurrence. What if you want to remove only the second occurrence of the character? In such cases, it is better to use the good old for loop, where you iterate through the string characters one by one and remove them as per your requirement.

for i in a:
if i=='o':
c=c+1
if not (i == "o" and c==2):
b += i
print(b) # god morning

In the above code, we loop through the string one by one and check if it is ‘o’. For every occurrence of the letter ‘o’ we increment the count variable c by 1. Unless the present letter is ‘o’ and count==2, we keep adding each letter to the empty string b. When for loop completes, the variable b will not contain the second occurrence of letter ‘o’.

3. Using List Comprehension

List comprehension is a faster and more concise way to iterate through the string and remove characters as per your requirement. Here is an example to remove ‘d’ from the string.

a='good morning'
a = "".join([c for c in a if c != "d"])
print(a)

In the above code, we use a list comprehension to get characters from original string that are not ‘d’. This will generate a list of characters except ‘d’. We use join() function to concatenate these characters into a new string.

4. Using Filter()

If you want to remove characters based on specific conditions, and your string is very large then you can use filter() function. It returns an iterator which does not take up much space. The result of this iterator can be converted back to string using join function, as done earlier.

a = "good morning"
a = "".join(filter(lambda c: c != "o", a))
print(a) # gd mrning

Iterator is just an object that contains a sequence of values. But they are not stored completely in memory. Instead, when you iterate over an iterator, it only uses a pointer to the present object, one at a time. It saves a lot of space and time.

5. Using Slicing

Sometimes, you may need to remove a character at a specific position. In such cases, you can use slicing in Python. Slicing allows you to extract specific parts of a string, based on index values. You can also use it to remove characters.

Here is its syntax. Both starting and ending indexes below are optional.

string[starting_index:ending_index]

Here is an example to remove character located at index=3.

a='good morning'
index=3
a = a[:index]+a[index+1:]
print(a)

6. Using Strip()

Sometimes your text may contain lots of whitespace characters at the beginning and end of string. In such cases, you can use strip() function to remove whitespace characters both at the start & end of string. You can call it directly from any string literal or variable.

a='  good morning\n\t   '
a = a.strip()
print(a) # good morning

In the above case, the strip function has removed space as well as newline characters. If your string has a some totally different leading and trailing characters, then you can specify those characters in strip().

a=',,,l.good morning;;;,'
a = a.strip(",.l;")
print(a) # good morning

In the above code, we have removed characters comma(,), dot(.),’l’, semi colon(;). This is a very simple way to clean up your text of unnecessary characters.

Similarly, if you want to remove only the leading characters, use lstrip().

a=',,,l.good morning;;;,'
a = a.lstrip(",.l;")
print(a) # good morning;;;,

a=' good morning '
a = a.lstrip()
print(a) # 'good morning '

On the other hand, if you want to remove trailing characters, use rstrip() function.

a=',,,l.good morning;;;,'
a = a.rstrip(",.l;")
print(a) # ,,,l.good morning

a=' good morning '
a = a.rstrip()
print(a) # ' good morning'

7. Using Regular Expressions

Sometimes, you may need to remove only numbers, or only alphabets, from a text. This is a little complicated requirement and can lead to a tedious solution, if you use loops and list comprehensions. In such cases, you can use regular expressions along with sub() function. Here is its syntax.

re.sub(pattern, repl, string, count=0, flags=0)

Here is an example to use regular expressions to remove only numbers from a string.

a="good4 morning3"
import re
a=re.sub("[0-9]","",a)
print(a) ## good morning

In the above code, we use regular expression [0-9] in re.sub() function to find and replace numbers with empty string.

Here is an example to remove alphabets from a string, using [a-z] regular expression.

a="good4 morning3"
import re
a=re.sub("[a-z]","",a)
print(a) # 4 3

You can also use this solution to remove only capital alphabets from a string. For this, we use regular expression [A-Z]

a="Good4 Morning3"
import re
a=re.sub("[A-Z]","",a)
print(a) # ood4 orning3

You can even combine the above solutions to create a regular expression to replace only smallcase letters and numbers.

a="Good4 Morning3"
import re
a=re.sub("[a-z0-9]","",a)
print(a) # G M

As you can see, using regular expressions is a superb way to easily clean up your data, and remove characters located at really complicated positions. You just need to come up with the right regular expression.

Conclusion

In this article, we have learnt several different ways to remove characters in a string in Python. If you want to find and remove one or more occurrences of specific character, you can use replace() function. If you want to remove a specific occurrence (e.g. 3rd occurrence), then you can use for loop to iterate through the string and remove the desired character. If you want to remove character at a specific position, then you can use slicing. If you want to leading/trailing whitespace characters then use strip() function. If you are working with a large string, then you can try to use filter() function with lambda, which returns a memory-efficient iterator. Depending on your use case, you will need to pick the appropriate solution.

Also read:

How to Use f-strings in Python?
How to Work With Zip Files in Python
How to Convert Pandas Dataframe to Dictionary

Leave a Reply

Your email address will not be published. Required fields are marked *