Python replace characters not in list

The regex method searches a string and then replace it with some other value. Python re.sub() function in the re module is used to do so.

Syntax:

re.sub(pattern, replacement, string, count=0, flags=0)

First of all, let us understand what all these parameters mean:
pattern: The regular expression you want to search and find inside the given string in Python.
string: The variable that contains the given string on which you want to perform the operation.
count: If the pattern occurs multiple times in the string, the number of times you want to you want it to be replaced. The default value is 0. It is optional.
flags: The regex flags are optional.

Input:

import re  

str = "[email protected]"
print(re.sub("[a-z]*@", "[email protected]", str))

Output:

[email protected]

Replacing multiple patterns using regex

We can use regex to replace multiple patterns at one time using regex. This can be easily done using the following syntax.

Syntax:

re.sub(pattern_1 | pattern_2, replacement, string, count=0, flags=0)

Input:

import re  

str = "Joe-Kim Ema Max Aby Liza"
print(re.sub("(\s) | (-)", ", ", str))

Output:

"Joe, Kim, Ema, Max, Aby, Liza"

Replacing multiple patterns with multiple replacements using regex

Now, if ever, you want to replace multiple patterns but with different replacements then also regex can be used. It can be done with a minor modification which you can see in the following example.

Input:

import re  

def convert_case(match_obj):
  if match_obj.group(1) is not None:
    return match_obj.group(1).lower()
  if match_obj.group(2) is not None:
    return match_obj.group(2).upper()

str = "jOE kIM mAx ABY lIzA"
print(re.sub(r"([A-Z]+) | ([a-z]+)", convert_case, str))

In this example, the string contains Uppercase and lowercase that we need to replace. We need to replace the uppercase with the lowercase and vice versa.
In order to do that, we will make two groups and then add a function for the replacement.

Output:

"Joe Kim MaX aby LiZa"

Closing thoughts

To replace a string in Python, the regex sub() method is used. It is a built-in Python method in re module that returns replaced string. Don't forget to import the re module. This method searches the pattern in the string and then replace it with a new given expression. One can learn about more Python concepts here.

I don’t think that solving this restriction is a need because wanting

text.replace(('pepper', 'red pepper', 'green pepper'), 'tomato')
3 to capture
text.replace(('pepper', 'red pepper', 'green pepper'), 'tomato')
4 looks like a very rare usage case, where a replacement is taken on top of another, for this rare occasion you could just concatenate two uses of
text.replace(('pepper', 'red pepper', 'green pepper'), 'tomato')
5, also because I created this restriction to make the replacement algorithm faster.

methane:

I think there are some TRIE implementations in PyPI.

I see how a Trie can solve this problem, but the memory usage would be a problem, and linked lists can not make use of the sequential memory caching (complexity is lower, but runtime can be larger).
Maybe a linear sweep similar to the current implementation of replace is a better solution.

steven.daprano:

Having str.replace support multiple targets (but with a single
replacement) has been suggested many times before, and it has always
foundered over the problem of what to do when the targets overlap.

text.replace(('pepper', 'red pepper', 'green pepper'), 'tomato')

The conclusion was always to recommend that if your replacement needs
were more complex than just changing a single substring at a time, you
should move to using regular expressions.

The Idea you cited is very different from mine, so let’s not hastily close this, thinking that is the same thing.


If someone is replacing text of a string, it is very likely that it’s being done more than just once.

Let’s compare the available solutions.

Consider we want to do 3 changes to the string

text.replace(('pepper', 'red pepper', 'green pepper'), 'tomato')
6:

text: str = get_text()

changes: List[Tuple[str, str]] = [
	(a, b),
	(c, d),
	(e, f)
] # Imagine those variables are strings

Common solution (same as what @luciano teaches):

for from_, to in changes:
    text = text.replace(from_, to)

RegEx solution (suggested by @steven.daprano and based on the re.sub() documentation):

import re

changes: dict[str, str] = dict(changes)

def callback_repl(matchobj) -> str:
	replacement: Optional[str] = changes.get(matchobj.group(0), None)
	if replacement is not None:
		return replacement
	raise Exception('The match object don\'t match!')

re.sub(rf'({a}|{c}|{e})', callback_repl, text)

New suggested solution:

text.replace((a, b), (c, d), (e, f))

I can’t complaint about the first solution, it works, the only reason I’m posting this is because I think that the operation of writing multiple replaces is very common and can be optimized.

The second solution is complicated for the simple job it solves, I can see people copying it from StackOverflow haha, jokes aside, the function call adds an unnecessary overhead to the algorithm.

Hi, another possible solution could be implement the replace algorithm using find method…

For my solution I downloaded a txt version of “El Quijote de la mancha”, in order to have a string long enough to measure time.

with urlopen("https://gist.githubusercontent.com/jsdario/6d6c69398cb0c73111e49f1218960f79/raw/8d4fc4548d437e2a7203a5aeeace5477f598827d/el_quijote.txt") as f:
    text = f.read()
text = str(text, 'utf-8')
to_replace = list(set([t for t in choices(text.split(), k=4000) if len(t)>3 ]))
replace_map =  list(map(lambda x: (x, f'new_string_to_replace_with_{x}'), to_replace))
print(replace_map)
print(len(replace_map))

Then I created a function using the nested calls to replace method

def multireplace_v1(s, changes):
    for old, new in changes:
        s = s.replace(old, new)
    return s

And another function using find method, and creating a list of all possible replacements using the changes

def multireplace_v2(s, changes):
    right = len(s)-1
    replacements = []
    for old, new in changes:
        i = 0
        l = len(old)
        while True:
            n = text_test.find(old, i, right)
            if n == -1:
                break
            i = n + l
            replacements.append((n, i, l, new))
    replacements = sorted(replacements, key= lambda x: x[0])
    
    i = 0
    prev_s = -1
    prev_e = -1
    new_s = ""
    for b, e, l, t in replacements:
        if b >= prev_s and  b+l <= prev_e:
            continue
        prev_s = b
        prev_e = b+l
        new_s += s[i:b] + t
        i = e
    new_s += s[i:]
    return new_s

The call

result1 = multireplace_v1(text, replace_map)

took 3.06 seconds to finish

And

text.replace(('pepper', 'red pepper', 'green pepper'), 'tomato')
0

took 914ms

The proposed solution is faster, and also prevents from replace the already replaced string, the priority is the occurrence of one of the string in changes

How do you replace a specific character in a list Python?

How to replace a string in a list in Python.
strings = ["a", "ab", "aa", "c"].
new_strings = [].
for string in strings:.
new_string = string. replace("a", "1") Modify old string..
new_strings. append(new_string) Add new string to list..
print(new_strings).

How do you replace a specific character in a list?

Replace a specific string in a list. If you want to replace the string of elements of a list, use the string method replace() for each element with the list comprehension. If there is no string to be replaced, applying replace() will not change it, so you don't need to select an element with if condition .

How do you replace multiple characters in a list Python?

01) Using replace() method Python offers replace() method to deal with replacing characters (single or multiple) in a string. The replace method returns a new object (string) replacing specified fields (characters) with new values.

How do you replace a single character in a string in Python?

replace() method replaces the old character with a new character. We can also use loops to replace a character in a string in Python. The string slicing method can also be used to replace a string in python. The regex module is a built-in Python module that can be used to replace a string Python.