Is there anyone knows the meaning of this regular expression?

I found this code in PyTorch tutorial,but I can’t search out the exact meaning of it.So I came here to get some help.

s = re.sub(r"([.!?])",r"\1",s)
s = re.sub(r"[^a-zA-Z.!?]+",r" ",s)

Very thx to who would like to answer my question.

1 Like

The first expression should replace the punctuations with a whitespace and the first matching group.
Note that you’ve forgotten to paste the whitespace in the replacement expression:

re.sub(r"([.!?])", r" \1", s)


s = 'This is a sentence.'
print(re.sub(r"([.!?])", r" \1", s))
> 'This is a sentence .'

The second regular expression matches everything except letters and punctuation and replaces it with a single whitespace. This is often used to get rid of multiple whitespace characters in the text.

s = 'This  is  a  sentence  with         whitespaces.'
print(re.sub(r"[^a-zA-Z.!?]+", r" ", s))
> 'This is a sentence with whitespaces.'

thanks a lot
sorry for replying so late
cause I almost forget this question…
I’m very grateful for your kind help