I found this code in PyTorch tutorial,but I can’t search out the exact meaning of it.So I came here to get some help.
s = re.sub(r"([.!?])",r"\1",s)
s = re.sub(r"[^a-zA-Z.!?]+",r" ",s)
Very thx to who would like to answer my question.
I found this code in PyTorch tutorial,but I can’t search out the exact meaning of it.So I came here to get some help.
s = re.sub(r"([.!?])",r"\1",s)
s = re.sub(r"[^a-zA-Z.!?]+",r" ",s)
Very thx to who would like to answer my question.
The first expression should replace the punctuations with a whitespace and the first matching group.
Note that you’ve forgotten to paste the whitespace in the replacement expression:
re.sub(r"([.!?])", r" \1", s)
Example:
s = 'This is a sentence.'
print(re.sub(r"([.!?])", r" \1", s))
> 'This is a sentence .'
The second regular expression matches everything except letters and punctuation and replaces it with a single whitespace. This is often used to get rid of multiple whitespace characters in the text.
Example:
s = 'This is a sentence with whitespaces.'
print(re.sub(r"[^a-zA-Z.!?]+", r" ", s))
> 'This is a sentence with whitespaces.'
thanks a lot
sorry for replying so late
cause I almost forget this question…
I’m very grateful for your kind help