Hi, I need to write a python script that:
- Get a random length for the string to generate
- Generate a string of this length, using random characters from [the supported/valid character set]
- Write the generated strings one per line in a text file (length >=2 million) and save this file to disk.
Following is the supported/valid character set:
Any kind of help would be appreciated.
This question doesn’t seem to be PyTorch-specific, so you might get a faster and better answer e.g. on StackOverflow.
In any case, something like this would work:
chars = [c for c in '0123456789abcdefghijklmnopqrstuvwxyzäöüßABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÜ!\"#$%&\'()*+,-./:;<=>?@[\\]_~€§£¥']
length = random.randint(1, 10)
s = 
for _ in range(length):
c = random.choice(chars)
s = "".join(s)
Note that this code is slow and is showing the explicit steps.
You could easily speed it up e.g. by indexing the valid characters and create the string directly instead of using the loop etc.
Thank you @ptrblck . It works.