Python: how to process a huge single-line file? -


i have huge single-line file, containing space-separated words only. run additional filtering on it. how fast?

currently have following code:

with open("words.txt") f:     lines = f.readlines()      line in lines:         words = str(line).split(' ')                  w in words:             if is_allowed(w):                 another_file.write(w + " ") 

but extremelly slow (~1mb/s). how speed up?

given describe file "huge", problem down code needing load entire file memory @ once, , making copy of in order carry out split operation.

it ought faster if treat file stream. read character character (char = f.read(1)); if character other space or eof, append temporary string. when hit space, process temporary string , blank , start over; when hit eof, process temporary string , break out of loop.

that way should never have more single word in memory @ given moment, should vastly speed processing.


Comments

Popular posts from this blog

commonjs - How to write a typescript definition file for a node module that exports a function? -

openid - Okta: Failed to get authorization code through API call -

thorough guide for profiling racket code -