parsing - Python: End of File when writing a parser that reads multiple lines at a go -
i'm writing parser object , i'd understand best practice indicating end of file. seems code should this.
myfp = newparser(filename) # package opens file pointer , scans through multiple lines entry while entry in myfp: yourcode(entry)
i see raising exception better returning status call case seems handled differently.
since parser reading multiple lines can't send result of readline() code chain, while loop runs infinitely in current implementation.
def read(this): entry = '' while 1: line = this.fp.readline() if line = '\\': return entry else: entry += line return entry
can show me how object structured on reading while loop exits in scenario?
so here simplified example, since haven't indicated parser does. should @ least general concept. first, let's build generator
yield
tokens iterates across file.
in simplified example, i'm going assume each token contained on single line, , line contains 1 token, can deduce how expand allow constraints not true.
def produce_token(line): # produce token line here return token def tokenizer(file_to_tokenize): # iterate on file, producing token each line line in file_to_tokenize: # if token isn't constrained line, don't # have yield every iteration. can yield # more once per iteration yield produce_token(line)
next, let's produce contextmanager
allow automagically produce tokenizer
file name, , handle closing @ end:
@contextmanager def tokenize_file(filename): open(filename) f: yield tokenizer(f)
and here how you'd use it:
filename = 'tokens.txt' tokenize_file(filename) tokens: token in tokens: # token
hopefully gets pointed in right direction. in toy example, it's simple there isn't benefit on directly iterating on lines (it faster [produce_token(line) line in token_file]
). if tokenization procedure more complex , expand such, can make process simpler when go use it.
Comments
Post a Comment