Can't write human readable words to file in Python -
i trying build list of words appearing in files in specified directory, , save list file. when try print out of list's positions appears ok (it human readable), after write file see byte-numbers. here code:
import os directorylist = ['/users/kuba/desktop/articles/1', '/users/kuba/desktop/articles/2', '/users/kuba/desktop/articles/4'] bigbagofwords = [] directory in directorylist: filename in os.listdir(directory): filename = os.path.join(directory, filename) currentfile = open(filename, 'rt', encoding = 'latin-1') line in currentfile: currentline = line.split(' ') word in currentline: if word.lower() not in bigbagofwords: bigbagofwords.append(word.lower()) currentfile.close() savefile = open('dictionary.txt', 'wt', encoding = 'latin-1') word in bigbagofwords: savefile.write(word) savefile.write('\n') savefile.close()
file "dictionary.txt" contains lines below:
0000 0007 0078 0064 006b 002e 0074 0078 0074 696c 6f63 626c 6f62 0000 0010 0000 00ec 0000 09e8 ffff ffff ffff 0000 0000
how force python write words in human - readable encoding? doing wrong here?
you've opened .ds_store
os x desktop information file , added output file. when opened file in sublime text text editor shows binary files in columned hex dump format.
the character sequence locblob
characteristic of proprietary format. have text xdk.txt
in utf-16 hidden in hex dump showed us; .ds_store
file stores icon positions , other attributes files on non-native os x filesystems.
filter these files out when looping on directories. typically, want ignore files starting .
:
for filename in os.listdir(directory): if filename[0] == '.': continue # skip hidden files filename = os.path.join(directory, filename)
Comments
Post a Comment