Can't write human readable words to file in Python -


i trying build list of words appearing in files in specified directory, , save list file. when try print out of list's positions appears ok (it human readable), after write file see byte-numbers. here code:

import os  directorylist = ['/users/kuba/desktop/articles/1', '/users/kuba/desktop/articles/2', '/users/kuba/desktop/articles/4'] bigbagofwords = []  directory in directorylist:     filename in os.listdir(directory):         filename = os.path.join(directory, filename)         currentfile = open(filename, 'rt', encoding = 'latin-1')         line in currentfile:             currentline = line.split(' ')             word in currentline:                 if word.lower() not in bigbagofwords:                     bigbagofwords.append(word.lower())         currentfile.close()  savefile = open('dictionary.txt', 'wt', encoding = 'latin-1') word in bigbagofwords:     savefile.write(word)     savefile.write('\n') savefile.close() 

file "dictionary.txt" contains lines below:

0000 0007 0078 0064 006b 002e 0074 0078 0074 696c 6f63 626c 6f62 0000 0010 0000 00ec 0000 09e8 ffff ffff ffff 0000 0000

how force python write words in human - readable encoding? doing wrong here?

you've opened .ds_store os x desktop information file , added output file. when opened file in sublime text text editor shows binary files in columned hex dump format.

the character sequence locblob characteristic of proprietary format. have text xdk.txt in utf-16 hidden in hex dump showed us; .ds_store file stores icon positions , other attributes files on non-native os x filesystems.

filter these files out when looping on directories. typically, want ignore files starting .:

for filename in os.listdir(directory):     if filename[0] == '.':         continue  # skip hidden files     filename = os.path.join(directory, filename) 

Comments

Popular posts from this blog

commonjs - How to write a typescript definition file for a node module that exports a function? -

openid - Okta: Failed to get authorization code through API call -

thorough guide for profiling racket code -