MATLAB - How to load and handle of a big TXT file (32GB) -
first os all, sorry english...
i know better way load , handle big txt file (around 32gb, matrix 83.000.000x66). tried experiments textscan, import (out of memory), fgets, fget1,.... except import approach, methods works take time (much more 1 week).
i aim use database execute sampling process and, after that, neural network learning behabiour.
someone know how import type of data faster? thinking make database dump in other format (instead txt), exemplo sql server , try handle data accessing database queries.
other doubt, after load data, can save in .mat format , handle format in experiments? other better idea?
thanks in advance.
it's impossible hold such big matrix (5,478,000,000 values) in workspace/memory (unless you've got tons of ram). file format (.mat or .csv) doesn't matter! definitly have use database (or split file in sevaral smaller ones , calculate step step (takes long too).
personaly, have experiances sqlite3 , did similar 1.47mio x 23 matrix/csv file. http://git.osuv.de/markus/sqlite-demo (remember csv2sqlite.m
designed run gnu octave [19k seconds @ night ...well, bad scripted :) ]. after imported sqlite3 database, can access data need within 8-12 seconds (take in comment header of leistung.m
).
if csv file straight, can import sqlite3 example:
┌─[markus@x121e]─[/tmp] └──╼ cat file.csv 0.9736834199195674,0.7239387515366997,0.3382008456696883 0.6963824911102146,0.8328410999877027,0.5863203843393815 0.2291736458336333,0.1427739134201017,0.8062332551565472 ┌─[markus@x121e]─[/tmp] └──╼ sqlite3 csv.db sqlite version 3.8.4.3 2014-04-03 16:53:12 enter ".help" usage hints. sqlite> create table csvtest (col1 text not null, col2 text not null, col3 text not null); sqlite> .separator "," sqlite> .import file.csv csvtest sqlite> select * csvtest; 0.9736834199195674,0.7239387515366997,0.3382008456696883 0.6963824911102146,0.8328410999877027,0.5863203843393815 0.2291736458336333,0.1427739134201017,0.8062332551565472 sqlite> select col1 csvtest; 0.9736834199195674 0.6963824911102146 0.2291736458336333
all done https://github.com/markuman/go-sqlite (matlab , octave compatible! guess no 1 me has ever used it!) however, recommand version 2-beta in branch 2 (git checkout -b 2 origin/2
) running in coop
mode (you'll hit max string length sqlite3 in ego
mode). there's html doku version 2 too. http://go-sqlite.osuv.de/doc/
Comments
Post a Comment