python - Memory Leak in Matplotlib save fig with PDFPages -
edit: update
i used objgraph
print out back reference graphs 'reference'
items appeared in memory leak. seems pdfpages holding onto of images iterate thorough them , save them each page (so perhaps inherent pdfpages module). think i'm going modify code write small pdf file on each iteration , use pypdf
merge these files desired larger pdf file.
edit: running python 2.7.3 matplotlib 1.3.1. have tried printing out gc.garbage
, returns empty list, doesn't appear there uncollectable objects. have tried using both pdf , agg backends, memory leak still present in both of these. tried closing axes (ax1
, cbaxes1
) , explicitly using del
on of variables (which had effect of removing +3 list after closing increasing +2 list after saving +5 list).
i trying create multiple heatmaps via pcolormesh , save them single page in pdf , repeat process create multiple pages figures in pdf file (i've dropped down 1 figure per page sake of example).
there seems memory leak occurring savefig function, seems small @ first, adds want able save large pdf files.
import matplotlib matplotlib.use('agg') import matplotlib.pyplot plt matplotlib.backends.backend_pdf import pdfpages matplotlib import gridspec matplotlib.backends.backend_agg import figurecanvasagg figurecanvas import numpy np import resource import gc import objgraph def plotfunction(i,pdf): fig = plt.figure() fig.set_figheight(25) fig.set_figwidth(64) gs = gridspec.gridspec(1,2) ax1 = plt.subplot(gs[0],rasterized=true) heatmap1 = ax1.pcolormesh(np.random.uniform(size=(10,10))) cbaxes1 = plt.subplot(gs[1],rasterized=true) cb1 = plt.colorbar(heatmap1, cax= cbaxes1, use_gridspec = true) gc.collect() print 'memory growth before savefig {round}: %s ({mb} mb)' .format(round=i,mb=resource.getrusage(resource.rusage_self).ru_maxrss/1024/1024) % resource.getrusage(resource.rusage_self).ru_maxrss objgraph.show_growth() pdf.savefig(fig) # memory leak seems occur here gc.collect() print 'memory growth after savefig {round}: %s ({mb} mb)' .format(round=i,mb=resource.getrusage(resource.rusage_self).ru_maxrss/1024/1024) % resource.getrusage(resource.rusage_self).ru_maxrss objgraph.show_growth() fig.clf() plt.close(fig) plt.close('all') gc.collect() print 'memory growth after closing {round}: %s ({mb} mb)' .format(round=i,mb=resource.getrusage(resource.rusage_self).ru_maxrss/1024/1024) % resource.getrusage(resource.rusage_self).ru_maxrss objgraph.show_growth() def main(): pdf = pdfpages('output.pdf') in range(15): plotfunction(i,pdf) print 'memory growth outside function {round}: %s ({mb} mb)' .format(round=i,mb=resource.getrusage(resource.rusage_self).ru_maxrss/1024/1024) % resource.getrusage(resource.rusage_self).ru_maxrss gc.collect() objgraph.show_growth() pdf.close() if __name__ == "__main__": main()
here's output showing memory leak:
memory growth before savefig 7: 877674496 (837 mb) memory growth after savefig 7: 988385280 (942 mb) reference 32 +3 list 1839 +2 name 16 +2 tuple 2384 +2 memory growth after closing 7: 988385280 (942 mb) list 1842 +3 memory growth outside function 7: 988385280 (942 mb) memory growth before savefig 8: 988651520 (942 mb) memory growth after savefig 8: 1099231232 (1048 mb) reference 35 +3 list 1844 +2 name 18 +2 tuple 2386 +2 memory growth after closing 8: 1099231232 (1048 mb) list 1847 +3 memory growth outside function 8: 1099231232 (1048 mb)
i have tried using multiprocessing, avoiding pyplot, , moving save function suggested in other posts, none of these solutions have solved memory leak problem.
thanks.
Comments
Post a Comment