I found this page: http://kbyanc.blogspot.com/2007/07/python-serializer-benchmarks.html
and made a few tests with cPickle, which is part of Python (no need to install extra module).
It is actually very efficient when used with the protocol=2 option.
Marshal seems to be even faster, but cannot handle the rec.array objects (at least it doesn't work for me...).
Example of the use:
In [2]: import CMorisset as CM
In [3]: data3 = CM.ReadFortran('test3.dat','a10,1x,f6.2,1x,f6.2,1x,i2',['name', 'ra', 'dec','mag'])
In [4]: import cPickle
In [5]: cPickle.dump(data3, open("data3.pickle", "wb"),protocol=2)
In [6]: data3=cPickle.load(open("data3.pickle","rb"))
Reading the 27Mo of the test3.dat file take me 30 seconds with the Fortran format, and only 1 with the cPickle function! The main problems are that 1) one need to know the name of the store variable, and 2) only one object can be saved at a time.
I think it can be bypassed using a dictionary containing the variables and the names.
My main issue now is to build the dicionnary from the arguments passed to a function, so that I could just have:
save(file='data',dat1,dat2,dat3)
Will se this latter...
Aucun commentaire:
Enregistrer un commentaire