dimanche 3 octobre 2010

Reading formated ascii file a la fortran

I first learned to program in Fortran (after some introductions to Basic, ADA, Turbo pascal, in the early 80's). Then I meet IDL in 1994 (thanks to mi friend Philippe) and the life changed! Interactive + Data + Language was exactly what I needed. But as already said at the beginning of this blog, I now want to change for a free access open language.
But I feel very difficult this change, I'm like a baby learning to walk and talk... For example, I was looking since 2 weeks a way to read a simple formated ascii file, like I used to do in IDL.
The file is just:

     alpha 193.63  18.40 19
      beta 280.12   0.52 16
     gamma 206.59   0.06 17
     delta  23.74  17.92 19
       eta  18.10  10.07 19

and the IDL process is:

data = replicate({name:'',ra:0.0,dec:0.0,mag:0},n_lines)
openr,lun,/get_lun,file
readf,lun,data,format='(a10,1x,f6.2,1x,f6.2,1x,i2)'

et voilà!

The format string is using the Fortran convention, that is quite powerful in describing quite any fixed format. It seems that it's was not possible to do this in Python, 'till I found a module including this facility! Developped by french people from CNRS in Orleans, it is avilable here:
http://dirac.cnrs-orleans.fr/plone/software/scientificpython/

The part of the module I want is this one:

Module FortranFormat

Fortran-style formatted input/output
This module provides two classes that aid in reading and writing Fortran-formatted text files.
Examples:
Input::

   >>>s = '   59999'
   >>>format = FortranFormat('2I4')
   >>>line = FortranLine(s, format)
   >>>print line[0]
   >>>print line[1]

 prints::

   >>>5
   >>>9999


 Output::

   >>>format = FortranFormat('2D15.5')
   >>>line = FortranLine([3.1415926, 2.71828], format)
   >>>print str(line)

 prints::

   '3.14159D+00    2.71828D+00'


I used it to read the same file as previously in IDL
data=numpy.rec.array(['          ',0.,0.,0], names=['name', 'ra', 'dec','mag'])
from Scientific.IO import FortranFormat as FF
format = FF.FortranFormat('a10,1x,f6.2,1x,f6.2,1x,i2')
f=open('test1.dat','r')
for line in f:
#    data['name'],data['ra'],data['dec'],data['mag'] = FF.FortranLine(line,format)
    data.name,data.ra,data.dec,data.mag = FF.FortranLine(line,format)
    print data

The main problem is that I don't know how to have the whole array in the data variable. Anyway, the problem of reading fixed formatted ascii file is solved ;-)

5 commentaires:

  1. J. Venancio Hernandez3 décembre 2010 à 01:54

    Have you figured out how to make the "data" dictionary (I'm using your example notation) arrays, like "data['name']" multidimensional and fill it with each line you read from the ascii file?

    RépondreSupprimer
  2. J. Venancio Hernandez3 décembre 2010 à 02:42

    Just figure it out, I read the ascii file and append each line to the array at the list I desire for example:

    file=open('J0644_200801.dat')
    lightc={'hjd':[''],'mag':[0.]}
    for lines in file:
    lightc['hjd'].append(lines)

    Guess I'll try more later before asking. Cheers

    RépondreSupprimer
  3. Hi, there is also the fortranformat library, now available on PyPI. I'd appreciate any feedback!

    RépondreSupprimer
  4. You mean this one: http://pypi.python.org/pypi/fortranformat/
    Worked fine for me.
    It could perhaps be cool to have only one class with read and write methods.

    RépondreSupprimer
  5. Yes that's the one. I'll keep it in mind re: having just one class. It wouldn't be difficult to do but the naming has to be sensible. Ideally would be called 'FortranFormat' but that is taken by the library you describe above.

    RépondreSupprimer