dimanche 12 septembre 2010

Reading an ascii file

I used to use the read_ascii() function in IDL (BTW I use my own version, which return an array of structures and not a structure containing arrays...), and it seems there is not an equivalent tool... or I didn't found it yet!
Nevertheless, I tried to make the job simpler by reading a Comma Separated Variable file (csv), as it can be output by any spreadsheet or downloaded from the net (e.g. Vizier tables).
I first found a module able to read such file: csv (!).
This is the first solution I used:

import csv
from scipy import *
rr = csv.DictReader(open('DIGEDA1.1.csv'))
x=[]
y=[]
for row in rr:
    x.append(row['5007'])
    y.append(row['3727'])
plot(x,y)

The problem is that I didn't succeed to apply simple filter to the data: I want to plot the log of the values, so I have to filter out the negative values... It seems that the format used for the x and y variables (namely "list") is not really the best one to be used with where.
 
Finally, I found another function to read the file, which outputs something very similar to a structure.
It is part of the pylab distribution (matplolib), so no need to import it as I use the pylab argument when calling ipython. So here my second (and successful) try to read, filter and plot my data:

rec=csv2rec('DIGEDA1.1.csv')
tt = where(logical_and(rec['3727']>0,rec['5007']>0))
rec2=rec[tt]
x=rec2['5007']
y=rec2['3727']
plot(log(y),log(x),'o')


The rec variable contains the whole file, in such a way very similar to IDL structure. A very nice thing is that the names of the different columns are directly taken from the first line of my file, which contains this information.

It is finally quite easy (well, it took me 3 hours to find all this in the Google jungle... it's why I'm doing this blog!).

1 commentaire:

  1. I found another way to read a file, if simple:
    db2=loadtxt("randomBig2.dat",skiprows=1,dtype=float)

    RépondreSupprimer