lundi 6 juin 2011

Structure-like object (2): record arrays, access and views

Some complement to the previous message on structured arrays and record arrays:

import numpy as np
a = np.zeros((10,),dtype=[('name', str), ('ra', float), ('dec', float)])
a['ra'] = np.random.random_sample(10)*360
a['dec'] = np.random.random_sample(10)*180-90
tt = ((a['ra'] > 5.) & (abs(a['dec']) < 10.))
b = a[tt]
a.size
#10
tt.size
#10
tt.sum()
#1
b.size
#1


If no names age given to the different tags, it is set by default to f0,f1,...fN.
One can access the data without knowing the names of the tags:
a[a.dtype.names[2]]
is more complicated that IDL>a.(2), but at least it's possible ;-) And it as some advantage: you can change the names:
a.dtype.names = ('obs','ra','dec')

Be careful with subset, they are views:
b=a[1]
a['dec'][1]
# 0.0
b['dec']
# 0.0
b['dec'] = 2
a['dec'][1]
# 2.0
But it's not so easy to see it:
b['dec'] is a['dec'][1]
# False

Now we can turn the structured array a into a recarray:

a2 = a.view(np.recarray)
This add a new access mode for the data:
>>> a2.dec
array([ 32.61106119,  82.72958898, -18.46190884,  44.79729473,
       -54.65838972, -23.78818937,   3.56472044, -79.63061338,
        15.81108779,  73.37221597])

Be careful, this is a view, it means that the data are NOT duplicated, they are the same:
>>> a2.dec[1] = 2
>>>
>>> a2.dec[1]
2.0
>>>
>>> a[1]['dec']
2.0
>>>
>>> a['dec'][1]
2.0


This can slow down the access to a2 AND to a !!! So not so useful, or for small tables.

Refs:
http://docs.scipy.org/doc/numpy/user/basics.rec.html
http://www.scipy.org/Cookbook/Recarray

Aucun commentaire:

Enregistrer un commentaire