python - appliquer la hiérarchie ou index multiples à des colonnes de panda

I have seen lots of examples on how to arrange dataframe row indexes hierarchically, but I am trying to do the same for columns and am not understanding the syntax:

I am reading the contents from a csv file as follows

df=pandas.read_csv("data.csv")

and data.csv contains something like:

rno,marktheory1,marklab1,marktheory2,marklab2
1,78,45,34,54
2,23,54,87,46

so In[1]:df gives

   rno  mark1  lab1  mark2  lab2
0    1     78    45     34    54
1    2     23    54     87    46

What I would like to do is add a hierarchical index or even something akin to a tag to the columns, so that they looked something like this:

        Subject1     Subject2   
   rno  mark1  lab1  mark2  lab2
0    1     78    45     34    54
1    2     23    54     87    46

Here is a quick-fix solution for you:

data = pd.read_csv('data.csv')
>>> arrays = [[ '', 'Subject1', 'Subject1', 'Subject2', 'Subject2'], data.columns]
>>> df = pd.DataFrame(data.values, columns=arrays)
>>> print df
        Subject1        Subject2      
   rno     mark1  lab1     mark2  lab2
0    1        78    45        34    54
1    2        23    54        87    46

[2 rows x 5 columns]

Just another way to do the same:

>>> data = pd.read_csv('data.csv')
>>> data_pieces = [data.ix[:, [0]], data.ix[:, [1, 2]], data.ix[:, [3,4]]]
>>> data = pd.concat(data_pieces, axis=1, keys=['','Subject1', 'Subject2'])
>>> print data
        Subject1        Subject2      
   rno     mark1  lab1     mark2  lab2
0    1        78    45        34    54
1    2        23    54        87    46

[2 rows x 5 columns]

I have seen lots of examples on how to arrange dataframe row indexes hierarchically, but I am trying to do the same for columns and am not understanding the syntax:

I am reading the contents from a csv file as follows

df=pandas.read_csv("data.csv")

and data.csv contains something like:

rno,marktheory1,marklab1,marktheory2,marklab2
1,78,45,34,54
2,23,54,87,46

so In[1]:df gives

   rno  mark1  lab1  mark2  lab2
0    1     78    45     34    54
1    2     23    54     87    46

What I would like to do is add a hierarchical index or even something akin to a tag to the columns, so that they looked something like this:

        Subject1     Subject2   
   rno  mark1  lab1  mark2  lab2
0    1     78    45     34    54
1    2     23    54     87    46

Here is a quick-fix solution for you:

data = pd.read_csv('data.csv')
>>> arrays = [[ '', 'Subject1', 'Subject1', 'Subject2', 'Subject2'], data.columns]
>>> df = pd.DataFrame(data.values, columns=arrays)
>>> print df
        Subject1        Subject2      
   rno     mark1  lab1     mark2  lab2
0    1        78    45        34    54
1    2        23    54        87    46

[2 rows x 5 columns]

Just another way to do the same:

>>> data = pd.read_csv('data.csv')
>>> data_pieces = [data.ix[:, [0]], data.ix[:, [1, 2]], data.ix[:, [3,4]]]
>>> data = pd.concat(data_pieces, axis=1, keys=['','Subject1', 'Subject2'])
>>> print data
        Subject1        Subject2      
   rno     mark1  lab1     mark2  lab2
0    1        78    45        34    54
1    2        23    54        87    46

[2 rows x 5 columns]

Source

Stackoverflow Blog

mercredi 9 avril 2014

python - appliquer la hiérarchie ou index multiples à des colonnes de panda - Stack Overflow

0 commentaires:

Enregistrer un commentaire

Popular Posts