2 colonne tableau Rank avec tie-break et sauver Python numpy

I need to be able to rank an array based on a single column and then again with using a second column as basically a tie breaker and then save those two ranks into the database

Array:

array = np.array(
    [(70,3,100),
     (72,3,101),
     (70,2,102)], dtype=[
                  ('score','int8'),
                  ('tiebreaker','int8'),
                  ('row_id','int8')])
array['score'] = array([70, 72, 70], dtype=int8)

First Rank using only the 'score' column would return

(1,3,1)

Then the second Rank rankings using 'score' and 'tiebreaker' columns

(2,3,1)

Then I want to save those two ranks to the database for example:

result1 = Result.objects.get(id=array[0]['row_id'])
result1.relative_rank = 1
result1.absolute_rank = 2
results.save()

You can use scipy.stats.rankdata, as follows:

In [10]: a
Out[10]: 
array([(70, 3, 100), (72, 3, 101), (70, 2, 102)], 
      dtype=[('score', 'i1'), ('tiebreaker', 'i1'), ('row_id', 'i1')])

In [11]: from scipy.stats import rankdata

First rank:

In [12]: rankdata(a['score'], method='min').astype(int)
Out[12]: array([1, 3, 1])

Second rank:

In [13]: rankdata(256*a['score'] + a['tiebreaker'], method='min').astype(int)
Out[13]: array([2, 3, 1])

The value used in the second rank (256*a['score'] + a['tiebreaker']) relies on the data having type int8.

Check the docstring to see if a different method would be more appropriate for the second rank. If you know there will be no ties in the second rank, the method doesn't matter.

I need to be able to rank an array based on a single column and then again with using a second column as basically a tie breaker and then save those two ranks into the database

Array:

array = np.array(
    [(70,3,100),
     (72,3,101),
     (70,2,102)], dtype=[
                  ('score','int8'),
                  ('tiebreaker','int8'),
                  ('row_id','int8')])
array['score'] = array([70, 72, 70], dtype=int8)

First Rank using only the 'score' column would return

(1,3,1)

Then the second Rank rankings using 'score' and 'tiebreaker' columns

(2,3,1)

Then I want to save those two ranks to the database for example:

result1 = Result.objects.get(id=array[0]['row_id'])
result1.relative_rank = 1
result1.absolute_rank = 2
results.save()

You can use scipy.stats.rankdata, as follows:

In [10]: a
Out[10]: 
array([(70, 3, 100), (72, 3, 101), (70, 2, 102)], 
      dtype=[('score', 'i1'), ('tiebreaker', 'i1'), ('row_id', 'i1')])

In [11]: from scipy.stats import rankdata

First rank:

In [12]: rankdata(a['score'], method='min').astype(int)
Out[12]: array([1, 3, 1])

Second rank:

In [13]: rankdata(256*a['score'] + a['tiebreaker'], method='min').astype(int)
Out[13]: array([2, 3, 1])

The value used in the second rank (256*a['score'] + a['tiebreaker']) relies on the data having type int8.

Check the docstring to see if a different method would be more appropriate for the second rank. If you know there will be no ties in the second rank, the method doesn't matter.

Source

Stackoverflow Blog

samedi 5 avril 2014

2 colonne tableau Rank avec tie-break et sauver Python numpy - Stack Overflow

0 commentaires:

Enregistrer un commentaire

Popular Posts