I try to discretize some numbers, by looking if they are in a given range, and then assign a number based on the range, however the result which I get is not exactly correct.
mapp
is the a dictionary of which defines ranges, and the values which correspond to the given range.
lst
is the list of numbers that I want to match against those ranges, and assign identifiers to them
mapp = {(0,100): 1, (100,400): 2, (400,800): 3}
lst = [3.5, 5.4, 300.12, 500.78, 600.45, 900.546]
def discretize(mapping_dict, list_of_values):
print "\n"
location = []
for x in sorted(list_of_values):
for (lower_bound,upper_bound),value in mapping_dict.items():
if round(x) in range(lower_bound,upper_bound):
print round(x), "yes", value
distance = mapping_dict[(lower_bound,upper_bound)]
location.append((distance))
else:
print round(x), "no"
distance = len(mapping_dict.items())+10
location.append((distance))
return location
The result which I expect is: [1, 1, 2, 3, 3, 13]
, however that's not what I get.
This is the actual result which I get, which is incorrect:
4.0 yes 1
4.0 no #wrong!
5.0 yes 1
5.0 no #wrong!
300.0 yes 2
300.0 no #wrong!
501.0 yes 3
501.0 no #wrong!
600.0 yes 3
600.0 no #wrong!
901.0 no #CORRECT
[1, 13, 1, 13, 2, 13, 3, 13, 3, 13, 13]
I get no
at 4.0
which is not correct, etc, etc.
Where is the problem?
Thanks
mapp = {(0,100): 1, (100,400): 2, (400,800): 3}
lst = [3.5, 5.4, 300.12, 500.78, 600.45, 900.546]
result = []
for l in lst:
for m in mapp:
if m[0] < l < m[1]:
result.append(mapp[m])
print result
Output:
[1, 1, 2, 3, 3]
EDIT:
result = []
for l in lst:
flag=True
for m in mapp:
if m[0] < l < m[1]:
result.append(mapp[m])
flag = False
break
if flag:
result.append(-1)
print result
Output:
[1, 1, 2, 3, 3, -1]
I think I have faced a similar problem once, because I found a small RangeDict
class:
class RangeDict (dict):
def __init__ (self, *args):
super ().__init__ ()
def __setitem__ (self, k, v):
if not isinstance (k, slice): raise ValueError ('Indices must be slices.')
super ().__setitem__ ( (k.start, k.stop), v)
def __getitem__ (self, k):
for (start, stop), v in self.items ():
if start <= k < stop: return v
raise IndexError ('{} out of bounds.'.format (k) )
I hope this class wraps your desired funcionality. Obviously lookup is O(N) and not O(1).
Sample usage:
r = RangeDict ()
r [0:100] = 1
r [100:400] = 2
r [400:800] = 3
for x in [3.5, 5.4, 300.12, 500.78, 600.45, 900.546]:
print (r [x] )
#Last value raises IndexError
Putting an else
after your for
loop you were an the right track! When you put an else
after a loop, that else
block is executed each time the loop exits normally, i.e. without using e.g. break
. Thus, (assuming that your groups are non-overlapping) you just need to add a break
statement to the end of your if
block, i.e. after location.append((distance))
. Then it works as expected.
Also, instead of checking whether the number is in the range
(which creates and searches a list each time!) you should just use <=
and <
. Also, you already have the value
, so why not use it?
for (lower_bound, upper_bound), value in mapping_dict.items():
if lower_bound <= x < upper_bound:
location.append(value)
break
else:
location.append(len(mapping_dict) + 10)
I try to discretize some numbers, by looking if they are in a given range, and then assign a number based on the range, however the result which I get is not exactly correct.
mapp
is the a dictionary of which defines ranges, and the values which correspond to the given range.
lst
is the list of numbers that I want to match against those ranges, and assign identifiers to them
mapp = {(0,100): 1, (100,400): 2, (400,800): 3}
lst = [3.5, 5.4, 300.12, 500.78, 600.45, 900.546]
def discretize(mapping_dict, list_of_values):
print "\n"
location = []
for x in sorted(list_of_values):
for (lower_bound,upper_bound),value in mapping_dict.items():
if round(x) in range(lower_bound,upper_bound):
print round(x), "yes", value
distance = mapping_dict[(lower_bound,upper_bound)]
location.append((distance))
else:
print round(x), "no"
distance = len(mapping_dict.items())+10
location.append((distance))
return location
The result which I expect is: [1, 1, 2, 3, 3, 13]
, however that's not what I get.
This is the actual result which I get, which is incorrect:
4.0 yes 1
4.0 no #wrong!
5.0 yes 1
5.0 no #wrong!
300.0 yes 2
300.0 no #wrong!
501.0 yes 3
501.0 no #wrong!
600.0 yes 3
600.0 no #wrong!
901.0 no #CORRECT
[1, 13, 1, 13, 2, 13, 3, 13, 3, 13, 13]
I get no
at 4.0
which is not correct, etc, etc.
Where is the problem?
Thanks
mapp = {(0,100): 1, (100,400): 2, (400,800): 3}
lst = [3.5, 5.4, 300.12, 500.78, 600.45, 900.546]
result = []
for l in lst:
for m in mapp:
if m[0] < l < m[1]:
result.append(mapp[m])
print result
Output:
[1, 1, 2, 3, 3]
EDIT:
result = []
for l in lst:
flag=True
for m in mapp:
if m[0] < l < m[1]:
result.append(mapp[m])
flag = False
break
if flag:
result.append(-1)
print result
Output:
[1, 1, 2, 3, 3, -1]
I think I have faced a similar problem once, because I found a small RangeDict
class:
class RangeDict (dict):
def __init__ (self, *args):
super ().__init__ ()
def __setitem__ (self, k, v):
if not isinstance (k, slice): raise ValueError ('Indices must be slices.')
super ().__setitem__ ( (k.start, k.stop), v)
def __getitem__ (self, k):
for (start, stop), v in self.items ():
if start <= k < stop: return v
raise IndexError ('{} out of bounds.'.format (k) )
I hope this class wraps your desired funcionality. Obviously lookup is O(N) and not O(1).
Sample usage:
r = RangeDict ()
r [0:100] = 1
r [100:400] = 2
r [400:800] = 3
for x in [3.5, 5.4, 300.12, 500.78, 600.45, 900.546]:
print (r [x] )
#Last value raises IndexError
Putting an else
after your for
loop you were an the right track! When you put an else
after a loop, that else
block is executed each time the loop exits normally, i.e. without using e.g. break
. Thus, (assuming that your groups are non-overlapping) you just need to add a break
statement to the end of your if
block, i.e. after location.append((distance))
. Then it works as expected.
Also, instead of checking whether the number is in the range
(which creates and searches a list each time!) you should just use <=
and <
. Also, you already have the value
, so why not use it?
for (lower_bound, upper_bound), value in mapping_dict.items():
if lower_bound <= x < upper_bound:
location.append(value)
break
else:
location.append(len(mapping_dict) + 10)
0 commentaires:
Enregistrer un commentaire