vendredi 8 août 2014

python - django - Stack Overflow différence entre avec plusieurs arguments et filtre de chaîne


What is the difference between filter with multiple arguments and chain filter in django?




As you can see in the generated SQL statements the difference is not the "OR" as some may suspect. It is how the WHERE and JOIN is placed.


Example1 (same joined table) :


(example from https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-valued-relationships)


Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)

This will give you all the Blogs that have one entry with both (entry_headline_contains='Lennon') AND (entry__pub_date__year=2008), which is what you would expect from this query. Result: Book with {entry.headline: 'Life of Lennon', entry.pub_date: '2008'}


Example 2 (chained)


Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)

This will cover all the results from Example 1, but it will generate slightly more result. Because it first filters all the blogs with (entry_headline_contains='Lennon') and then from the result filters (entry__pub_date__year=2008).


The difference is that it will also give you results like: Book with {entry.headline: 'Lennon', entry.pub_date: 2000}, {entry.headline: 'Bill', entry.pub_date: 2008}


One table


But if the query doesn't involve joined tables like the example from Yuji and DTing. The result is same.




You can use the connection module to see the raw sql queries to compare. As explained by Yuji's, for the most part they are equivalent as shown here:


>>> from django.db import connection
>>> samples1 = Unit.objects.filter(color="orange", volume=None)
>>> samples2 = Unit.objects.filter(color="orange").filter(volume=None)
>>> list(samples1)
[]
>>> list(samples2)
[]
>>> for q in connection.queries:
... print q['sql']
...
SELECT `samples_unit`.`id`, `samples_unit`.`color`, `samples_unit`.`volume` FROM `samples_unit` WHERE (`samples_unit`.`color` = orange AND `samples_unit`.`volume` IS NULL)
SELECT `samples_unit`.`id`, `samples_unit`.`color`, `samples_unit`.`volume` FROM `samples_unit` WHERE (`samples_unit`.`color` = orange AND `samples_unit`.`volume` IS NULL)
>>>



Most of the time, there is only one possible set of results for a query.


The use for chaining filters comes when you are dealing with m2m:


Consider this:


# will return all Model with m2m field 1
Model.objects.filter(m2m_field=1)

# will return Model with both 1 AND 2
Model.objects.filter(m2m_field=1).filter(m2m_field=2)

# this will NOT work
Model.objects.filter(Q(m2m_field=1) & Q(m2m_field=2))

Other examples are welcome.




There is a difference when you have request to your related object, for example


class Book(models.Model):
author = models.ForeignKey(Author)
name = models.ForeignKey(Region)

class Author(models.Model):
name = models.ForeignKey(Region)

request


Author.objects.filter(book_name='name1',book_name='name2')

returns empty set


and request


Author.objects.filter(book_name='name1').filter(book_name='name2')

returns authors that have books with both 'name1' and 'name2'


for details look at https://docs.djangoproject.com/en/dev/topics/db/queries/#s-spanning-multi-valued-relationships




The case in which results of "multiple arguments filter-query" is different then "chained-filter-query", following:



Selecting referenced objects on the basis of referencing objects and relationship is one-to-many (or many-to-many).


Multiple filters:


    Referenced.filter(referencing1_a=x, referencing1_b=y)
# same referencing model ^^ ^^

Chained filters:


    Referenced.filter(referencing1_a=x).filter(referencing1_b=y)

Both queries can output different result:
If more then one rows in referencing-modelReferencing1can refer to same row in referenced-modelReferenced. This can be the case in Referenced: Referencing1 have either 1:N (one to many) or N:M (many to many) relation-ship.



Example:


Consider my application my_company has two models Employee and Dependent. An employee in my_company can have more than dependents(in other-words a dependent can be son/daughter of a single employee, while a employee can have more than one son/daughter).
Ehh, assuming like husband-wife both can't work in a my_company. I took 1:m example


So, Employee is referenced-model that can be referenced by more then Dependent that is referencing-model. Now consider relation-state as follows:



Employee:        Dependent:
+------+ +------+--------+-------------+--------------+
| name | | name | E-name | school_mark | college_mark |
+------+ +------+--------+-------------+--------------+
| A | | a1 | A | 79 | 81 |
| B | | b1 | B | 80 | 60 |
+------+ | b2 | B | 68 | 86 |
+------+--------+-------------+--------------+

Dependenta1refers to employeeA, and dependentb1, b2references to employeeB.



Now my query is:


Find all employees those having son/daughter has distinction marks (say >= 75%) in both college and school?


>>> Employee.objects.filter(dependent__school_mark__gte=75,
... dependent__college_mark__gte=75)

[<Employee: A>]

Output is 'A' dependent 'a1' has distinction marks in both college and school is dependent on employee 'A'. Note 'B' is not selected because nether of 'B''s child has distinction marks in both college and school. Relational algebra:



Employee (school_mark >=75 AND college_mark>=75)Dependent



In Second, case I need a query:


Find all employees whose some of dependents has distinction marks in college and school?


>>> Employee.objects.filter(
... dependent__school_mark__gte=75
... ).filter(
... dependent__college_mark__gte=75)

[<Employee: A>, <Employee: B>]

This time 'B' also selected because 'B' has two children (more than one!), one has distinction mark in school 'b1' and other is has distinction mark in college 'b2'.
Order of filter doesn't matter we can also write above query as:


>>> Employee.objects.filter(
... dependent__college_mark__gte=75
... ).filter(
... dependent__school_mark__gte=75)

[<Employee: A>, <Employee: B>]

result is same! Relational algebra can be:



(Employee (school_mark >=75)Dependent) (college_mark>=75)Dependent



Note following:


dq1 = Dependent.objects.filter(college_mark__gte=75, school_mark__gte=75)
dq2 = Dependent.objects.filter(college_mark__gte=75).filter(school_mark__gte=75)

Outputs same result: [<Dependent: a1>]


I check target SQL query generated by Django using print qd1.query and print qd2.query both are same(Django 1.6).


But semantically both are different to me. first looks like simple section σ[school_mark >= 75 AND college_mark >= 75](Dependent) and second like slow nested query: σ[school_mark >= 75][college_mark >= 75](Dependent)).


If one need Code @codepad


btw, it is given in documentation @Spanning multi-valued relationships I have just added an example, I think it will be helpful for someone new.



What is the difference between filter with multiple arguments and chain filter in django?



As you can see in the generated SQL statements the difference is not the "OR" as some may suspect. It is how the WHERE and JOIN is placed.


Example1 (same joined table) :


(example from https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-valued-relationships)


Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)

This will give you all the Blogs that have one entry with both (entry_headline_contains='Lennon') AND (entry__pub_date__year=2008), which is what you would expect from this query. Result: Book with {entry.headline: 'Life of Lennon', entry.pub_date: '2008'}


Example 2 (chained)


Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)

This will cover all the results from Example 1, but it will generate slightly more result. Because it first filters all the blogs with (entry_headline_contains='Lennon') and then from the result filters (entry__pub_date__year=2008).


The difference is that it will also give you results like: Book with {entry.headline: 'Lennon', entry.pub_date: 2000}, {entry.headline: 'Bill', entry.pub_date: 2008}


One table


But if the query doesn't involve joined tables like the example from Yuji and DTing. The result is same.



You can use the connection module to see the raw sql queries to compare. As explained by Yuji's, for the most part they are equivalent as shown here:


>>> from django.db import connection
>>> samples1 = Unit.objects.filter(color="orange", volume=None)
>>> samples2 = Unit.objects.filter(color="orange").filter(volume=None)
>>> list(samples1)
[]
>>> list(samples2)
[]
>>> for q in connection.queries:
... print q['sql']
...
SELECT `samples_unit`.`id`, `samples_unit`.`color`, `samples_unit`.`volume` FROM `samples_unit` WHERE (`samples_unit`.`color` = orange AND `samples_unit`.`volume` IS NULL)
SELECT `samples_unit`.`id`, `samples_unit`.`color`, `samples_unit`.`volume` FROM `samples_unit` WHERE (`samples_unit`.`color` = orange AND `samples_unit`.`volume` IS NULL)
>>>


Most of the time, there is only one possible set of results for a query.


The use for chaining filters comes when you are dealing with m2m:


Consider this:


# will return all Model with m2m field 1
Model.objects.filter(m2m_field=1)

# will return Model with both 1 AND 2
Model.objects.filter(m2m_field=1).filter(m2m_field=2)

# this will NOT work
Model.objects.filter(Q(m2m_field=1) & Q(m2m_field=2))

Other examples are welcome.



There is a difference when you have request to your related object, for example


class Book(models.Model):
author = models.ForeignKey(Author)
name = models.ForeignKey(Region)

class Author(models.Model):
name = models.ForeignKey(Region)

request


Author.objects.filter(book_name='name1',book_name='name2')

returns empty set


and request


Author.objects.filter(book_name='name1').filter(book_name='name2')

returns authors that have books with both 'name1' and 'name2'


for details look at https://docs.djangoproject.com/en/dev/topics/db/queries/#s-spanning-multi-valued-relationships



The case in which results of "multiple arguments filter-query" is different then "chained-filter-query", following:



Selecting referenced objects on the basis of referencing objects and relationship is one-to-many (or many-to-many).


Multiple filters:


    Referenced.filter(referencing1_a=x, referencing1_b=y)
# same referencing model ^^ ^^

Chained filters:


    Referenced.filter(referencing1_a=x).filter(referencing1_b=y)

Both queries can output different result:
If more then one rows in referencing-modelReferencing1can refer to same row in referenced-modelReferenced. This can be the case in Referenced: Referencing1 have either 1:N (one to many) or N:M (many to many) relation-ship.



Example:


Consider my application my_company has two models Employee and Dependent. An employee in my_company can have more than dependents(in other-words a dependent can be son/daughter of a single employee, while a employee can have more than one son/daughter).
Ehh, assuming like husband-wife both can't work in a my_company. I took 1:m example


So, Employee is referenced-model that can be referenced by more then Dependent that is referencing-model. Now consider relation-state as follows:



Employee:        Dependent:
+------+ +------+--------+-------------+--------------+
| name | | name | E-name | school_mark | college_mark |
+------+ +------+--------+-------------+--------------+
| A | | a1 | A | 79 | 81 |
| B | | b1 | B | 80 | 60 |
+------+ | b2 | B | 68 | 86 |
+------+--------+-------------+--------------+

Dependenta1refers to employeeA, and dependentb1, b2references to employeeB.



Now my query is:


Find all employees those having son/daughter has distinction marks (say >= 75%) in both college and school?


>>> Employee.objects.filter(dependent__school_mark__gte=75,
... dependent__college_mark__gte=75)

[<Employee: A>]

Output is 'A' dependent 'a1' has distinction marks in both college and school is dependent on employee 'A'. Note 'B' is not selected because nether of 'B''s child has distinction marks in both college and school. Relational algebra:



Employee (school_mark >=75 AND college_mark>=75)Dependent



In Second, case I need a query:


Find all employees whose some of dependents has distinction marks in college and school?


>>> Employee.objects.filter(
... dependent__school_mark__gte=75
... ).filter(
... dependent__college_mark__gte=75)

[<Employee: A>, <Employee: B>]

This time 'B' also selected because 'B' has two children (more than one!), one has distinction mark in school 'b1' and other is has distinction mark in college 'b2'.
Order of filter doesn't matter we can also write above query as:


>>> Employee.objects.filter(
... dependent__college_mark__gte=75
... ).filter(
... dependent__school_mark__gte=75)

[<Employee: A>, <Employee: B>]

result is same! Relational algebra can be:



(Employee (school_mark >=75)Dependent) (college_mark>=75)Dependent



Note following:


dq1 = Dependent.objects.filter(college_mark__gte=75, school_mark__gte=75)
dq2 = Dependent.objects.filter(college_mark__gte=75).filter(school_mark__gte=75)

Outputs same result: [<Dependent: a1>]


I check target SQL query generated by Django using print qd1.query and print qd2.query both are same(Django 1.6).


But semantically both are different to me. first looks like simple section σ[school_mark >= 75 AND college_mark >= 75](Dependent) and second like slow nested query: σ[school_mark >= 75][college_mark >= 75](Dependent)).


If one need Code @codepad


btw, it is given in documentation @Spanning multi-valued relationships I have just added an example, I think it will be helpful for someone new.


0 commentaires:

Enregistrer un commentaire