Usage of iterator() on django queryset

djangodjango-querysetiterator

I came across some strange behaviour recently, and need to check my understanding.

I'm using a simple filter in the model and then iterating over the results.

e.g.

allbooks = Book.objects.filter(author='A.A. Milne')

for book in allbooks:
   do_something(book)

oddly, it was returning only a partial list of books.

However, when using the same code and using iterator(), this seems to work well.

i.e.

for book in allbooks.iterator():
    do_something(book)

Any idea why?

p.s. I did look through the Django documentation, but can't see how the queryset would be cached already anywhere else…

iterator()
Evaluates the QuerySet (by performing the query) and returns an iterator over the results. A QuerySet typically caches its results internally so that repeated evaluations do not result in additional queries; iterator() will instead read results directly, without doing any caching at the QuerySet level. For a QuerySet which returns a large number of objects, this often results in better performance and a significant reduction in memory

Note that using iterator() on a QuerySet which has already been evaluated will force it to evaluate again, repeating the query.

Best Answer

oddly, it was returning only a partial list of books.

That's not how the queryset must work. Iterating over queryset should give you every record returned by your database. Debug your code. You'll find the error, otherwise debug it again.

It's easy to check in the REPL. Run manage.py shell:

from app.models import Model
for o in Model.objects.filter(fieldname="foo"): print o

#Let's see DB query
from django.db import connection
print(connection.queries)