N+1 queries are a performance problem in which the application makes database queries in a loop, instead of making a single query that returns all the information at once. Each database connection takes some amount of time, so querying the database in a loop can be many times slower than doing it just once. This problem often occurs when you use an object-relational mapping (ORM) tool in web frameworks like Django or Ruby on Rails.
The detector for performance issues looks for a set of sequential, non-overlapping database spans with similar descriptions. It also uses the following criteria:
- Total duration of involved spans must exceed a threshold (usually 100 ms)
- Total count of involved spans must exceed a threshold (usually five spans)
- Involved spans cannot be truncated
If Sentry is not detecting an N+1 issue where you expect one, it's probably because the transaction didn't meet one of the above criteria.
The evidence for an N+1 queries problem has four main aspects:
- Transaction name
- Parent span - This can be a view, a serializer, or another span that logically groups the queries.
- Repeating span - This is the "N" of N+1 queries. This is the looped query that should have been part of a bulk query.
Consider a book review website. It has two ORM models,
Author, each with a corresponding database table. The website shows a list of the ten oldest books and their respective authors. The code might look like this:
from django.http import HttpResponse def books(request): books = Book.objects.all()[:10] book_list = [book.title + " by " + book.author.name for book in books] return HttpResponse((", ").join(book_list))
This code has a subtle performance problem. Each call to
book.author.name makes a query to fetch the book's author. In total, this code makes 11 queries: one query to fetch the list of books, and 10 more queries to fetch the author of each book. This results in a characteristic query span waterfall:
In order to fix this performance issue, you could use the
select_related method in Django, like so:
from django.http import HttpResponse def books(request): books = Book.objects.select_related("author").all()[:10] book_list = [book.title + " by " + book.author.name for book in books] return HttpResponse((", ").join(book_list))
JOIN the tables ahead of time, and preload the author information. That way, calling
book.author.name does not need to make an extra query. Instead of a long waterfall, there is a single