8000 Fix pagination_total index option to prevent SELECT count(*) queries by losvedir · Pull Request #3848 · activeadmin/activeadmin · GitHub
[go: up one dir, main page]

Skip to content

Fix pagination_total index option to prevent SELECT count(*) queries #3848

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

losvedir
Copy link
Contributor
@losvedir losvedir commented Mar 9, 2015

Addresses #3847.

The pagination_total option is supposed to prevent expensive SELECT count(*) queries from being performed when viewing a resource's index page.

However, while this option was hiding the display of the total number of pages, it did not actually prevent the query from being performed.

There were two reasons:

First, in #build_pagination, the line: text_node paginate collection, options would trigger Kaminari to call #total_pages on the collection, which would trigger the query. Excerpt of kaminari code:

def paginate(scope, options = {}, &block)
  options[:total_pages] ||= options[:num_pages] || scope.total_pages

By passing in the :total_pages option, we short circuit this assignment and avoid the query.

The downside is we do so by hardcoding in an essentially random total pages count of 100. I chose the number 100, because I wanted to ensure ellipses would appear in Kaminari's "Page 1, 2, 3, ..., Next, Last" links, no matter how many links are configured to be displayed. However, if the resource only had 2 pages worth of items, then clicking the "3" would result in an empty page. That said, without querying the resource, there's no way to know how many pages there should be.

Second, in #page_entries_info, the call collection.num_pages also triggered a SELECT count(*). I re-arranged the conditionals to avoid that call if @display_total is false, while trying to keep the behavior the same.

@losvedir 8000
Copy link
Contributor Author
losvedir commented Mar 9, 2015

Do you want me to modify the line lengths? They're mostly from code I just moved around, and I didn't want to be too invasive. My comments are longer than 80 characters since I didn't realize the length restriction. I can change the line breaks if you want.

@losvedir losvedir force-pushed the 3847-fix-pagination-total-false branch from ae7145d to cc476fb Compare March 9, 2015 18:50
# To do so we must not call `collection.num_pages`
offset = (collection.current_page - 1) * collection.limit_value
I18n.t "active_admin.pagination.multiple_without_total", model: entries_name,
from: offset + 1, to: offset + collection_size

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Align the elements of a hash literal if they span more than one line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's not your code style, but please change this and do the same on line 138

@losvedir
Copy link
Contributor Author
losvedir commented Mar 9, 2015

Okay, I changed my comments to be within 80 characters, and changed single quotes to double.

@timoschilling
Copy link
Member

note: related to #2638

The downside is we do so by hardcoding in an essentially random total pages count of 100. I chose the number 100, because I wanted to ensure ellipses would appear in Kaminari's "Page 1, 2, 3, ..., Next, Last" links, no matter how many links are configured to be displayed. However, if the resource only had 2 pages worth of items, then clicking the "3" would result in an empty page. That said, without querying the resource, there's no way to know how many pages there should be.

Sorry but I can't merge that, that's a hack and not a solution. In my eyes this is not acceptable.

There are some unexpected behaviors if you trie all variants of index pagination_total: false do and config.paginate = false.

BTW: In my test app I still see count queries:

SELECT COUNT(count_column) FROM (SELECT  1 AS count_column FROM `companies`  WHERE (user_id IS NOT NULL) LIMIT 10000 OFFSET 0) subquery_for_count

Have you read #2638? I'm not very familiar with #2638 but in my understanding it's a kaminari bug.

@losvedir
Copy link
Contributor Author

Apologies, I didn't see that other issue. I tried searching for pagination_total and it didn't come up.

There are some unexpected behaviors if you trie all variants of index pagination_total: false do and config.paginate = false.

I'm sorry, I don't understand this. Can you explain a bit? And if you disable pagination then won't that blow up activeadmin if you have millions of records (the impetus for turning pagination_total: false?

SELECT COUNT(count_column) FROM (SELECT  1 AS count_column FROM `companies`  WHERE (user_id IS NOT NULL) LIMIT 10000 OFFSET 0) subquery_for_count
111

Yeah, I get these, too, but they run instantaneously for me. The LIMIT there narrows it down to just 10,000 records (and in my configuration it's just 30). I think it's to determine how many records are being displayed on the current page, rather than the total number of records?

The query to look out for is SELECT count(*) FROM "companies". Before my patch that one query can take upwards of seconds if you have millions of records.

Sorry but I can't merge that, that's a hack and not a solution. In my eyes this is not acceptable.

Thank you for the blunt feedback. I'll see if I can find another approach.

@losvedir
Copy link
Contributor Author

Okay, I've looked into this some more, including #2638.

Contrary to what #2638 says, I'd argue that this is not a bug in Kaminari, but rather a feature request. It is a bug in activeadmin, because the documentation advertises this as a feature that removes the count (*) and has some code intending that effect, but does not actually do so. I poked around the Kaminari codebase, and unfortunately don't feel I could cleanly implement the feature in the time I'm willing to spend.

While Kaminari does not have an "infinite pages" feature, it does let you specify the total_pages as an option, so it doesn't have to do the query. That's what my PR took advantage of, although I arbitrarily set it to 100.

I looked at #2638 and @azach links to a clever approach called activeadmin_fuzzy_paginate, which I think is a more elegant solution than the way I did. If I changed my PR to use his approach would that suffice?

Let me explain the issue and the approach:

Here's a screenshot of the pagination component:

screen shot 2015-03-10 at 11 11 53 am

The activeadmin code that generates this is:

def build_pagination
  options = {}
  options[:param_name] = @param_name if @param_name
  text_node paginate collection, options
end

The pagination component can be configured via the options that are passed in.

There are two parts to the pagination component that are relevant here:

  • The links 96, 97, 98, 99, 100. 98 is the current page, and the quantity of links on each side is determined in Kaminari via Kaminari.config.window.
  • The links 9999, 1000. These are the links starting from total_pages and counting backwards. The number of links is determined by options[:right].

So the clever approach is still to send in options[:total_pages] so Kaminari doesn't calculate it (like I did in my PR). But instead of hardcoding 100 as the total number of pages, we say the total is the "current page + the window". Since by definition we can't know the total number of pages without a count(*) this basically implements "infinite pages" by always saying there are at least window pages more than whatever we're on. We also set options[:right] = 0 so we don't display the links to the last few pages.

options[:total_pages] = collection.current_page + Kaminari.config.window + 1
options[:right] = 0

To be clear there is still one small downside: Kaminari seems to generate the "Last" link no matter what. The screenshot from my proposed fix looks like this:

screen shot 2015-03-10 at 12 36 13 pm

If you click "Last", then the pagination on that page will look the same except now 14 will be the "current page", and it will count up 15, 16, 17, 18, ....

So, at this point, I see two possibilities:

  • I update my PR with the approach above.
  • We wait for someone to implement a new "infinite pages" feature in Kaminari which does basically the above, but also removes the Last link. Then we pass in options[:infinite_pages] = true or something here. Activeadmin lives with a bug until then, and we bump the required kaminari version in the gemspec when it happens. But I don't have the time or understanding of Kaminari codebase to do this, unfortunately.

I hope this new approach is sufficiently "not hacky" so that you'll accept it, but I'll understand if not and close the PR. Please let me know.

@losvedir losvedir force-pushed the 3847-fix-pagination-total-false branch from cc476fb to 453a67d Compare March 10, 2015 16:54
@losvedir
Copy link
Contributor Author

Okay, since I had already worked out the code locally for the screenshots, I just pushed up my changes to the PR. Feel free to review and merge if you approve of the approach! :)

@timoschilling
Copy link
Member

First: 👍 for one of the best documented PR's

I still think it's to hacky.

But I have a idea:
What if we change this:

options[:total_pages] = collection.current_page + Kaminari.config.window + 1

to this:

next_page = collection.page(collection.current_page + 1).per(@per_page).limit(1).any?
options[:total_pages] = collection.current_page + (next_page ? 1 : 0)

That's just produce a query like this:

SELECT COUNT(count_column) FROM (SELECT  1 AS count_column FROM `companies` LIMIT 1 OFFSET 30) subquery_for_count

@losvedir
Copy link
Contributor Author

Oh, that's very interesting. I like it! Sorry I haven't had time to respond. I'll try out this approach over the weekend and update the PR.

@timoschilling
Copy link
Member

There is a more easy version:

offset = collection.page(collection.current_page + 1).per(@per_page).limit(1).count
options[:total_pages] = collection.current_page + offset

@losvedir
Copy link
Contributor Author

I looked into these approaches this weekend and unfortunately they didn't work. I put a byebug in the code right where those code portions go, to try them out and see what gets executed in the way of SQL:

(byebug) collection.page(collection.current_page + 1).per(@per_page).limit(1).any?
   (117.3ms)  SELECT COUNT(*) FROM "messages"
true
(byebug)

and

(byebug) offset = collection.page(collection.current_page + 1).per(@per_page).limit(1).count
   (108.0ms)  SELECT COUNT(*) FROM "messages"
989547
(byebug)

I believe it's the usage of Kaminari's #page and #per. Doing it with straight #offset and #limit works:

(byebug) collection.offset( collection.current_page * @per_page + 1).limit(1).count
   (2.1ms)  SELECT COUNT(count_column) FROM (SELECT 1 AS count_column FROM "messages" LIMIT 1 OFFSET 15001) subquery_for_count
1
(byebug)

However, I don't know enough about activeadmin: are we guaranteed that the collection in PaginatedCollection responds to offset and limit? Mine did, since I'm using postgres and activerecord, but I'm not sure what other data stores are supported.

If I can use #offset and #limit, I'll rewrite the PR to use them. Let me know.

If I can't use them, then I'm out of ideas. I'll put in one more plug for the PR's current approach which is to set the total pages as "current page + 1". The only downside is that when you're on the last page, it will still have a "Next" button. In practice, I don't think it's a problem, since we're talking about collections with tens of thousands of pages here, and if a user opts in to pagination_false: true, then they shouldn't expect us to actually know the last page.

@timoschilling
8000 Copy link
Member

collection should be a ActiveRecord::Base or ActiveRecord::Relation object. It's possible but not allowed that collection returns something else. So we can assume that it has offset and limit.

@losvedir
Copy link
Contributor Author

Ah, great! I'll update the PR to use offset and limit when I have a chance.

@timoschilling
Copy link
Member

Need we more test for that?

@timoschilling
Copy link
Member

@losvedir any updates?

@losvedir losvedir force-pushed the 3847-fix-pagination-total-false branch from 453a67d to 6ab8c65 Compare April 7, 2015 12:55
offset = (collection.current_page - 1) * collection.limit_value
total = collection.total_count
I18n.t "active_admin.pagination.multiple",
model: entries_name,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Align the parameters of a method call if they span more than one line.

@losvedir
Copy link
Contributor Author
losvedir commented Apr 7, 2015

@timoschilling - Sorry, for the delay. I made the offset and limit changes as well as hash formatting (just now pushed up) but have been working on tests for the feature. I'm not familiar with cucumber myself so the tests are slow going, and I don't have a ton of time on the side to work on it. Will try to complete the tests this week.

@timoschilling
Copy link
Member

@losvedir 👍

@losvedir losvedir force-pushed the 3847-fix-pagination-total-false branch from 6ab8c65 to d3b5a93 Compare April 9, 2015 03:42
@@ -5,3 +5,11 @@
Then /^I should see pagination with (\d+) pages$/ do |count|
expect(page).to have_css '.pagination span.page', count: count
end

Then /^I should see the pagination "Next" link/ do
expect(page).to have_css 'a', text: 'Next ›'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer double-quoted strings unless you need single quotes to avoid extra backslashes for escaping.

@losvedir losvedir force-pushed the 3847-fix-pagination-total-false branch 2 times, most recently from 7973eaa to c6dd98c Compare April 9, 2015 03:50
else
offset = (collection.current_page - 1) * collection.limit_value
total = collection.total_count
I18n.t 'active_admin.pagination.multiple',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer double-quoted strings unless you need single quotes to avoid extra backslashes for escaping.

@losvedir
Copy link
Contributor Author
losvedir commented Apr 9, 2015

@timoschilling - Okay, ready for review! I've never used cucumber before; please let me know if I should be doing that a different way.

@losvedir losvedir force-pushed the 3847-fix-pagination-total-false branch from c6dd98c to 61ea1ea Compare April 9, 2015 03:54
@timoschilling
Copy link
Member

There is a syntax error on Ruby 1.9 https://travis-ci.org/activeadmin/activeadmin/jobs/57748755#L303

@losvedir
Copy link
Contributor Author
losvedir commented Apr 9, 2015

Ah, bummer. I don't have Ruby 1.9 on this computer and it's slow to build for me. Will have to take a look tonight.

The pagination_total option is supposed to prevent expensive SELECT
count(*) queries from being performed when viewing a resource's index
page.

However, while this option was hiding the display of the total number of
pages, it did not actually prevent the query from being performed.

There were two reasons:

In `#build_pagination`, the line:

`text_node paginate collection, options`

would trigger Kaminari to call `#total_pages` on the collection, which would
trigger the query:

```
    def paginate(scope, options = {}, &block)
      options[:total_pages] ||= options[:num_pages] || scope.total_pages
```

By passing in the `:total_pages` option, we short circuit this assignment
and avoid the query.

That fix was not sufficient. In `#page_entries_info`, the call
`collection.num_pages` also triggered a SELECT count(*). I re-arranged
the conditionals to avoid that call if `@display_total` is false, while
trying to keep the behavior the same.
@losvedir losvedir force-pushed the 3847-fix-pagination-total-false branch from 61ea1ea to 10b9c8a Compare April 10, 2015 00:52
@losvedir
Copy link
Contributor Author

Ah, ruby 1.9.3 didn't like the unicode literal in my string! Fixed.

timoschilling added a commit that referenced this pull request Apr 11, 2015
Fix pagination_total index option to prevent SELECT count(*) queries
close #2638
refs #2333, #2283
@timoschilling timoschilling merged commit 3f9efd1 into activeadmin:master Apr 11, 2015
@sillylogger
Copy link

I think this collection.offset(collection.current_page * @per_page.to_i).limit(1).count is great in that it's faster than actually counting the items in the collection, but what about leveraging the already queried collection_size from ActiveAdmin::Helpers::Collection?

As it stands there are 3 queries actually run against the table:

SELECT COUNT(count_column) FROM (SELECT  1 AS count_column FROM "users" LIMIT 50 OFFSET 0) subquery_for_count
SELECT COUNT(count_column) FROM (SELECT  1 AS count_column FROM "users" LIMIT 1 OFFSET 50) subquery_for_count
SELECT  "users".* FROM "users"  ORDER BY "users"."id" desc LIMIT 50 OFFSET 0
  1. to count the number of items on the current page
  2. to peek if there is 1 item on the next page
  3. to load the items on the current page

I see two separate potential optimizations here:

  • What if we assume there is a next page if the current page as a complete collection? There is a loss in information here no doubt... but if the goal is performance, this would eliminate a potentially expensive query.
  • What about actually loading the items on the first collection_size call in ActiveAdmin::Helpers::Collection? Subsequent calls to iterate over the items will hit the cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0