8000 Speed up `bsearch_range_table`. by nnethercote · Pull Request #16 · unicode-rs/unicode-xid · GitHub
[go: up one dir, main page]

Skip to content

Speed up bsearch_range_table. #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 7, 2020
Merged

Speed up bsearch_range_table. #16

merged 1 commit into from
Feb 7, 2020

Conversation

nnethercote
Copy link
Contributor

The comparison function used for the binary search is sub-optimal,
because it looks for an Equal result first, even though that is the
least common result (happening 0 or 1 times per search).

This commit changes the comparison function to look for Equals last.
As a result, the number of tests executed within the comparison function
is reduced:

  • On a Greater result: 2 tests --> 1 test
  • On a Less result: 3 tests --> 2 tests
  • On an Equals result (next most common): 2 tests --> 2 tests

The comparison function used for the binary search is sub-optimal,
because it looks for an `Equal` result first, even though that is the
least common result (happening 0 or 1 times per search).

This commit changes the comparison function to look for `Equals` last.
As a result, the number of tests executed within the comparison function
is reduced:
- On a `Greater` result: 2 tests --> 1 test
- On a `Less` result: 3 tests --> 2 tests
- On an `Equals` result (next most common): 2 tests --> 2 tests
@Manishearth
Copy link
Member

Should we just have an ascii fast path with an array of 255 values?

Also, we should probably make this change for all unicode-rs crates.

@Manishearth Manishearth merged commit bb14270 into unicode-rs:master Feb 7, 2020
@nnethercote
Copy link
Contributor Author

Note that bsearch_range_table showed up in one profile for rustc, because proc-macro2-0.4.3 has this function which is reasonably hot:

fn xid_ok(string: &str) -> bool {
    let mut chars = string.chars(); 
    let first = chars.next().unwrap();
    if !(UnicodeXID::is_xid_start(first) || first == '_') {
        return false;
    }
    for ch in chars {
        if !UnicodeXID::is_xid_continue(ch) {
            return false;
        }
    }
    true
}

proc-macro2 is now up to version 1.0.8 and the current code has ASCII checks before calling out to unicode-xid functions, like so:
https://github.com/alexcrichton/proc-macro2/blob/8ce7c670a55fa2c8a312d9c107143b3bdb6e93ec/src/fallback.rs#L578-L591

Still, it's an easy change and a clear win, even if it doesn't end up helping rustc directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0