10000 perf: Use `for_each` in `Vec::extend` by Marwes · Pull Request #68046 · rust-lang/rust · GitHub
[go: up one dir, main page]

Skip to content

perf: Use for_each in Vec::extend #68046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
perf: Use for_each in Vec::extend
`for_each` are specialized for iterators such as `chain` allowing for
faster iteration than a normal `for/while` loop.

Note that since this only checks `size_hint` once at the start it may
end up needing to call `reserve` more in the case that `size_hint`
returns a larger and more accurate lower bound during iteration.

This could maybe be alleviated with an implementation closure like the current
one but the extra complexity will likely end up harming the normal case
of an accurate or 0 (think `filter`) lower bound.

```rust
while let Some(element) = iterator.next() {
    let (lower, _) = iterator.size_hint();
    self.reserve(lower.saturating_add(1));
    unsafe {
        let len = self.len();
        ptr::write(self.get_unchecked_mut(len), element);
        // NB can't overflow since we would have had to alloc the address space
        self.set_len(len + 1);
    }

    iterator.by_ref().take(self.capacity()).for_each(|element| {
        unsafe {
            let len = self.len();
            ptr::write(self.get_unchecked_mut(len), element);
            // NB can't overflow since we would have had to alloc the address space
            self.set_len(len + 1);
        }
    });
}

// OR

let (lower, _) = iterator.size_hint();
self.reserve(lower);
loop {
    let result = iterator.by_ref().try_for_each(|element| {
        if self.len() == self.capacity() {
            return Err(element);
        }
        unsafe {
            let len = self.len();
            ptr::write(self.get_unchecked_mut(len), element);
            // NB can't overflow since we would have had to alloc the address space
            self.set_len(len + 1);
        }
        Ok(())
    });

    match result {
        Ok(()) => break,
        Err(element) => {
            let (lower, _) = iterator.size_hint();
            self.reserve(lower.saturating_add(1));
            self.push(element);
        }
    }
}
```

Closes #63340
  • Loading branch information
Marwes authored and Markus Westerlind committed Feb 6, 2020
commit ddabd7643efb5cc7dc82d3024e6294bc98b9caf2
18 changes: 5 additions & 13 deletions src/liballoc/vec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2122,26 +2122,18 @@ where
}

impl<T> Vec<T> {
fn extend_desugared<I: Iterator<Item = T>>(&mut self, mut iterator: I) {
fn extend_desugared<I: Iterator<Item = T>>(&mut self, iterator: I) {
// This is the case for a general iterator.
//
// This function should be the moral equivalent of:
//
// for item in iterator {
// self.push(item);
// }
while let Some(element) = iterator.next() {
let len = self.len();
if len == self.capacity() {
let (lower, _) = iterator.size_hint();
self.reserve(lower.saturating_add(1));
}
unsafe {
ptr::write(self.get_unchecked_mut(len), element);
// NB can't overflow since we would have had to alloc the address space
self.set_len(len + 1);
}
}
let (lower, _) = iterator.size_hint();
self.reserve(lower);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still has the issue of trying to reserve before advancing the iterator. taking an element and then checking the hint and capacity is strictly better because the hint will be more accurate and if you take a None you can skip the hint calculation and reserve attempt.

Copy link
Contributor Author
@Marwes Marwes Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit is that it avoids calling next at all. While the hint should be more accurate by calling next it would also be even more accurate if we called next twice so I don't quite buy that argument. Calling next once would give a fast path for the empty iterator at least which I find a more convincing argument, however it is only a benefit if size_hint(); reserve(0) is slow enough to warrant this fast path.

Pushed another variant which does next first which should optimize better for the empty iterator. Haven't got time to benchmark it yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more concerned about doing an suboptimal resize if we got an incorrect lower bound on the first attempt, but you're right, if it's just 0 it hopefully would be cheap enough and the next reserve will do it properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have gone a bit back and forth but I think the of reserving eagerly has a better tradeoff.

On size_hint == 0 it does call reserve unnecessarily but it is only one more branch in an already fast case. If size_hint > 0 then most of the time we will reserve either way and most of the time with the same value (size_hint_minus_removed_value + 1 vs size_hint).

The only case where I'd expect it to be faster to call next first is if size_hint is expensive, despite the iterator being empty and next being faster to call than size_hint which seems unlikely (happy to be corrected however!).


iterator.for_each(|element| self.push(element));
}

/// Creates a splicing iterator that replaces the specified range in the vector
Expand Down
0