Because this is focused on how the data structure works it doesn't mention lots of nice API design choices in Rust.
The one I particularly want to call out because it came up this morning is providing both Vec::reserve and Vec::reserve_exact
Vec::reserve lets us hint about our upcoming capacity expectations without damaging the O(1) amortized growth which is the whole point of this collection type, but it can waste some memory to achieve this. This is probably what you want in most code.
Vec::reserve_exact is a more narrow idea - we can hint about the ultimate capacity needed, if we're wrong and later need more capacity this has a significant performance cost because we thew away the amortized growth promise to get this, but we don't waste memory.
C++ only provides the equivalent of Vec::reserve_exact which makes this a footgun but if you use this call when you really needed the other one you trash your perf, as a result of which people teaching C++ tend to just say don't use reservation to uh, reserve capacity, but now you're leaving perf on the table so that's not great.
> Vec::reserve_exact is a more narrow idea - we can hint about the ultimate capacity needed, if we're wrong and later need more capacity this has a significant performance cost because we thew away the amortized growth promise to get this, but we don't waste memory.
The claim that using reserve_exact "throws away the amortized growth promise" is wrong. You don't disable amortized growth, you just won't get extra headroom from that call. If you guessed too small, you may pay an extra reallocation later. Rust's Vec still provides amortized O(1) push overall.
> C++ only provides the equivalent of Vec::reserve_exact which makes this a footgun but if you use this call when you really needed the other one you trash your perf, as a result of which people teaching C++ tend to just say don't use reservation to uh, reserve capacity, but now you're leaving perf on the table so that's not great.
It's not a reserve_exact only footgun. The standard says reserve(n) makes capacity at least n, implementations may allocate exactly n ( and many do), but they're allowed to overallocate and future growth remains amortized. Used properly (when you know or can bound the size), reserve is a common and recommended optimization.
> The claim that using reserve_exact "throws away the amortized growth promise" is wrong
Well, in some sense this depends on what exactly you think you're amortizing over. If you `Vec::reserve` with size N, fill to N, and then append a single element you get the usual amortized O(1) growth of an append (or at least you can, the docs for `Vec::reserve` say it may reserve additional space, not that it must). But if you `Vec::reserve_exact` with size N, fill to N, and then append a single element you are guaranteeing that that first append triggers a potentially O(N) resize.
From that point on in either case you would get the usual amortized growth of course.
> if you `Vec::reserve_exact` with size N, fill to N, and then append a single element you are guaranteeing that that first append triggers a potentially O(N) resize.
The documentation does not guarantee this, because memory allocators can't typically allocate arbitrarily-sized blocks of memory, instead rounding to the nearest supported size. For example, for small allocations glibc malloc allocates in multiples of 8 bytes, with a minimum size of 24 bytes. So if you make a 35-byte allocation, there will be 5 bytes of wasted space at the end which you could theoretically use to store more elements without reallocating if your collection grows.
If you're using the system allocator, Rust can't take advantage of this, because the C malloc/free APIs don't provide any (portable) way for the allocator to inform the application about this excess capacity. But Rust's (currently unstable) API for custom allocators does expose this information, and the documentation is written to allow Vec to take advantage of this information if available. If you reserve_exact space for 35 u8's, and your allocator rounds to 40 bytes (and informs the Vec of this the allocator API), then the vector is allowed to set its capacity to 40, meaning the next append would not trigger a resize.
On current stable Rust, this is all just theoretical and Vec works as you describe -- but the documentation specifically does not promise this because the situation is expected to change in the future.
Sure, but this isn't a job for reserve_exact. Vec::reserve is like a framing hammer made for speed and momentum. You might dent a little extra wood (overallocate), but you drive lots of nails fast and keep amortized O(1) growth.
reserve_exact is a jeweler's hammer so great when you know the exact setting and won't touch it again, precise fit, zer slack etc.
Now, try to frame a house with jeweler's hammer, tapping in tiny increments and resizing every few boards and you will crawl into quadratic time.
Who is using jeweler's hammer to frame their house?
So what I'm arguing is if you don't misuse reserve_exact, then as you said you still get amortized growth. The OP's example misuses the tool and then blames it for not behaving differently on that first append.
> The claim that using reserve_exact "throws away the amortized growth promise" is wrong
The key part is that multiple calls reserve_exact will cause repeated allocations. The classic example is if someone defines a `pushfive` function that uses reserve_exact to increase the size by 5, and then pushes 5 times. Calling this function in a loop will take quadratic time since each reserve_exact call increases the array size. With reserve, the runtime is linear as expected.
Quadratic pushfive is just a cautionary tale about misusing reserve_exact. Basically use reserve_exact when you know the final size (reserve once), or you're doing a one off tight sizing where memory footprint matters.
Don't pre reserve inside pushfive, just push the 5 elements (Vec handles growth)
or if you know how many times you'll call pushfive in a loop, reserve up front once vec.reserve(5 * times) or use reserve instead of reserve_exact for incremental growth.
that's exactly the footgun. reserve_exact usage needs to be analyzed globally, while `resere` has the extra fuzziness needed to ensure that you can't mess anything up too badly with it.
Yes, reserve is the safer default because it gives you slack and preserves amortized growth even if your usage pattern is messy.
But "needs global analysis" isn't unique to reserve_exact. Lots of perf knobs do (chunk sizes, buffering, locking granularity etc). The fix to this is to use the tool where its preconditions actually hold, not to avoid the tool.
So what I'm basically saying is that reserve_exact isn't inherently dangerous, it just assumes you know the final size and won't exceed it etc. If you keep bumping it in tiny steps (pushfive style), that's just misuse so treating it as a flaw is unwarranted.
Dangerous is perhaps overstating it, but the only utility reserve(_exact) has is performance and predictability. If using the API can be worse than doing nothing, I think it warrants being referred to as a footgun.
Just today I saw a (2 year old) video on that very topic in Rust and C++: https://www.youtube.com/watch?v=algDLvbl1YY
Towards the end they mention what to use instead in C++ to get the same characteristics as Rust's Vec::reserve:
This is roughly analogous to Vec's specialised Extend implementation. Like that feature it only helps if you've got a similar API shape you're adapting. If we need to make newElems then we're probably not getting a perf win from this.
Reserve(reserve_exact in rust lingo) as a concept doesn't throw away amortized growth promise, so this entire post makes no sense.
You would need a very shitty implementation to achieve anything close to what you are describing(perhaps only allocating in powers of 2), and even then it would recover fairly quickly.
Suppose I think I may need 128 entries at some point, but the vector is allocated with room for 16 entries by default. I may not want to allocate and then immediately allocate again. But if I get to a 17th entry then I’m already causing allocation. So I might as well allocate 128 at that time so there are no more allocations at all.
I believe that you are describing `Vec::with_capacity` which allows to change the initial reserved memory on construction.
`reserve` and `reserve_exact` are used when mutating an existing vec. What you provide is not the total wanted capacity but the additional wanted capacity.
`reserve` allows to avoid intermediate allocation.
Let's say that you have a vec with 50 items already and plan to run a loop to add 100 more (so 150 in total).
The initial internal capacity is most likely 64, if you just do regular `push` calls without anything else, there will be two reallocations: one from 64 to 128 and one from 128 to 256.
If you call `reserve(100)`, you'll be able to skip the intermediate 64 to 128 reallocation: it will do a single reallocation from 64 to 256 and it will be able to handle the 100 pushes without any reallocation.
If you call `reserve_exact(100)`, you'll get a single reallocation for from 64 to 150 capacity, and also guarantee no reallocation during the processing loop.
The difference is that `reserve_exact` is better if these 100 items were the last ones you intended to push as you get a full vec of capacity 150 and containing 150 items. However, if you intend to push more items later, maybe 100 more, then you'd need to reallocate and break the amortized cost guarantees. With `reserve`, you don't break the amortized cost if there are follow-up inserts; at the price of not being at 100% usage all the time. In the `reserve` case, the capacity of 256 would be enough and let you go from 150 to 250 items without any reallocation.
In short, a rule of thumb could be:
- If creating a vec and you know the total count, prefer `Vec::with_capacity`
- If appending a final chunk of items and then no longer adding items, prefer `Vec::reserve_exact`
- If appending a chunk of items which may not be final, prefer `Vec::reserve`
That's a nice idea, thank you. I have personal blog, I'll try to clean it up a bit and provide performance measurements so it's worth posting.
Regarding the official documentation, I've returned to read them. I agree that the docs would benefit from more discussion about when to use each method. In particular, the code examples are currently exactly the same which is not great. Still, the most critical piece of information is there [0]
> Prefer `reserve` if future insertions are expected.
If anyone wants to reuse my explanation above, feel free to do it; no need to credit.
All the functions mentioned above, even the cpp one, will reserve atleast the number of elements given to resize() or resize_exact(), but may reserve more than that.
After some pondering, and reading the rust documentation, I came to the conclusion that te difference is this:
reserve() will grow the underlaying memory area to the next increment, or more than one increment, while
reserve_exact() will only grow the underlaying memory area to the next increment, but no more than that.
Eg, if grow strategy is powers of two, and we are at pow(2), then reserve() may skip from pow(2) to pow(4), but reserve_exact() would be constrained to pow(3) as the next increment.
Or so i read the documentation. Hopefully someone can confirm?
> even the cpp one, will reserve atleast the number of elements given
The C++ one, however, will not reserve more than you ask for (in the case that you reserve greater than the current capacity). It's an exact reservation in the rust sense.
> reserve() will grow the underlaying memory area to the next increment, or more than one increment, while reserve_exact() will only grow the underlaying memory area to the next increment, but no more than that
No, not quite. Reserve will request as many increments as it needs, and reserve_exact will request the exact total capacity it needs.
Where the docs get confusing, is that the allocator also has a say here. In either case, if you ask for 21 items, and the allocator decides it prefers to give you a full page of memory that can contain, say, 32 items... then the Vec will use all the capacity returned by the allocator.
As far as I can tell, in the current implementation, reserve_exact is indeed exact. The only situation in which the capacity after calling reserve_exact will not equal length + additional is when it was already greater than that. Even if the allocator gives more than the requested amount of memory, the excess is ignored for the purposes of Vec's capacity: https://github.com/rust-lang/rust/blob/4b57d8154a6a74d2514cd...
Of course, this can change in the future; in particular, the entire allocator API is still unstable and likely won't stabilize any time soon.
Maybe more interestingly, line 659, slightly above that, explains that we know we got [u8] but today the ordinary Rust allocator promises capacity is correct, so we just ignore the length of that slice.
We could, as that comment suggests, check the slice and see if there's enough room for more than our chosen capacity. We could also debug_assert that it's not less room, 'cos the Allocator promised it would be big enough. I dunno if that's worthwhile.
> Increase the capacity of the vector (the total number of elements that the vector can hold without requiring reallocation) to a value that's greater or equal to new_cap.
I belive that the behaviour of reserve() is implementation defined.
Because there's only a single function here, it has to either be Vec::reserve or Vec::reserve_exact
If you don't offer Vec::reserve_exact then people who needed that run out of RAM and will dub your stdlib garbage. If you don't offer Vec::reserve as we've seen C++ programmers will say "Skill issue" whenever a noob gets awful performance as a result. So, it's an easy choice.
> In either case, if you ask for 21 items, and the allocator decides it prefers to give you a full page of memory that can contain, say, 32 items... then the Vec will use all the capacity returned by the allocator.
It would be nice if this were true but AFAIK the memory allocator interface is busted - Rust inherits the malloc-style from C/C++ which doesn’t permit the allocator to tell the application “you asked for 128 bytes but I gave you an allocation for 256”. The alloc method just returns a naked u8 pointer.
The global allocator GlobalAlloc::alloc method does indeed return a naked pointer
But the (not yet stable) Allocator::allocate returns Result<NonNull<[u8]>, AllocError> --- that is, either a slice of bytes OR a failure.
Vec actually relies on Allocator not GlobalAlloc (it's part of the standard library so it's allowed to use unstable features)
So that interface is allowed to say you asked for 128 bytes but here's 256. Or, more likely, you asked for 940 bytes, but here's 1024. So if you were trying to make a Vec<TwentyByteThing> and Vec::with_capacity(47) it would be practical to adjust this so that when the allocator has 1024 bytes available but not 940 we get back a Vec with capacity 51 not 47.
You misread the documentation. Reserve-exact is precisely that - the growth strategy is ignored and you are ensured that at least that many more elements can be inserted without a reallocation. Eg reserve_exact(100) on an empty Vec allocates space for 100 elements.
By contrast reserve will allocate space for the extra elements following the growth strategy. If you reserve(100) on an empty Vec the allocation will be able to actually insert 128 (assuming the growth strategy is pow(n))
Vec::reserve(100) on an empty Vec will give you capacity 100, not 128 even though our amortization is indeed doubling.
The rules go roughly like this, suppose length is L, present capacity is C, reserve(N):
1. L + N < C ? Enough capacity already, we're done, return
2. L + N <= C * 2 ? Ordinary doubling, grow to capacity C * 2
3. Otherwise, try to grow to L + N
This means we can grow any amount more quickly than the amortized growth strategy or at the same speed - but never less quickly. We can go 100, 250, 600, 1300 and we can go 100, 200, 400, 800, 1600 - but we can''t do 100, 150, 200, 250, 300, 350, 400, 450, 500...
reserve() reallocates by at least doubling the capacity.
reserve_exact() reallocates by exactly what you ask for.
If you reserve() space for 1 more element a 1000 times, you will get ~30 reallocations, not 1000.
This inexact nature is useful when the total size is unknown, but you append in batches. You could implement your own amortised growth strategy, but having one built-in makes it simple for different functions to cooperate.
In this case, the implementation is not free to ignore calls to reserve or reserve_exact -- the documentation makes explicit guarantees about this: "After calling reserve, capacity will be greater than or equal to self.len() + additional".
This matters for performance reasons, and also because unsafe code is allowed to use the extra capacity of the vector -- e.g. you can write to the unused space and then call set_len to insert elements in-place. If `reserve` did not guarantee the requested capacity was allocated, this would cause buffer overflows.
The documentation for these functions is very carefully worded to explain exactly what each method does and does not guarantee. Both functions have the exact same hard guarantees: after calling the function, the vector will have at least enough capacity for the additional elements.
The difference is in what the methods try to do without guaranteeing exact behavior: reserve follows the usual amortized growth pattern, while reserve_exact tries to avoid overallocating. These are described as best-effort performance optimizations rather than guarantees you can rely on because 1) the amortized growth strategy is subject to change between platforms and compiler versions, and 2) memory allocators typically don't support arbitrarily-sized blocks of memory, and instead round up to the nearest supported size.
> Reserves capacity for at least additional more elements to be inserted in the given Vec<T>. The collection may reserve more space to speculatively avoid frequent reallocations.
reserve_exact()
> Reserves the minimum capacity for at least additional more elements to be inserted in the given Vec<T>. Unlike reserve, this will not deliberately over-allocate to speculatively avoid frequent allocations... Prefer reserve if future insertions are expected.
So if you know you'll need n now but will probably need more later, use reserve(). If you known is all you'll need, use reserve_exact().
Vec<T> is a contiguous growable array, so the optimization game is to minimise allocations, reallocations (to keep it contiguous), and unused allocated memory.
I could look this up, but I’m enjoying reading this conversation.
Do reserve and reserve_exact increment the capacity, or ensure there’s still at least that much capacity remaining?
If the former, if I reserve(1) 10 times in a row, does that mean it could be rounding up to a thousand elements (say, because of the page table size) each time?
At least that much remaining. For both Vec::reserve and Vec::reserve_exact the parameter is an unsigned integer representing how many more items you expect to need space for.
So reserve(1) 10 times in a row will just repeatedly ensure there's enough space for at least 1 more item, which after the first time there certainly is†
There's an excellent chance the capacity check in particular got inlined, so the optimized machine code will not "actually" call a function ten times, it'll call it once and then maybe emit a single capacity check inline, the subsequent checks even a crap 1980s optimiser will go "We do not need to check that again" and skip it.
† If the Vec is full and can't grow, which really can happen for Zero Size Types in particular, then this panics, and what happens next is a compile time choice, in debug likely it just tells you what went wrong and suggests how to make a backtrace. In the ordinary case that we instead got to keep running then it succeeded.
That makes sense, and it's how I'd hope it'd work. I can imagine all sorts of cases where I'm getting data from an external source, like an API request returning some mudball of data monstrosity, where the easiest path doesn't give all the information at once.
Nice. I truly do appreciate the developer ergonomics that went into Rust. Its APIs are pleasantly well thought out.
The one I particularly want to call out because it came up this morning is providing both Vec::reserve and Vec::reserve_exact
Vec::reserve lets us hint about our upcoming capacity expectations without damaging the O(1) amortized growth which is the whole point of this collection type, but it can waste some memory to achieve this. This is probably what you want in most code.
Vec::reserve_exact is a more narrow idea - we can hint about the ultimate capacity needed, if we're wrong and later need more capacity this has a significant performance cost because we thew away the amortized growth promise to get this, but we don't waste memory.
C++ only provides the equivalent of Vec::reserve_exact which makes this a footgun but if you use this call when you really needed the other one you trash your perf, as a result of which people teaching C++ tend to just say don't use reservation to uh, reserve capacity, but now you're leaving perf on the table so that's not great.