You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is no standard way to iterate over a sequence of values in Go. For lack of any convention, we have ended up with a wide variety of approaches. Each implementation has done what made the most sense in that context, but decisions made in isolation have resulted in confusion for users.
When you want to iterate over something, you first have to learn how the specific code you are calling handles iteration. This lack of uniformity hinders Go’s goal of making it easy to easy to move around in a large code base. People often mention as a strength that all Go code looks about the same. That’s simply not true for code with custom iteration.
We should converge on a standard way to handle iteration in Go, and one way to incentivize that is to support it directly in range syntax. Specifically, the idea is to allow range over function values of certain types. If any kind of code providing iteration implements such a function, then users can write the same kind of range loop they use for slices and maps and stop worrying about whether they are using a bespoke iteration API correctly.
This GitHub Discussion is about this idea of allowing range over function values. This is obviously related to the iterator discussion (#54245), but one aim of this discussion is to separate out just the idea of a language change for customized range behavior, which should probably be done independently of an iterator library. A library for iterators can then be built using and augmenting the range change, not being the cause of it.
To date, range's behavior has depended only on the type of its argument, not methods the argument has, nor any other details of the argument. Range currently handles slice, (pointer to) array, map, chan, and string arguments. We can extend range to support user-defined behavior by adding certain forms of func arguments.
There are two natural kinds of func arguments we might want to support in range: push functions and pull functions (definitions below). These kinds of funcs are duals of each other, and while push functions are more suited to range loops, both are useful in different contexts.
This posts suggests that for loops allow range over both push functions and pull functions. The end of the post also suggests range over int.
The rest of this post explains all this in more detail.
Push functions
A push function is a function with a type of one of these forms:
That is, a push function takes a single argument, here named yield, although that exact name is not a requirement. The yield argument is itself a function taking N arguments (0 ≤ N ≤ 2) (denoted by ... in the pseudo-syntax above) and returning a single bool. The push function itself must return nothing at all or else a single bool. The optional bool allows the push function to indicate whether it stopped early, which can be useful when composing push functions; when called using range syntax, the compiled code would ignore the result.
The push function enumerates a sequence of values by calling yield repeatedly. The bool result from yield indicates whether to keep yielding operations (true means continue running, false means stop). Each call to yield runs the range loop body once and then returns. When there are no more values to pass to yield, or if yield returns false, the push function returns.
In short, a push function pushes a sequence of values into the yield function.
For example, here is a method to traverse a binary tree:
func (t *Tree[K, V]) All(f func(key K, val V) bool) bool {
if t == nil {
return true
}
return t.left.All(f) && f(t.key, t.value) && t.right.All(f)
}
The method value t.All is a push function: it has signature func(func(K, V) bool) bool.
(In this usage, the caller doesn’t care about the boolean result from t.All, only the fact that it calls f on every key-value pair.)
Adding support for push functions to range would allow writing this equivalent code:
for k, v := range t.All {
fmt.Println(k, v)
}
In fact, the Go compiler would effectively rewrite the second form into the first form, turning the loop body into a synthesized function to pass to t.All. However, that rewrite would also preserve the “on the page” semantics of code inside the loop like break, continue, defer, goto, and return, so that all those constructs would execute the same as in a range over a slice or map.
If you are worried about the subtle variable scoping difference, consider the change discussed in #56010 a prerequisite of adding func support to range.
Note that the results of the push function (if any) are discarded when using the range form. Most often a push function will return nothing at all, or else a bool indicating whether the loop stopped early, as the All method does to make recursion easier.
A method x.All(f), which may become a common pattern, has two different, equally valid interpretations. One is that f is a yield function and All passes all the tree's contents to f. The other is that f is a condition function and All reports whether the condition is true for all the contents of the tree, stopping the traversal once it determines the result.
Pull functions
A pull function is a function with a type of the form
func() (values, bool)
That is, a pull function takes no arguments and returns the next set of N values (0 ≤ N ≤ 2) from the sequence. Each valid set of values comes with a final true bool result. When there are no more values, the pull function returns arbitrary values and a false bool.
A pull function must maintain internal state, so that repeated calls return successive values.
In short, a pull function lets the caller pull successive elements from the sequence, one at a time.
For example, here is a method that returns a pull function to traverse a linked list:
func (l *List[V]) Iter() func() (V, bool) {
cur := l
return func() (v V, ok bool) {
if cur == nil {
return v, false
}
v, ok = cur.value, true
cur = cur.next
return
}
}
The method value l.Iter is not a pull function, but it returns one.
With that method, one can write today:
next := l.Iter()
for v, ok := next(); ok; v, ok = next() {
fmt.Println(v)
}
Adding support for pull functions to range would allow writing this equivalent code:
for v := range l.Iter() {
fmt.Println(v)
}
In fact, the Go compiler would effectively rewrite the second form into the first form. Again, consider the scope change in #56010 a prerequisite.
If some iterator-like value had a Next method that returned (value, bool), we could write:
for v := range it.Next {
...
}
Note that range over pull functions has been proposed by itself as #43557, and the discussion also considered push functions (for example, #43557 (comment)). Both can be appropriate at different times.
Duality of push and pull functions
Any push function can be converted into a pull function and vice versa.
Converting a pull function into a push function is a few lines of code:
func push(next func() (V, bool)) func(func(V)bool) {
return func(yield func(V) bool) {
for {
v, ok := next()
if !ok || !yield(v) {
break
}
}
}
}
Converting a push function into a pull function is more involved. Because the push function has its own state maintained in its stack (like in the binary tree traversal), that code must run in a separate goroutine in order to give it a stack that persists across calls to the next function. The full code is in this playground snippet.
It can be arranged that the separate goroutine executes with its own stack but not actually running in parallel with the caller. With a bit of smarts in the compiler and runtime, but no changes to the Go language or any of its semantics, that lack of parallelism allows the separate goroutine to be optimized into a coroutine, so that switches between the caller and the push function are fairly cheap. The details of the optimization are beyond the scope of this discussion but are posted in the “Appendix” of #54245.
The signature for converting a push function to a pull function is
The conversion must return two functions: the pull function next and a cleanup function stop, which shuts down the goroutine.
Although push and pull functions are duals, they have important differences. Push functions are easier to write and somewhat more powerful to invoke, because they can store state on the stack and can automatically clean up when the traversal is over. That cleanup is made explicit by the stop callback when converting to the pull form.
For example, the binary tree traversal above was made very easy by being able to use recursion in its implementation. A direct implementation of a pull form would need to maintain its own explicit stack instead, like:
func (t *Tree[K, V]) Iter() func() (K, V, bool) {
var stk []*Tree[K, V]
for ; t != nil; t = t.left {
stk = append(stk, t)
}
next := func() (k K, v V, ok bool) {
if len(stk) == 0 {
return k, v, false
}
t := stk[len(stk)-1]
stk = stk[:len(stk)-1]
for r := t.right; r != nil; r = r.left {
stk = append(stk, r)
}
return t.key, t.value, true
}
return next
}
That implementation is much harder to reason about and probably contains a bug.
As another example of the power of push functions and automatic cleanup, consider this function that allows ranging over the lines from a file:
for line, err := range Lines("motd.txt") {
if err != nil {
log.Fatal(err)
}
fmt.Print(strings.ToUpper(line))
}
Note that the implementation of Lines can use defer to clean up automatically when the loop is done. An implementation using a pull function would need a separate stop function to close the file.
A push function usually represents an entire sequence of values, so that it can be called multiple times to traverse the sequence multiple times. It can usually also be called simultaneously from different goroutines if they both want to traverse the sequence, without any synchronization. In contrast, a pull function always represents a specific point in one traversal of the sequence. It can be advanced to the end of the sequence, but then it can't be reused. Goroutines cannot share a pull function without synchronization, but a pull function can be used from multiple call sites in a single goroutine, such as a lexer pulling bytes from an input source.
In terms of concepts in other languages, a push function can be thought of as representing an entire collection. The implementation of the push function maintains iterator state implicitly on its stack, so that multiple uses of the push function use separate instances of the iterator state. In contrast, a pull function can be thought of as representing an iterator, not an entire collection.
Push and pull functions represent different ways of interacting with data, and one way may be more appropriate than the other depending on the data. For example, many programs process the lines in a file in a single loop, so a push function is appropriate for lines in a file. In contrast, it is difficult to imagine any programs that would process the bytes in a file with a single loop (except maybe wc), while many process bytes in a file incrementally from many call sites (again, lexers are an example), so a pull function is more appropriate for bytes in a file.
Because both forms are appropriate in different contexts, range loops should support functions of both types. Note that there is no overlap between the two function kinds: push functions always have one argument, while pull functions always have no arguments.
Alternatives
An alternative would be to extend range by recognizing special methods. For example if range knew to call a .Range method, then we could define (*Tree).Range and then use
for k, v := range t {
...
}
instead of
for k, v := range t.All {
...
}
One aesthetic reason not to do this is that range today uses types to make the decision, and it seems cleaner to continue to do that. In fact, there is nothing in the language today that calls specially defined methods. (The closest to that is the definition of the error interface, but no language construct calls the Error method.) Aesthetic reasons aside, though, there are two practical problems with a method-based decision.
The first problem with a method-based decision is that only a single method can implement the behavior. Using functions, other methods can be called instead simply by naming them. For example we might define t.AllReverse that enumerates the tree in reverse order, and then a loop can use
for k, v := range t.AllReverse {
...
}
Similarly, an iterator that defines Next might also define Prev, allowing
for v := range it.Prev {
...
}
The second problem with a method-based decision is that it can conflict with the type-based decision. For example if the loop calls the Range method, what happens in a range over a channel value that also has a Range method? Is it treated like other channels, ignoring the Range method? It would seem that must be the case, for backwards compatibility. But then it's confusing that the Range method doesn't win.
Continuing the type-based decision instead of introducing a new method-based decision rule avoids these problems.
Range over ints
One common problem for developers not coming from the C family of languages is puzzling through the Go idiom
for i := 0; i < n; i++ { ... }
When you stop to explain it, that’s a lot of machinery to say “count to n”.
One common use case that people have mentioned for user-defined range behaviors is to have a standard function to simplify that pattern, like:
func count(n int) func(yield func(i int) bool) {
return func(yield func(i int) bool) {
i := 0
for i < n && yield(i) {
i++
}
}
}
used as:
for i := range count(n) { ... }
If this will become the new idiom for counting to n, it's unclear where the count function would be defined. Some package that essentially every program imports?
Counting from 0 to n is so incredibly common that it could merit a predefined function, but at that point we’re talking about a language change. And if we’re talking about a language change, it makes sense to continue to extend range in a type-based way, namely by ranging over ints.
Adding support for ints to range would allow writing this code:
for i := range n { ... }
instead of:
for i := 0; i < n; i++ { ... }
For former C, C++, and Java programmers, the idea of not writing the 3-clause for loop may seem foreign. It did to me at first too. But if we adopt this change, the range form would quickly become idiomatic, and the 3-clause loop would seem as archaic as ending statements with semicolons.
Discussion
What do people think about this idea?
Should we stop at push functions and not allow pull functions in range?
At first, I'd like to note that a general idea of something iterator-related is a welcome addition to the language.
When I started with Go (coming from a C# background), the differences between the ways of iteration were quite confusing. And they still are! C# has this notion of: everything that is an IEnumerable<T> can be accessed and manipulated with LINQ.
However, LINQ is a beast itself and introducing something like that is definitely not suitable for the goals of the Go programming language. And I would even argue that it's not needed.
The concept of pull and push functions is clear. Incorporating this even further into the language, e.g. by defining a Range() method as considered in the alternatives would decrease the readability. Developers would need to know about this concept, because it definitely hides something. So I'd would consider this a no-go. Explicit readable code is the preferred way.
As for the range over ints proposal: Python has something similar, therefore this new pattern could improve the adoptability across Python developers. I, for one, don't have a strong opinion for or against it, as I'm already too used to the C-way.
Edit: fixed some minor grammar corrections, simply because English is not my first language.
Incorporating this even further into the language, e.g. by defining a Range() method as considered in the alternatives would decrease the readability. Developers would need to know about this concept, because it definitely hides something.
I actually don't entirely agree with that. Yes it is a new concept, but it keeps the idea that a function is a function and has computational weight.
When doing a for x := range x.Range() you know should be able to know that it's a function call and that the range will produce values from it. This seems to fit more with the Go theme than having predefined interfaces that an object must implement to get range functionality. It really doesn't seem to be hiding anything, cards are out in the open. In other words you can read this as range (produce values) from x.Range() calling it every iteration. The only implicit part is the function signature which shouldn't be foreign to anyone who is familiar with a HandlerFunc or other similar APIs in go.
So this actually seems pretty explicit and actually far more readable as there would be one way of iterating through objects vs consulting the documentation on the specific iterator semantics.
I'm really interested in what the transform for push functions will be to allow flow control statements would be. This would effectively add a form of non-local return to Go, which other languages use to make these sorts of internal iterators feel nice.
This is an unfortunate limitation, do we need it? I don't think we want to allow racy calls to yield, but I can imagine push functions that e.g., start worker goroutines to walk a data structure and call yield on all elements. Provided there is synchronization around yield calls, it feels like that should be fine.
This could always be worked around by having workers send values back to the original goroutine, which calls yield, but this feels like an awkward requirement in the language. I can't think of other functions that must be called from the correct goroutine (t.FailNow() is the closest I can think of), so this seems odd.
That said, I'm not sure how to reconcile this with what should happen if the loop body panics.
yield just returns false, I suppose? (This would imply that the yield implementation would use some internal communication mechanism to make the loop body run in the original goroutine) Edit: this doesn't make sense, as the original goroutine is likely blocked in something like sync.WaitGroup.Wait.
Indeed, a panic or a call to defer is the main reason the goroutine limitation exists. I doubt it will be much of a problem in practice. We can also always lift it later.
Yes, I believe the yield function will panic if it is called after the loop is done or from the wrong goroutine.
What do we expect the push function to do if the yield function panics, either for the reason above, or because of a call to panic within the loop body? I imagine we expect it to stop and not call the yield function any more; is that right?
If the push function uses defer to recover from the panic and call the yield function, it seems there is potential for an infinite loop of panics. Perhaps the yield function should check for this and after some number of panics do something else? Maybe exit the goroutine or exit the process?
The conventional names I've heard before are internal iterators ("push functions") and external iterators ("pull functions"). I'm not sure if you were avoiding these terms on purpose, but this may be helpful when comparing to what other languages are doing.
When using a few generic containers (sets, ordered maps, etc) the lack of native range iteration is, I think, the biggest point of friction that makes them feel like second-class collection types. So I'm cautiously optimistic that this idea would solve that problem.
If the iterator function doesn't get inlined (how likely is that, particularly for push functions?), then the call-per-iteration seems like it would make this fairly slow in some contexts (performance-optimized data structures).
The iteration on ints seems wholly unnecessary to me. It makes two ways to do a very common task (decreasing code readability) while saving hardly any typing. Concern (3) is also significant here: if the "nice" way to do it is 10x slower, then the choice of which form to use is more of a burden.
I wasn't aware of those terms. At first glance I'm not sure what would make something internal or external, so I think I will stick with pull or push, but thanks for the mapping for people who are already familiar with them.
👍
I believe that the most trivial push iterators will get inlined. Clearly we can't land this feature with terrible performance: we will do the work needed to make it perform well (or else rethink).
I'm not convinced that for i := range n decreases code readability, but that will depend on how quickly everyone moves to the new syntax. Regardless, it won't be 10X slower. It will be exactly the same speed.
I actually like for i := range n syntax. I've faced error numerous times in Go writing that line. Coming from Python world where we are habituated to writing for i in range(n), this should be a welcome change.
tl;dr: External iteration is when the user's code "calls" the iteration, as with iterator types and for loops. Internal iteration is when the iteration "calls" the user's code, as with forEach-like functions in JavaScript, Ruby et al.
The push function case has a weird quality that I think is novel. The yield() function the compiler passes to it is a function that you can't write in Go, because it's a function which, when called, can execute a defer in a caller's context. I'm mildly afraid of that, not least because I have often wanted the ability to write "defer-but-in-parent" and also I would be absolutely miserable if anyone else (including "me three months ago") had access to it.
I don't think we could entirely dispose of the three-clause loop, but I do agree that I'd be fine with not needing it in the "count to n" case.
Anyway, as a person who's repeatedly wanted to request iterator support in the language, I will say that I like this a lot, and at least so far, this feels like something that I'd use and not hate, which is pretty high praise for programming languages.
Thanks for writing this up @rsc, and for providing a clear mental model around push and pull based iteration! I really like the direction this is going.
One thing that seemed a bit nuanced in the description was the push pull distinction apparently requiring different parenthesis.
for k, v := range t.All
for k, v := range t.Iter()
This I likely an artifact of your example having Iter() return a function instead of an intermediate value (as in the discussion), because I think that this clears up the nuance (despite being more verbose):
for k, v := range t.All
for k, v := range t.Iter().Next
I also think the decision to pass a function to range instead of passing a value that has a given method is a good one; although it adds a bit of syntactic noise, it makes it very clear how the feature works, and (although you don't call this out) allows people who are navigating a new code-base to click on a method to see where it's implemented as they would for a function call.
The biggest concern I can think of is not really a concern with the proposed changes to range itself, but with how it would interact with the rest of the language. In particular, if so many things are allowed, there's no way to specify that my function takes "something it can pass to range". This may not be a pr
6D38
oblem in practice (the examples I can think of are fairly mundane) but it might be irritating to have to write ~6 implementations of the same thing for various different push and pull function signatures.
This could be solved with some syntax in interface definitions (for example):
type Aller interface {
All range[K, V]
AllKeys range[K]
}
This is similar to the operator based approach that was decided against for type parameters (in favor of the named types approach) so there may be issues there that led to that decision that I don't know about. It also has the downside that you lose information (once you have an Aller you cannot call it's All() method directly because you don't know what its signature actually is).
Alternatively it could be solved by heavily restricting the proposal so that range only accepts functions with one signature (probably (func (k K, v V) bool) bool). Although it seems reasonable to require All to always return a bool (just as Close() always returns an error), I'm not sure how reasonable it is to require two callback parameters - implementors could always pass nil as a second value, but seems a bit meh. This would also mean that pull-iterators are not directly supported, and possibly the push function is made available so people can convert between the two. (I do think it would be reasonable to support i := range n if n is an integer type even if there was no way to pass "either an integer or a function with the right syntax).
type Aller interface {
All func(func(k K, v V) bool)
AllKeys func(func(k K, v V) bool)
}
A third option is to split the difference, allow some number of types (more than one and fewer than six) so that if you want to write code that takes something that can be passed to range, you only need a couple of different copies.
In any case it would be nice to be able to write code like this and pass anything that could be passed to range to a function (but probably not a deal breaker if you can't):
func benchmarkAll(t Aller) {
time := t.Now()
for k, v := range t.All {
fmt.Println(t.Now().Sub(time), k, v)
time = t.Now()
}
}
Thanks for writing this up @rsc, and for providing a clear mental model around push and pull based iteration! I really like the direction this is going.
Totally agree.
One thing that seemed a bit nuanced in the description was the push pull distinction apparently requiring different parenthesis.
for k, v := range t.All
for k, v := range t.Iter()
I noticed this too and i'm a little conflicted about it.
On the one hand, i don't want to have to remember whether any given loop needs the parentheses or not. Sounds like a great source of frustration while coding.
On the other hand, it could be nice to have a visual indicator of whether a loop is over a push-type (repeatable) or pull-type (consumable) value. (Attempting to reuse a spent iterator is a mistake that i still occasionally make in Python.)
Parenthesis would probably not be my first choice for that visual indicator. A different keyword, maybe. Or perhaps a naming convention would be sufficient.
That said, I guess this confusion already exists today - channels are consumable, but maps and slices are reusable - so maybe it's too late to do anything about it.
range t.Iter() would require Next() to be a magic method name so I think that was an oversight. If you assume magic methods are forbidden, ranging over an iterable must be range t.Iter().Next.
range t.Iter() would require Next() to be a magic method name so I think that was an oversight. If you assume magic methods are forbidden, ranging over an iterable must be range t.Iter().Next.
I believe t.Iter() returns a func - the name is immaterial - only the signature needs to match.
t.All could return a func as well. I think whether or not Iter() returning a func is more plausible depends on a) whether it was written before this proposal is implemented (it would most likely return a mundane iterator type), and b) whether we have an iterator library (it will likely return a canonical iterator type). In the space between implementing this proposal and us getting an iterator library, I could see an argument that returning a pull func from Iter() is easier in some cases.
Either way, whether you have to put a call expression into range doesn't actually depend on whether or not you use push or pull. It depends on what the type of range expression is. That's, FTR, the same as today - a range expression can be a function call, or it can not be.
What is the intent for using these as iterators? I know that the discussion here splits, but to me, if these are not usable to write an iterator library, I don't really see the point for a relatively invasive language change.
From what I can tell, there is no realistic way to write a function which takes "either a push or a pull function". There isn't even a way to write one which can take a push function, due to that having 6 (?) different forms. I mean, you can write a type-constraint for "it has to be any of these" and use a type-switch, but that isn't exactly ergonomic.
So all I could think of is iterator-compositions taking a form they need and returning a form they find convenient. With the user being expected to use the appropiate glue code to transform them back-and-forth. Especially given that some of these need a separate stop function, that sounds like a pain to manage.
So while I can totally see how this would enable us to iterate over user-defined collections (and I think it does that reasonably well, though I find the dangers of push persisting yield icky), I can't really see how this addresses the goal of "a standard way to do iteration".
AIUI one of the goals is to provide the language change needed to then do #54245. But #54245 really only needs pull-functions to be rangeable, doesn't it?
Of course, you'd also need FromPushFunc2 and perhaps FromPushFunc0.
@Merovius Right, this was my point. You can't abstract over varying numbers of values in func signatures.
There are also the variants where the push iterator func itself returns a bool. That doubles the cases.
Subjective does not mean insufficient.
I don't think I equated the two with the totality of what I wrote.
With regard to your less subjective argument, note that this proposal does not require any significant rewriting of the loop body.
I was thinking that for push iterators, the loop body would just be the body of the callback, but now I see that return and goto wouldn't work that way. So the range loop is basically only working with pull iterators anyway, after converting push to pull. Makes sense.
Currently I can't imagine that we would make that choice. We would need much better arguments than we've seen so far.
I haven't seen any counterarguments from "we" indicating any problems with the arguments so far, but if there's just no interest from the core Go team, then there's no point in continuing to talk about it. I'll end my remarks about it here.
I haven't seen any counterarguments from "we" indicating any problems with the arguments so far
You have seen them. You disagree with them. That's fine, but I'd personally be far more inclined to converse, if that's a distinction you could consistently internalize and reflect.
Because there are no defer, goto labels, etc., this is plausible to run in Go:
iter:=CountN(3)
varxs []intforiter(func(xint) bool {
xs=append(xs, x)
returntrue
}) { // gofmt formats this oddly, who writes an empty loop body here?
}
It's kind of funny that what should be the loop body appears in a function here, with an empty loop body. I think that's a good demonstration of why the minimally magical language change would be sensible: just allow writing the loop body where it should reasonably appear, while preserving the semantics of defer, goto, break, continue, panic (or anything I'm forgetting)
@Merovius No, I haven't, and please don't tell me what I've seen.
To be clear, this is regarding the discussion about including generator functions.
Many of the points I made in response to you and @ianlancetaylor remain unaddressed. Here are some of them:
I don't think panic is an exception; it's the lighthouse pointing the way...
It clutters up the code. Nobody wants to wrangle callbacks if they can help it.
I don't think any other language feature has needed to know how to interact with user types like this, to be fair.
I think the compiler would emit an error if yield is used inside a function that doesn't return the right type.
Many of the points made in response to me were subjective and vague. Here are some of them:
My feeling is that part of this discussion is exactly about...
Defining Iter and yield in ways that make sense for Go...
We should not rely on a special type Iter, because nothing in the language depends on special types
We should not rely on a special function yield that changes the flow of control, because we try to make flow of control very clear (here panic is the exception)...
It does not seem Go-like to me...
It doesn't really seem better in the sense of making it easier or more convenient or more robust to write code.
I'm not going to explain here how logic works, but suffice it to say that good premises and conclusions in arguments are falsifiable, and "it does not seem Go-like to me" is not falsifiable. There's no way to argue against it. The only things you can really say in response to statements like that is "I agree that you say you have that feeling" or "I have the opposite feeling." There's no "meat" on those bones to sink your teeth into.
I addressed all of the objective points that were made in response to me. We seemed to get sidetracked on how generators work, and I'm still unclear on whether my attempt to clarify how they work had any kind of effect, since what I wrote wasn't acknowledged.
I'd prefer to leave it there. If we must, let's agree to disagree.
Edit: I should add that I interpreted the "we" to mean the core Go team, not including @Merovius, and by arguments, I meant unaddressed arguments.
@willfaught I'm sorry it seems like we're ignoring your points. I personally don't find it productive to reply with a simple "I disagree". It doesn't seem to lead to useful conversations.
That said:
I made the point that a builtin yield function would lead to unexpected flow of control. For some reason you call that point "subjective and vague." I don't think it is. I think it is objective and clear.
You responded by saying, I think, that we should treat panic not as an exception but as a path to follow.
I disagree.
Given that disagreement, I don't find it necessary to keep responding to every other argument on this topic. At some point we have to be able to draw a line.
I don't know if we are going to adopt this proposal (forrange over pull and push functions) or not. I'm in favor of it but I can live without it. But even if we don't adopt this proposal, I'm really pretty sure that we aren't going to adopt a new yield builtin function. So my interest in discussing that topic is naturally somewhat limited. I'm sorry if this seems harsh. I'm sure it does seem subjective and vague. That's OK with me: some aspects of language design are subjective and vague. I'm just trying to state my views clearly and honestly.
I like all of this proposal but it's not clear to me how control flow statements would be implemented within the body of a loop ranging over a push iterator.
Adding support for push functions to range would allow writing this equivalent code:
for k, v := range t.All {
fmt.Println(k, v)
}
In fact, the Go compiler would effectively rewrite the second form into the first form, turning the loop body into a synthesized function to pass to t.All. However, that rewrite would also preserve the “on the page” semantics of code inside the loop like break, continue, defer, goto, and return, so that all those constructs would execute the same as in a range over a slice or map.
Break and continue are easy (return false and return true, respectively) but AFAIK the only way to implement return, defer, and goto without significant runtime changes (e.g. non-local return/jump) is with something like this:
Sure, something like that. #47707 has a bunch of discussion about that. I'm trying to keep this discussion at a higher-level, but I'm confident it can be implemented.
I like the fact this avoids the introductions of specially-named methods. It's good that Go avoids this.
The "pull" range idiom is similar to what I envision as a typical iterator in most languages, the only difference being the iterator here is a function, each call advancing the iterator, rather than an implementation of some interface with a similar function that advances the iterator. In the end, they are similar. Does this addition improve the language? I'm not so sure, mostly because there's not a whole lot of difference between the code being replaced and the replacement, so I'm not so sure the small savings in code justifies the language addition.
The "push" idiom is a little more difficult to picture in one's mind, it takes a little bit more mental effort to traverse from the pull function to its use as part of a range loop. Once again, I'm not so sure the small savings in code justifies the language addition.
the "range over int", I find it less useful, because it is restricted to a range from 0 up to n, and I find my range loops are sometimes descending, sometimes starting from 1, and all sorts of other combinations. So I don't see it as all that valuable. It would be a bit more valuable if it looked like: for i := range m..n { } and then you'd have a few more options with m > n, m = 1, etc.
If people were to start writing
for i := range n {
i++
....
}
or
for i := range n {
i = n - i - 1
....
}
then that would be worse than what exists today.
So, overall, I'm skeptical of this proposal. For me, I'd probably be happier reading and writing the original code rather than these new 'range' equivalents.
For me, the main objective of adding iterators to the language is to provide common types shared by many. Code using or producing iterators written by different people would be automatically compliant, because they're both standardizing on the same common library types. Without that, additional code is being written to translate one iterator type to the other.
So, from this proposal I suppose the pull and push functions suggest that you might define standardized iterators to be:
type Iter[V any] func() (V, bool)
type PushIter[V any] func(V) bool
And then perhaps people might decide to standardize on these two iterator data types, but frankly, they are probably not what I would choose to standardize on (although that would be a whole new discussion).
I'm not going to try to convince you to change your mind, but I do want to point out that this reply is focusing on the language change by itself, not engaging with the point at the start of the post, namely that there is a tower of babel of iterators and that supporting canonical ones in range will both encourage implementers to use a standard pattern and make usages cleaner.
I agree that in these trimmed down examples the differences does not appear large, although with more complex expressions the linguistic benefit is greater. But focusing on the linguistic benefit ignores the ecosystem benefit of a way to standardize what an iterator interface looks like.
I agree with this comment, I do think that standardization around a common pattern is beneficial, to avoid the tower of babel. In fact, that's what I meant by my comment "For me, the main objective of adding iterators to the language is to provide common types shared by many".
I agree 100% that the primary benefit is to "standardize what an iterator interface looks like".
I did focus largely on the integration with range in my comment. So, I see your point in this reply. I think that you are right that a proposal like this would likely push most of the past and future iterators towards either the push or pull pattern proposed here. If that is the primary goal, it would likely do that, in my opinion. People would most likely want to support any new "range" functionality. Although, to be sure, you'd probably want to make the "range" functionality as attractive as possible.
Probably, most would gravitate towards the "pull" pattern.
Probably, most would gravitate towards the "pull" pattern.
You expect most developers would choose to implement Next() (T, bool) instead of Range(yield func(T) bool)? IMO the latter is far more intuitive, especially for complex structures such as a tree. Next() (T, bool) is easy to implement for queue-like values such as channels and random access values such as slices, but implementing Next() (T, bool) for a map or a tree is significantly more complex. Implementing Range(yield func(T) bool) is trivial for most iterable values.
I think the code search below by @rsc provides some evidence that most people in most situations gravitate towards pull functions.
Even though callbacks are sometimes the better choice to make code cleaner, many developers never use them at all. Pull is simpler, you call something and get something back, and then you repeat, case closed. But it's true that it can require more work maintaining state inside the pull function. Maybe for push, some people have a harder time picturing a call stack and the flow of control in their minds, and maybe pull is easy enough in most cases that it is the preferred choice.
I do agree, traversing binary trees, or many other data structures, is often much cleaner and simpler with callbacks like the push pattern, and so push can sometimes be the better choice.
I expect the main reason for package developers preferring pull iterators is that they feel more natural for the consumer than passing a callback, IMO. However if this proposal is accepted, I expect package developers will shift to writing push iterators with the barrier (the consumer reasoning about a callback) gone.
I like this. One tiny nit, though... I would prefer to leave out the possibility of a push function returning a bool.
It doesn't save much code, as any function returning a bool can be trivially wrapped in a function that returns nothing. The tree example above could be rewritten:
I would prefer either to say that a push function must return nothing, or to say that a push function can have any return type(s) at all, including nothing, and for...range will ignore the returned values.
If we drop the optional bool, we need a good, short name to replace All as the thing you range over in for t := range x.All {. If the answer is for t := range x.Under {, I don't understand what Under means in that context.
In the code snippet above, my intent was that one would continue to write for t := range x.All {. Under is just a support method for All, and
is identical to your All method, just renamed.
The reason for allowing specifically bool is that it is the same result as in the yield callback. Perhaps it should be dropped though, so that the function must return no results.
If we allowed arbitrary return types, I suspect there would too many false positives or misuse, such as a function that returns error being used with range and then the user not noticing the error.
Sorry for misreading Under vs All. I would be reluctant to establish a convention of calling the non-bool-returning push method All, since that's not the signature that Python and Rust's all has. We'd probably have to pick some other name.
In the context of a method on an iterable type accepting a predicate, I would expect All, Each, and Every to behave the same. In most languages, one of those is the idiom for "return true iff the predicate returns true for all/every/each element of the collection". IMO it would be more natural for Range to be a function that enumerates a range (subset) from the collection. Using All, Each, or Every as the iterator is natural and intuitive to me.
The reason for allowing specifically bool is that it is the same result as in the yield callback. Perhaps it should be dropped though, so that the function must return no results.
To be clear, although I would prefer to drop the returning bool version, it's not terribly important to me one way or the other.
I really like the idea of getting some form of standardized enumeration/iteration in Go.
For my 2 cents, I'd like to start with a as concise and explicit TL;DR summary of push/pull functions as I've understood them:
You call push functions once in your code, and the given yield function is called repeatedly by the push function, once for each item in the "collection".
You call pull functions repeatedly in your code, each time it returns the next item in the "collection".
I generally like this, as it makes for a set of very small yet simple and flexible methods of enumeration/iteration over a collection.
Suggestion
Next, and feel free to disagree here, I'd like to suggest alternative names for push/pull functions:
"enumerator functions" instead of "push functions".
"iterator functions" instead of "pull functions".
Enumerator
My reasoning for "enumerator" is largely due to my history with Ruby, where any object can be made enumerable by simply defining a #each method that works very much like the push functions proposed here. (You should also include the Enumerable module to get #map, #inject, #select, etc., which all use #each under the hood.)
Personally at least, the word "push" feels suggestive of pushing values into the collection. Hence when reading the code examples, I realized push functions works very different from the initial impression I got based on the name.
Iterator
As for "iterator", my reasoning is simply that it feels very similar to other types of iterator objects I've come across which may have Next(), Prev() and similar methods. Except it's not an iterator itself, it is a singular "iterator function", that simply iterates to the next item each time it's called, and nothing else.
Type safety?
The only thing I feel slightly uneasy about with these functions, is that I don't see how the type system could be used to reliably ensure a function given to range is a push or pull function, and not simply something completely different that has a bool as it's final return value, or a func arg with a bool as a final return value.
Range int
And finally, regarding range over ints, conceptually the wording of something like range 7 feels a bit forced to me. 7 is itself not something with a range. Something like range 2..7 feels less forced, and is more flexible too. But I assume that requires changes to Go's syntax.
Though I personally feel fine about using the three-clause for loop on the rare occasion I need to loop N times. And that's despite my history with Ruby and its 7.times { |n| ... } and (2..7).each { |n| ... } stuff.
Of these, two are arguably actual pull functions (reflect.Value.Recv and reflect.Value.TryRecv). The other 17 are accidental. One of the oldest (from Go 1) is regexp.Regexp.LiteralPrefix, which has signature:
func (*Regexp) LiteralPrefix() (string, bool)
So you could accidentally write
for prefix := range re.LiteralPrefix {
use(prefix)
}
The loop would run either zero times or forever. This will be true of most "accidental" pull methods: if it's accidental, it probably returns the same thing every time you call it. Even the lightest testing seems likely to find this problem.
There are 7 instances in the standard library of methods that are push functions:
The first five are really all the same instance (Eval) and are accidental. Disallowing the returned bool would disqualify them. The last two are true push functions.
Accidental push functions seem to me far less common than accidental pull functions: plenty of methods take no arguments and return (T, bool). Very few take a callback returning a bool.
range n is not a duck, although you are not the first person to ask that.
The 3-clause "count to n" really is a significant stumbling block for new Go programmers, and it is a remarkable number of tokens to explain, to do something so incredibly common. A quick scan looks like the majority of 3-clause for loops in the Go repo can use range instead:
Using range for the majority that do count from 0 to N would make the others stand out more as unusual in some way, which would be helpful when reading the code. Skimming the for3 files created by that script, I often noticed lines and thought "wait, what's wrong with my regexp? why is this here?" only to read more carefully and see that the line really isn't a 0 to N loop, in a way that I missed at first glance.
I do admit that range n seems very un-C-like, and that aspect surprises people. But I don't believe that means it is un-Go-like, any more than not using semicolons.
Good to see pull/push function signatures aren't that common.
I believe passing the wrong function to range would probably be pretty rare, but I do like the idea of push/pull functions by their nature explicitly being push/pull functions without the need to read accompanying documentation to make sure.
I'm not sure it's a good idea. But the only way I can think of to make their signature explicitly indicate they are pull/push functions, is to swap out the final bool return type, with a new custom "iterator bool" style type.
If we had a regular type in a iter package for example, something like:
I'm not sure a iter package really fits with what's proposed here though, I merely used it as a means of easily showing conceptually what I have mind.
The end result though of something like the above, is that pull/push functions become distinct within the type system from other functions which have a final bool return value. And it also makes it very obvious to developers that it's a push/pull function by just looking at the function signature.
I'm fine with the integer range. Doesn't seem like a big deal either way. It has precedent in Vue templates, which have <element v-for="i in n"> or <element v-for="i of n">. My main issue with it in Vue is that in and of iteration behave differently in JavaScript, but the same in Vue templates, which is confusing. This doesn't apply to Go though.
If you have var f func() (T, error, bool) are for v := range f and (less importantly) for range f valid? Or do they need to be for v, _ := range f and for _, _ := range f? Making them valid is more consistent with slices and maps, but possibly more error prone since they allow accidentally omitting error checking.
The number of range variables would be required to match the number of pull results (minus the bool) or the number of push yield arguments. Only slice and map would allow dropping the _.
I think there is a definite improvement in discoverability and understandability of this feature if there is always a 1-1 correspondence between range-variables and returns from the function. If I know little about Go and read Go code and see all those range statements and wonder what they do, it'll be hard enough as it is to map the dozen or so different forms of functions which can be used and what they mean. Throw into the mix that the number of range variables can also differ from the number of returns…
I would even go so far as to argue that allowing different number of loop variables for maps and slices might have been a mistake. for i := range someSlice is still a source of confusion and bugs, when people assume that yields values. If you'd always have to write for i, _ := range someSlice or for _, v := range someSlice, that would've been avoided, at the cost of a bit of verbosity.
The one exception I could see is for range x. Perhaps that should be allowed with any number of returns, as it looks sufficiently different.
range is already inconsistent since chans only have a single-value variant of it. I don't think it's too weird for funcs to have their own rule for that, too.
I don't understand why it needs to be as complicated as this. Can't Go define the standard iteration interfaces that range will support in the stdlib, and give an order of precedence and/or chose the interface based on the declared range variables. All you need is the standard pull-type interfaces.
I "think" this is complicated because there is still a design concern about supporting general iteration over a map - requiring a push interface. I think this is easily addressed with built-ins, e.g. iter(somemap) that return one of the above declared interfaces. For non-builtin containers this is not an issue.
I don't see how "generators" align with the Go team's concerns over flow control (which seems to have been the major barrier to exceptions). "generators" (aka hidden threads or coroutines) are the magic that Go typically tries to avoid.
I had envisioned scoped channels as simply tied to the creating routine. If that routine's references to the channel goes out of scope, the channel is automatically closed - essentially an implicit defer().
But I don't think it's really necessary. I don't think this is a real problem in practice. These leaked channels/routines are easily detected and then fixed in the design. In any long-lived system the monitoring will detect the leaks, and the overhead of the routine is fairly minimal in the meantime.
In either case, the stack trace is simple. If the iterator is a Map, the stack is range_body -> (*Map[K, V].All) -> run. If the iterator is a Tree, the stack is range_body -> (*Tree[K, V].All) ... -> run with All repeated a number of times depending on the recursion depth. The only thing that's not obvious about that is that the range body is a function. And unless the iterator does something like run the iteration in a separate goroutine, the stack trace will be simple.
Don't start in the range body - start at the function calling range - and imagine the debugger trying to step through the elements and the range body.
Based on the sample 'unranging' implementation it doesn't appear trivial to go backwards. A pull iterator seems trivial. Maybe each of the generated lines can be labelled with the source line but how the stack aligns is difficult for me to grasp - but I am sure I am missing something.
In most languages when you "compile for debug" it removes a lot of the optimizations because the debugger can't deal with it. This seems to require a level of code generation that won't be easy to work with.
I would expect the debugger to step into All if I tell it "step into" when the current line is for range it.All. If I "step over" that line, I would expect it to transparently step through All until it calls yield or returns. If I "step into" the range statement during the middle of iterating, I would expect that to be equivalent to stepping out of the body function.
I like this idea. It solves a bunch of the confusion that arose trying to deal with an iterator interface, and it also leaves it open to potentially add new function signatures later if something useful comes up. It also feels more consistent with the way that the rest of Go works by not relying on methods for a language feature, though it's definitely a bit strange in its own right in a completely different way.
I thought a bit about how a general iteration package could be implemented around this, and I think the best option is to deal with push functions primarily. Everything else can be very easily and cheaply converted to them, so it seems like the most general, simplest form. Here's some examples, assuming that #49085 or something similar isn't adopted:
package iter
typePush[Tany] func(yieldfunc(T) bool) booltypePair[T1, T2any] struct {
AT1BT2
}
// Uses Pair to get the index, too.funcFromSlice[Eany, S~[]E](sS) Push[Pair[int, E]] {
returnfunc(yieldfunc(Pair[int, E]) bool) bool {
fori, v:=ranges {
ok:=yield(Pair[int, E]{i, v})
if!ok {
returnfalse
}
}
returntrue
}
}
funcFromMap[Kcomparable, Vany, M~map[K]V](mM) Push[Pair[K, V]] { ... }
funcFromChan[Eany, C~chanE](cC) Push[E] { ... }
funcMap[T, Rany](fPush[T], mfunc(T) R) Push[R] {
returnfunc(yieldfunc(R) bool) bool {
returnf(func(vT) bool { returnyield(m(v)) }
}
}
funcFilter[Tany](fPush[T], ffunc(T) bool) Push[T] { ... }
funcReduce[T, Rany](fPush[T], initialR, ffunc(R, T) R) R { ... }
// These might be redundant because of Reduce(), though it would be nice to have common functionality like this pre-written.funcIntoSlice[Eany, S~[]E](sS, fPush[E]) S { ... }
funcIntoMap[Kcomparable, Vany, M~map[K]V](mM, fPush[Pair[K, V]]) { ... }
And so on. The composition of the push functions is kind of interesting, but it's a bit confusing to look at, I think. It might be easier with something like #21498, but I'm not entirely sure.
Yes, I think it is confusing to look at. It takes some concentration to follow. Returning from a function a function that takes a function argument, to be called by some calling function, is hard to follow without a lot of concentration.
I think the only way to make it easily readable is to do type definitions so we do not see all three functions at the same time.
I thought a bit about how a general iteration package could be implemented around this, and I think the best option is to deal with push functions primarily.
We're off on a tangent here, but I disagree.
The primary purpose of a standard iterator type would be as glue: one thing produces values, another consumes values, and the iterator type connects the two things. As far as this goes, the iterator type could be based around either push or pull functions.
But an iterator type based on push functions can't do anything else, at least not in any reasonable way I can see.
An iterator type based on pull functions can have other abilities, if the underlying source of values permits those abilities. For example,
move an iterator some number of steps backwards so we can revisit the last few values
clone an iterator to yield another that can be advanced over the same sequence of values, independently of the original iterator
use iterators to mark positions in ordered collections, as in the C++ standard library
An "iterator based on push functions" does not make sense. Push functions provide iteration, but they are not iterators in the sense of an iterator being an explicit object representing the state of an in-progress iteration (which is the meaning of "iterator" in almost every other language). I wrote above in a different comment:
Here's a different way to view things. The library being discussed in #54245 is about iterators (think "cursors"). Iteration, meaning what's possible with range loops, is a much broader topic. If you have a data structure that explicitly represents the paused state of a single iteration, which is what people usually mean by the term "iterator" (including in #54245, but also in C++ and Java), that's almost always a "pull function" (or an object with a pull method). Those have their place, and it would be fine to have a library to help with them, perhaps in std or perhaps in x. But range and looping generally is a much broader topic, and we shouldn't overfit to iterator objects.
Probably a stupid question, but in most cases where iteration has no side-effects (on itself?) , can't a properly defined, closure-based yield function allow to create a generator/iterator/pull function?
Essentially extracting the iteration internal to the collection by iterating once and storing values in a slice or pushing them onto a channel etc?
Your question is worth pondering - ISTM the really useful and interesting cases of for will be the ones where there's a lot going on behind the scenes. Persistent data structures would be a great example.
Just wanted to summarise my take on adding range F, F a push/pull func and range n, n an int, after digesting things further and some back and forth over a couple of sub-points in other threads.
First, thanks for setting up this discussion, it indeed addresses something missing in Go, a mechanism for custom iteration in an interesting way that extends the Go-like range specialisation over types to functions whose bodies roughly correspond to the body of a for loop, and over range n, n an int.
For range n, I think an approach which does not panic would be preferable. If it were the case, I'd be for it. If it were not, I'd lean toward not supporting it because it doesn't save much and the need to take into consideration possible panics would counterbalance the benefits.
I think custom iteration in general, that is by any means, should be taken slowly and with due diligence to, as the top doc says:
People often mention as a strength that all Go code looks about the same. That’s simply not true for code with custom iteration.
Custom iteration is often difficult to work with in other languages for this very reason. To me, one of Go's strengths is that there is not much custom iteration so that the loops all look the same. Support for a uniform mechanism for custom iteration would IMO make Go less uniform wherever it is over-applied or non convergent w.r.t. best practices. Finding a balance for where custom iteration helps vs hurts will be a long road, and guidance along the way would help.
For range F, the body of F corresponds roughly to the loop body via compiler translation of the loop body.
The 'roughly' part feels too rough to me: There are cases in which such a function would panic where it wouldn't in ordinary usage. The argument to push/pull is not really a full fledged func but it looks like it should be. The scoping differences as compared to range loops today are confusing, and are cited as a reason to take changing loop semantics as a prerequisite, even though the pre-declared for _ = range (not for _ := range) version has the original semantics. Should the loop semantics change, this would introduce non-uniformity in Go constructs where it is today uniform: := would no longer correspond to a single declaration within block boundaries {}. The for loop itself would behave more differently between the two versions, =, :=.
I don't think custom iteration really needs the rough edges above. Personally, I'd find something along the lines below much simpler.
[the code below was edited]
var (
aAbB
)
fora, b=rangepull { }
funcpull() (A, B, bool) {...} // any ordinary user defined func==>var ( // declared if the above is for a, b := range, not if for a, b = rangeaAbBbrkbool// compiler generated variable
)
for {
a, b, brk=pull()
ifbrk { break }
// user loop body here
}
Thanks for reading and your consideration!
[before edit, the incorrect code was]:
var (
aAbB
)
fora, b=rangepull, brk { }
funcpull() (A, B) {...} // any ordinary user defined funcfuncbrk(A,B) bool {...} // likewise==>var ( // declared if the above is for a, b := range, not if for a, b = rangeaAbB
)
for {
a, b=pull()
ifbrk(a, b) { break }
// user loop body here
}
even in interface form
type Ranger[A, B any] interface {
Break(a A, b B) bool
Pull() (A, B)
}
for a, b = range R {...} // R implements Ranger, equivalent to for a, b = range R.Pull, R.Break { ... }
a pull function takes no arguments and returns the next set of N values (0 ≤ N ≤ 2) from the sequence
I am slightly concerned about ambiguity when an "element" bool is returned (i.e. ambiguity of use, as called out by @rsc as "accidental iterators"). While these may not show up all that often in the stdlib, but I wonder if these may show up more frequently in community code.
For example:
// CheckHealth samples a system, returning true if healthy
func CheckHealth() (healthy bool)
The above would appear to be a valid iterator despite not likely being designed with iteration in mind. If designed to be an iterator, it'd probably look more like:
func MonitorHealth() (healthy, more bool)
Side note: in practice, such iterators may always return true for the more value.
Note that for ok := range CheckHealth() would be a compile-time error (too many values), while for range CheckHealth() would be equivalent to for CheckHealth(). The latter does suggest that 0-ary pull functions may be redundant, but it's not obvious to me that that's true of 0-ary push functions.
iirc, the rationale for not including custom iterators in the language initially was to prevent hiding cost and side effects (thus decreasing the ability to reason about code).
Today, if I see for range, I know that, except for iteration on channels, each iteration will be non-blocking, and will be exceptionally efficient (~constant cost for slices and maps, and bounded cost for strings). I know also that even with channels, the iteration will have no side effects. There are many classes of bug I might encounter where the cause simply can't be caused by the loop iteration, but with this proposal, that no longer would be the case.
I wonder if the loss of these reasoning guarantees is truly outweighed by the convenience gained through this proposal.
This concern would be mitigated if we had distinct syntax of some kind, such as for x := func iterator or for x := range @iterator. I'm not suggesting a particular syntax, but would like to see us consider the idea of a variant syntax.
This may also ease integration existing tooling, since I didn't see any behavior around omitted variables in the original proposal (i.e. for i := range iterator when iterator returns (int, string, bool)), and if we're forced to write for i, _ := range iterator, there will be some tooling, for some time, that will likely complain that the use of the blank identifier is unneeded.
Today, if I see for range, I know that, except for iteration on channels, each iteration will be non-blocking, and will be exceptionally efficient (~constant cost for slices and maps, and bounded cost for strings).
I would instead look at it as: the iteration behavior for built-in types was simple and fast. That would remain true with iterators.
I know also that even with channels, the iteration will have no side effects.
The sending goroutine can mutate shared state or perform other side effects between sends.
I would instead look at it as: the iteration behavior for built-in types was simple and fast. That would remain true with iterators.
That's true, but it's a weaker property: the minimum overhead is remains fairly low, but the maximum overhead becomes unbounded. Further, where before the CPU overhead for a next-element (or channel receive) is consistent and negligible, with this proposal, the overhead can vary per iteration.
Consider, at some hypothetical future time, there's stable code which uses a custom iterator, which which has long been assumed by readers to use a builtin collection (a plausible misidentification case, since there's no syntactical difference in the proposal), then debugging can become an issue if that iterator, which generally has very tight performance per iteration, begins tripping on an edge case that unusually blocks for a long time.
My concern is that, based on that misassumption, the thing being iterated over may be one of the last parts of the code that is inspected to find the issue (because it's assumed the non-call expression being looped over couldn't possibly be the cause). Not all programmers review or keep up with language changes, and an identical-syntax change like this could end up resulting in one of those post-incident blog posts ("How Go magic iteration caused company X to have a 16 hour outage"). That hypothetical scenario could be avoided with visibly distinct syntax to signify custom iteration.
The sending goroutine can mutate shared state or perform other side effects between sends.
That's true, though it should be quite atypical, and contrary to "share memory by communicating" (if the calling goroutine wanted to share mutable state, then channels are not likely the appropriate mechanism.
In the general case, another goroutine (running in "parallel") could be mutating, without synchronization, a slice or map being iterated.
I specifically meant that loops themselves (specifically the evaluation of range expression) cannot cause side-effects today without a visible call: a visible call (with parens) sticks out as "something special may explicitly happen here, but only prior to iteration." With this proposal, even without parens, it's the case that "something special may implicitly happen here on each iteration."
@extemporalgenome I don't buy the scenario you are painting as a reason. Sure, it could happen, but I can conjure up similar scenarios for all kinds of code. For example, what if code is it := myIter(); for { x, ok := it.Next(); if !ok { break } use(x) }, the reviewer checks myIter's code and sees that it just abstractly over a slice and approves. And at some point, another engineer changes it to iterate over a channel, after all "that's just an implementation detail of myIter". And then, at some point, some edgecase…
If we always assume the worst case scenario that could happen under a language change, we will never change the language. It's not a practical approach.
I'm not saying I'm not a little bit worried about the potential hidden extra cost. But I'm less worried about this than I'd be about, say, appropriating Python's in operator, which is usually assumed constant time but often is linear time.
For example, what if code is it := myIter(); for { x, ok := it.Next(); if !ok { break } use(x) }, the reviewer checks myIter's code and sees that it just abstractly over a slice and approves. And at some point, another engineer changes it to iterate over a channel, after all "that's just an implementation detail of myIter". And then, at some point, some edgecase…
But the point is that there is a call there. However much you trust it.Next() to not change behavior, there's still a clear syntactic boundary between core language behavior and user-defined code, and as such it.Next() very clearly indicates that arbitrary side-effects or blocking may occur.
Granted, a programmer can switch an iteration over map keys to be an iteration over a channel without any other changes, and that would support your point even without the function wrapping. That is a risk inherent in language, but not necessarily one we should carry forward to other use-cases.
Aside: as relatively infrequently as it is used (in part because select statements are often needed), I do wish channel iteration had a slightly different syntax, as it doesn't have the same properties as slice and map iteration (can block, can be unbounded). As such, personally, I do not treat channel iteration as compelling precedent for extending existing range syntax to cover custom iterators.
If we always assume the worst case scenario that could happen under a language change, we will never change the language. It's not a practical approach.
I'm not suggesting no change, just adding a warning sign to things that can be dangerous. For example, if the syntax were changed slightly to be any of the following:
for v := range it... {}
for v := range @it {}
for v := range iter(it) {} // new builtin
(or even just dropping the range keyword alongside any of the above)
As long as identical syntax cannot be used for both builtin collection iteration and custom iteration, then there would be sufficient visual distinctiveness, without much typing cost, to make the code (and potential traps) much easier to reason about.
I understood your point. I just don't find it a compelling. If I can construct a similarly asinine scenario with the same consequences in a universe without syntactic differentiation, then it seems obvious that syntactic differentiation isn't really the issue with your example.
I believe there's too much magic in the variety of function signatures that will be accepted by the compiler. I'm thinking about what it would be like to teach this to new Go programmers, and I suspect there'll be a mystical aspect to this which isn't present elsewhere in the language (aside from unintended design consequences, like loop iterator bugs).
A new programmer may learn that they can iterate over any function which returns (T1, bool) and (T1, T2, bool). Can they then iterate over a function which returns (T1, T2, T3, bool)? Why not? That's surprising (to a new Go programmer)!
They also learn that they can iterate on integers! It sounds like you can iterate on almost anything. They might infer that they can iterate on (T, int), where the int indicates the number of items remaining, since, orthogonally, iterating on functions, and iterating on integers could plausibly be combinable.
At this time, I'd favor a variation on #54245 that does not allow signatures to vary in anything but type (i.e. accept Next() (T, bool) but notNext() (T1, T2, bool)). If two-clause assignment is important, then where index/key value pairing is applicable, the element returned from Next would be considered an index/key, and a Get(K) V extension method would be defined to permit for k, v := assignments, just as Stop() was proposed as an extension method.
Even without extension methods, I believe there's more value (following the introduction of generics) of a single precise [generic] method signature that is accepted, rather than a family of function [and potentially method] signatures.
If supporting functions, and not just methods, is critical, I'd be more comfortable with a single signature for push iterators, and a single signature for pull iterators.
I think it's also just fine to encourage modeling iterators after bufio.Scanner, as that introduced a clean, predictable, and well-understood style and semantics.
Yes, I think some people have proposed to use a defined bool type instead as a signal. It also may help since the semantics of this boolean are a bit specific to iteration status.
And I am sympathetic to your view on push functions.
Using range on them doesn't seem to bring much but an alternate call syntax unless I'm mistaken.
I agree that there are far too many function signatures allowed here. As a general rule, Go prefers to explicitly convert to a new type or wrap one type with another in order to get something to have the features someone wants. I think all of the function signatures except for one and two value push functions should be removed, and then a new package should be added that has functions that convert from the other signatures, i.e. funcs.FromPull(somePushFunc). Push functions are the most general and add some minor functionality that is not currently available in Go otherwise so I think they're the most important ones to add directly, but the rest seem unnecessary to me.
@DeedleFake I feel like we're missing something (or being too hasty) if we say that push iterators are universal (or "the most general"). Having the iterator own the iteration loop can cause some awkwardness in a number of cases:
What if you want to selectively consume from multiple iterators based on a condition? Imagine merge sort or any merge algorithm. To make that work with push iterators, channels would need to be involved. It'd be trivial to write a Next method and that case would be supported efficiently and for free.
If a push iterator defers a recover to catch its own panics, it'll end up catching panics from the callback as well, which the caller/callbacker likely does not expect.
Stack traces will be surprisingly longer whenever push iterators are used despite there being no visible call in the case of a sugared loop.
Certainly the above are all solvable/avoidable merely by not using sugared loops (or by not requiring a FromPull wrapping), but it does suggest there are usability issues with push iterators (they're not universally usable), and if we make them universal, people will tend to favor writing push iterators even if pull iterators would have been simpler or more appropriate.
Given that the most magical parts of the proposal are around push iterators (pull iterators don't need special defer/return behavior or the implicit transformation of a block into an anonymous function), the resultant language may be cleaner if we only solve for pull iterators to start with, and keep push iterators using explicit callbacks while we consider the impact of pull iteration in the wild.
If, for example, Go considered introducing concise lambdas with typeless parameters (i.e. only really usable for inline callbacks), that could solve push iterators, and other callback cases, as well as this proposal, albeit arguably with less magic.
All three of the problems that you outlined are usability concerns from the caller's side. Because of that, you've convinced me that push functions should not be the default.
I still think it makes sense to only support one type and require explicit conversions for the rest, but since the conversion happens on the caller's end, the form that it is converted to should not be one that removes power from the caller. Therefore, I think it makes more sense for pull functions to be the default after all.
If, for example, Go considered introducing concise lambdas with typeless parameters (i.e. only really usable for inline callbacks), that could solve push iterators, and other callback cases, as well as this proposal, albeit arguably with less magic.
I disagree, what sucks about that isn't just the syntax for the closure but that break, continue, goto and return don't do what they should do.
My current worry about adding pull functions as range arguments is that it's a backdoor way to add coroutines to Go. I feel like coroutines should either be first class (have a yield statement like Python and maybe a different keyword in the declaration, like func F(In) (stream Out) { /**/ }) or they should be an implementation detail, like the iter.NewGenerator function and runtime channel operations in the old proposal. With this proposal, there's a whole new kind of procedure… but it only works if you use it in a range statement. In a way, that's more radical than "there's a new optimization, but it only applies in certain cases when the compiler is sure it's safe."
I worry that it's an avenue for possible abuse, and people are going to do "clever" things, like make "Twisted Go" (a la Twisted Python) that simulate an async system with pull functions in order to avoid the Go runtime scheduler, for whatever reason. ISTM something as important as a new mechanism that lets you suspend a function without using goroutines shouldn't be tied to the range statement.
So how would that be different, in regards to the specific criticism that push functions are only useful in the context of range?
As I understand your question and the proposal, to get the coroutine behavior of suspending and resuming with panic propagation across boundaries, you have to do range pf because if you do
The stack trace will look like outerfunc > Iter, but if you do the panic in a range loop (for range myContainer.iter { panic("boom") }), the stack trace will look like outerfunc alone.
If you convert the push func to a pull func with some library construct, then the stack trace will of course just have outerfunc, but that's because Iter will be off running in its own goroutine.
So, if you're in a range call and AFAICT only if you're in a range call, you can suspend Iter without spawning a goroutine and yielding to the scheduler.
You seemed to be responding to #56413 (reply in thread), and asked a straightforward question about what it would look like, so that's what I addressed.
How is that different from a hypothetical yield statement? I'm having trouble fitting that into Go in a way that doesn't come down to the same thing.
I agree that there's ultimately no difference. It's just an implementation detail. For all the user knows, the compiler actually will use a channel and real goroutine when converting from push to pull.
to get the coroutine behavior of suspending and resuming with panic propagation across boundaries
It is my understanding that the only reason we'd need to change this is to make converting from push functions into pull functions more efficient (i.e. exactly the optimization described in the Appendix to #54245 - and which, AIUI, you consider the preferable alternative. In particular, you say:
I feel like coroutines should either be first class […] or they should be an implementation detail, like the iter.NewGenerator function and runtime channel operations in the old proposal.
But in terms of these optimizations, this new design does not actually differ from the old design. The old design called "pull functions" iter.Iter and called "push functions" generators and suggested to name the conversion from iterators generators into generators iterators iter.NewGenerator. And it suggested to apply the optimization to that case. None of that differs in the slightest from what we are discussing here. At least as far as I understand it.
It is not my understanding that the compiler should allow conversions from push to pull functions or vice-versa or that it should do them automatically. But that the compiler should implement range over pull by straight forward iteration code, range over push by translating the loop body into an opaque func and that an iteration library could convert between the two as glue.
Supporting push functions in rangedoes require some magic to transform a loop body into an opaque func value, but that has nothing to do with coroutines.
Does that answer your question?
I'm not sure. Your original statement was
With this proposal, there's a whole new kind of procedure… but it only works if you use it in a range statement.
I still don't understand this statement, in relationship to the idea of a builtin yield statement/function. ISTM that such a coroutine would also only work in a range statement. Or, at least, it wouldn't work in any more contexts than push/pull functions do. And the explanations so far don't seem to really contradict that. So I'm still a bit confused what you meant.
Where you in fact referring to the "magic" translation of loop bodies into opaque func values? If so, I'd understand a bit better where you are coming from. Though I'd still be a bit confused about the criticism, because it seems fairly self-evident that this translation only happens for range statements.
In my opinion, yield is a complicated enough concept to cause a lot of bad incomprehensible code to appear, this suggestion provides only a syntax sugar for writing something, that is already more than possible in the language. I believe, this goes against a rule of One problem - one solution.
Please, let Go stay boring
So much this.
What happened between 2018 when Rob Pike said that adding more features would make the language bigger but less different and now?
I'm afraid that the current push to add more features is just going to turn the language into another flavour of the generic programming language.
Pardon me if this has already been discussed, but it occurs to me that push functions could be replaced by a simple variant of pull functions. I'm not sure whether this would be better than push functions (that's surely a subjective matter where opinions will vary), but it seems to me a quite reasonable alternative.
Let's consider just the case of loop yielding one values at a time: for x := range something, where x has type T. The case of two values at a time isn't different in any significant way. In the initial discussion, a pull function for this case has signature
func() (T, bool)
We could also consider pull functions with signature
func(bool) (T, bool)
The intent here is when called with true, the pull function acts as the previous pull function, returning either (next value, true) or (arbitrary, false). When called with false, the pull function performs any cleanup necessary for the end of the loop. A for range statement would call the pull function with false when the loop is terminated prematurely.
Given a function pullx with signature func(bool) (int, bool), we could write
The initial discussion remarks that any push function can be automatically transformed into a pair of a next pull function and a stop cleanup function. We can continue this to get a pull function of the new form:
and this new function can be used as the object of a for range without any need for an explicit call to stop.
Some questions around this:
When a call to pullx(true) returns (whatever, false), should the for range loop call pullx(false)? I tend to think not, that pullx should do the cleanup when it returns false. But I don't really care either way.
If the loop body panics, should the for range loop call pullx(false)? If so, precisely when? What happens if both the loop body and pullx(false) panic? My initial feeling: yes, immediately on loop termination and before executing any deferred functions in the function where the loop occurs. And I haven't a clue what to do about the double panic.
If for range accepts this version of a pull function, then there is no need for it to accept the original (no parameter) version of a pull function. Would we want to accept both, or just the new version?
for x, ok := pullx(true); ok; x, ok = pullx(true) {
fmt.Println(x)
if x >= 5 {
break
}
}
pullx(false)
to be sure that the cleanup will always be called, and not just if the users breaks from the loop. Or is the assumption here that the pullx will also do cleanup when the resources are exhausted - though wouldn't that make the implementation more complicated?
I think it would be fine either way. If we chose to allow pull(bool) functions in for range, the spec would have to be clear about whether for range calls pull(false) after pull(true) returns false, and writers of pull functions would adjust appropriately.
I think it would be more interesting what happens if the loop body panics. Presumably that should call pullx(false). Which means that should probably get defered by the range (answering the question above). But that would be the first time (I think) a language feature would implicitly defer something.
I think it would be more interesting what happens if the loop body panics. Presumably that should call pullx(false). Which means that should probably get defered by the range (answering the question above). But that would be the first time (I think) a language feature would implicitly defer something.
It would be sort of like a defer, yes. But maybe not exactly. Consider
Should the call to pullx(false) occur before or after the 11 calls to functions deferred by the loop body?
Also consider that if the implementation of for range uses defer to call pullx(false), then that call won't happen until the containing function exits. But in the common case where the loop body does not panic but does break, pullx(false) should be called immediately after the loop terminates, and before any other code in the containing function runs.
To me, all of this seems like a decent argument that the control flow of pullx is not as straight forward as it may seem.
Yes, I am now thinking that push functions would be a better choice. They also don't necessarily have simple control flow, but are sometimes easier to write.
The ergonomics of push-based iterators seem nice, but I'm concerned it has a lot of corner cases to think about:
What happens if the iterator continues to call the yield function even after it returns true?
What happens if the iterator holds onto the yield function and calls it after returning?
What happens if the iterator calls the yield function on another goroutine?
What order are deferred calls invoked in? We're logically interleaving execution of code from two different functions, so it's possible we interleave defer statements too. E.g., suppose a recursive iterator like Tree[K,V].All contained a defer statement, as did the for loop that invoked it.
I expect these questions don't directly matter to most users, but I think they're relevant to the compiler for how it desugars control flow statements. In turn, this is indirectly relevant to users because it could affect performance.
I think a lot of misuse (e.g., questions 1 and 2) could be cheaply caught by simply poisoning the closure's PC field after we don't expect it to be called any further.
What's unsavory to me is that there's no obviously-good ordering on when the deferred calls happen.
I disagree. I think there is one obvious ordering: the calls deferred by the push function occur when the push function returns — which is after the caller finishes executing the last iteration of the loop and before the caller executes the first statement outside of the loop. (That is: the deferred calls occur when execution leaves the for … range statement in the caller.)
I think my last paragraph in #56413 (reply in thread) was confused. The deferred calls aren't in some kind of global LIFO order — the deferred calls in each function are in LIFO order, and each function executes its deferred calls when the function returns (or halts via panic or Goexit).
I think there is one obvious ordering: the calls deferred by the push function occur when the push function returns
I agree that's an obvious ordering, yes. It's the same one @DeedleFake suggested, for example.
I'm saying it's not obviously good: it means deferred calls no longer happen in strict LIFO order with respect to their corresponding defer statements.
I periodically see tracing code written like defer f()() where f() pushes something onto a stack, and then the returned function is responsible for popping it off. This idiom becomes error-prone if we abandon LIFO ordering.
The deferred calls aren't in some kind of global LIFO order
When you say "global LIFO order," I hear an ordering across all goroutines within a process. I'm not suggesting that exists either.
But today we do maintain a strictly LIFO, per-goroutine stack of deferred calls: each defer statement pushes a call onto the goroutine's defer stack, and panic and return are responsible for popping calls off the stack as necessary.
The proposal here implies relaxing the "strictly LIFO" part of that. We can certainly do that, but I think it should be taken very seriously. defer/panic are already very subtle, and the implementation today is quite complex and fragile.
Ian points out the iterators could actually operate under the hood using two goroutines, which would cleanly address the implementation concerns around deferred calls. But it wouldn't have any performance advantages, since the API is synchronous anyway. So that seems like it would be pure overhead to me.
But as I also pointed out, I question whether users actually intentionally write defer statements inside for loops, intending for the calls to queue until function return. And if they don't, we can just disallow them in the presence of push-based iterators, which avoids the whole issue. We can always relax that restriction in the future if use cases present themselves.
The defer statements called within a loop will appear in a well-defined order. The defer statements called by a push function will appear in a well-defined order. Nobody is saying otherwise. The only question is whether there is any required ordering between the defer statements called within a loop and the defer statements called by a push function. I am suggesting that for that latter case only there is no required order, just as there is no required order in the goroutine example I wrote two paragraphs up.
The point is that there is only one order in which the deferred function calls can be executed that satisfies both existing language semantics and reasonable rules around iterating over push functions (as outlined in #56413 (reply in thread)).
Earlier, I argued for a sort-of converse of this, that if you thought the order was not determined, then you must intend to change existing language semantics. That might have been confusing; I apologize for that.
Note that the defer statements from the push function and loop body are interleaved, even though the deferred function calls will not be (as we will see).
The function calls deferred inside the push function iter occur in LIFO order, and they occur when iter returns, which must be before fmt.Println("middle") is executed. So we must have this sequence (possibly interleaved with other deferred function calls, so far as we know at this point in the argument):
And we see that function calls deferred in the push function and the loop body cannot be interleaved, even in the absence of an explicit rule against interleaving.
On Wed, Jan 18, 2023 at 2:54 PM Ian Lance Taylor ***@***.***> wrote:
Actually, thinking about this more, I'm not sure that the exact order in
which the defer statements are executed should be precisely defined. One
possible implementation of for/range over a push function is to start a
new goroutine and have the push function send the values over a channel
(with a second channel used to exit the goroutine on loop termination if
necessary). I don't think we want to rule out that implementation a priori.
In that case the interleaving of the "iter" and "loop" messages wou
442
ld not
be fully specified.
If I understand correctly what you're suggesting, then I must disagree.
The "iter" messages must appear before the "end" message, because they are
deferred by the push function, which terminates before main defers the end
message.
And the "loop" messages must appear after the "end" message, because they
are deferred earlier and by the same function (main).
To have a new form of iteration change this would be endlessly confusing
for programmers. And I imagine it would lead to many data races in programs
with no visible goroutines. Also, a future change in implementation from
"no extra goroutine" to "a secondary goroutine" would change program
results and introduce data races.
No, an implementation using another goroutine would still have to guarantee
this order, in my opinion. I know almost nothing of the compiler internals,
but I imagine this would be doable but possibly quite difficult.
Or am I misunderstanding something?
Note: this is out of sequence because it was sent via e-mail rather than added to the discussion thread.
I agree that the "iter" and "loop" messages must appear before the "end" message. What I said was that the interleaving of the "iter" and "loop" messages could, perhaps, be unspecified. That is, while the "iter" messages must appear in the obvious order, and the "loop" messages must appear in the
CFEA
obvious order, it's unspecified whether the "loop 0" appears before or after "iter 0", etc.
Note: this is out of sequence because it was sent via e-mail rather than added to the discussion thread.
Mea culpa. I forgot that replying by e-mail has that undesirable side effect. I'll try to remember in future.
I agree that the "iter" and "loop" messages must appear before the "end" message.
Sorry if I wasn't clear. My opinion is the "iter" messages must appear before the "end" message, and the "loop" messages must appear after the "end" message. So the "iter" messages are separated from the "loop" messages by the "end" message, and no intermixing is possible. Not because of implementation details, but because of language semantics that it would be too confusing to change.
Currently, no matter what the type of whatever, if x takes the values 0, 1, and 2 in that order, then this must print
middle
end
2
1
0
start
If I understand you correctly, you are suggesting that in the one special case that whatever is a push function, the output could also be, for example,
2
middle
1
end
0
start
or
middle
end
start
2
1
0
or many other possibilities.
I find this a startling departure from the current state of affairs.
Sorry for misunderstanding. But your concern is not what I'm suggesting. I agree that whether or not you use a push function the order of the defer statements in your example is unchanged.
What I am saying is that if the push function itself uses defer statements, then the order in which those defer statements, the ones in the push function, run, compared to the order in which the defer statements executed during the loop are run, is unspecified.
I really like how this proposal provides a unified syntax (range) for both internal (push) and external (pull) iterators.
But, as someone who reads (reviews) much more code than I write, my only (non-blocking) concerns are about the complexified mental model around a for range loop because of the explosion of the possible types and underlying hidden complexity and cost.
So far when I see the following loop:
fora, b:=rangeX {
...
}
I only have to determine if X is an array, a slice, a string or a map. As array is similar to slice and range over string is quite rare in business code and quickly identified by the context, the question is usually more between 2 alternatives: slice/array or map. I can usually reply to that question using the func scope around the loop.
However by introducing push/pull range iterators, the number of possible types will explode. And even more, the cost of each iteration style will be much more varied: a user-defined iterator might have some bugs or performance issues that I don't expect from built-in iterators. The risk of hidden panics will also explode (so far, no panic on iterating on a nil slice or nil map). My existing review tooling (git diff, GitLab Merge requests viewed in browser) that doesn't provides type information inline will become insufficient if I can't easily determine the iterator construction.
That is a case where this added syntactic sugar will ease write more concise Go, but increase mental load of human readers.
Range over plain integers (for i := range 5) as suggested as a later step would make it even worse: in for e := range arg.Elements there is a huge difference in block behavior if Elements is an int vs a []string.
database/sql.Rows is mentionned as an interator example, but I think that isn't one that will benefit from this proposal. Errors may happen while iterating or inside an iteration callback, and this proposal doesn't handle that case.
A bit off-topic as this isn't about the push/pull proposal, but as I'm mentionning database/sql.Rows I wanted to mention some experiments I did over the years around simplifying iterating over it.
As a heavy user of database/sql.Rows I have written my own external iterator around it with the following signature:
// QueryRows calls QueryContext and loop over rows calling scanRow. If scanRow// fails, the error it returns is wrapped in a RowError.funcQueryRows(
ctx context.Context,
dbinterface { QueryContext(context.Context, string, ...any) (*sql.Rows, error) },
querystring,
args []any,
scanRowfunc(*sql.Rows) error,
) error
However I almost never use it myself because:
it would increase mental load for readers: need to know that utility function in addition to database/sql.Rows methods
the increased syntax complexity: a closure argument vs a loop block (this range proposal would help in many other iteration cases, but not for sql.Rows where we have to handle runtime errors)
the increased runtime cost of function call (small, but it adds to the readability costs as a drawback)
the biggest complexity when iterating over sql.Rows is the call to rows.Scan where you have to pass pointers to target variables (forgetting & is a common beginner mistake), and that wrapper was still not encapsulating it.
I have gone further with a more general sql.Rows iterator in my package github.com/dolmen-go/sqlfunc (see ForEach and Query) going towards encapsulating the Rows.Scan call but its heavy use of reflect makes it perform badly. I had started some work on a code generator (to move type introspection to a go:generate phase in order to avoid use of reflect.Value.Call), but I paused this in 2021 while waiting for generics.
I'm proposing a different API, which would not require the extra error check. row.Scan was not a typo for rows.Scan. It's a new type that represents a single row. It's unlikely that such an API change will happen however because it would be somewhat redundant with current API. Maybe if there's ever a database/sql/v2.
For Range over ints, would using the same syntax as for slicing subscripts be more "Go-like"? E.g. for i := range [:n] {...}
We could use other start-points, e.g. for i := range [2:len(a)],
or require an explicit break, e.g. for i := range [:].
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
There is no standard way to iterate over a sequence of values in Go. For lack of any convention, we have ended up with a wide variety of approaches. Each implementation has done what made the most sense in that context, but decisions made in isolation have resulted in confusion for users.
In the standard library alone, we have archive/tar.Reader.Next, bufio.Reader.ReadByte, bufio.Scanner.Scan, container/ring.Ring.Do, database/sql.Rows, expvar.Do, flag.Visit, go/token.FileSet.Iterate, path/filepath.Walk, go/token.FileSet.Iterate, runtime.Frames.Next, and sync.Map.Range, hardly any of which agree on the exact details of iteration. Even the functions that agree on the signature don’t always agree about the semantics. For example, most iteration functions that return (T, bool) follow the usual Go convention of having the bool indicate whether the T is valid. In contrast, the bool returned from runtime.Frames.Next indicates whether the next call will return something valid.
When you want to iterate over something, you first have to learn how the specific code you are calling handles iteration. This lack of uniformity hinders Go’s goal of making it easy to easy to move around in a large code base. People often mention as a strength that all Go code looks about the same. That’s simply not true for code with custom iteration.
We should converge on a standard way to handle iteration in Go, and one way to incentivize that is to support it directly in range syntax. Specifically, the idea is to allow range over function values of certain types. If any kind of code providing iteration implements such a function, then users can write the same kind of range loop they use for slices and maps and stop worrying about whether they are using a bespoke iteration API correctly.
This GitHub Discussion is about this idea of allowing range over function values. This is obviously related to the iterator discussion (#54245), but one aim of this discussion is to separate out just the idea of a language change for customized range behavior, which should probably be done independently of an iterator library. A library for iterators can then be built using and augmenting the range change, not being the cause of it.
To date, range's behavior has depended only on the type of its argument, not methods the argument has, nor any other details of the argument. Range currently handles slice, (pointer to) array, map, chan, and string arguments. We can extend range to support user-defined behavior by adding certain forms of func arguments.
There are two natural kinds of func arguments we might want to support in range: push functions and pull functions (definitions below). These kinds of funcs are duals of each other, and while push functions are more suited to range loops, both are useful in different contexts.
This posts suggests that for loops allow range over both push functions and pull functions. The end of the post also suggests range over int.
The rest of this post explains all this in more detail.
Push functions
A push function is a function with a type of one of these forms:
That is, a push function takes a single argument, here named yield, although that exact name is not a requirement. The yield argument is itself a function taking N arguments (0 ≤ N ≤ 2) (denoted by
...in the pseudo-syntax above) and returning a single bool. The push function itself must return nothing at all or else a single bool. The optional bool allows the push function to indicate whether it stopped early, which can be useful when composing push functions; when called using range syntax, the compiled code would ignore the result.The push function enumerates a sequence of values by calling yield repeatedly. The bool result from yield indicates whether to keep yielding operations (true means continue running, false means stop). Each call to yield runs the range loop body once and then returns. When there are no more values to pass to yield, or if yield returns false, the push function returns.
In short, a push function pushes a sequence of values into the yield function.
For example, here is a method to traverse a binary tree:
The method value
t.Allis a push function: it has signaturefunc(func(K, V) bool) bool.With that method, one can write today:
(In this usage, the caller doesn’t care about the boolean result from t.All, only the fact that it calls f on every key-value pair.)
Adding support for push functions to range would allow writing this equivalent code:
In fact, the Go compiler would effectively rewrite the second form into the first form, turning the loop body into a synthesized function to pass to t.All. However, that rewrite would also preserve the “on the page” semantics of code inside the loop like
break,continue,defer,goto, andreturn, so that all those constructs would execute the same as in a range over a slice or map.If you are worried about the subtle variable scoping difference, consider the change discussed in #56010 a prerequisite of adding func support to range.
Note that the results of the push function (if any) are discarded when using the range form. Most often a push function will return nothing at all, or else a bool indicating whether the loop stopped early, as the All method does to make recursion easier.
A method x.All(f), which may become a common pattern, has two different, equally valid interpretations. One is that f is a yield function and All passes all the tree's contents to f. The other is that f is a condition function and All reports whether the condition is true for all the contents of the tree, stopping the traversal once it determines the result.
Pull functions
A pull function is a function with a type of the form
That is, a pull function takes no arguments and returns the next set of N values (0 ≤ N ≤ 2) from the sequence. Each valid set of values comes with a final true bool result. When there are no more values, the pull function returns arbitrary values and a false bool.
A pull function must maintain internal state, so that repeated calls return successive values.
In short, a pull function lets the caller pull successive elements from the sequence, one at a time.
For example, here is a method that returns a pull function to traverse a linked list:
The method value
l.Iteris not a pull function, but it returns one.With that method, one can write today:
Adding support for pull functions to range would allow writing this equivalent code:
In fact, the Go compiler would effectively rewrite the second form into the first form. Again, consider the scope change in #56010 a prerequisite.
If some iterator-like value had a Next method that returned (value, bool), we could write:
Note that range over pull functions has been proposed by itself as #43557, and the discussion also considered push functions (for example, #43557 (comment)). Both can be appropriate at different times.
Duality of push and pull functions
Any push function can be converted into a pull function and vice versa.
Converting a pull function into a push function is a few lines of code:
Converting a push function into a pull function is more involved. Because the push function has its own state maintained in its stack (like in the binary tree traversal), that code must run in a separate goroutine in order to give it a stack that persists across calls to the next function. The full code is in this playground snippet.
It can be arranged that the separate goroutine executes with its own stack but not actually running in parallel with the caller. With a bit of smarts in the compiler and runtime, but no changes to the Go language or any of its semantics, that lack of parallelism allows the separate goroutine to be optimized into a coroutine, so that switches between the caller and the push function are fairly cheap. The details of the optimization are beyond the scope of this discussion but are posted in the “Appendix” of #54245.
The signature for converting a push function to a pull function is
The conversion must return two functions: the pull function
nextand a cleanup functionstop, which shuts down the goroutine.Although push and pull functions are duals, they have important differences. Push functions are easier to write and somewhat more powerful to invoke, because they can store state on the stack and can automatically clean up when the traversal is over. That cleanup is made explicit by the
stopcallback when converting to the pull form.For example, the binary tree traversal above was made very easy by being able to use recursion in its implementation. A direct implementation of a pull form would need to maintain its own explicit stack instead, like:
That implementation is much harder to reason about and probably contains a bug.
As another example of the power of push functions and automatic cleanup, consider this function that allows ranging over the lines from a file:
This could be used as:
Note that the implementation of Lines can use defer to clean up automatically when the loop is done. An implementation using a pull function would need a separate stop function to close the file.
A push function usually represents an entire sequence of values, so that it can be called multiple times to traverse the sequence multiple times. It can usually also be called simultaneously from different goroutines if they both want to traverse the sequence, without any synchronization. In contrast, a pull function always represents a specific point in one traversal of the sequence. It can be advanced to the end of the sequence, but then it can't be reused. Goroutines cannot share a pull function without synchronization, but a pull function can be used from multiple call sites in a single goroutine, such as a lexer pulling bytes from an input source.
In terms of concepts in other languages, a push function can be thought of as representing an entire collection. The implementation of the push function maintains iterator state implicitly on its stack, so that multiple uses of the push function use separate instances of the iterator state. In contrast, a pull function can be thought of as representing an iterator, not an entire collection.
Push and pull functions represent different ways of interacting with data, and one way may be more appropriate than the other depending on the data. For example, many programs process the lines in a file in a single loop, so a push function is appropriate for lines in a file. In contrast, it is difficult to imagine any programs that would process the bytes in a file with a single loop (except maybe wc), while many process bytes in a file incrementally from many call sites (again, lexers are an example), so a pull function is more appropriate for bytes in a file.
Because both forms are appropriate in different contexts, range loops should support functions of both types. Note that there is no overlap between the two function kinds: push functions always have one argument, while pull functions always have no arguments.
Alternatives
An alternative would be to extend range by recognizing special methods. For example if range knew to call a .Range method, then we could define (*Tree).Range and then use
instead of
One aesthetic reason not to do this is that range today uses types to make the decision, and it seems cleaner to continue to do that. In fact, there is nothing in the language today that calls specially defined methods. (The closest to that is the definition of the error interface, but no language construct calls the Error method.) Aesthetic reasons aside, though, there are two practical problems with a method-based decision.
The first problem with a method-based decision is that only a single method can implement the behavior. Using functions, other methods can be called instead simply by naming them. For example we might define t.AllReverse that enumerates the tree in reverse order, and then a loop can use
Similarly, an iterator that defines Next might also define Prev, allowing
The second problem with a method-based decision is that it can conflict with the type-based decision. For example if the loop calls the Range method, what happens in a range over a channel value that also has a Range method? Is it treated like other channels, ignoring the Range method? It would seem that must be the case, for backwards compatibility. But then it's confusing that the Range method doesn't win.
Continuing the type-based decision instead of introducing a new method-based decision rule avoids these problems.
Range over ints
One common problem for developers not coming from the C family of languages is puzzling through the Go idiom
When you stop to explain it, that’s a lot of machinery to say “count to n”.
One common use case that people have mentioned for user-defined range behaviors is to have a standard function to simplify that pattern, like:
used as:
If this will become the new idiom for counting to n, it's unclear where the count function would be defined. Some package that essentially every program imports?
Counting from 0 to n is so incredibly common that it could merit a predefined function, but at that point we’re talking about a language change. And if we’re talking about a language change, it makes sense to continue to extend range in a type-based way, namely by ranging over ints.
Adding support for ints to range would allow writing this code:
instead of:
For former C, C++, and Java programmers, the idea of not writing the 3-clause for loop may seem foreign. It did to me at first too. But if we adopt this change, the range form would quickly become idiomatic, and the 3-clause loop would seem as archaic as ending statements with semicolons.
Discussion
What do people think about this idea?
Should we stop at push functions and not allow pull functions in range?
Should we add range over int too?
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 53 comments · 330 replies
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
At first, I'd like to note that a general idea of something iterator-related is a welcome addition to the language.
When I started with Go (coming from a C# background), the differences between the ways of iteration were quite confusing. And they still are! C# has this notion of: everything that is an
IEnumerable<T>can be accessed and manipulated with LINQ.However, LINQ is a beast itself and introducing something like that is definitely not suitable for the goals of the Go programming language. And I would even argue that it's not needed.
The concept of pull and push functions is clear. Incorporating this even further into the language, e.g. by defining a
Range()method as considered in the alternatives would decrease the readability. Developers would need to know about this concept, because it definitely hides something. So I'd would consider this a no-go. Explicit readable code is the preferred way.As for the range over ints proposal: Python has something similar, therefore this new pattern could improve the adoptability across Python developers. I, for one, don't have a strong opinion for or against it, as I'm already too used to the C-way.
Edit: fixed some minor grammar corrections, simply because English is not my first language.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I actually don't entirely agree with that. Yes it is a new concept, but it keeps the idea that a function is a function and has computational weight.
When doing a
for x := range x.Range()you know should be able to know that it's a function call and that therangewill produce values from it. This seems to fit more with the Go theme than having predefined interfaces that an object must implement to get range functionality. It really doesn't seem to be hiding anything, cards are out in the open. In other words you can read this asrange(produce values) from x.Range() calling it every iteration. The only implicit part is the function signature which shouldn't be foreign to anyone who is familiar with aHandlerFuncor other similar APIs in go.So this actually seems pretty explicit and actually far more readable as there would be one way of iterating through objects vs consulting the documentation on the specific iterator semantics.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I'm really interested in what the transform for push functions will be to allow flow control statements would be. This would effectively add a form of non-local return to Go, which other languages use to make these sorts of internal iterators feel nice.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Yes, I believe the yield function will panic if it is called after the loop is done or from the wrong goroutine.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This is an unfortunate limitation, do we need it? I don't think we want to allow racy calls to yield, but I can imagine push functions that e.g., start worker goroutines to walk a data structure and call yield on all elements. Provided there is synchronization around yield calls, it feels like that should be fine.
This could always be worked around by having workers send values back to the original goroutine, which calls yield, but this feels like an awkward requirement in the language. I can't think of other functions that must be called from the correct goroutine (
t.FailNow()is the closest I can think of), so this seems odd.That said, I'm not sure how to reconcile this with what should happen if the loop body panics.
yield just returns false, I suppose? (This would imply that the yield implementation would use some internal communication mechanism to make the loop body run in the original goroutine)Edit: this doesn't make sense, as the original goroutine is likely blocked in something likesync.WaitGroup.Wait.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Indeed, a panic or a call to defer is the main reason the goroutine limitation exists. I doubt it will be much of a problem in practice. We can also always lift it later.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Ah great thanks for all that info, that makes sense!
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
@rsc
What do we expect the push function to do if the yield function panics, either for the reason above, or because of a call to
panicwithin the loop body? I imagine we expect it to stop and not call the yield function any more; is that right?If the push function uses defer to recover from the panic and call the yield function, it seems there is potential for an infinite loop of panics. Perhaps the yield function should check for this and after some number of panics do something else? Maybe exit the goroutine or exit the process?
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
A few assorted thoughts.
rangeiteration is, I think, the biggest point of friction that makes them feel like second-class collection types. So I'm cautiously optimistic that this idea would solve that problem.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
A few assorted replies.
for i := range ndecreases code readability, but that will depend on how quickly everyone moves to the new syntax. Regardless, it won't be 10X slower. It will be exactly the same speed.Thanks for the comments!
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I actually like
for i := range nsyntax. I've faced error numerous times in Go writing that line. Coming from Python world where we are habituated to writingfor i in range(n), this should be a welcome change.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
@rsc For your reference, I've found this article by Robert Nystrom (of craftinginterpreters.com fame) on internal and external iteration useful: https://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/.
tl;dr: External iteration is when the user's code "calls" the iteration, as with iterator types and
forloops. Internal iteration is when the iteration "calls" the user's code, as withforEach-like functions in JavaScript, Ruby et al.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
The push function case has a weird quality that I think is novel. The
yield()function the compiler passes to it is a function that you can't write in Go, because it's a function which, when called, can execute a defer in a caller's context. I'm mildly afraid of that, not least because I have often wanted the ability to write "defer-but-in-parent" and also I would be absolutely miserable if anyone else (including "me three months ago") had access to it.I don't think we could entirely dispose of the three-clause loop, but I do agree that I'd be fine with not needing it in the "count to n" case.
Anyway, as a person who's repeatedly wanted to request iterator support in the language, I will say that I like this a lot, and at least so far, this feels like something that I'd use and not hate, which is pretty high praise for programming languages.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
FWIW, I think of the
yieldmagic as being more about optimizing away a goroutine than about running things in the caller's stack.(https://go.dev/play/p/bB1-_JempqG)
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Thanks for writing this up @rsc, and for providing a clear mental model around push and pull based iteration! I really like the direction this is going.
One thing that seemed a bit nuanced in the description was the push pull distinction apparently requiring different parenthesis.
This I likely an artifact of your example having Iter() return a function instead of an intermediate value (as in the discussion), because I think that this clears up the nuance (despite being more verbose):
I also think the decision to pass a function to
rangeinstead of passing a value that has a given method is a good one; although it adds a bit of syntactic noise, it makes it very clear how the feature works, and (although you don't call this out) allows people who are navigating a new code-base to click on a method to see where it's implemented as they would for a function call.The biggest concern I can think of is not really a concern with the proposed changes to
rangeitself, but with how it would interact with the rest of the language. In particular, if so many things are allowed, there's no way to specify that my function takes "something it can pass to range". This may not be a pr 6D38 oblem in practice (the examples I can think of are fairly mundane) but it might be irritating to have to write ~6 implementations of the same thing for various different push and pull function signatures.This could be solved with some syntax in interface definitions (for example):
This is similar to the operator based approach that was decided against for type parameters (in favor of the named types approach) so there may be issues there that led to that decision that I don't know about. It also has the downside that you lose information (once you have an
Alleryou cannot call it's All() method directly because you don't know what its signature actually is).Alternatively it could be solved by heavily restricting the proposal so that range only accepts functions with one signature (probably
(func (k K, v V) bool) bool). Although it seems reasonable to require All to always return abool(just asClose()always returns an error), I'm not sure how reasonable it is to require two callback parameters - implementors could always pass nil as a second value, but seems a bit meh. This would also mean that pull-iterators are not directly supported, and possibly thepushfunction is made available so people can convert between the two. (I do think it would be reasonable to supporti := range nif n is an integer type even if there was no way to pass "either an integer or a function with the right syntax).A third option is to split the difference, allow some number of types (more than one and fewer than six) so that if you want to write code that takes something that can be passed to range, you only need a couple of different copies.
In any case it would be nice to be able to write code like this and pass anything that could be passed to range to a function (but probably not a deal breaker if you can't):
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Totally agree.
I noticed this too and i'm a little conflicted about it.
On the one hand, i don't want to have to remember whether any given loop needs the parentheses or not. Sounds like a great source of frustration while coding.
On the other hand, it could be nice to have a visual indicator of whether a loop is over a push-type (repeatable) or pull-type (consumable) value. (Attempting to reuse a spent iterator is a mistake that i still occasionally make in Python.)
Parenthesis would probably not be my first choice for that visual indicator. A different keyword, maybe. Or perhaps a naming convention would be sufficient.
That said, I guess this confusion already exists today - channels are consumable, but maps and slices are reusable - so maybe it's too late to do anything about it.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
range t.Iter()would requireNext()to be a magic method name so I think that was an oversight. If you assume magic methods are forbidden, ranging over an iterable must berange t.Iter().Next.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I believe t.Iter() returns a func - the name is immaterial - only the signature needs to match.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
t.Allcould return afuncas well. I think whether or notIter()returning afuncis more plausible depends on a) whether it was written before this proposal is implemented (it would most likely return a mundane iterator type), and b) whether we have an iterator library (it will likely return a canonical iterator type). In the space between implementing this proposal and us getting an iterator library, I could see an argument that returning a pull func fromIter()is easier in some cases.Either way, whether you have to put a call expression into
rangedoesn't actually depend on whether or not you use push or pull. It depends on what the type of range expression is. That's, FTR, the same as today - a range expression can be a function call, or it can not be.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
What is the intent for using these as iterators? I know that the discussion here splits, but to me, if these are not usable to write an iterator library, I don't really see the point for a relatively invasive language change.
From what I can tell, there is no realistic way to write a function which takes "either a push or a pull function". There isn't even a way to write one which can take a push function, due to that having 6 (?) different forms. I mean, you can write a type-constraint for "it has to be any of these" and use a type-switch, but that isn't exactly ergonomic.
So all I could think of is iterator-compositions taking a form they need and returning a form they find convenient. With the user being expected to use the appropiate glue code to transform them back-and-forth. Especially given that some of these need a separate
stopfunction, that sounds like a pain to manage.So while I can totally see how this would enable us to iterate over user-defined collections (and I think it does that reasonably well, though I find the dangers of
pushpersistingyieldicky), I can't really see how this addresses the goal of "a standard way to do iteration".AIUI one of the goals is to provide the language change needed to then do #54245. But #54245 really only needs pull-functions to be
rangeable, doesn't it?Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
@Merovius Right, this was my point. You can't abstract over varying numbers of values in func signatures.
There are also the variants where the push iterator func itself returns a bool. That doubles the cases.
I don't think I equated the two with the totality of what I wrote.
I was thinking that for push iterators, the loop body would just be the body of the callback, but now I see that return and goto wouldn't work that way. So the range loop is basically only working with pull iterators anyway, after converting push to pull. Makes sense.
I haven't seen any counterarguments from "we" indicating any problems with the arguments so far, but if there's just no interest from the core Go team, then there's no point in continuing to talk about it. I'll end my remarks about it here.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
You have seen them. You disagree with them. That's fine, but I'd personally be far more inclined to converse, if that's a distinction you could consistently internalize and reflect.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
In addition to a solution with channels, here's another solution with a higher-ordered
CountN:Because there are no
defer,gotolabels, etc., this is plausible to run in Go:It's kind of funny that what should be the loop body appears in a function here, with an empty loop body. I think that's a good demonstration of why the minimally magical language change would be sensible: just allow writing the loop body where it should reasonably appear, while preserving the semantics of
defer,goto,break,continue,panic(or anything I'm forgetting)Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
@Merovius No, I haven't, and please don't tell me what I've seen.
To be clear, this is regarding the discussion about including generator functions.
Many of the points I made in response to you and @ianlancetaylor remain unaddressed. Here are some of them:
Many of the points made in response to me were subjective and vague. Here are some of them:
I'm not going to explain here how logic works, but suffice it to say that good premises and conclusions in arguments are falsifiable, and "it does not seem Go-like to me" is not falsifiable. There's no way to argue against it. The only things you can really say in response to statements like that is "I agree that you say you have that feeling" or "I have the opposite feeling." There's no "meat" on those bones to sink your teeth into.
I addressed all of the objective points that were made in response to me. We seemed to get sidetracked on how generators work, and I'm still unclear on whether my attempt to clarify how they work had any kind of effect, since what I wrote wasn't acknowledged.
I'd prefer to leave it there. If we must, let's agree to disagree.
Edit: I should add that I interpreted the "we" to mean the core Go team, not including @Merovius, and by arguments, I meant unaddressed arguments.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
@willfaught I'm sorry it seems like we're ignoring your points. I personally don't find it productive to reply with a simple "I disagree". It doesn't seem to lead to useful conversations.
That said:
I made the point that a builtin
yieldfunction would lead to unexpected flow of control. For some reason you call that point "subjective and vague." I don't think it is. I think it is objective and clear.You responded by saying, I think, that we should treat
panicnot as an exception but as a path to follow.I disagree.
Given that disagreement, I don't find it necessary to keep responding to every other argument on this topic. At some point we have to be able to draw a line.
I don't know if we are going to adopt this proposal (
forrangeover pull and push functions) or not. I'm in favor of it but I can live without it. But even if we don't adopt this proposal, I'm really pretty sure that we aren't going to adopt a newyieldbuiltin function. So my interest in discussing that topic is naturally somewhat limited. I'm sorry if this seems harsh. I'm sure it does seem subjective and vague. That's OK with me: some aspects of language design are subjective and vague. I'm just trying to state my views clearly and honestly.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I like all of this proposal but it's not clear to me how control flow statements would be implemented within the body of a loop ranging over a push iterator.
Break and continue are easy (
return falseandreturn true, respectively) but AFAIK the only way to implement return, defer, and goto without significant runtime changes (e.g. non-local return/jump) is with something like this:Though I am not confident that my
defer fn()translation would behave exactly the same as a non-local defer.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Sure, something like that. #47707 has a bunch of discussion about that. I'm trying to keep this discussion at a higher-level, but I'm confident it can be implemented.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
My 5 cents:
I like the fact this avoids the introductions of specially-named methods. It's good that Go avoids this.
The "pull" range idiom is similar to what I envision as a typical iterator in most languages, the only difference being the iterator here is a function, each call advancing the iterator, rather than an implementation of some interface with a similar function that advances the iterator. In the end, they are similar. Does this addition improve the language? I'm not so sure, mostly because there's not a whole lot of difference between the code being replaced and the replacement, so I'm not so sure the small savings in code justifies the language addition.
The "push" idiom is a little more difficult to picture in one's mind, it takes a little bit more mental effort to traverse from the pull function to its use as part of a range loop. Once again, I'm not so sure the small savings in code justifies the language addition.
the "range over int", I find it less useful, because it is restricted to a range from 0 up to n, and I find my range loops are sometimes descending, sometimes starting from 1, and all sorts of other combinations. So I don't see it as all that valuable. It would be a bit more valuable if it looked like:
for i := range m..n { }and then you'd have a few more options with m > n, m = 1, etc.If people were to start writing
or
then that would be worse than what exists today.
So, overall, I'm skeptical of this proposal. For me, I'd probably be happier reading and writing the original code rather than these new 'range' equivalents.
For me, the main objective of adding iterators to the language is to provide common types shared by many. Code using or producing iterators written by different people would be automatically compliant, because they're both standardizing on the same common library types. Without that, additional code is being written to translate one iterator type to the other.
So, from this proposal I suppose the pull and push functions suggest that you might define standardized iterators to be:
And then perhaps people might decide to standardize on these two iterator data types, but frankly, they are probably not what I would choose to standardize on (although that would be a whole new discussion).
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I'm not going to try to convince you to change your mind, but I do want to point out that this reply is focusing on the language change by itself, not engaging with the point at the start of the post, namely that there is a tower of babel of iterators and that supporting canonical ones in range will both encourage implementers to use a standard pattern and make usages cleaner.
I agree that in these trimmed down examples the differences does not appear large, although with more complex expressions the linguistic benefit is greater. But focusing on the linguistic benefit ignores the ecosystem benefit of a way to standardize what an iterator interface looks like.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I agree with this comment, I do think that standardization around a common pattern is beneficial, to avoid the tower of babel. In fact, that's what I meant by my comment "For me, the main objective of adding iterators to the language is to provide common types shared by many".
I agree 100% that the primary benefit is to "standardize what an iterator interface looks like".
I did focus largely on the integration with range in my comment. So, I see your point in this reply. I think that you are right that a proposal like this would likely push most of the past and future iterators towards either the push or pull pattern proposed here. If that is the primary goal, it would likely do that, in my opinion. People would most likely want to support any new "range" functionality. Although, to be sure, you'd probably want to make the "range" functionality as attractive as possible.
Probably, most would gravitate towards the "pull" pattern.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
You expect most developers would choose to implement
Next() (T, bool)instead ofRange(yield func(T) bool)? IMO the latter is far more intuitive, especially for complex structures such as a tree.Next() (T, bool)is easy to implement for queue-like values such as channels and random access values such as slices, but implementingNext() (T, bool)for a map or a tree is significantly more complex. ImplementingRange(yield func(T) bool)is trivial for most iterable values.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I think the code search below by @rsc provides some evidence that most people in most situations gravitate towards pull functions.
Even though callbacks are sometimes the better choice to make code cleaner, many developers never use them at all. Pull is simpler, you call something and get something back, and then you repeat, case closed. But it's true that it can require more work maintaining state inside the pull function. Maybe for push, some people have a harder time picturing a call stack and the flow of control in their minds, and maybe pull is easy enough in most cases that it is the preferred choice.
I do agree, traversing binary trees, or many other data structures, is often much cleaner and simpler with callbacks like the push pattern, and so push can sometimes be the better choice.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I expect the main reason for package developers preferring pull iterators is that they feel more natural for the consumer than passing a callback, IMO. However if this proposal is accepted, I expect package developers will shift to writing push iterators with the barrier (the consumer reasoning about a callback) gone.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I like this. One tiny nit, though... I would prefer to leave out the possibility of a push function returning a bool.
It doesn't save much code, as any function returning a bool can be trivially wrapped in a function that returns nothing. The tree example above could be rewritten:
I would prefer either to say that a push function must return nothing, or to say that a push function can have any return type(s) at all, including nothing, and for...range will ignore the returned values.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
In the code snippet above, my intent was that one would continue to write
for t := range x.All {.Underis just a support method forAll, andis identical to your
Allmethod, just renamed.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
The reason for allowing specifically bool is that it is the same result as in the yield callback. Perhaps it should be dropped though, so that the function must return no results.
If we allowed arbitrary return types, I suspect there would too many false positives or misuse, such as a function that returns error being used with range and then the user not noticing the error.
Sorry for misreading Under vs All. I would be reluctant to establish a convention of calling the non-bool-returning push method All, since that's not the signature that Python and Rust's
allhas. We'd probably have to pick some other name.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Eachwould be consistent with ruby; though I preferRangeto make the correspondence with the builtin feature clear (in the case with no bool return).I do like the idea of reducing the number of possibilities, as that will help with the goal of consistency.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
In the context of a method on an iterable type accepting a predicate, I would expect
All,Each, andEveryto behave the same. In most languages, one of those is the idiom for "return true iff the predicate returns true for all/every/each element of the collection". IMO it would be more natural forRangeto be a function that enumerates a range (subset) from the collection. Using All, Each, or Every as the iterator is natural and intuitive to me.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
@rsc
To be clear, although I would prefer to drop the returning bool version, it's not terribly important to me one way or the other.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Would the 0 arity version of an iterator be allowed? What about in-line functions?
for range func() bool{ return true }{}?Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I think the answer to both questions should be yes.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I really like the idea of getting some form of standardized enumeration/iteration in Go.
For my 2 cents, I'd like to start with a as concise and explicit TL;DR summary of push/pull functions as I've understood them:
yieldfunction is called repeatedly by the push function, once for each item in the "collection".I generally like this, as it makes for a set of very small yet simple and flexible methods of enumeration/iteration over a collection.
Suggestion
Next, and feel free to disagree here, I'd like to suggest alternative names for push/pull functions:
Enumerator
My reasoning for "enumerator" is largely due to my history with Ruby, where any object can be made enumerable by simply defining a
#eachmethod that works very much like the push functions proposed here. (You should also include theEnumerablemodule to get#map,#inject,#select, etc., which all use#eachunder the hood.)Personally at least, the word "push" feels suggestive of pushing values into the collection. Hence when reading the code examples, I realized push functions works very different from the initial impression I got based on the name.
Iterator
As for "iterator", my reasoning is simply that it feels very similar to other types of iterator objects I've come across which may have
Next(),Prev()and similar methods. Except it's not an iterator itself, it is a singular "iterator function", that simply iterates to the next item each time it's called, and nothing else.Type safety?
The only thing I feel slightly uneasy about with these functions, is that I don't see how the type system could be used to reliably ensure a function given to range is a push or pull function, and not simply something completely different that has a bool as it's final return value, or a func arg with a bool as a final return value.
Range int
And finally, regarding range over ints, conceptually the wording of something like
range 7feels a bit forced to me.7is itself not something with a range. Something likerange 2..7feels less forced, and is more flexible too. But I assume that requires changes to Go's syntax.Though I personally feel fine about using the three-clause for loop on the rare occasion I need to loop N times. And that's despite my history with Ruby and its
7.times { |n| ... }and(2..7).each { |n| ... }stuff.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
TBH, I wonder if Russ added the integer thing as a duck.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
There are 19 instances in the standard library of methods that are pull functions:
Of these, two are arguably actual pull functions (reflect.Value.Recv and reflect.Value.TryRecv). The other 17 are accidental. One of the oldest (from Go 1) is regexp.Regexp.LiteralPrefix, which has signature:
So you could accidentally write
The loop would run either zero times or forever. This will be true of most "accidental" pull methods: if it's accidental, it probably returns the same thing every time you call it. Even the lightest testing seems likely to find this problem.
There are 7 instances in the standard library of methods that are push functions:
The first five are really all the same instance (Eval) and are accidental. Disallowing the returned bool would disqualify them. The last two are true push functions.
Accidental push functions seem to me far less common than accidental pull functions: plenty of methods take no arguments and return (T, bool). Very few take a callback returning a bool.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
range n is not a duck, although you are not the first person to ask that.
The 3-clause "count to n" really is a significant stumbling block for new Go programmers, and it is a remarkable number of tokens to explain, to do something so incredibly common. A quick scan looks like the majority of 3-clause for loops in the Go repo can use range instead:
The result holds up across projects:
Using range for the majority that do count from 0 to N would make the others stand out more as unusual in some way, which would be helpful when reading the code. Skimming the for3 files created by that script, I often noticed lines and thought "wait, what's wrong with my regexp? why is this here?" only to read more carefully and see that the line really isn't a 0 to N loop, in a way that I missed at first glance.
I do admit that range n seems very un-C-like, and that aspect surprises people. But I don't believe that means it is un-Go-like, any more than not using semicolons.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Good to see pull/push function signatures aren't that common.
I believe passing the wrong function to range would probably be pretty rare, but I do like the idea of push/pull functions by their nature explicitly being push/pull functions without the need to read accompanying documentation to make sure.
I'm not sure it's a good idea. But the only way I can think of to make their signature explicitly indicate they are pull/push functions, is to swap out the final
boolreturn type, with a new custom "iterator bool" style type.If we had a regular type in a
iterpackage for example, something like:Giving push functions the following signatures:
And pull functions:
I'm not sure a
iterpackage really fits with what's proposed here though, I merely used it as a means of easily showing conceptually what I have mind.The end result though of something like the above, is that pull/push functions become distinct within the type system from other functions which have a final bool return value. And it also makes it very obvious to developers that it's a push/pull function by just looking at the function signature.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I'm fine with the integer range. Doesn't seem like a big deal either way. It has precedent in Vue templates, which have
<element v-for="i in n">or<element v-for="i of n">. My main issue with it in Vue is thatinandofiteration behave differently in JavaScript, but the same in Vue templates, which is confusing. This doesn't apply to Go though.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
If you have
var f func() (T, error, bool)arefor v := range fand (less importantly)for range fvalid? Or do they need to befor v, _ := range fandfor _, _ := range f? Making them valid is more consistent with slices and maps, but possibly more error prone since they allow accidentally omitting error checking.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
That should be handled by
go vet, just likedefer reader.Close()orwriter.Write(...)are.Beta Was this translation helpful? Give feedback.
All reactions
C462Uh oh!
There was an error while loading. Please reload this page.
-
The number of range variables would be required to match the number of pull results (minus the bool) or the number of push yield arguments. Only slice and map would allow dropping the _.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Why special case slice and map in that regard?
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I think there is a definite improvement in discoverability and understandability of this feature if there is always a 1-1 correspondence between range-variables and returns from the function. If I know little about Go and read Go code and see all those
rangestatements and wonder what they do, it'll be hard enough as it is to map the dozen or so different forms of functions which can be used and what they mean. Throw into the mix that the number of range variables can also differ from the number of returns…I would even go so far as to argue that allowing different number of loop variables for maps and slices might have been a mistake.
for i := range someSliceis still a source of confusion and bugs, when people assume that yields values. If you'd always have to writefor i, _ := range someSliceorfor _, v := range someSlice, that would've been avoided, at the cost of a bit of verbosity.The one exception I could see is
for range x. Perhaps that should be allowed with any number of returns, as it looks sufficiently different.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
rangeis already inconsistent sincechans only have a single-value variant of it. I don't think it's too weird forfuncs to have their own rule for that, too.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I don't understand why it needs to be as complicated as this. Can't Go define the standard iteration interfaces that range will support in the stdlib, and give an order of precedence and/or chose the interface based on the declared range variables. All you need is the standard pull-type interfaces.
I "think" this is complicated because there is still a design concern about supporting general iteration over a map - requiring a push interface. I think this is easily addressed with built-ins, e.g. iter(somemap) that return one of the above declared interfaces. For non-builtin containers this is not an issue.
I don't see how "generators" align with the Go team's concerns over flow control (which seems to have been the major barrier to exceptions). "generators" (aka hidden threads or coroutines) are the magic that Go typically tries to avoid.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I'm not sure what scoped channels are exactly, but that sounds like a much bigger language change than push-style iterators.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I had envisioned scoped channels as simply tied to the creating routine. If that routine's references to the channel goes out of scope, the channel is automatically closed - essentially an implicit defer().
But I don't think it's really necessary. I don't think this is a real problem in practice. These leaked channels/routines are easily detected and then fixed in the design. In any long-lived system the monitoring will detect the leaks, and the overhead of the routine is fairly minimal in the meantime.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
As long as the range function is well behaved, the stack trace will be straightforward.
In either case, the stack trace is simple. If the iterator is a Map, the stack is range_body -> (*Map[K, V].All) -> run. If the iterator is a Tree, the stack is range_body -> (*Tree[K, V].All) ... -> run with All repeated a number of times depending on the recursion depth. The only thing that's not obvious about that is that the range body is a function. And unless the iterator does something like run the iteration in a separate goroutine, the stack trace will be simple.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Don't start in the range body - start at the function calling range - and imagine the debugger trying to step through the elements and the range body.
Based on the sample 'unranging' implementation it doesn't appear trivial to go backwards. A pull iterator seems trivial. Maybe each of the generated lines can be labelled with the source line but how the stack aligns is difficult for me to grasp - but I am sure I am missing something.
In most languages when you "compile for debug" it removes a lot of the optimizations because the debugger can't deal with it. This seems to require a level of code generation that won't be easy to work with.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I would expect the debugger to step into All if I tell it "step into" when the current line is
for range it.All. If I "step over" that line, I would expect it to transparently step through All until it calls yield or returns. If I "step into" the range statement during the middle of iterating, I would expect that to be equivalent to stepping out of the body function.for v := range it.AllIt's certainly quirky but it should work.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I like this idea. It solves a bunch of the confusion that arose trying to deal with an iterator
interface, and it also leaves it open to potentially add new function signatures later if something useful comes up. It also feels more consistent with the way that the rest of Go works by not relying on methods for a language feature, though it's definitely a bit strange in its own right in a completely different way.I thought a bit about how a general iteration package could be implemented around this, and I think the best option is to deal with push functions primarily. Everything else can be very easily and cheaply converted to them, so it seems like the most general, simplest form. Here's some examples, assuming that #49085 or something similar isn't adopted:
And so on. The composition of the push functions is kind of interesting, but it's a bit confusing to look at, I think. It might be easier with something like #21498, but I'm not entirely sure.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
"but it's a bit confusing to look at"
Yes, I think it is confusing to look at. It takes some concentration to follow. Returning from a function a function that takes a function argument, to be called by some calling function, is hard to follow without a lot of concentration.
I think the only way to make it easily readable is to do type definitions so we do not see all three functions at the same time.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
We're off on a tangent here, but I disagree.
The primary purpose of a standard iterator type would be as glue: one thing produces values, another consumes values, and the iterator type connects the two things. As far as this goes, the iterator type could be based around either push or pull functions.
But an iterator type based on push functions can't do anything else, at least not in any reasonable way I can see.
An iterator type based on pull functions can have other abilities, if the underlying source of values permits those abilities. For example,
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
An "iterator based on push functions" does not make sense. Push functions provide iteration, but they are not iterators in the sense of an iterator being an explicit object representing the state of an in-progress iteration (which is the meaning of "iterator" in almost every other language). I wrote above in a different comment:
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
For anyone coming along later, the comment in question: #56413 (reply in thread)
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I misread the comment I was reacting to as suggesting a form of iterator type. Sorry for the confusion.
"Push functions ... are not iterators". Yes. But given a push function, an iterator can be created from it. See
NewGenin #54245.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Probably a stupid question, but in most cases where iteration has no side-effects (on itself?) , can't a properly defined, closure-based yield function allow to create a generator/iterator/pull function?
Essentially extracting the iteration internal to the collection by iterating once and storing values in a slice or pushing them onto a channel etc?
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Ha that's one stupid question I guess. If the collection changes, the iterator would need to see the changes as well I guess. Nvm.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Your question is worth pondering - ISTM the really useful and interesting cases of
forwill be the ones where there's a lot going on behind the scenes. Persistent data structures would be a great example.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Just wanted to summarise my take on adding range F, F a push/pull func and range n, n an int, after digesting things further and some back and forth over a couple of sub-points in other threads.
First, thanks for setting up this discussion, it indeed addresses something missing in Go, a mechanism for custom iteration in an interesting way that extends the Go-like range specialisation over types to functions whose bodies roughly correspond to the body of a for loop, and over range n, n an int.
For range n, I think an approach which does not panic would be preferable. If it were the case, I'd be for it. If it were not, I'd lean toward not supporting it because it doesn't save much and the need to take into consideration possible panics would counterbalance the benefits.
I think custom iteration in general, that is by any means, should be taken slowly and with due diligence to, as the top doc says:
Custom iteration is often difficult to work with in other languages for this very reason. To me, one of Go's strengths is that there is not much custom iteration so that the loops all look the same. Support for a uniform mechanism for custom iteration would IMO make Go less uniform wherever it is over-applied or non convergent w.r.t. best practices. Finding a balance for where custom iteration helps vs hurts will be a long road, and guidance along the way would help.
For range F, the body of F corresponds roughly to the loop body via compiler translation of the loop body.
The 'roughly' part feels too rough to me: There are cases in which such a function would panic where it wouldn't in ordinary usage. The argument to push/pull is not really a full fledged func but it looks like it should be. The scoping differences as compared to range loops today are confusing, and are cited as a reason to take changing loop semantics as a prerequisite, even though the pre-declared
for _ = range(notfor _ := range) version has the original semantics. Should the loop semantics change, this would introduce non-uniformity in Go constructs where it is today uniform::=would no longer correspond to a single declaration within block boundaries{}. The for loop itself would behave more differently between the two versions,=,:=.I don't think custom iteration really needs the rough edges above. Personally, I'd find something along the lines below much simpler.
[the code below was edited]
Thanks for reading and your consideration!
[before edit, the incorrect code was]:
even in interface form
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I am slightly concerned about ambiguity when an "element" bool is returned (i.e. ambiguity of use, as called out by @rsc as "accidental iterators"). While these may not show up all that often in the stdlib, but I wonder if these may show up more frequently in community code.
For example:
The above would appear to be a valid iterator despite not likely being designed with iteration in mind. If designed to be an iterator, it'd probably look more like:
Side note: in practice, such iterators may always return true for the
morevalue.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Note that
for ok := range CheckHealth()would be a compile-time error (too many values), whilefor range CheckHealth()would be equivalent tofor CheckHealth(). The latter does suggest that 0-ary pull functions may be redundant, but it's not obvious to me that that's true of 0-ary push functions.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
iirc, the rationale for not including custom iterators in the language initially was to prevent hiding cost and side effects (thus decreasing the ability to reason about code).
Today, if I see
for range, I know that, except for iteration on channels, each iteration will be non-blocking, and will be exceptionally efficient (~constant cost for slices and maps, and bounded cost for strings). I know also that even with channels, the iteration will have no side effects. There are many classes of bug I might encounter where the cause simply can't be caused by the loop iteration, but with this proposal, that no longer would be the case.I wonder if the loss of these reasoning guarantees is truly outweighed by the convenience gained through this proposal.
This concern would be mitigated if we had distinct syntax of some kind, such as
for x := func iteratororfor x := range @iterator. I'm not suggesting a particular syntax, but would like to see us consider the idea of a variant syntax.This may also ease integration existing tooling, since I didn't see any behavior around omitted variables in the original proposal (i.e.
for i := range iteratorwhen iterator returns(int, string, bool)), and if we're forced to writefor i, _ := range iterator, there will be some tooling, for some time, that will likely complain that the use of the blank identifier is unneeded.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I would instead look at it as: the iteration behavior for built-in types was simple and fast. That would remain true with iterators.
The sending goroutine can mutate shared state or perform other side effects between sends.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
@willfaught
That's true, but it's a weaker property: the minimum overhead is remains fairly low, but the maximum overhead becomes unbounded. Further, where before the CPU overhead for a next-element (or channel receive) is consistent and negligible, with this proposal, the overhead can vary per iteration.
Consider, at some hypothetical future time, there's stable code which uses a custom iterator, which which has long been assumed by readers to use a builtin collection (a plausible misidentification case, since there's no syntactical difference in the proposal), then debugging can become an issue if that iterator, which generally has very tight performance per iteration, begins tripping on an edge case that unusually blocks for a long time.
My concern is that, based on that misassumption, the thing being iterated over may be one of the last parts of the code that is inspected to find the issue (because it's assumed the non-call expression being looped over couldn't possibly be the cause). Not all programmers review or keep up with language changes, and an identical-syntax change like this could end up resulting in one of those post-incident blog posts ("How Go magic iteration caused company X to have a 16 hour outage"). That hypothetical scenario could be avoided with visibly distinct syntax to signify custom iteration.
That's true, though it should be quite atypical, and contrary to "share memory by communicating" (if the calling goroutine wanted to share mutable state, then channels are not likely the appropriate mechanism.
In the general case, another goroutine (running in "parallel") could be mutating, without synchronization, a slice or map being iterated.
I specifically meant that loops themselves (specifically the evaluation of range expression) cannot cause side-effects today without a visible call: a visible call (with parens) sticks out as "something special may explicitly happen here, but only prior to iteration." With this proposal, even without parens, it's the case that "something special may implicitly happen here on each iteration."
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
@extemporalgenome I don't buy the scenario you are painting as a reason. Sure, it could happen, but I can conjure up similar scenarios for all kinds of code. For example, what if code is
it := myIter(); for { x, ok := it.Next(); if !ok { break } use(x) }, the reviewer checksmyIter's code and sees that it just abstractly over a slice and approves. And at some point, another engineer changes it to iterate over a channel, after all "that's just an implementation detail ofmyIter". And then, at some point, some edgecase…If we always assume the worst case scenario that could happen under a language change, we will never change the language. It's not a practical approach.
I'm not saying I'm not a little bit worried about the potential hidden extra cost. But I'm less worried about this than I'd be about, say, appropriating Python's
inoperator, which is usually assumed constant time but often is linear time.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
But the point is that there is a call there. However much you trust
it.Next()to not change behavior, there's still a clear syntactic boundary between core language behavior and user-defined code, and as suchit.Next()very clearly indicates that arbitrary side-effects or blocking may occur.Granted, a programmer can switch an iteration over map keys to be an iteration over a channel without any other changes, and that would support your point even without the function wrapping. That is a risk inherent in language, but not necessarily one we should carry forward to other use-cases.
Aside: as relatively infrequently as it is used (in part because select statements are often needed), I do wish channel iteration had a slightly different syntax, as it doesn't have the same properties as slice and map iteration (can block, can be unbounded). As such, personally, I do not treat channel iteration as compelling precedent for extending existing
rangesyntax to cover custom iterators.I'm not suggesting no change, just adding a warning sign to things that can be dangerous. For example, if the syntax were changed slightly to be any of the following:
(or even just dropping the
rangekeyword alongside any of the above)As long as identical syntax cannot be used for both builtin collection iteration and custom iteration, then there would be sufficient visual distinctiveness, without much typing cost, to make the code (and potential traps) much easier to reason about.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I understood your point. I just don't find it a compelling. If I can construct a similarly asinine scenario with the same consequences in a universe without syntactic differentiation, then it seems obvious that syntactic differentiation isn't really the issue with your example.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I believe there's too much magic in the variety of function signatures that will be accepted by the compiler. I'm thinking about what it would be like to teach this to new Go programmers, and I suspect there'll be a mystical aspect to this which isn't present elsewhere in the language (aside from unintended design consequences, like loop iterator bugs).
A new programmer may learn that they can iterate over any function which returns
(T1, bool)and(T1, T2, bool). Can they then iterate over a function which returns(T1, T2, T3, bool)? Why not? That's surprising (to a new Go programmer)!They also learn that they can iterate on integers! It sounds like you can iterate on almost anything. They might infer that they can iterate on
(T, int), where the int indicates the number of items remaining, since, orthogonally, iterating on functions, and iterating on integers could plausibly be combinable.At this time, I'd favor a variation on #54245 that does not allow signatures to vary in anything but type (i.e. accept
Next() (T, bool)but notNext() (T1, T2, bool)). If two-clause assignment is important, then where index/key value pairing is applicable, the element returned from Next would be considered an index/key, and aGet(K) Vextension method would be defined to permitfor k, v :=assignments, just asStop()was proposed as an extension method.Even without extension methods, I believe there's more value (following the introduction of generics) of a single precise [generic] method signature that is accepted, rather than a family of function [and potentially method] signatures.
If supporting functions, and not just methods, is critical, I'd be more comfortable with a single signature for push iterators, and a single signature for pull iterators.
I think it's also just fine to encourage modeling iterators after
bufio.Scanner, as that introduced a clean, predictable, and well-understood style and semantics.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Yes, I think some people have proposed to use a defined bool type instead as a signal. It also may help since the semantics of this boolean are a bit specific to iteration status.
And I am sympathetic to your view on push functions.
Using range on them doesn't seem to bring much but an alternate call syntax unless I'm mistaken.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I agree that there are far too many function signatures allowed here. As a general rule, Go prefers to explicitly convert to a new type or wrap one type with another in order to get something to have the features someone wants. I think all of the function signatures except for one and two value push functions should be removed, and then a new package should be added that has functions that convert from the other signatures, i.e.
funcs.FromPull(somePushFunc). Push functions are the most general and add some minor functionality that is not currently available in Go otherwise so I think they're the most important ones to add directly, but the rest seem unnecessary to me.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
@DeedleFake I feel like we're missing something (or being too hasty) if we say that push iterators are universal (or "the most general"). Having the iterator own the iteration loop can cause some awkwardness in a number of cases:
Certainly the above are all solvable/avoidable merely by not using sugared loops (or by not requiring a FromPull wrapping), but it does suggest there are usability issues with push iterators (they're not universally usable), and if we make them universal, people will tend to favor writing push iterators even if pull iterators would have been simpler or more appropriate.
Given that the most magical parts of the proposal are around push iterators (pull iterators don't need special defer/return behavior or the implicit transformation of a block into an anonymous function), the resultant language may be cleaner if we only solve for pull iterators to start with, and keep push iterators using explicit callbacks while we consider the impact of pull iteration in the wild.
If, for example, Go considered introducing concise lambdas with typeless parameters (i.e. only really usable for inline callbacks), that could solve push iterators, and other callback cases, as well as this proposal, albeit arguably with less magic.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
All three of the problems that you outlined are usability concerns from the caller's side. Because of that, you've convinced me that push functions should not be the default.
I still think it makes sense to only support one type and require explicit conversions for the rest, but since the conversion happens on the caller's end, the form that it is converted to should not be one that removes power from the caller. Therefore, I think it makes more sense for pull functions to be the default after all.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I disagree, what sucks about that isn't just the syntax for the closure but that break, continue, goto and return don't do what they should do.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
My current worry about adding pull functions as range arguments is that it's a backdoor way to add coroutines to Go. I feel like coroutines should either be first class (have a
yieldstatement like Python and maybe a different keyword in the declaration, likefunc F(In) (stream Out) { /**/ }) or they should be an implementation detail, like the iter.NewGenerator function and runtime channel operations in the old proposal. With this proposal, there's a whole new kind of procedure… but it only works if you use it in a range statement. In a way, that's more radical than "there's a new optimization, but it only applies in certain cases when the compiler is sure it's safe."I worry that it's an avenue for possible abuse, and people are going to do "clever" things, like make "Twisted Go" (a la Twisted Python) that simulate an async system with pull functions in order to avoid the Go runtime scheduler, for whatever reason. ISTM something as important as a new mechanism that lets you suspend a function without using goroutines shouldn't be tied to the range statement.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
@willfaught You are addressing the clarification of what my question meant. Do you have an answer to the question itself?
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
As I understand your question and the proposal, to get the coroutine behavior of suspending and resuming with panic propagation across boundaries, you have to do
range pfbecause if you doThe stack trace will look like outerfunc > Iter, but if you do the panic in a range loop (
for range myContainer.iter { panic("boom") }), the stack trace will look like outerfunc alone.If you convert the push func to a pull func with some library construct, then the stack trace will of course just have outerfunc, but that's because Iter will be off running in its own goroutine.
So, if you're in a range call and AFAICT only if you're in a range call, you can suspend Iter without spawning a goroutine and yielding to the scheduler.
Does that answer your question?
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
You seemed to be responding to #56413 (reply in thread), and asked a straightforward question about what it would look like, so that's what I addressed.
I agree that there's ultimately no difference. It's just an implementation detail. For all the user knows, the compiler actually will use a channel and real goroutine when converting from push to pull.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
It is my understanding that the only reason we'd need to change this is to make converting from push functions into pull functions more efficient (i.e. exactly the optimization described in the Appendix to #54245 - and which, AIUI, you consider the preferable alternative. In particular, you say:
But in terms of these optimizations, this new design does not actually differ from the old design. The old design called "pull functions"
iter.Iterand called "push functions" generators and suggested to name the conversion fromiteratorsgenerators intogeneratorsiteratorsiter.NewGenerator. And it suggested to apply the optimization to that case. None of that differs in the slightest from what we are discussing here. At least as far as I understand it.It is not my understanding that the compiler should allow conversions from push to pull functions or vice-versa or that it should do them automatically. But that the compiler should implement range over pull by straight forward iteration code, range over push by translating the loop body into an opaque
funcand that an iteration library could convert between the two as glue.Supporting
pushfunctions inrangedoes require some magic to transform a loop body into an opaquefuncvalue, but that has nothing to do with coroutines.I'm not sure. Your original statement was
I still don't understand this statement, in relationship to the idea of a builtin
yieldstatement/function. ISTM that such a coroutine would also only work in arangestatement. Or, at least, it wouldn't work in any more contexts than push/pull functions do. And the explanations so far don't seem to really contradict that. So I'm still a bit confused what you meant.Where you in fact referring to the "magic" translation of loop bodies into opaque
funcvalues? If so, I'd understand a bit better where you are coming from. Though I'd still be a bit confused about the criticism, because it seems fairly self-evident that this translation only happens forrangestatements.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
@carlmjohnson Your Javascript MDN example is exactly the sort of magical code that the go language has so far managed to avoid.
To each his own, but I sure hope Go does not go that route.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
In my opinion, yield is a complicated enough concept to cause a lot of bad incomprehensible code to appear, this suggestion provides only a syntax sugar for writing something, that is already more than possible in the language. I believe, this goes against a rule of
One problem - one solution.Please, let Go stay boring
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
So much this.
What happened between 2018 when Rob Pike said that adding more features would make the language bigger but less different and now?
I'm afraid that the current push to add more features is just going to turn the language into another flavour of the generic programming language.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Pardon me if this has already been discussed, but it occurs to me that push functions could be replaced by a simple variant of pull functions. I'm not sure whether this would be better than push functions (that's surely a subjective matter where opinions will vary), but it seems to me a quite reasonable alternative.
Let's consider just the case of loop yielding one values at a time:
for x := range something, wherexhas typeT. The case of two values at a time isn't different in any significant way. In the initial discussion, a pull function for this case has signatureWe could also consider pull functions with signature
The intent here is when called with
true, the pull function acts as the previous pull function, returning either (next value,true) or (arbitrary,false). When called withfalse, the pull function performs any cleanup necessary for the end of the loop. Afor rangestatement would call the pull function withfalsewhen the loop is terminated prematurely.Given a function
pullxwith signaturefunc(bool) (int, bool), we could writeand the compiler would translate this into something similar to
The initial discussion remarks that any push function can be automatically transformed into a pair of a
nextpull function and astopcleanup function. We can continue this to get a pull function of the new form:and this new function can be used as the object of a
for rangewithout any need for an explicit call tostop.Some questions around this:
pullx(true)returns (whatever,false), should thefor rangeloop callpullx(false)? I tend to think not, thatpullxshould do the cleanup when it returnsfalse. But I don't really care either way.for rangeloop callpullx(false)? If so, precisely when? What happens if both the loop body andpullx(false)panic? My initial feeling: yes, immediately on loop termination and before executing any deferred functions in the function where the loop occurs. And I haven't a clue what to do about the double panic.for rangeaccepts this version of a pull function, then there is no need for it to accept the original (no parameter) version of a pull function. Would we want to accept both, or just the new version?Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I think it would be fine either way. If we chose to allow
pull(bool)functions infor range, the spec would have to be clear about whetherfor rangecallspull(false)afterpull(true)returnsfalse, and writers of pull functions would adjust appropriately.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I think it would be more interesting what happens if the loop body panics. Presumably that should call pullx(false). Which means that should probably get defered by the range (answering the question above). But that would be the first time (I think) a language feature would implicitly defer something.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
It would be sort of like a defer, yes. But maybe not exactly. Consider
Should the call to
pullx(false)occur before or after the 11 calls to functions deferred by the loop body?Also consider that if the implementation of
for rangeusesdeferto callpullx(false), then that call won't happen until the containing function exits. But in the common case where the loop body does not panic but doesbreak,pullx(false)should be called immediately after the loop terminates, and before any other code in the containing function runs.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
To me, all of this seems like a decent argument that the control flow of
pullxis not as straight forward as it may seem.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Yes, I am now thinking that push functions would be a better choice. They also don't necessarily have simple control flow, but are sometimes easier to write.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
The ergonomics of push-based iterators seem nice, but I'm concerned it has a lot of corner cases to think about:
deferstatements too. E.g., suppose a recursive iterator like Tree[K,V].All contained a defer statement, as did the for loop that invoked it.I expect these questions don't directly matter to most users, but I think they're relevant to the compiler for how it desugars control flow statements. In turn, this is indirectly relevant to users because it could affect performance.
I think a lot of misuse (e.g., questions 1 and 2) could be cheaply caught by simply poisoning the closure's PC field after we don't expect it to be called any further.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
@mdempsky
I disagree. I think there is one obvious ordering: the calls deferred by the push function occur when the push function returns — which is after the caller finishes executing the last iteration of the loop and before the caller executes the first statement outside of the loop. (That is: the deferred calls occur when execution leaves the
for … rangestatement in the caller.)Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
- 1077E
I think my last paragraph in #56413 (reply in thread) was confused. The deferred calls aren't in some kind of global LIFO order — the deferred calls in each function are in LIFO order, and each function executes its deferred calls when the function returns (or halts via panic or
Goexit).Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I agree that's an obvious ordering, yes. It's the same one @DeedleFake suggested, for example.
I'm saying it's not obviously good: it means deferred calls no longer happen in strict LIFO order with respect to their corresponding
deferstatements.I periodically see tracing code written like
defer f()()wheref()pushes something onto a stack, and then the returned function is responsible for popping it off. This idiom becomes error-prone if we abandon LIFO ordering.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
When you say "global LIFO order," I hear an ordering across all goroutines within a process. I'm not suggesting that exists either.
But today we do maintain a strictly LIFO, per-goroutine stack of deferred calls: each
deferstatement pushes a call onto the goroutine's defer stack, andpanicandreturnare responsible for popping calls off the stack as necessary.The proposal here implies relaxing the "strictly LIFO" part of that. We can certainly do that, but I think it should be taken very seriously.
defer/panicare already very subtle, and the implementation today is quite complex and fragile.Ian points out the iterators could actually operate under the hood using two goroutines, which would cleanly address the implementation concerns around deferred calls. But it wouldn't have any performance advantages, since the API is synchronous anyway. So that seems like it would be pure overhead to me.
But as I also pointed out, I question whether users actually intentionally write
deferstatements insideforloops, intending for the calls to queue until function return. And if they don't, we can just disallow them in the presence of push-based iterators, which avoids the whole issue. We can always relax that restriction in the future if use cases present themselves.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
From #56413 (reply in thread)
The point is that there is only one order in which the deferred function calls can be executed that satisfies both existing language semantics and reasonable rules around iterating over push functions (as outlined in #56413 (reply in thread)).
Earlier, I argued for a sort-of converse of this, that if you thought the order was not determined, then you must intend to change existing language semantics. That might have been confusing; I apologize for that.
To take a concrete example, slight modified from #56413 (reply in thread)
Here the defer statements (not the deferred function calls) must be executed in this order:
Note that the defer statements from the push function and loop body are interleaved, even though the deferred function calls will not be (as we will see).
The function calls deferred inside the push function
iteroccur in LIFO order, and they occur wheniterreturns, which must be beforefmt.Println("middle")is executed. So we must have this sequence (possibly interleaved with other deferred function calls, so far as we know at this point in the argument):But by existing language semantics, the function calls deferred within
mainmust occur in this sequence:Since
fmt.Println("middle")occurs at the end of one subsequence and the beginning of the other, there is only one way they can be combined:And we see that function calls deferred in the push function and the loop body cannot be interleaved, even in the absence of an explicit rule against interleaving.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Note: this is out of sequence because it was sent via e-mail rather than added to the discussion thread.
I agree that the "iter" and "loop" messages must appear before the "end" message. What I said was that the interleaving of the "iter" and "loop" messages could, perhaps, be unspecified. That is, while the "iter" messages must appear in the obvious order, and the "loop" messages must appear in the CFEA obvious order, it's unspecified whether the "loop 0" appears before or after "iter 0", etc.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Mea culpa. I forgot that replying by e-mail has that undesirable side effect. I'll try to remember in future.
Sorry if I wasn't clear. My opinion is the "iter" messages must appear before the "end" message, and the "loop" messages must appear after the "end" message. So the "iter" messages are separated from the "loop" messages by the "end" message, and no intermixing is possible. Not because of implementation details, but because of language semantics that it would be too confusing to change.
Consider this snippet (with no "iter" messages):
Currently, no matter what the type of
whatever, ifxtakes the values 0, 1, and 2 in that order, then this must printIf I understand you correctly, you are suggesting that in the one special case that
whateveris a push function, the output could also be, for example,or
or many other possibilities.
I find this a startling departure from the current state of affairs.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Sorry for misunderstanding. But your concern is not what I'm suggesting. I agree that whether or not you use a push function the order of the defer statements in your example is unchanged.
What I am saying is that if the push function itself uses defer statements, then the order in which those defer statements, the ones in the push function, run, compared to the order in which the defer statements executed during the loop are run, is unspecified.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I really like how this proposal provides a unified syntax (
range) for both internal (push) and external (pull) iterators.But, as someone who reads (reviews) much more code than I write, my only (non-blocking) concerns are about the complexified mental model around a
for rangeloop because of the explosion of the possible types and underlying hidden complexity and cost.So far when I see the following loop:
I only have to determine if X is an
array, aslice, astringor amap. Asarrayis similar tosliceand range overstringis quite rare in business code and quickly identified by the context, the question is usually more between 2 alternatives:slice/arrayormap. I can usually reply to that question using the func scope around the loop.However by introducing push/pull range iterators, the number of possible types will explode. And even more, the cost of each iteration style will be much more varied: a user-defined iterator might have some bugs or performance issues that I don't expect from built-in iterators. The risk of hidden panics will also explode (so far, no panic on iterating on a nil slice or nil map). My existing review tooling (
git diff, GitLab Merge requests viewed in browser) that doesn't provides type information inline will become insufficient if I can't easily determine the iterator construction.That is a case where this added syntactic sugar will ease write more concise Go, but increase mental load of human readers.
Range over plain integers (
for i := range 5) as suggested as a later step would make it even worse: infor e := range arg.Elementsthere is a huge difference in block behavior ifElementsis anintvs a[]string.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
database/sql.Rowsis mentionned as an interator example, but I think that isn't one that will benefit from this proposal. Errors may happen while iterating or inside an iteration callback, and this proposal doesn't handle that case.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
A bit off-topic as this isn't about the push/pull proposal, but as I'm mentionning
database/sql.RowsI wanted to mention some experiments I did over the years around simplifying iterating over it.As a heavy user of
database/sql.RowsI have written my own external iterator around it with the following signature:However I almost never use it myself because:
database/sql.Rowsmethodssql.Rowswhere we have to handle runtime errors)sql.Rowsis the call torows.Scanwhere you have to pass pointers to target variables (forgetting&is a common beginner mistake), and that wrapper was still not encapsulating it.I have gone further with a more general
sql.Rowsiterator in my packagegithub.com/dolmen-go/sqlfunc(see ForEach and Query) going towards encapsulating theRows.Scancall but its heavy use ofreflectmakes it perform badly. I had started some work on a code generator (to move type introspection to ago:generatephase in order to avoid use ofreflect.Value.Call), but I paused this in 2021 while waiting for generics.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
That could look something like this, right?
Admittedly not a big improvement over the current
for rows.Next(), but at least more uniform. You have to remember the error check either way.If we adjust the interface somewhat, could that help?
I don't know, but I like that the proposal gives us this option. Now you can build whatever adapter you like.
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
@carlmjohnson This is more verbose and less efficient than:
And you forgot to check for
rows.Err()after the loop, and this is necessary in both versions.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
I'm proposing a different API, which would not require the extra error check.
row.Scanwas not a typo forrows.Scan. It's a new type that represents a single row. It's unlikely that such an API change will happen however because it would be somewhat redundant with current API. Maybe if there's ever a database/sql/v2.Beta Was this translation helpful? Give feedback.
All reactions
Uh oh!
There was an error while loading. Please reload this page.
-
For Range over ints, would using the same syntax as for slicing subscripts be more "Go-like"? E.g.
for i := range [:n] {...}We could use other start-points, e.g.
for i := range [2:len(a)],or require an explicit
break, e.g.for i := range [:].Beta Was this translation helpful? Give feedback.
All reactions