Skip to content

Conversation

@Ivorforce
Copy link
Member

@Ivorforce Ivorforce commented Oct 22, 2025

Adds zip_shortest and enumerate iteration helpers.
If they're needed, I'll take this PR out of draft.

Note that I'm not 100% sure we actually want this, due to its trade-offs.

How to use

constexpr float a[5] = { 1, 2, 3, 4, 5 };
constexpr int b[5] = { 2, 3, 4, 5, 6 };

for (auto [ai, bi] : zip_shortest<float, int>(a, b)) {
    print_line("a[i]: ", ai, ", b[i]: ", bi);
}
constexpr float a[5] = { 1, 2, 3, 4, 5 };

for (auto [idx, ai] : enumerate<size_t, float>(a)) {
    print_line("a[", idx, "]: ", ai);
}

Trade-off

There are some downsides to the implementation:

  • It's fairly complicated "C++ magic". This can be hard to understand and may impact compile time.
  • C++ structured binding does not support explicit types. It uses auto.

That being said, there are upsides:

  • Both enumerate and zip work very similarly as in other languages (e.g. python, Rust), and should be instantly familiar.
  • Use of these helpers encourages using iterators, which is often more efficient than index based iteration.

Notes

Part of the implementation adds std::get support to our Tuple. This would also allow code like this:

Tuple<int, bool, float> tuple;

// Direct access
int a = tuple.get<0>();
int b = tuple.get<1>();
float b = tuple.get<2>();

// std-like access
int a = std::get<0>(tuple);
int b = std::get<1>(tuple);
float b = std::get<2>(tuple);

// Destructure
auto [a, b, c] = tuple;

We may wish to merge this separately, although there is no explicit reason to do it (and destructuring uses auto anyway).

@AThousandShips
Copy link
Member

AThousandShips commented Oct 22, 2025

I'd argue that it depending on auto is a very good reason not to do this, it's awful for readability and increases cognitive load evaluating small segments of code

@Ivorforce
Copy link
Member Author

Ivorforce commented Oct 22, 2025

I'd argue that it depending on auto is a very good reason not to do this, it's awful for readability and increases cognitive load evaluating small segments of code

I don't have as hard of a dislike against auto as other maintainers, but given our discouragement of it, I might agree.

It would also be possible to use explicit Tuple instead. That would just require removing the destructuring operator, but ends up with fairly complex types:

const float floats[5];
for (Tuple<size_t, const float &> tuple : enumerate(floats)) {
    // ...
}

If we don't want auto, I may prefer #111492. But unfortunately, it would be incompatible with zip.

It's a real shame that C++ destructuring does not support explicit types!

@AThousandShips
Copy link
Member

(Separated to a followed comment)

My issue with auto is that it removes local type specificity, which makes understanding code locally hard:

Take for example this:

// In some header file, the grandparent class of the class we're using.
Vector<Foo> my_foos;

// In the source itself, with no direct idea of what `my_foos` contains, without searching up the hierarchy.
for (auto [idx, val] : enumerate(my_foos)) {
    // How to know what `val` is if I'm just reading this code?
}

Contrasted with having the type explicit, where you don't have to go hunting for the type to know what it does

@Ivorforce
Copy link
Member Author

My issue with auto is that it removes local type specificity, which makes understanding code locally hard:
[... example ...]
Contrasted with having the type explicit, where you don't have to go hunting for the type to know what it does

Note that modern IDEs help a lot with this:

SCR-20251022-nbpy

But yea, it wouldn't help with GitHub's code viewer, or for people that don't use an IDE.

@AThousandShips
Copy link
Member

AThousandShips commented Oct 22, 2025

This preference and concern comes in quite a big part from doing review where being able to tell what's going on is important to be able to evaluate what's going on in a bit of code, and a lot of just looking at the code to understand it would be done just looking at the code listing on GitHub I'd say

But where explicit typing can be achieved using Tuple for example tat would be very useful, but I think trying to maximize "locality of understanding" is an important goal to make the codebase accessible, especially since a lot of code has quite deep inheritance, this also isn't limited to local variables or class variables but also cases like:

for (auto [idx, var] : get_something_from_great_grandparent(foo)) {
   // ...
}

Edit: And to be clear I used to use auto a lot in my own projects, because it's so convenient, and while I think we should stick to a pretty absolute rule on it generally in review to not get bogged down in discussions about where to draw the line, I do think it would be acceptable in cases where the return value is obvious, like auto *node_3d = Object::cast_to<Node3D>(obj);, but for those cases it saves little in the way of typing, so it's not worth it IMO

@Repiteo
Copy link
Contributor

Repiteo commented Oct 22, 2025

Hmmm… There might be a way to sidestep auto (or at least, its disadvantages). We could potentially take advantage of class template deduction. This is a paradigm we don't use currently (though we probably could), which allows templates to infer their typing from the arguments. In theory, we could leverage the opposite effect: making a deduction result in an invalid template.

I played around with this a bit locally; here's a rough example of what I mean:

template <typename T, bool EXPLICIT = true>
class ForceExplicit {
	T _value;

public:
	ForceExplicit(T p_value) :
			_value(p_value) {
		static_assert(EXPLICIT, "Template must be explicitly defined");
	}
};

template <typename T>
ForceExplicit(T) -> ForceExplicit<T, false>; // User-defined deduction to invalidate implicit construction.

void test_explicit() {
	// auto test_implicit = ForceExplicit(0); // Fails compilation, static assertion triggered.
	auto test_explicit = ForceExplicit<int>(0); // Will compile as expected.
}

The hope is that this would allow us to have our cake and eat it too, where we can use this neat syntax that forces auto but without the ambiguity that would come with it:

// Hypothetical implementation
Vector<Foo> my_foos;

// for (auto [idx, val] : enumerate(my_foos)) {} // Fails compilation, static assertion triggered.

for (auto [idx, val] : enumerate<size_t, Foo&>(my_foos) {} // Will compile as expected.

@Ivorforce
Copy link
Member Author

Ivorforce commented Oct 22, 2025

Your suggestion has the additional advantage that types can be coerced on return (like in regular for loops):

const int ints[5];

// Will all work
for (auto [idx, val] : enumerate<int, int&>(ints) {}
for (auto [idx, val] : enumerate<int, const int&>(ints) {}
for (auto [idx, val] : enumerate<int, int>(ints) {}

I like it!

@AThousandShips
Copy link
Member

AThousandShips commented Oct 22, 2025

I'd say that does a lot to reduce the issue with type clarity, though I'd say it still creates an ambiguity in enforcing no-auto that makes review and using the feature harder, though I still have the same concerns over hiding implementation raised in the other PR

@AThousandShips
Copy link
Member

AThousandShips commented Oct 22, 2025

I'm also concerned that zip is a bit obscure in its behavior that can easily trip people up, it is less obvious that it means the same as:

for (int i = 0; i < MIN(a.size(), b.size()), ++i) {
    av = a[i];
    bv = b[i];
    // ...
}

The hidden implementation details might throw people off as it's far less clear what it does when iterating differently sized containers, by hiding the implementation details

I'd also be interested in knowing what the potential impact of using templates so much is on compilation performance and binary size, the templates should be inlined but that's a potential hazard, and especially the compilation time of working out the templates and substituting them

Also reiterating my concern from the other PR that this should come as a solution to existing code needs, would be good to have some cases where this would be used in the current codebase to evaluate the need and demand for it

@Repiteo
Copy link
Contributor

Repiteo commented Oct 22, 2025

I think the concern of hiding implementation details is valid, though I would argue that could be applied to the majority of core templates. We could make it a requirement that PRs adding core templates need an associated documentation PR as well. That way these ambiguous cases would have something concrete for us to point towards, with the niches they fill having associated examples and "best practices"

@Ivorforce
Copy link
Member Author

Ivorforce commented Oct 22, 2025

I've pushed a change that includes Repiteo's suggestion.

I also share ATS's concern that zip's behavior with different sized iterators. I've renamed the function to zip_shortest, which is aligned with Rust's zip_longest.
Another solution would have been to DEV_ASSERT that they're the same length, but that would seem a bit extreme to me?

I'd also be interested in knowing what the potential impact of using templates so much is on compilation performance

Yea, this will definitely be harder to compile than regular iterators. However, the cost most likely comes when it's used, so it would likely be confined to a few dozen cpp files (assuming we don't zip in header files). I think the practical impact will be small, but it might be worth testing.

and binary size, the templates should be inlined but that's a potential hazard

gcc and clang are very good, so I expect they will optimize this perfectly. MSVC will probably be "good enough" (as always).

@Ivorforce Ivorforce changed the title Add zip and enumerate iteration helpers. Add zip_shortest and enumerate iteration helpers. Oct 22, 2025
@AThousandShips
Copy link
Member

The difference is that we don't really have any other templates like this, we have containers that are complex data types, but this is a simple wrapper or helper that has very little obvious meaning in the specifics IMO

I think the practical impact will be small, but it might be worth testing.

I think practically any performance overhead for compiling for helper types is too much, especially since the readability is arguably worsened at least personally

If they come at a cost, and have to be used sparingly, that feels like it defeats the purpose, especially without good examples of where to use things

@Ivorforce
Copy link
Member Author

Ivorforce commented Oct 22, 2025

The difference is that we don't really have any other templates like this, we have containers that are complex data types, but this is a simple wrapper or helper that has very little obvious meaning in the specifics IMO

We have a few helpers of varying complexity in core headers. Some examples include typedefs.h, variant_internal.h, type_info.h and method_ptrcall.h. There's probably more that I missed.

But yea, this would be the first one for iteration, which is already quite nebulous in C++ imo.

I think practically any performance overhead for compiling for helper types is too much, especially since the readability is arguably worsened at least personally

Readability is definitely a matter of perspective. I'd argue that it's easier to read if you accept it as a blackbox implementation (like language features), or if you are quite familiar with it already.
But I'd agree it's harder to understand if you want to know the specifics, but haven't dissected the implementation yet.

If they come at a cost, and have to be used sparingly, that feels like it defeats the purpose, especially without good examples of where to use things

If we were to introduce them, I would fully encourage their use in all cases they can be used. I don't think there's a need to use them "sparingly".

@AThousandShips
Copy link
Member

I based that on the following which seemed to be saying that the cost would be negligible in practice because they were used sparingly:

However, the cost most likely comes when it's used, so it would likely be confined to a few dozen cpp files

But I'd say that indeed it would be good to:

  • Evaluate that potential compile time impact of using these extensively
  • Get some tangible examples of where these would be used, to ensure there is established cases where to use them, especially for zip_shortest, to start from solving an existing problem rather than adding convenience helpers for potential future use, having them in the codebase would also greatly improve the ease to learn to use them with practical examples to reference

@Ivorforce
Copy link
Member Author

Makes sense, what I meant was "I don't think it will be useful very often" rather than "I don't think we should use them very often".

But I'd say that indeed it would be good to:

  • Evaluate that potential compile time impact of using these extensively
  • Get some tangible examples of where these would be used, to ensure there is established cases where to use them, especially for zip_shortest, to start from solving an existing problem rather than adding convenience helpers for potential future use, having them in the codebase would also greatly improve the ease to learn to use them with practical examples to reference

Fully agreed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants