Skip to content

TypeDesc Rust Bindings #4643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

scott-wilson
Copy link
Contributor

Description

This is my initial stab at creating an (official?) wrapper for OIIO in Rust. The current goal is to tackle one module/class at a time and get it to 100% (sys crate, API crate, tests, and docs) before moving onto the next.

Tests

Testing is in progress, but the tests should only cover if the wrapper is doing what it is supposed to. If C++ side has an error, then it'll not be considered an error in the wrapper.

Checklist:

  • I have read the contribution guidelines.
  • I have updated the documentation, if applicable. (Check if there is no
    need to update the documentation, for example if this is a bug fix that
    doesn't change the API.)
  • I have ensured that the change is tested somewhere in the testsuite
    (adding new test cases if necessary).
  • If I added or modified a C++ API call, I have also amended the
    corresponding Python bindings (and if altering ImageBufAlgo functions, also
    exposed the new functionality as oiiotool options).
  • My code follows the prevailing code style of this project. If I haven't
    already run clang-format before submitting, I definitely will look at the CI
    test that runs clang-format and fix anything that it highlights as being
    nonconforming.

@virtualritz
Copy link

virtualritz commented Feb 19, 2025

I think a Rust binding ideally feels like it isn't a binding but an API of any well-executed crate (pure Rust).

If a user is confronted with the oddities (from a Rust perspective) that the shortcomings of the original languange dictate because they're forwarded in the wrapper, you failed on the developer UX part of writing a wrapper.

In that sense, the TypeDesc Rust API in the PR does not feel very Rust to me. Please take this with a grain of salt: I'm quite opinionated about this as I've been writing production Rust code full time for over four years now. 😉
I care about this so much that I spent a measurable amount of my time smoothing out APIs of dependencies I directly interact with and opening PRs for these.
Because many Rust people care about this, large parts of the actively maintained and used Rust ecosystem feel 'uniform' (or even 'wholesome', to misqoute John Carmack who spoke about the language itself, when he used that term, I believe).
But keeping it what way is an uphill battle as people coming to Rust from other languages often don't have a feeling for what this entails (yet).

What I did in my binding:

  • None variants in the C++ enums were stripped and expressed by wrapping the whole sum-type on the Rust side in an Option. I.e. There should not be a BaseType::None, instead the TypeDesc::base_type should be Option<BaseType> .

    #[repr(C)]
    pub struct TypeDesc {
        pub base_type: Option<BaseType>,
        pub aggregate: Aggregate,
        pub vec_semantics: Option<VecSemantics>,
        pub array_len: Option<ArrayLen>,
    }

    I also collapsed BASETYPE::UNKNOWN into Option<BaseType>::None.

    The reason being that as while there is a difference it seemed to me that handling both cases comes down to the same thing: if you don't know the width of the BaseType, how do you do anything meaningful with the data?
    Or: is this edge case important enough to warrant keeping an Unknown variant? I guess @lgritz has to comment. :)
    If Unknown is important it can be added later without breaking actual downstream code as BaseType is marked as #[non-exhaustive].

  • C++ type names are mapped to Rust type names. Because C++ type names are just confusing to people with no C++ background (i.e. sticking to them not only makes the API look like C++ code expressed in Rust, it is plain rude IMHO as you force people, especially non-native speakers, to learn yet another set of acronyms).

    #[non-exhaustive]
    #[repr(u8)] 
    pub enum BaseType {
        U8 = 2,
        I8 = 3,
        U16 = 4,
        I16 = 5,
        U32 = 6,
        I32 = 7,
        I64 = 9,
        U64 = 8,
        F16 = 10,
        F32 = 11,
        F64 = 12,
        String = 13,
        Ref = 14,
        UstringHash,
    }
    

    Otherwise where do you stop? Does the binding also ship with stuff like

    #[allow(non_camel_case_types)]
    type float = f32;

    And then use float in any public signature, instead of f32?

    That should sound as ridiculous as it does but it should make the above rationale about renaming type name variants to their Rust equivalents easier to follow.

    On that note, the variants in your binding violate Rust API Guidelines). I.e. UInt8 (wrong) vs. Uint8, UStringHash (wrong) vs UstringHash. Curiously there is Uint16 which has the correct capitalization. 😁

    P.S.: You may have noted that Ptr was renamed to Ref. I believe the API should not expose any way to interact with raw pointers. I.e. the pointer type on FFI side needs to be wrapped into a reference with a known lifetime on the Rust side and a lifetime-tracked reference on the Rust side needs to get converted into a pointer at the FFI boundary that upholds the lifetime gurantees, in the way it is exposed to the FFI by the binding code.

  • I also did small changes to variant names to reflect common practices in Rust land.

    Almost all linear algebra crates use MatN/MatrixN where N means N×N. I.e. nalgebra::Matrix3, glam::Mat4, mint::ColumnMatrix2.

    Only when dimensions vary is the 2nd dimension expressed as in nalgebra::Matrix3x2.

    As such the resp. types in Aggregate are called Matrix3 and Matrix4 in my binding, not Matrix33 and Matrix44.

  • VECSEMANTICS::NOXFORM & VECSEMANTICS::NOSEMANTICS where mapped to Option<VecSemantics>::None (see above).

@scott-wilson
Copy link
Contributor Author

Fair points. I think the hard part at the moment is that the Rust API should match the C++ API as much as possible (there's a change that I'm going to revert in my code where I've changed what would be an i32 to a usize in some interfaces). If I were to make those changes, then that would have to be with the blessing of the OIIO team.

@lgritz
Copy link
Collaborator

lgritz commented Feb 19, 2025

I 100% understand where you're coming from and admire the sense of purity you're striving for. There are even things about TypeDesc I wish I'd done differently in C++. But I think that redesigning every bit of our APIs is not the mission this time around. For stage 1, we just want to wrap what we have in as direct a way that we can while still being acceptable Rust. Stage 2 might be a rethink, with the 20/20 hindsight of experience using it (including the C++ side getting more experience with Rust and bringing good ideas back to the C++ side).

Otherwise where do you stop? Does the binding also ship with stuff like

#[allow(non_camel_case_types)]
type float = f32;

And then use float in any public signature, instead of f32?

No, of course not. The names of the language's types are what they are.

The names of our enum values aren't chosen to be identical to type names in C++ (for example, there are no C++ language types called UINT8 or HALF), and the names we used for Python didn't change (it's still called STRING for our Python APIs, we didn't call it STR to match Python's str type), so I don't see why we should totally overhaul them for Rust. I do see value in having consistent names (when possible) across the language bindings, as well as consistent with the command line utilities that have no exposed language at all (e.g., oiiotool in.tif -d uint16 -o out.tif).

As for TypeDesc in particular, again I am not necessarily criticizing the theoretically good ideas you have for a (maybe better) abstract design, but I am weighing them against some concrete design values at this time:

  • I think there's value -- for now -- in having as much of a direct pairing of names and underlying representations as possible between the different language bindings.
  • I anticipate a future where we not only have Rust bindings, but in fact where we are a bit fluid about which parts of OIIO are Rust-wrapping-C++ and which are C++-wrapping-Rust. So some of these fundamental types that we pass around a lot might make the jump from Rust to C++ and back quite frequently, and I'd like that to be low overhead. Sure, there will be some cases where we have to fully disassemble the parts of a C++ class and reassemble them as Rust, or vice versa. But TypeDesc seems like one of those places where it's possible to have bit-for-bit parity between the languages for the data representation (i.e. NO overhead to pass it from one to the other), and only the class methods need to be different.
  • The design of TypeDesc included the desire to keep it specifically as requiring 64 bits. What's the actual size of your proposed TypeDesc?

@virtualritz
Copy link

virtualritz commented Feb 20, 2025

I 100% understand where you're coming from and admire the sense of purity you're striving for. There are even things about TypeDesc I wish I'd done differently in C++. But I think that redesigning every bit of our APIs is not the mission this time around. For stage 1, we just want to wrap what we have in as direct a way that we can while still being acceptable Rust. Stage 2 might be a rethink, with the 20/20 hindsight of experience using it (including the C++ side getting more experience with Rust and bringing good ideas back to the C++ side).

This is a mistake, IMHO.

Apart from deeming the approach wrong, it's also more work meaning more people are needed or stuff will just take longer. I'm urging you to check progress ASWF Rust wrappers have made in the last 2--3 years. More or less none. Who will do this stuff? This bit maybe the pink elephant in the room, even though it's OT here. It was what I was trying to convey in the thread in Slack and may have failed at.

No, of course not. The names of the language's types are what they are.

So are the uppercase enum variant names. They are capitalized names of the POD types and used as such throughout the entire ecosystem of crates when a POD type has to be conveyed through an enum variant. That's what I use in BaseType. I.e. I did not come up with this. This is a Rust standard.

The names of our enum values aren't chosen to be identical to type names in C++ (for example, there are no C++ language types called UINT8 or HALF), and the names we used for Python didn't change (it's still called STRING for our Python APIs, we didn't call it STR to match Python's str type), so I don't see why we should totally overhaul them for Rust. I do see value in having consistent names (when possible) across the language bindings, as well as consistent with the command line utilities that have no exposed language at all (e.g., oiiotool in.tif -d uint16 -o out.tif).

I don't think you understand how uniform the Rust ecosystem is and how much people adhering to one set of common names and naming conventions is part of that.
What I do not understand is how the fact that C++ doesn't have good enforced standards for POD type names is a reason to then force the ones you choose to be the same in Rust.

I think adoption has higher value than consistency accross languages. And adopters in Rust land, outside of the tiny, narrow field of VFX, will not be people who care about matching API in Python/C++. Because these are two of the most shunned languages by Rust people. If you ignore that you're just saying you care more about consistency than adoption.

What you get if you go down this road is just not what a "Rust binding" is in Rust land from my experience of using such bindings for years now.
It's will be a weird frankenbinding (not in the cool way of frankenrender, mind you 😉) between a low level -sys FFI binding (i.e. the auto-generated part) and safe binding that tries to match C++ as closely as possible disregarding completely how Rust is so much better than C++ even for a simple struct like this.

  • I think there's value -- for now -- in having as much of a direct pairing of names and underlying representations as possible between the different language bindings.

I disagree. I think it's moot to write a Rust binding then. See also below re. the ArrayLen type.

For representations it should be "as necessary" and for names it should be "as possible". Which means best practices and offcial guidelines for the target language take precedence.

  • I anticipate a future where we not only have Rust bindings, but in fact where we are a bit fluid about which parts of OIIO are Rust-wrapping-C++ and which are C++-wrapping-Rust. So some of these fundamental types that we pass around a lot might make the jump from Rust to C++ and back quite frequently, and I'd like that to be low overhead. Sure, there will be some cases where we have to fully disassemble the parts of a C++ class and reassemble them as Rust, or vice versa. But TypeDesc seems like one of those places where it's possible to have bit-for-bit parity between the languages for the data representation (i.e. NO overhead to pass it from one to the other), and only the class methods need to be different.

I think given the time typical OIIO operations take the from()/into() conversions for such a type should never trump decisions about developer UX. They won't show in any statistics. This is btw exactly what we're seeing with the stats on the use of our binding.

But unfortunately, that's not what you're saying here.

  • The design of TypeDesc included the desire to keep it specifically as requiring 64 bits. What's the actual size of your proposed TypeDesc?

96bit. The reason is the last part, ArrayLen which is defined like so:

const I32_MAX: usize = i32::MAX as _;

#[derive(Copy, Clone, Debug, Default, Eq, PartialEq, Hash)]
#[repr(i32)]
pub enum ArrayLen {
    Specific(Refinement<u32, ClosedInterval<1, I32_MAX>>),
    #[default]
    Unspecific = -1,
}

This is ensures you can only store unsigned values between 1 and i32::MAX in the Specific variant, the Unspecific one is self-explanatory.
I.e. the compiler and Rust's type system guarantees that the API is used correctly.

Those possibilities are exactly the reason you use Rust. If you replicate the C++ API that requires you to take care yourself to not store negative array lengths or make sure you use the right magic number: why make the effort to write a Rust wrapper at all?

I.e. to hark back to what you said above that "there is value" in this: I don't see it.

Lastly, all this also means I won't be able to contribute to the Rust wrapper, which is a pity.

I have very little time to work on the OIIO stuff and my mandate is that the people I work with can use it, without hand-holding and with foot guns removed, where possible.
They're all Rust devs, zero of them are C++ or Python devs, they have never used OIIO and they do not have a strong VFX background.

I.e. they're exactly the kind people representative of the biggest group of users of an OIIO wrapper in Rust land, in the future. 😉

@lgritz
Copy link
Collaborator

lgritz commented Feb 20, 2025

Maybe I don't understand the implementation of Option. How much extra space is an Option<T> versus a T?

Just picking the arraylen part to focus on, is there a reason why the following won't work (excuse me if my syntax is wrong, but I think you'll know what I mean):

#[derive(Copy, Clone, Debug, Default, Eq, PartialEq, Hash)]
#[repr(i32)]
pub enum ArrayLen {
    NotArray = 0,                                           // <-------------- my addition
    Specific(Refinement<u32, ClosedInterval<1, I32_MAX>>),
    #[default]
    Unspecific = -1,
}

which as far as I understand is bit-for-bit the same as the arraylen field of the C++ struct,
and then have the Rust TypeDesc contain an ArrayLen instead of an Option<ArrayLen>?

@scott-wilson
Copy link
Contributor Author

scott-wilson commented Feb 20, 2025

Maybe I don't understand the implementation of Option. How much extra space is an Option<T> versus a T?

I can comment on the Option<T> type. This depends on the type being wrapped. For example, Option<u8> would be 2 bytes (one for the u8, and the other for the Option enum variants since Rust uses tagged tagged unions behind the scenes). But, if you have a type such as Option<NotNull<T>>, then it'll only be the size of a pointer because Rust knows that NotNull can use the 0 value as Option::None, and every other value would be Option::Some(&T) (Don't know how, though).

@virtualritz
Copy link

virtualritz commented Feb 21, 2025

Maybe I don't understand the implementation of Option. How much extra space is an Option<T> versus a T?

As Scott wrote, it depends. In general, wrapping an enum in an Option doesn't change the size if the compiler sees it can stash the variant in unused bits.

Consider:

use std::mem::size_of;

#[repr(u8)]
enum MyEnum {
    Foo,
    Bar,
    Baz,
}

assert_eq!(size_of::<MyEnum>(), size_of::<Option<MyEnum>>());

Because of that, in the case at hand, the TypeDesc enum in my binding w/o ArrayLen is actually just 32bits, despite the Options:

#[repr(C)]
pub struct TypeDescTest {
    pub base_type: Option<BaseType>,
    pub aggregate: Aggregate,
    pub vec_semantics: Option<VecSemantics>,
    pub _reserved: u8,
    // No ArrayLen.
}

assert_eq!(size_of::<TypeDescTest>(), 4);

Just picking the arraylen part to focus on, is there a reason why the following won't work (excuse me if my syntax is wrong, but I think you'll know what I mean):

#[derive(Copy, Clone, Debug, Default, Eq, PartialEq, Hash)]
#[repr(i32)]
pub enum ArrayLen {
    NotArray = 0,                                           // <-------------- my addition
    Specific(Refinement<u32, ClosedInterval<1, I32_MAX>>),
    #[default]
    Unspecific = -1,
}

which as far as I understand is bit-for-bit the same as the arraylen field of the C++ struct, and then have the Rust TypeDesc contain an ArrayLen instead of an Option<ArrayLen>?

Just to clarify again: the size increase does not come from the Option-wrapping in the case at hand.

To answer the question: because adding a None (or NotArray) variant or that wouldn't be Rust any more. It would be C/C++ and you'd just express it in Rust syntax. Option is part of the language so to speak.

Dozens of patterns that are common is Rust will not work with ArrayLen because they would need to be re-implemented.

For example:

let my_vec = my_type_desc.array_len
    .and_then(|array_len| {
         // Do something applying to arrays.
    })
    .ok_or_else(|| {
         // Do something else when it's not an array.
    })?;

That is Rust. Rust may look familiar to C/C++ when you are e beginner but it looks quite a bit different once you start writing idiomatic code. Above is a completely made-up example but just check the methods available on Option.

Whenever you have something that can error, you wrap it in a Result and whenever you have something that can be invalid/null/don't apply/etc, you wrap it in an Option. This is canonical Rust. This also means anything that is made to work with Option (or Result) will then work with that (wrapped) type.

Adding a None (or NoArray) variant is simply not Rust. It's C/C++. It will result in code that looks and feels like C/C++. The above snippet will just be a bunch of ifs. Procedural code that, after some months of Rust, which is very functional and hence concise, will be difficult on the eyes.

In the case at hand the size increase comes from the fact that the Specific variant stores a 32bit value. The compiler can't understand that some bits are not used.

Rust enums have a hidden discriminant which is the number of the variant (see Scott's comment -- they're essentially tagged unions). That will take at least another 8bit and gets padded to 32bit for alignment. So the whole ArrayLen enum is 64bit.

To cram it into 32bit, the Specific variant could be made 16bit only wide but then we would not map to the C++ API well any more (array lengths of 65k max seem like a footgun in waiting too):

use refined::{boundable::unsigned::ClosedInterval, Refinement};
use std::mem::size_of;

const U16_MAX: usize = u16::MAX as _;
const I32_MAX: usize = i32::MAX as _;

pub enum ArrayLenSmall {
    Specific(Refinement<u16, ClosedInterval<1, U16_MAX>>),
    #[default]
    Unspecific = -1,
}

pub enum ArrayLen {
    Specific(Refinement<u32, ClosedInterval<1, I32_MAX>>),
    #[default]
    Unspecific = -1,
}

assert_eq!(size_of::<Option<ArrayLenSmall>>, 4);
assert_eq!(size_of::<Option<ArrayLen>>, 8);

TL;DR: None variants instead of wrapping in an Option will make any Rust dev cringe. Not only does it 'feel' wrong. You also can't write canonical code any more unless you implement all the methods of Option on your enum yourself. And that would still not be the same because the enum still isn't wrapped. So stuff like unwrap_or_else() can't be re-implemented (or let's say you could implement it but it would be a lie because there isn't anything to unwrap).

@lgritz
Copy link
Collaborator

lgritz commented Feb 21, 2025

Leaning into using Option, I think that maybe we should just eliminate the basetype NONE/UNDEFINED enum value entirely. When we don't know or don't want to specify a type, we don't want ANY of these fields, so it seems to me that the right way to use Option is that some functions or data might have an Option<TypeDesc> or Option<BaseType>, rather than have the basetype field within TypeDesc be an Option. Any time you don't have a basetype, you shouldn't have any of the typedesc.

I'm on the fence about VecSemantics (which I would like to rename Semantics, since it has long since grown beyond just describing the transformation semantics of a vector). As a general design, I can see the merit of an Option, but also I'm not yet convinced that it it's a bad thing to use an enum value. I dunno. If Scott also agrees with you, then I guess I'm ok it for now and we'll see how it goes.

So, to summarize where I'm at on all the issues:

  • basetype: I prefer no Option in the struct, but let's eliminate the NONE/UNDEFINED from the enum to make the option unnecessary.
  • vecsemantics: rename as Semantics (lose the "Vec"), and I have no strong opinion on whether the struct holds an option. I wouldn't, but if you and Scott both agree that it's more idiomatic Rust and it doesn't add to the size of the struct, I'll go along with it.
  • arraylen: this one really bothers me, but if Scott agrees with you that it's much better to have this be an Option in the struct, I'm willing to begrudgingly go along with it.
  • Optional type: I'm in favor of liberally using Option<TypeDesc> elsewhere in the code in places where we want to indicate that there is no type or no preference. I think this is better than having an Option<BaseType> inside the struct.

On the wholesale renaming of the BaseType enum tags: hard no from me on that one. They don't match "C++" in their other uses, so they don't need to be converted to match Rust names here. I understand that if Rust was the only/first binding we had, maybe we would have chosen names that matched Rust nomenclature (and in that alternate universe, I would similarly not let the C++ bindings deviate from the established names). On this matter, I put more weight on having uniform nomenclature between C++, Python, Rust, command line utils like oiiotool, and concepts we talk about in the docs, even if it makes any or all of the individual language bindings slightly less familiar.

Separate issue: I have sometimes wished that on the C++ side, I had made the data fields private and added access methods. Does idiomatic Rust prefer accessors or direct struct member use by client code? If the former, I sure don't mind doing it here, and it's another way to allow us to make the API a bit more insulated from any back-and-forth about how we represent the data underneath.

@scott-wilson
Copy link
Contributor Author

Just going to quickly summarize in my own words with my thoughts.

  • The sys crate should still have the APIs that match what's happening on the C++ side a bit more, since that's sort of the translation layer.
  • BaseType: Replace the BaseType::NONE/BaseType::UNDEFINED with Option<BaseType> in all inputs and outputs where it makes sense to say "This may or may not have a defined base type". That seems fine to me.
  • Drop the Vec from VecSemantics. Works for me.
  • arraylen: I'm honestly confused on this one. I think that if we want to have the arraylen be guaranteed to be 0 - i32::MAX, then it would probably make more sense to go the direction of something like https://doc.rust-lang.org/std/num/type.NonZeroI32.html. This would probably be useful in other parts of the API such as getting/setting widths, heights and other int like things. Or, does -1 have a special meaning as well?

Also, side note/question that I realize that I forgot to mention. I've dropped fields such as BaseType::Int because (if my C++ knowledge is accurate), this is an alias of BaseType::Int32. Rust doesn't allow this, so I went with whatever gave more information to the developer. If this is an issue, then I can see what the options are. Also, I swapped from BaseType::INT32 to BaseType::Int32 because otherwise I'd get warnings from the compiler and clippy. My rough philosophy with the Rust side of the API is "Follow what the C++ API looks like, unless I can't, get warnings in the compiler/clippy, or should use Result<T, E>".

As for your question about getters/setters vs directly accessing the attributes, I've heard arguments for both sides. If I recall, the main negative for the getters/setters is I've heard that it can mess with the borrow checker (take this with a grain of salt, my memory on this is fuzzy at best). Personally, I prefer having all of my attributes private, and having specific getters and setters that control how the user of the API accesses my data. But, there are some cases where it makes sense to have direct access to the attributes, such as with the options struct that @virtualritz mentioned when you have optional arguments for a function.

@lgritz
Copy link
Collaborator

lgritz commented Feb 21, 2025

arraylen: I'm honestly confused on this one. I think that if we want to have the arraylen be guaranteed to be 0 - i32::MAX, then it would probably make more sense to go the direction of something like https://doc.rust-lang.org/std/num/type.NonZeroI32.html.

Aha, the docs say

This enables some memory layout optimization. For example, Option<NonZeroI32> is the same size as i32:

use std::mem::size_of;
assert_eq!(size_of::<Option<core::num::NonZeroI32>>(), size_of::<i32>());

So presumably, under the covers, it's using the 0 value to indicate that the option is None?

You know what? I'd be ok with that as the solution to the arraylen issue. It satisfies my inclination to keep the underlying representation the same size and layout, even if the language expression is different between Rust and C++.

This would probably be useful in other parts of the API such as getting/setting widths, heights and other int like things. Or, does -1 have a special meaning as well?

We never have negative image dimensions. I think it's probably ok to use unsigned for that rather than signed.

Also, side note/question that I realize that I forgot to mention. I've dropped fields such as BaseType::Int because (if my C++ knowledge is accurate), this is an alias of BaseType::Int32.

Correct, and that's fine. I prefer the names that include the size even on the C++ side.

@lgritz
Copy link
Collaborator

lgritz commented Feb 21, 2025

Or, does -1 have a special meaning as well?

For arraylen, on the C++ side, we use -1 to signify "it's an array, but we don't know the length". I'm not 100% sure that this is used in OIIO anywhere, but we certainly use it in OSL, which also uses OIIO's TypeDesc.

@virtualritz
Copy link

virtualritz commented Feb 22, 2025

arraylen: I'm honestly confused on this one. I think that if we want to have the arraylen be guaranteed to be 0 - i32::MAX, then it would probably make more sense to go the direction of something like https://doc.rust-lang.org/std/num/type.NonZeroI32.html. [...]

I'm not sure how that will help for array_len. The range of NonZeroI32 is -i32::MAX to i32::MAX without0/-0.
So you can store any negative 32 bit integer in there (including -1, which has special meaning, i.e. you could even create the other enum variant by doing this, very bad).

The ArrayLen enum I proposed above upholds all invariants of the C++ API. NonZeroI32 would still allow you to create a bunch of meaningless values on the Rust side, i.e. NonZeroI32::new(-42).is_some() == true. What does array_len == -42 convey?

That's precisely why I used refined.

@virtualritz
Copy link

Separate issue: I have sometimes wished that on the C++ side, I had made the data fields private and added access methods. Does idiomatic Rust prefer accessors or direct struct member use by client code? If the former, I sure don't mind doing it here, and it's another way to allow us to make the API a bit more insulated from any back-and-forth about how we represent the data underneath.

I would use private struct fields if a user can easily create invalid combinations of public fields. And they could be mixed too, ofc.
Same reasoning: developer UX always should come first. The setters would make sure invalid combinations can't be created.

If there are no invalid combinations, go with public struct fields. The code setting the struct is commonly easier on the eyes than when using setters.

The middle way is a builder pattern. Then you need to decide if the struct allows building itself or if there is a separate builder structs whose build() method produces a TypeDesc.

@scott-wilson
Copy link
Contributor Author

Sorry, to be specific, I wasn't saying that we should use NonZeroI32 specifically, but that it would make sense to have a type like it for the array length. The API might look something like this:

pub struct ArrayLength(i32);

let array_len = ArrayLength::new(1)?;  // This will return a result if the number is less than 1 or 0, whichever we prefer.
let undefined_array_len = ArrayLength::undefined();  // This will return ArrayLength(-1)

But honestly, I need to give this more thought. But, at the end of the day, if we make the interfaces like the below, then this should be a bit moot.

pub fn new<A: TryInto<ArrayLength>>(base_type: BaseType, agg: Aggregate, semantics: VecSemantics, arraylength: A) -> Result<Self, OiioError> {...}

@virtualritz
Copy link

This would probably be useful in other parts of the API such as getting/setting widths, heights and other int like things. Or, does -1 have a special meaning as well?

We never have negative image dimensions. I think it's probably ok to use unsigned for that rather than signed.

You would absolutely do that on the Rust side anyway. Incl. use of NonZeroU... types when the value should exclude 0.

It's one of the forever WTFs in C/C++ land, the use of int for stuff that by definition (i.e. API use error) can never be negative.

@virtualritz
Copy link

Sorry, to be specific, I wasn't saying that we should use NonZeroI32 specifically, but that it would make sense to have a type like it for the array length. The API might look something like this:

Yeah, and that type is what goes into the Specific variant. You can give it a name (which I omitted in my sample code for brevity):

const I32_MAX: usize = i32::MAX as _;

/// This *is* your type for the array length:
pub type Len = Refinement<u32, ClosedInterval<1, I32_MAX>>;

#[derive(Copy, Clone, Debug, Default, Eq, PartialEq, Hash)]
#[repr(i32)]
pub enum ArrayLen {
    Specific(Len),
    #[default]
    Unspecific = -1,
}
pub struct ArrayLength(i32);

let array_len = ArrayLength::new(1)?;  // This will return a result if the number is less than 1 or 0, whichever we prefer.
let undefined_array_len = ArrayLength::undefined();  // This will return ArrayLength(-1)

Ouch. I think I'm giving up now. Apart from the fact that this should be TryFrom and not new(): these things are not the same. Your version is just not Rust. Why have an undefined() function when you can have an enum variant expressing the same thing? What crate ever uses an API like this???

This is not only about making this feel natural to a Rust developer. While the refine crate is young the approach I chose for my ArrayLen will allow for some static optimizations very soon too (pending an issue, see e.g. here).

@virtualritz
Copy link

On the wholesale renaming of the BaseType enum tags: hard no from me on that one. They don't match "C++" in their other uses, so they don't need to be converted to match Rust names here.

@lgritz I don't understand the rationale here. The users of the API are Rust developers. They know these tags by heart. They do not know the ones on the C++ side.

@lgritz
Copy link
Collaborator

lgritz commented Feb 24, 2025

@lgritz I don't understand the rationale here. The users of the API are Rust developers. They know these tags by heart. They do not know the ones on the C++ side.

There are at least three distinct sets of people I am considering here:

  1. Experienced Rust developers new to OIIO, who will only use the Rust APIs, and can be blissfully unaware of existing OIIO terminology or the other language bindings. They are probably best served, in some sense, by starting with a fresh slate and not caring about anything that came before their arrival on the scene.
  2. Current users of OIIO who may or may not have much Rust experience (some will, some won't) but would like to start developing new software in Rust but are currently hamstrung by all the usual foundational libraries they use in VFX not having Rust bindings. These people will probably have to develop both C++ and Rust based apps for the foreseeable future. It's probably fine with them having any working Rust bindings at all, and probably would prefer at first to make maximal what they already know about OIIO nomenclature and APIs.
  3. The current and future maintainers of OIIO, who would appreciate the simplification of common nomenclature to use (and document) across all the tools and bindings. They are best served by being able to have simple, consistent nomenclature and documentation that are as consistent as possible across the array of OIIO facets and uses.

While I admire your aspirations for what we could eventually be for the Rust ecosystem, I think of group 1 as only a minor constituency for the next couple years. I think and hope they will become more important in the future, but today they don't even know we exist.

@virtualritz
Copy link

I think this is a lost cause for me. I don't know how to say this better: I don't think there is enough Rust experience here, especially understanding of the type system and how big a role unified naming plays in the accessibility of the language's ecosystem, to make these calls.
And I mean that really in a kind and caring way.
The short version is that I can't help with these aspects because of the lengthy discussions that therefore ensue.
As I said before: I am very limited with time I can contribute to this. The only group I get paid to care about when doing any OIIO wrapper stuff is people in group 1.
Making people in groups 2 & 3 feel fluffy is a non- goal for me. And I believe it's an inroad to getting a bad wrapper.
If you treat the Rust binding like "just another binding" you will get just that.
What you won't get is a Rust binding.
👋

@scott-wilson
Copy link
Contributor Author

Currently blocked by dtolnay/cxx#1519 - It looks like some of the functions are passing the BaseType as the wrong value (some incredibly large number instead of a more reasonable value) on the C++ side. This might be a bug in CXX or not.

@lgritz
Copy link
Collaborator

lgritz commented May 19, 2025

Just as a tip: I don't find "merging" to be very useful in a long-lived feature development branch. Personally, I prefer to "git rebase main" periodically to move feature branch commits to be simple, linear, and on top of the current master, rather than end up with a complex tangle of multiple branch-to-branch merges from different points in the past.

- Remove some test code
- Disable Rust building on `ICC` and `ICX`

Signed-off-by: Scott Wilson <[email protected]>
…b dirs via environment variables

Signed-off-by: Scott Wilson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants