Skip to content

refactor: Remove dyn Any usage in BufferElem #1672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Gelbpunkt
Copy link
Member

This gets rid of the Box<dyn Any> in BufferElem and makes it a thin wrapper around Vec<u8>.

@Gelbpunkt
Copy link
Member Author

Seems like this depends on #1671, I didn't compile check it when splitting up my branch.

@Gelbpunkt Gelbpunkt marked this pull request as draft April 7, 2025 14:54
@mkroening mkroening self-assigned this Apr 7, 2025
@mkroening mkroening self-requested a review April 7, 2025 15:05
where
T: Any,
{
pub fn pop_front_deserialize<T>(&mut self) -> Option<Box<T, DeviceAlloc>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are giving up some type safety here. As long as the alignment matches, we can cast any buffer into a box of any type (or a vector in the case of pop_front_vec), right?

Copy link
Member Author

@Gelbpunkt Gelbpunkt Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't really have any "type safety" before. The behaviour change is that previously, it was e.g. possible to have [BufferElem::Sized, BufferElem::Vector] (pseudo-code) and then calling pop_front_vector would return None, while now it won't complain and instead happily return you a Vec<u8>. That isn't necessarily wrong and all conversions here are safe. Deserializing into the required type by the caller is left to the caller, we only ensure that the deserialization is sound.

If you had the order wrong before, e.g. receive_packet would return None, which is arguably wrong since it would not really detect that there was an issue in the code. Now, it lets you treat the buffers as what they are (buffers), and you cannot really "mishandle" it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the downcasting operation would check the variant (and the type ID in the case of the Sized variant) of the box during downcasting and would error if a dynamic box of incorrect type was provided (The vectors are a special case as Rust did not [and probably still does not] allow us to handle Box<[u8]>s as Box<dyn Any>).

I think it is better to catch early when we are using a buffer of incorrect type. For example, we could forget to pop the header of a read operation and try to pop a vector from it directly. Currently, this will result in an error as the read buffer should be of the Vector variant but the first buffer is BufferElement::Sized. With the proposed changes, a vector also containing the VIRTIO header will be returned. If that buffer is passed on to the network stack, for example, the error will happen there as what should only be the frame will contain the VIRTIO header content that is not expected and cannot be handled by the network stack. Errors in the networking stack are particularly annoying as transmission errors are expected and result in repeated attempts instead of outright errors.

Another scenario would be the queue returning the incorrect transfer to the caller because of a buffer ID mix-up. The type ID check can allow us to catch early when we are given the result of an open operation, for example, when we were expecting a write result. Because the header type IDs are different, the downcast would return None.

Of course, such hypotheticals depend on us making mistakes in our code but we do make mistakes and early error catching precautions make debugging much easier.

T: IntoBytes + Immutable,
{
fn from(value: T) -> Self {
Self(value.as_bytes().to_vec_in(DeviceAlloc))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a copy that we were previously able to avoid.

@cagatay-y
Copy link
Contributor

Is there a particular reason why we want to get rid of the Box<dyn Any> usage?

@Gelbpunkt
Copy link
Member Author

Is there a particular reason why we want to get rid of the Box<dyn Any> usage?

My motivation for this is twofold. First, a Vec<u8> more closely represents what this actually is: a (partially initialized) buffer that later gets deserialized into e.g. a header of some kind. It makes the code easier to reason about, and viewing things as buffers rather than "heap allocations of any type" makes even more sense if we start using zerocopy more, since that operates on buffers. Second, dyn Trait is an antipattern. Calling methods on dyn Trait objects requires a virtual function table lookup and usually they prevent compiler optimizations. While that shouldn't matter too much here, I think dynamic typing should be avoid where possible, so if we can get rid of it, we should.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants