Skip to content

Conversation

@LukasKalbertodt
Copy link
Member

This discussion PR is for everything that is not covered by more specialized PRs.

To add new comments, go into the "files changed" tab and comment on individual lines or sections of lines. Or you can comment on existing discussion threads here. If you feel like you want to start a discussion about a broader topic (than something mainly referencing a few lines), consider open a new discussion here.

At most, it stores a path relative to the configured `storage.dir`, but potentially in an even more implicit way.


(¹) File system = local file system, or NFS, or S3 storage or potentially others.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are so bold, maybe make S3 more like the default location 🙈?

Do we still differentiate between asset manager and delivery for locations? If we mix the two, this can be difficult for configuring serving files. If we just put everything in one storage.dir and expose that to the webserver, how to check what files should not be allowed to be served (e.g. source files).?


## Promised properties

This data model promises certain properties about certain fields/data, for example: "there is a non-empty title", "this is an array of strings" or "the duration matches the duration all tracks".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the duration matches the duration all tracks"

I strongly disagree with that statement. Not all tracks in a "media package / events / whatever this is now called" should be required to have the same duration. In practice, not only is this strictly not true (e.g. stopping could be executed at slightly different times resulting in a difference of milliseconds to seconds), but we have use cases in Opencast where various tracks of different lengths are uploaded (e.g. concat partially recorded videos, video conference recordings, intro- outro if you don't use themes, ...). I don't see a problem with the way this is done right now where every track has its own duration. For Tobira, you will instrument the player with a specific track. Then just use the duration of that track.


Ideally, there shouldn't be a separate `search` endpoint anyway, but rather have the search feature be part of the external event API.
As an API user, I don't care what indices or data structures Opencast uses to give me the data.
And now that we use ElasticSearch/OpenSearch, there is no reason why there are nodes that couldn't perform that search.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The external API actually uses OpenSearch and you can also search, but it's different from the search endpoint.

In general, I agree. But I also see valid use cases for having (internal) APIs that surface data differently e.g. for Tobira or the Admin UI. However, this should probably be an exception if there is an actual reason.

- `NonBlankAsciiString`: A `NonBlankString` that is also restricted to only using ASCII characters.
- `Label`: a `NonBlankAsciiString` that only consists of letters, numbers or `-._~!*:@,;`. This means a label is URL-safe except for use in the domain part.<sup>(2?)</sup>
- `ID`: a `Label` that cannot be changed after being created.
- `Username`: TODO define rules for usernames
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


:::danger[Open questions]

- (1?) Java famously has no/bad support for unsigned integers. Decide how to deal with that: do we just give up one bit or do we require proper unsigned usage via `Integer.*Unsigned` methods? Either way: these values must never be negative!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use validation annotations? So yes give up one bit.

- Arrays as arrays
- Tuples as arrays
- `Map<string, string>` is serialized as object
- `DateTime`: as ISO 8601-compatible formatted string. The ISO standard actually allows a number of different formats by ommitting parts of the string. Opencast shall format all date times as either `YYYY-MM-DDTHH:mm:ss.sssZ` or `YYYY-MM-DDTHH:mm:ssZ`, i.e. only the sub-second part is optional. The parts on this format string are best described in [the ECMAScript specification](https://tc39.es/ecma262/multipage/numbers-and-dates.html#sec-date-time-string-format) (which again, is a subset of ISO 8601). Only thing of note: `Z` could either be literal `Z` or a timezone offset like `+02`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why allow variability? Maybe always require sub-seconds and UTC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mh I guess the subsecond thing is fair, the only reasons for the possible omission I can come up with could also be used to argue for the possible omission of seconds.

The timezone is different tho: yes, when you just want to communicate an instant, always UTC is a good idea. But sometimes you want to communicate the timezone as well, so that the frontend could for example show "this lecture starts at 10am local time (which is 2pm your time)". For that to work, the frontend need to know what the local time is. Of course we can argue whether that is ever important, but to me it feels semantically correct to store stuff like bibliographical date with timezone. Sure, technical timestamps like updated/modified: always UTC, attaching the server timezone makes no sense.

This was referenced Apr 1, 2025
Comment on lines +77 to +79
Ideally, there shouldn't be a separate `search` endpoint anyway, but rather have the search feature be part of the external event API.
As an API user, I don't care what indices or data structures Opencast uses to give me the data.
And now that we use ElasticSearch/OpenSearch, there is no reason why there are nodes that couldn't perform that search.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KatrinIhler said:

Don't mind merging /search & /api in general (but will break a loot of integrations :(), but

  • search is read-only & can theoretically run on multiple nodes, the external API currently isn't and can't

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhhh. We need to talk about the technical details (where I lack lots of knowledge), but I would certainly prefer if the API is designed just with "nice API" in mind and not leak internal implementation details like services.

Comment on lines +14 to +15
- `read`: generally, gives read access to an entity
- `write`: generally, gives write access to an entity (changing or deleting it)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KatrinIhler said:

Imo we could even say that write needs/implies read, as we do in a couple of places already

ACLs control access to Opencast entities.
An ACL is simply a list of `role` + `action` pairs.
An entry gives users that have that particular `role` the permission to perform the specified `action` on the entity.
Both, `role` and `action` have the [type `Label`](./types).<sup>(1?)</sup>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KatrinIhler wrote:

Imo roles need to allow the same characters as username unless we do matching of role and user in a different way altogether. Sanitation is the devil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants