-
Notifications
You must be signed in to change notification settings - Fork 118
Expose desktop capturer #725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I will give this a closer look and testing this week. Big thanks for adding a new example program! |
|
EDIT: Nevermind, I checked out the wrong branch. |
|
How do I build the example? I don't see any documentation how this build system works and libwebrtc-sys's build.rs is quite complicated. |
|
That build_linux.sh script fails. I'm puzzled what it's doing. It seems to be downloading a bunch of stuff including a Debian image or something?? And building Abseil... does libwebrtc depend on Abseil? Then it fails to find a header from glibc. |
|
I think I need to rebase #648 to get this to build... this build system needs some work. |
|
After 3 days of yak shaving, I got libwebrtc to build locally with #730. I had to make a few changes to this branch to get it to build, so I made a PR for your fork: gethopp#2. Now I am wondering what LIVEKIT_URL to use for development. Is there a test server I can use? Or a tool for running a test server locally? |
|
@Be-ing you can either
|
|
I tested and this works on KDE Plasma Wayland! 🎉 I will do more extensive testing on different desktops, Wayland and X11, tomorrow. |
|
@Be-ing thanks for testing it. I haven't found time during the week to check X11. I will do it during the weekend. |
21c54ef to
c60d7be
Compare
|
Could you rebase this branch on the main branch to incorporate #731, add the new example to the workspace root Cargo.toml, and set |
|
I opened another PR for your fork: gethopp#3 |
|
I've tested this on: and confirm both desktop and window capture work on all of them. |
examples/screensharing/src/main.rs
Outdated
| let source_type = if args.capture_window { | ||
| DesktopCaptureSourceType::WINDOW | ||
| } else { | ||
| DesktopCaptureSourceType::SCREEN | ||
| }; | ||
| let mut options = DesktopCapturerOptions::new(source_type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I like this API better, nice idea. It's a bit odd that the WINDOW and SCREEN enum variants are all caps though.
webrtc-sys/build/src/lib.rs
Outdated
| if cfg!(target_os = "linux") { | ||
| files.push(desktop_capture_path); | ||
| } else if desktop_capture_path.exists() { | ||
| files.push(desktop_capture_path); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conditions of this if-else block don't make much sense to me. Shouldn't this just check if the path exists regardless of target_os?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just got lazy here. I will update the scripts for every platform.
examples/screensharing/src/main.rs
Outdated
| let stream_width = 1920; | ||
| let stream_height = 1080; | ||
| let buffer_source = | ||
| NativeVideoSource::new(VideoResolution { width: stream_width, height: stream_height }); | ||
| let track = LocalVideoTrack::create_video_track( | ||
| "screen_share", | ||
| RtcVideoSource::Native(buffer_source.clone()), | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused what the right thing to do is here for the resolution. Unless I'm missing something, libwebrtc doesn't expose an API to get the resolution of a DesktopCapturer::Source; you can only get the resolution of the captured screen/window from the DesktopFrame passed to the callback after the stream has started. Nor is there a cross-platform way to change the resolution of the VideoTrackSource after it is created. There is a method on DesktopCapturer void UpdateResolution(uint32_t width, uint32_t height) which is guarded by defined(WEBRTC_USE_PIPEWIRE) || defined(WEBRTC_USE_X11) though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe someone from Livekit has some idea about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Digging deeper into the NativeVideoSource code, it seems the requirement for the video resolution is for the scaling performed by VideoTrackSource::InternalSource::on_captured_frame. I was wondering if this was a requirement imposed by libwebrtc somehow, but in that case, it wouldn't make sense that libwebrtc doesn't provide an API to get the resolution before starting the stream. If I understand that correctly, then I think we can add a method to NativeVideoSource to change the VideoResolution after the NativeVideoSource is created and call that in the callback. Or better yet, we could add a method to VideoTrackSource that takes a DesktopFrame and takes care of this automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what is the problem here TBH.
You should know with which resolution you want to share your stream (of course you need to do change the dims to match the aspect ration of the source to avoid distortions), which of course most of the times will be different from the capture res.
In platforms that allow you to choose your source without a system picker you should be able to get the source's dims before starting capturing. When you are using the system picker and you can't know which source the user selected you can simply start publishing your track after capturing has started.
IMO the only improvement that could be done here is to change VideoTrackSource::InternalSource::on_captured_frame to scale the buffer to match resolution_ (with the aspect ratio change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should know with which resolution you want to share your stream
How can you know that without any idea what the resolution of the source is? There's no way to know in advance if you're capturing a 200 x 200 window or a 7680 × 4320 screen. Up/down scaling either of these to some arbitrary assumed sized is not going to look good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, I started looking into this because the current screen capturing code in Zed assumes the resolution is known in advance and sets the stream dimensions based on that. That assumption is true for the current capture mechanisms (the scap crate and macOS APIs), but not for libwebrtc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Digging more into the code, I'm getting more confused. I don't understand the point of calling AdaptedVideoTrackSource::AdaptFrame in VideoTrackSource::InternalSource::on_captured_frame because the AdaptedVideoTrackSource's VideoAdapter is never really configured with a VideoSinkWants... there's just this code with a default constructed VideoSinkWants:
void VideoTrack::add_sink(const std::shared_ptr<NativeVideoSink>& sink) const {
webrtc::MutexLock lock(&mutex_);
track()->AddOrUpdateSink(sink.get(),
webrtc::VideoSinkWants()); // TODO(theomonnom): Expose
// VideoSinkWants to Rust?
sinks_.push_back(sink);
}When the AdaptFrame call does change the resolution, it always outputs a square resolution (height and width equal) regardless of the input resolution.
Another confusing bit of code I found is this in libwebrtc::NativeVideoSource::new
livekit_runtime::spawn({
let source = source.clone();
let i420 = I420Buffer::new(resolution.width, resolution.height);
async move {
let mut interval = interval(Duration::from_millis(100)); // 10 fps
loop {
interval.tick().await;
let inner = source.inner.lock();
if inner.captured_frames > 0 {
break;
}
let mut builder = vf_sys::ffi::new_video_frame_builder();
builder.pin_mut().set_rotation(VideoRotation::VideoRotation0);
builder.pin_mut().set_video_frame_buffer(i420.as_ref().sys_handle());
let now = SystemTime::now().duration_since(UNIX_EPOCH).unwrap();
builder.pin_mut().set_timestamp_us(now.as_micros() as i64);
source.sys_handle.on_captured_frame(&builder.pin_mut().build());
}
}
});With desktop capturing, we're calling on_captured_frame when the DesktopCapturer's callback is invoked, so it doesn't make sense to me to have livekit_runtime running this loop. And I definitely want more than 10 FPS for screen captures. I also don't understand why livekit_runtime is a dependency of the libwebrtc crate.
I found that code because I tried setting the VideoResolution for NativeVideoSource::new to 0,0 to test this code in VideoTrackSource::InternalSource::on_captured_frame:
if (resolution_.height == 0 || resolution_.width == 0) {
resolution_ = VideoResolution{static_cast<uint32_t>(buffer->width()),
static_cast<uint32_t>(buffer->height())};
}However, I have to comment out the call to livekit_runtime::spawn above because I420Buffer::new asserts that the width and height aren't 0, so this code I pasted from VideoTrackSource::InternalSource::on_captured_frame is effectively unusable.
@theomonnom as the author of much of this code, can you explain what's going on here? What is really the purpose of the VideoResolution passed to NativeVideoSource::new?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, the NativeVideoSource's resolution is read in livekit::LocalParticipant::publish_track. So I think what is needed is a new API to change the resolution of an existing track after it has been published.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you are using the system picker and you can't know which source the user selected you can simply start publishing your track after capturing has started.
After stepping away and rereading this, yeah I think that could work. I'll give it a try.
An API for updating the resolution after capturing has started is still needed because windows can be resized during capturing, though I don't think that needs to be done immediately for this to be usable. Of course a screen isn't going to be resized during capturing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refactored the example to publish the track after the resolution is known from the first callback: gethopp#4. This makes a clearly noticeable difference in text legibility when sharing my 3840 x 2160 screen without downscaling it to the arbitrary 1920 x 1080 resolution that was used before. Sharing a window and resizing it works well enough. A window that is resized to smaller than its initial size looks fine; a window that is resized to bigger than its initial size looks like it gets downscaled to the initial size.
examples/screensharing/src/main.rs
Outdated
| let (y, u, v) = capture_buffer.data_mut(); | ||
| yuv_helper::argb_to_i420(data, stride, y, s_y, u, s_u, v, s_v, width, height); | ||
|
|
||
| let scaled_buffer = capture_buffer.scale(stream_width as i32, stream_height as i32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scaling is redundant. NativeVideoSource::capture_frame will scale it automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are right, I had missed the VideoTrackSource::InternalSource::on_captured_frame implementation. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I had another look and I think the scaling is actually needed. When setting the NativeVideoSource to 1080p and the frame dims are 2880x1800, adapted_width/height are becoming 2880x1800 and not 1080p.
|
There are some merge conflicts now, I think from #753. |
|
I am finally using this downstream in a PR for Zed. It would help move that along to get this merged soon. @iparaskev could you resolve the merge conflicts and review gethopp#4? If you don't have time to continue working on this anytime soon, I could open a new PR and you could close this one. |
|
@Be-ing I was busy the previous weeks but I have some time this weekend to update the PR. Also thanks for the PR for my fork, I will have a look and keep what I think it makes sense to be kept. Feel free to leave comments here so the maintainers can also chip in. |
|
Note that #784 was just merged yesterday to get the next release to build libwebrtc with the required gn arguments regardless of when this PR gets merged. |
In particular:
- Adds support for the generic capturer.
- Exposes the desktop_capturer module only on macos, linux and
windows.
- Updates the example
a97df01 to
0a1bcb0
Compare
|
I have only tested it on macos, will check windows and ubuntu tomorrow. Also about the track dims for windows that have been resized, if I remember correctly in other sdks LiveKit allows this so we should be able to add in this sdk as well. |
Yes, in the code generated from the protobuf, there is code for it, but that generated code is never called in the Rust APIs. |
|
Can you explain why you didn't merge some code from gethopp#4? gethopp@b3ef1e0 is helpful so the callback can have its own mutable state without needing to share state with another thread and thus having to wrap everything in |
|
cool, works fine on macOS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
|
I missed this. I will change it. Because I couldn't see the difference from the current approach without the extra context you provided here. Choosing the source before starting to capture shouldn't make a difference on when you create the callback, the callback should be generic enough to handle changes in dims, both for windows and screens (the resolution of a screen can change while being shared). Also, my personal preference is to fully initialise an object on creation when possible (usually it protects you from bugs). I will add it.
Because IMO the examples should be as minimal as possible in order to show how to use the API. The less lines of code the better. |
xianshijing-lk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, I have some comments.
| public: | ||
| explicit DesktopCapturer(std::unique_ptr<webrtc::DesktopCapturer> capturer) | ||
| : capturer(std::move(capturer)), | ||
| callback(rust::Box<DesktopCapturerCallbackWrapper>::from_raw(nullptr)) {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a concerning, is it Ok to use nullptr with rust::Box::from_raw() ? which takes a pointer allocated from rust to C++ ?
@ladvoc for suggestion here.
Alternatively, we can change callback to optional in C++
std::optional<rust::Box> callback;
then we can remove this code
'callback(rust::Box::from_raw(nullptr))'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Rust API guarantees that the callback member is set to a real pointer before it is used by requiring it as a parameter for DesktopCapturer::start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got the usage of it.
My suggestion is removing the need of doing "callback(rust::Box::from_raw(nullptr))"
one way is to use std::optionalrust::Box callback;
then we will set the callback in the Start() function.
Alternative, we will make sure the callback is passed in the constructor, like
DesktopCapturer(std::unique_ptr<webrtc::DesktopCapturer> capturer,
rust::Box<DesktopCapturerCallbackWrapper> callback)
: capturer(std::move(capturer)),
callback(std::move(callback)) {}
void start() {
capturer->Start(this);
}
On the Rust side:
fn new_desktop_capturer(
options: DesktopCapturerOptions,
callback: Box<DesktopCapturerCallbackWrapper>,
) -> UniquePtr<DesktopCapturer>;
fn start(self: Pin<&mut DesktopCapturer>);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a std::optional would work, sure.
The callback was originally passed in the constructor. I changed that because it is inconvenient for the Rust application. A real application likely needs some UI to let the user pick from the list of available CaptureSources. This UI code would instantiate a DesktopCapturer. At that point there is no need to provide a callback. Calling DesktopCapturer::start is likely done in a different part of the application's code from the UI.
webrtc-sys/src/desktop_capturer.cpp
Outdated
| #endif /* defined(WEBRTC_MAC) && !defined(WEBRTC_IOS) */ | ||
| #ifdef _WIN64 | ||
| if (options.window_capturer) { | ||
| webrtc_options.set_allow_wgc_screen_capturer(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks weird.
it set set_allow_wgc_screen_capturer when window_capturer is true
and set set_allow_wgc_window_capturer otherwise.
I think it should be opposite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Actually this doesn't compile. It was a leftover from a previous revision and I didn't get the chance to test the latest change on windows.
| webrtc_options.set_allow_pipewire(true); | ||
| #endif /* WEBRTC_USE_PIPEWIRE */ | ||
|
|
||
| webrtc_options.set_prefer_cursor_embedded(options.include_cursor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you know what this set_prefer_cursor_embedded mean, and why it gets mapped to include_cursor here
please add a comment. I don't really understand this settings from the names :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. See here for reference.
webrtc-sys/src/desktop_capturer.cpp
Outdated
| void DesktopCapturer::OnCaptureResult( | ||
| webrtc::DesktopCapturer::Result result, | ||
| std::unique_ptr<webrtc::DesktopFrame> frame) { | ||
| CaptureResult ret_result = CaptureResult::Success; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a sanity check at the top of the function
if (!callback) {
return;
}
webrtc-sys/src/desktop_capturer.rs
Outdated
| impl_thread_safety!(ffi::DesktopCapturer, Send + Sync); | ||
|
|
||
| pub trait DesktopCapturerCallback: Send { | ||
| fn on_capture_result(&mut self, result: CaptureResult, frame: UniquePtr<DesktopFrame>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that DesktopFrame can be null, should we use
frame: Option
instead ?
| result: sys_dc::ffi::CaptureResult, | ||
| frame: UniquePtr<sys_dc::ffi::DesktopFrame>, | ||
| ) { | ||
| (self.callback)(capture_result_from_sys(result), DesktopFrame::new(frame)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- @ladvoc for suggestion here.
As errors can happen, which will result in an nullptr frame, we should handle the case better
I wonder if we should do things like
let result = capture_result_from_sys(result);
let frame = if frame.is_null() {
None
} else {
Some(DesktopFrame::new(frame))
};
(self.callback)(result, frame);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. There is currently UB when a slice is formed with DesktopFrame::data() when the pointer is null. We can either do a simple null check here or in DesktopFrame::new()—I am ok with either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about instead of passing an Option to the Rust callback, we pass a single argument of Result<DesktopFrame, CaptureError>?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the Result approach and went with it.
ladvoc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks great, just a few small comments
| } | ||
|
|
||
| impl DesktopFrame { | ||
| pub fn new(sys_handle: UniquePtr<sys_dc::ffi::DesktopFrame>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this ever get constructed outside the libwebrtc crate? If not, it should be marked pub(crate).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
| result: sys_dc::ffi::CaptureResult, | ||
| frame: UniquePtr<sys_dc::ffi::DesktopFrame>, | ||
| ) { | ||
| (self.callback)(capture_result_from_sys(result), DesktopFrame::new(frame)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. There is currently UB when a slice is formed with DesktopFrame::data() when the pointer is null. We can either do a simple null check here or in DesktopFrame::new()—I am ok with either.
|
Thanks for the comments everyone. I pushed a new revision with the changes (I still haven't test this on windows). |
|
I opened #789 for the linux CI failure. |
May as well make that part of this PR? Then we can confirm that it does actually fix CI. |
I haven't checked how they have configured their CI. It could use either this branch for the runs or main. If it is this branch I will include it. |
|
I'm not aware of a GitHub setting to use GH Actions workflows from a different branch from the one being tested. If that exists, it isn't the default. Modifying the files in .github/workflows should modify the CI runs on this branch. |
Implements screen sharing by exposing libwebrtc's
DesktopCapturer.A few notes:
GetSourceListto actually get the available displays without the system picker. I could port my patch to your libwebrtc fork if you want. For selecting windowsGetSourceListseems to work.Tested on:
This is related to #92 and zed-industries/zed#28754.