This won't change anything for 60Hz displays but higher refresh rate
users will benefit.
Using Nvidia EGLStreams on a 240Hz monitor for example (refresh interval
~4.1ms), the maximum render time allowed before dropping to 120Hz is now
3.6ms whereas it was previously 2.1ms.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/2158>
Not sure how to update the damage or redraw clip or something; at least
this works properly when under a constantly-redrawing window, which is
ok for debugging purposes.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1762>
Max render time shows how early the frame clock needs to be dispatched
to make it to the predicted next presentation time. Before this commit
it was set to refresh interval minus 2 ms. This means Mutter would
always start compositing 14.7 ms before a display refresh on a 60 Hz
screen or 4.9 ms before a display refresh on a 144 Hz screen. However,
Mutter frequently does not need as much time to finish compositing and
submit buffer to KMS:
max render time
/------------\
---|---------------|---------------|---> presentations
D----S D--S
D - frame clock dispatch
S - buffer submission
This commit aims to automatically compute a shorter max render time to
make Mutter start compositing as late as possible (but still making it
in time for the presentation):
max render time
/-----\
---|---------------|---------------|---> presentations
D----S D--S
Why is this better? First of all, Mutter gets application contents to
draw at the time when compositing starts. If new application buffer
arrives after the compositing has started, but before the next
presentation, it won't make it on screen:
---|---------------|---------------|---> presentations
D----S D--S
A-------------X----------->
^ doesn't make it for this presentation
A - application buffer commit
X - application buffer sampled by Mutter
Here the application committed just a few ms too late and didn't make on
screen until the next presentation. If compositing starts later in the
frame cycle, applications can commit buffers closer to the presentation.
These buffers will be more up-to-date thereby reducing input latency.
---|---------------|---------------|---> presentations
D----S D--S
A----X---->
^ made it!
Moreover, applications are recommended to render their frames on frame
callbacks, which Mutter sends right after compositing is done. Since
this commit delays the compositing, it also reduces the latency for
applications drawing on frame callbacks. Compare:
---|---------------|---------------|---> presentations
D----S D--S
F--A-------X----------->
\____________________/
latency
---|---------------|---------------|---> presentations
D----S D--S
F--A-------X---->
\_____________/
less latency
F - frame callback received, application starts rendering
So how do we actually estimate max render time? We want it to be as low
as possible, but still large enough so as not to miss any frames by
accident:
max render time
/-----\
---|---------------|---------------|---> presentations
D------S------------->
oops, took a little too long
For a successful presentation, the frame needs to be submitted to KMS
and the GPU work must be completed before the vblank. This deadline can
be computed by subtracting the vblank duration (calculated from display
mode) from the predicted next presentation time.
We don't know how long compositing will take, and we also don't know how
long the GPU work will take, since clients can submit buffers with
unfinished GPU work. So we measure and estimate these values.
The frame clock dispatch can be split into two phases:
1. From start of the dispatch to all GPU commands being submitted (but
not finished)—until the call to eglSwapBuffers().
2. From eglSwapBuffers() to submitting the buffer to KMS and to GPU
work completing. These happen in parallel, and we want the latest of
the two to be done before the vblank.
We measure these three durations and store them for the last 16 frames.
The estimate for each duration is a maximum of these last 16 durations.
Usually even taking just the last frame's durations as the estimates
works well enough, but I found that screen-capturing with OBS Studio
increases duration variability enough to cause frequent missed frames
when using that method. Taking a maximum of the last 16 frames smoothes
out this variability.
The durations are naturally quite variable and the estimates aren't
perfect. To take this into account, an additional constant 2 ms is added
to the max render time.
How does it perform in practice? On my desktop with 144 Hz monitors I
get a max render time of 4–5 ms instead of the default 4.9 ms (I had
1 ms manually configured in sway) and on my laptop with a 60 Hz screen I
get a max render time of 4.8–5.5 ms instead of the default 14.7 ms (I
had 5–6 ms manually configured in sway). Weston [1] went with a 7 ms
default.
The main downside is that if there's a sudden heavy batch of work in the
compositing, which would've made it in default 14.7 ms, but doesn't make
it in reduced 6 ms, there is a delayed frame which would otherwise not
be there. Arguably, this happens rarely enough to be a good trade-off
for reduced latency. One possible solution is a "next frame is expected
to be heavy" function which manually increases max render time for the
next frame. This would avoid this single dropped frame at the start of
complex animations.
[1]: https://www.collabora.com/about-us/blog/2015/02/12/weston-repaint-scheduling/
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1762>
Before this commit, next presentation time could end up behind now_us or
ahead of now_us depending on how presentation times happened to be
aligned relative to integer multiples of refresh_interval_us. It's not
clear whether this was originally intended because even if it the next
presentation time ends up behind now_us, it is moved ahead by a while
loop down below in this function.
Even though this difference in behavior didn't really matter, it made
reasoning about the subsequent branches more complex. It would also
potentially introduce bugs if the logic was to be modified. So this
commit makes it so next presentation time is always ahead of now_us.
It also adds a comment with a graph detailing the computations, and
adjusts the variable names to drop unfortunate terminology mistakes.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1707>
Last presentation time is mainly used to make sure predicted
presentation time is aligned with display refreshes. Even if it went
back in time, there will be no issue as next presentation time takes
current time into account. Synthetic presentation time is not exactly
aligned with display refreshes, so using it would only result in
inconsistent animations.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1707>
In constrast to notify_presented(), notify_ready() also returns the
state machine to the idle state, but without providing new frame
information, as no frame was actually presented.
This will happen for example with the simple KMS impl backend will do a
cursor movement, which will trigger a symbolic "page flip" reply in
order to emulate atomic KMS behavior. When this happen, we should just
try to reschedule again.
Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1488>
The frame clock owner should be able to explicitly destroy (i.e. make
defunct) a frame clock, e.g. when a stage view is destructed. This is so
that other objects can keep reference to its without it being left
around even after stopped being usable.
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1285
A frame clock dispatch doesn't necessarily result in a frame drawn,
meaning we'll end up in the idle state. However, it may be the case that
something still requires another frame, and will in that case have
requested one to be scheduled. In order to not dead lock, try to
reschedule directly if requested after dispatching, if we ended up in
the idle state.
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1285
We had time unit conversion helpers (e.g. us2ms(), ns2us(), etc) in
multiple places. Clean that up by moving them all to a common file. That
file is clutter-private.h, as it's accessible by both from clutter/ and
src/.
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1285
Add API to add and remove ClutterTimeline objects to the frame clock.
Just as the legacy master clock, having a timeline added to the frame
clock causes the frame clock to continuously reschedule updates until
the timeline is removed.
ClutterTimeline is adapted to be able to be driven by a
ClutterFrameClock. This is done by adding a 'frame-clock' property, and
if set, the timeline will add and remove itself to the frame clock
instead of the master clock.
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1285
The timestamp comes from the GSource, meaning it's a more accurate
representation of when the frame started to be dispatched compared to
getting the current time in any callback.
Currently unused.
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1285
In certain scenarios, the frame clock needs to handle present feedback
long before the assumed presentation time happens. To avoid scheduling
the next frame to soon, avoid scheduling one if we were presented half a
frame interval within the last expected presentation time.
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1285
This adds a current unused, apart from tests, frame clock. It just
reschedules given a refresh rate, based on presentation time feedback.
The aiming for it is to be used with a single frame listener (stage
views) that will notify when a frame is presented. It does not aim to
handle multiple frame listeners, instead, it's assumed that different
frame listeners will use their own frame clocks.
Also add a test that verifies that the basic functionality works.
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1285