Fix occurences of MEDIA_TIME_NOT_FOUND on initial fallback#1758
Merged
peaBerberian merged 1 commit intodevfrom Oct 23, 2025
Merged
Fix occurences of MEDIA_TIME_NOT_FOUND on initial fallback#1758peaBerberian merged 1 commit intodevfrom
MEDIA_TIME_NOT_FOUND on initial fallback#1758peaBerberian merged 1 commit intodevfrom
Conversation
22ba6e9 to
3d1766d
Compare
61ff3c7 to
c871d1e
Compare
74d6b70 to
3c1cd2f
Compare
3c1cd2f to
04bd366
Compare
04bd366 to
b24f4eb
Compare
We've recently seen `MEDIA_TIME_NOT_FOUND` errors in a Canal+ application. It's one of the errors that are part of our API that should actually be never sent to an application - unless there's an RxPlayer bug somewhere (it is documented as such in our API documentation). The issue --------- Turns out they were encountering a very specific race condition after a chain of events: 1. The application relied on a multi-threaded RxPlayer and played an encrypted content with some non-decipherable qualities. They also have all conditions met to allow our cache of `MediaKeySession` (which allows to reuse already-loaded decryption keys). 2. They then loaded another content, with a completely different media position that does not map to anything in the previous content (it shouldn't matter, but it will be important later) 3. The application switches again to the first content, without calling `stop` in between (which is OK and even more performant). In that situation, the following happen just after the third step: 4. We start to initialize everything for that last content. Since recently (#1607), that initialization step includes the initial polling of media metrics such as the position, the playbackRate etc. Before that work, this polling was done in a later step. 5. We stop the previous content. We do this after initializing the next content on purpose. "Stopping" a content (setting `mediaElement.src = ""`, typically) is a synchronous/blocking operations in JS that can actually take a lot of time - like hundred of ms on lower-end devices. To improve performance, we thus "initialize" the next content before stopping the previous one, as the former includes some parallelizable operations (network requests, `postMessage` to our Worker etc.) that can still take place while the "stop operation" is blocking the JS main thread. 6. The RxPlayer core (running in a WebWorker here) initialize everything, fetches the Manifest etc. Here the content is already known and its decryption keys are still "cached" locally, so we directly know that some qualities in the Manifest are not decipherable and decide to "fallback" from the higher qualities. 7. The fallback mechanism reads the last polled media metrics. If we reached that logic too fast, we will actually read the initially-polled metrics (3 steps earlier). This should be OK, but because we last polled the media metrics _BEFORE_ stopping the previous content **AND** no metrics has been emitted since then, the playback position in those metrics is actually the last position reached in the previous content. 8. The RxPlayer checks that wanted position, see that it doesn't make sense in the current content, and triggers the `MEDIA_TIME_NOT_FOUND` error. The fixes --------- The crux of this issue is both that: 1. we poll media metrics before stopping the previous content and, 2. we do not emit new metrics immediately when the initial position to seek to is known The fix I ended up choosing is very far from being the most straightforward one though :D. It's a solution I already PoCed with my [preload work](#1646) which I found elegant in terms of code architecture. Basically the `PlaybackObserver` (the class doing the polling) can now be "headless" initially: without a media element. In that case, default media metrics are considered (position `0`, paused etc.) When the media element is considered "ready", it can be attached to it, in which case "real" polling will be performed. This ensure that no polling of media metrics linked to a previous content is going on. It wasn't done with this issue in mind, but I found that it also made sense there: before stopping the previous content, we want to start our metrics-polling module (the `PlaybackObserver`) but are not yet ready to attach the media element as it is still technically playing the previous content. Once the previous content is stopped, we can now begin to actually link the media element to it to enable actual polling. Moreover, I now decide to emit those metrics right when we know what the initial position to seek to will be (as those metrics include this data point as a "wanted position). This makes sure that the RxPlayer core always has the more up-to-date information. Lastly, the already-merged #1755 fix should also have fixed that issue - as the initial position would have been known at some point by the RxPlayer Core anyway. This makes it far from a hotfix (there's a lot of lines) but I like this solution.
b24f4eb to
58b2107
Compare
|
✅ Automated performance checks have passed on commit DetailsPerformance tests 1st run outputNo significative change in performance for tests:
|
Florent-Bouisset
approved these changes
Oct 23, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We've recently seen
MEDIA_TIME_NOT_FOUNDerrors in a Canal+ application.It's one of the errors that are part of our API that should actually be never sent to an application - unless there's an RxPlayer bug somewhere (it is documented as such in our API documentation).
The issue
Turns out they were encountering a very specific race condition after a chain of events:
The application relied on a multi-threaded RxPlayer and played an encrypted content with some non-decipherable qualities.
They also have all conditions met to allow our cache of
MediaKeySession(which allows to reuse already-loaded decryption keys).They then loaded another content, with a completely different media position that does not map to anything in the previous content (it shouldn't matter, but it will be important later)
The application switches again to the first content, without calling
stopin between (which is OK and even more performant).In that situation, the following happen just after the third step:
We start to initialize everything for that last content.
Since recently ([Proposal] Allow seeking through
seekTobefore the HTMLMediaElement is ready #1607), that initialization step includes the initial polling of media metrics such as the position, the playbackRate etc.Before that work, this polling was done in a later step.
We stop the previous content. We do this after initializing the next content on purpose.
"Stopping" a content (setting
mediaElement.src = "", typically) is a synchronous/blocking operations in JS that can actually take a lot of time - like hundred of ms on lower-end devices.To improve performance, we thus "initialize" the next content before stopping the previous one, as the former includes some parallelizable operations (network requests,
postMessageto our Worker etc.) that can still take place while the "stop operation" is blocking the JS main thread.The RxPlayer core (running in a WebWorker here) initialize everything, fetches the Manifest etc.
Here the content is already known and its decryption keys are still "cached" locally, so we directly know that some qualities in the Manifest are not decipherable and decide to "fallback" from the higher qualities.
The fallback mechanism reads the last polled media metrics. If we reached that logic too fast, we will actually read the initially-polled metrics (3 steps earlier).
This should be OK, but because we last polled the media metrics BEFORE stopping the previous content AND no metrics has been emitted since then, the playback position in those metrics is actually the last position reached in the previous content.
The RxPlayer checks that wanted position, see that it doesn't make sense in the current content, and triggers the
MEDIA_TIME_NOT_FOUNDerror.The fixes
The crux of this issue is that:
MEDIA_TIME_NOT_FOUNDerrorThe fix I ended up choosing is very far from being the most straightforward one though :D. It's a solution I already PoCed with my preload work which I found elegant in terms of code architecture.
Basically the
PlaybackObserver(the class doing the polling) can now be "headless" initially: without a media element. In that case, default media metrics are considered (position0, paused etc.)When the media element is considered "ready", it can be attached to it, in which case "real" polling will be performed. This ensure that no polling of media metrics linked to a previous content is going on.
It wasn't done with this issue in mind, but I found that it also made sense there: before stopping the previous content, we want to start our metrics-polling module (the
PlaybackObserver) but are not yet ready to attach the media element as it is still technically playing the previous content.Once the previous content is stopped, we can now begin to actually link the media element to it to enable actual polling.
Moreover, I now also decide to emit those metrics right when we know what the initial position to seek to will be (as those metrics include this data point as a "wanted position"). This makes sure that the RxPlayer core always has the more up-to-date information.
Lastly, the already-merged #1755 fix should also have fixed that issue - as the initial position would have been known at some point by the RxPlayer Core anyway.
This makes it far from a hotfix (there's a lot of lines) but I like this solution.