X-Git-Url: https://git.carlh.net/gitweb/?a=blobdiff_plain;f=doc%2Fdesign%2Fdecoder_structures.tex;h=9b9be33cdd4ea0e6e552b8241a8770c623b15db4;hb=6d686ea45f5cd01a0d11f92a903ac77779ad8562;hp=1f7ae0ec48dcef5cb9ff6f5a7338e10639c95dde;hpb=e12d6f79ca85a71d30abbd3c167a0747b2ae5bb1;p=dcpomatic.git diff --git a/doc/design/decoder_structures.tex b/doc/design/decoder_structures.tex index 1f7ae0ec4..9b9be33cd 100644 --- a/doc/design/decoder_structures.tex +++ b/doc/design/decoder_structures.tex @@ -1,4 +1,6 @@ \documentclass{article} +\usepackage[usenames]{xcolor} +\usepackage{listings} \title{Decoder structures} \author{} \date{} @@ -12,7 +14,8 @@ hides a decode-some-and-see-what-comes-out approach. With most decoders it is quick, easy and reliable to get a particular piece of content from a particular timecode. This applies to the DCP, -DCP subtitle, Sndfile and Image decoders. With FFmpeg, however, this is not easy. +DCP subtitle, Image and Video MXF decoders. With FFmpeg, however, +this is not easy. This suggests that it would make more sense to keep the decode-and-see-what-comes-out code within the FFmpeg decoder and not @@ -20,7 +23,7 @@ use it anywhere else. However resampling screws this up, as it means all audio requires decode-and-see. I don't think you can't resample in neat blocks as -there are fractional samples other complications. You can't postpone +there are fractional samples and other complications. You can't postpone resampling to the end of the player since different audio may be coming in at different rates. @@ -28,6 +31,10 @@ This suggests that decode-and-see is a better match, even if it feels a bit ridiculous when most of the decoders have slightly clunky seek and pass methods. +Having said that: the only other decoder which produces audio is now +the DCP one, and maybe that never needs to be resampled. + + \section{Multiple streams} Another thing unique to FFmpeg is multiple audio streams, possibly at @@ -50,4 +57,159 @@ and the merging of all audio content inside the player. These disadvantages suggest that the first approach is better. +One might think that the logical conclusion is to take streams all the +way back to the player and resample them there, but the resampling +must occur on the other side of the get-stuff-at-time API. + + +\section{Going back} + +Thinking about this again in October 2016 it feels like the +get-stuff-at-this-time API is causing problems. It especially seems +to be a bad fit for previewing audio. The API is nice for callers, +but there is a lot of dancing around behind it to make it work, and it +seems that it is more `flexible' than necessary; all callers ever do +is seek or run. + +Hence there is a temptation to go back to see-what-comes-out. + +There are two operations: make DCP and preview. Make DCP seems to be + +\lstset{language=C++} +\lstset{basicstyle=\footnotesize\ttfamily, + breaklines=true, + keywordstyle=\color{blue}\ttfamily, + stringstyle=\color{red}\ttfamily, + commentstyle=\color{olive}\ttfamily} + +\begin{lstlisting} + while (!done) { + done = player->pass(); + // pass() causes things to appear which are + // sent to encoders / disk + } +\end{lstlisting} + +And preview seems to be + +\begin{lstlisting} + // Thread 1 + while (!done) { + done = player->pass(); + // pass() causes things to appear which are buffered + sleep_until_buffers_empty(); + } + + // Thread 2 + while (true) { + get_video_and_audio_from_buffers(); + push_to_output(); + sleep(); + } +\end{lstlisting} + +\texttt{Player::pass} must call \texttt{pass()} on its decoders. They +will emit stuff which \texttt{Player} must adjust (mixing sound etc.). +Player then emits the `final cut', which must have properties like no +gaps in video/audio. + +Maybe you could have a parent class for simpler get-stuff-at-this-time +decoders to give them \texttt{pass()} / \texttt{seek()}. + +One problem I remember is which decoder to \texttt{pass()} at any given time: +it must be the one with the earliest last output, presumably. +Resampling also looks fiddly in the v1 code. + + +\section{Having a go} + +\begin{lstlisting} + class Decoder { + virtual void pass() = 0; + virtual void seek(ContentTime time, bool accurate) = 0; + + signal Video; + signal Audio; + signal TextSubtitle; + }; +\end{lstlisting} + +or perhaps + +\begin{lstlisting} + class Decoder { + virtual void pass() = 0; + virtual void seek(ContentTime time, bool accurate) = 0; + + shared_ptr video; + shared_ptr audio; + shared_ptr subtitle; + }; + + class VideoDecoder { + signals2 Data; + }; +\end{lstlisting} + +Questions: +\begin{itemize} +\item Video / audio frame or \texttt{ContentTime}? +\item Can all the subtitle period notation code go? +\end{itemize} + +\subsection{Steps} + +\begin{itemize} +\item Add signals to \texttt{Player}. + \begin{itemize} + \item \texttt{signal), DCPTime)> Video;} + \item \texttt{signal, DCPTime)> Audio;} + \item \texttt{signal Subtitle;} + \end{itemize} + \item Remove \texttt{get()}-based loops and replace with \texttt{pass()} and signal connections. + \item Remove \texttt{get()} and \texttt{seek()} from decoder parts; add emission signals. + \item Put \texttt{AudioMerger} back. + \item Remove \texttt{during} stuff from \texttt{SubtitleDecoder} and decoder classes that use it. + \item Rename \texttt{give} methods to \texttt{emit}. + \item Remove \texttt{get} methods from \texttt{Player}; replace with \texttt{pass()} and \texttt{seek()}. +\end{itemize} + + +\section{Summary of work done in \texttt{back-to-pass}} + +The diff between \texttt{back-to-pass} and \texttt{master} as at 21/2/2017 can be summarised as: + +\begin{enumerate} +\item Remove \texttt{AudioDecoderStream}; no more need to buffer, and resampling is done in \texttt{Player}. +\item \texttt{AudioDecoder} is simple; basically counting frames. +\item All subtitles-during stuff is gone; no need to know what happens in a particular period as we just wait and see. +\item Pass reason stuff gone; not sure what it was for but seems to have been a contortion related to trying to find specific stuff. + \item \texttt{Player::pass} back, obviously. + \item \texttt{Player::get\_video}, \texttt{get\_audio} and + \texttt{get\_subtitle} more-or-less become \texttt{Player}'s + handlers for emissions from decoders; lots of buffering crap gone + in the process. + \item Add \texttt{Decoder::position} stuff so that we know what to \texttt{pass()} in \texttt{Player}. + \item Add \texttt{AudioMerger}; necessary as audio arrives at the + \texttt{Player} from different streams at different times. The + \texttt{AudioMerger} just accepts data, mixes and spits it out + again. +\item \texttt{AudioMerger} made aware of periods with no content to + allow referenced reels; adds a fair amount of complexity. Without + this the referenced reel gaps are silence-padded which confuses + things later on as our VF DCP gets audio data that it does not need. +\item Obvious consumer changes: what was a loop over the playlist + length and calls to \texttt{get()} is now calls to \texttt{pass()}. + \item Maybe-seek stuff gone. + \item Some small \texttt{const}-correctness bits. +\end{enumerate} + +Obvious things to do: + +\begin{enumerate} +\item Ensure AudioMerger is being tested. +\item Ensure hardest-case in video / audio is being tested. +\item Look at symmetry of video/audio paths / APIs. +\end{enumerate} + \end{document}