doc/design/resampling.tex

   1 \documentclass{article}
   2 \usepackage{amsmath}
   3 \begin{document}
   4
   5 Here is what resampling we need to do.  Content video is at $C_V$ fps, audio at $C_A$.
   6
   7 \section{Easy case 1}
   8
   9 $C_V$ and $C_A$ are both DCI rates, e.g.\ if $C_V = 24$, $C_A = 48\times{}10^3$.
  10
  11 \medskip
  12 \textbf{Nothing to do.}
  13
  14 \section{Easy case 2}
  15
  16 $C_V$ is a DCI rate, $C_A$ is not.  e.g.\ if $C_V = 24$, $C_A = 44.1\times{}10^3$.
  17
  18 \medskip
  19 \textbf{Resample $C_A$ to the DCI rate.}
  20
  21 \section{Hard case 1}
  22 \label{sec:hard1}
  23
  24 $C_V$ is not a DCI rate, $C_A$ is, e.g.\ if $C_V = 25$, $C_A =
  25 48\times{}10^3$.  We will run the video at a nearby DCI rate $F_V$,
  26 meaning that it will run faster or slower than it should.  We resample
  27 the audio to $C_V C_A / F_V$ and mark it as $C_A$ so that it, too,
  28 runs faster or slower by the corresponding factor.
  29
  30 e.g.\ if $C_V = 25$, $F_V = 24$ and $C_A = 48\times{}10^3$, we
  31 resample audio to $25 * 48\times{}10^3 / 24 = 50\times{}10^3$.
  32
  33 \medskip
  34 \textbf{Resample $C_A$ to $C_V C_A / F_V$}
  35
  36 \section{Hard case 2}
  37
  38 Neither $C_V$ nor $C_A$ is not a DCI rate, e.g.\ if $C_V = 25$, $C_A =
  39 44.1\times{}10^3$.  We will run the video at a nearby DCI rate $F_V$,
  40 meaning that it will run faster or slower than it should.  We first
  41 resample the audio to a DCI rate $F_A$, then perform as with
  42 Section~\ref{sec:hard1} above.
  43
  44 \medskip
  45 \textbf{Resample $C_A$ to $C_V F_A / F_V$}
  46
  47
  48 \section{The general case}
  49
  50 Given a DCP running at $F_V$ and $F_A$ and a piece of content at $C_V$
  51 and $C_A$, resample the audio to $R_A$ where
  52 \begin{align*}
  53 R_A &= \frac{C_V F_A}{F_V}
  54 \end{align*}
  55
  56 Once this is done, consider 1 second's worth of content samples ($C_A$
  57 samples).  We have turned them into $R_A$ samples which should still
  58 last 1 second.  These samples are then played back at $F_A$ samples
  59 per second, so they last $R_A / F_A$ seconds.  Hence there is a
  60 scaling between some content time and some DCP time of $R_A / F_A$
  61 i.e. $C_V / F_V$.
  62
  63
  64 \section{Another explanation}
  65
  66 Say we have some content at a video rate $C_V$ and we want to
  67 run it at DCP video rate $F_V$.  It's always the video rates that
  68 decide what to do, since we don't have an equivalent to audio
  69 resampling in the video domain.
  70
  71 We can just mark the video as $F_V$ and it will run $F_V / C_V$ faster
  72 than it was.  Let's call the factor $S = F_V / C_V$.
  73
  74 An equivalent for audio would be to take the content audio at a rate
  75 $C_A$ and mark it as $C_A S$.  Then the same audio frames will be run
  76 more quickly, just as the same video frames are being.  The audio would be
  77 in sync with the video since it has been sped up by the same amount.
  78
  79 In practice we can't do this, in general, as the only allowed DCP
  80 audio rates are 48kHz and 96kHz.  Instead, we'll resample to some new
  81 rate $P$ and mark it as $Q$ where $Q / P = S$.  Resampling does not
  82 change the sound, just how many samples are being used to describe it,
  83 so this is equivalent to marking the original, unsampled audio as $C_A S$.
  84
  85 Then we set $Q = 48$kHz so that $P = 48000 / S$, or $P = C_V F_A
  86 / F_V$.
  87
  88 Note that the original sampling rate of the audio content is
  89 irrelevant.  Also, skipping or doubling of video frames is analagous
  90 to audio resampling: the data are the same, just represented with more
  91 or fewer samples.
  92
  93 \end{document}