SHARE
TWEET

Untitled

a guest Jan 12th, 2019 77 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. Dear Amiga mod playback software author,
  2.  
  3. you are receiving this note because your software is implementing playback
  4. capabilities for an important format, the Amiga protracker 1.x-3.x mod format.
  5.  
  6. As you probably are aware, most of these modules were composed on the Amiga
  7. computer for playback on Amiga system. The Amiga hardware playback system is
  8. unlike the PC environment, and needs some special care if authentic-seeming
  9. Amiga mod playback is desired.
  10.  
  11. Typical playback software for the various sample tracker formats is architected
  12. roughly like this:
  13.  
  14.     [ Player routine ] --> [ Mixer and resampler ] --> Output
  15.  
  16. The player routine component produces information about the samples, their
  17. sampling frequencies and volume levels (and filtering, panning, etc. depending
  18. on format) to the mixer and resampler component, which uses the player
  19. routine's instructions to generate the final output by resampling the sample
  20. stream from memory and mixing it to physical output channels.  I'm going to
  21. focus exclusively to the Mixer and resampler component here.
  22.  
  23. A typical implementation of the resampler uses linear interpolation, or some
  24. higher-order polynomial interpolation, splines, or even sinc-based
  25. "theoretically correct" resampling with some 8-tap FIR or more. Unfortunately,
  26. all of them miss the mark as far as Amiga is concerned. For typical mod player,
  27. the most accurate emulation of Amiga output occurs when *no* interpolation is
  28. used at all. However, there seems to be nothing special about the mixing --
  29. simply adding the resampled channels together suffices.
  30.  
  31. Why the Amiga is different?
  32. ---------------------------
  33.  
  34. To understand what goes on with the Amiga sound chip, Paula, is to understand that
  35. Paula does no interpolation of any kind. Paula's output is strictly a pulse wave,
  36. produced on 3546895 Hz frequency, which is the Paula clock rate for PAL systems.
  37.  
  38. The Amiga period values, typically valued somewhere between 120 and 800, count Paula
  39. ticks. For instance, a period value of 400 means that Paula waits 400 ticks holding
  40. the output constant, and then changes to the next sample.
  41.  
  42. To simulate this perfectly, we would produce an output sample stream at that
  43. 3.55 MHz rate, one sample per tick, and then simply hold the output constant
  44. while counting down to zero from the period value, then switch to next sample,
  45. reset period clock back to original value, and so on.  This stream would be
  46. very accurate representation of the Paula's output.
  47.  
  48. Aliasing in Paula output
  49. ------------------------
  50.  
  51. It is also worth it to understand that playing back a sample from Paula
  52. replaces the original waveform with its smoothly undulating shapes with nothing
  53. but hard-edged pulses. This will have an impact to the sound, of course, and is
  54. called aliasing distortion. It's called aliasing, because copies of the
  55. original sound's frequency spectrum repeat throughout the entire audible part
  56. of audio spectrum and beyond, theoretically up to infinity.
  57.  
  58. This may surprise those who believe that because Amiga's maximum sampling rate
  59. is some 28867 Hz, the maximum frequency produced by Amiga is threrefore limited
  60. to 14.5 kHz or so. However, this is merely the highest frequency that may be
  61. produced without aliasing.
  62.  
  63. In addition, some misguided people have tried to run their mod players at 28867
  64. Hz mixing frequency based on some kind of confused idea of what the mixing rate
  65. means for Amiga. This won't work either.
  66.  
  67. The analog filters -- more complications
  68. ----------------------------------------
  69.  
  70. After leaving the chip, the sound enters a 6 dB/oct resistor+capacitor (RC)
  71. low-pass filter tuned at approximately 5 kHz. I will call this component "fixed
  72. filter" in this document. The purpose of this filter is probably to remove some
  73. of that aliasing which is inherent in the pulse synthesis. Additionally, it is
  74. possible to engage a low-pass 12 dB/oct Butterworth filter tuned at
  75. approximately 3.2 kHz by turning the Amiga power LED brighter with a special
  76. protracker command.  I will call this component "LED filter".
  77.  
  78. These filters should probably be understood as reconstruction filters, because
  79. their purpose is to attempt to "reconstruct" the original smoothly undulating
  80. waveform after a pulse-based version of it has been generated by some D/A
  81. converter.  However, proper reconstruction filters should probably feature
  82. steeper cut-offs than just 6 or 12 dB, and they should be dynamic; their tuning
  83. should follow the sampling rate, as it is the sampling rate that decides where
  84. the first aliased frequency will occur. (As an example, some Atari ST models
  85. have a DAC chip which is fed through DMA at any of the four possible sampling
  86. rates, and the analog reconstruction filter's cutoff also is chosen based on
  87. the sampling rate.)
  88.  
  89. Regardless, this is the hand we have been dealt with, and in order to
  90. accurately simulate Amiga 500 audio playback, we thus need to emulate these two
  91. filters in addition of producing the right kind of output stream from Paula
  92. emulation.
  93.  
  94. No interpolation -- use resampling!
  95. -----------------------------------
  96.  
  97. To properly resample audio, we need to achieve two things, often done simultaneously:
  98.  
  99. * remove all components in the audio stream above some selected frequency, usually
  100.   about 20 kHz.
  101. * decimate the signal by discarding samples until we obtain output at the correct rate.
  102.   For 44.1 kHz output we need to discard approximately 79 of every 80 samples.
  103.  
  104. Decimation can be safely done only after the lowpass step has been completed.
  105. If decimation is done without lowpass filtering, some extra aliasing shall be
  106. introduced to the remaining audio. What happens is that all frequencies in the
  107. spectrum now alias in the audible range, making the ultrasonic sounds produced by
  108. the pulse-wave synthesis to appear as additional noise in the mod playback.
  109.  
  110. Fortunately, the previous paragraph contains the answer to the problem. It is
  111. possible to make simulated Paula's output consist of "bandlimited steps", or
  112. BLEPs as they are called. The output is done by simulating Paula at 3.55 MHz
  113. clock rate and replacing the hard edges of Paula's pulses by BLEPs, which are
  114. carefully constructed in such a fashion to not contain those difficult frequencies
  115. above 21 kHz or so.
  116.  
  117. Because the input stream does not contain frequencies near or above nyquist
  118. frequency of the final sampling rate, all we have to do is to build an abstract
  119. version of Paula's output signal that we can evalute to real sample whenever it
  120. is needed. The abstract version consists two parts: the output level of
  121. "genuine" Paula with hard-edged pulses, and the states of various BLEPs which
  122. which are then used to correct the "hard edges" of Paula's waveform to the
  123. smooth curves of the BLEPs instead.
  124.  
  125. The mixing of one blep thus has the following procedure, roughly:
  126.  
  127. * adjust the output level of Paula to the new value.
  128. * start a new blep, and scale it in such a fashion that its amplitude is the difference
  129.   between new output level and the old level.
  130.  
  131. If we ask the synthesis function to produce a sample at this moment, it would still
  132. report the old level. But as we clock Paula forward, and ask again, the blep gradually
  133. changes the output level from the old value to the new value. At some point we are done
  134. with the BLEP, and then the output is simply constant at the new output level.
  135.  
  136. This leaves the filters, but they are easy. Because the output is done by
  137. mixing bleps, we can implement the analog filters in the bleps themselves, by
  138. filtering them with digital models of the analog filters to produce versions of
  139. BLEPs that have treble components reduced to match Amiga 500's operation with
  140. LED filter on and off. This may seem somewhat miraculous, but this is how it
  141. works.
  142.  
  143. The implementation
  144. ------------------
  145.  
  146. We need to compute the blep tables. Firstly we need a function that contains
  147. all the frequencies in the world. One such function is called the delta
  148. function. We define it as follows:
  149.  
  150.     delta(x) = 0 if x != 0
  151.     delta(x) = 1 if x = 0.
  152.  
  153. If we low-pass filter the delta, it morphs into a sinc function. sinc function
  154. is a special one, it has all the frequencies from 0 to some given cutoff value.
  155. The cutoff is determined by the rate the sinc is oscillating in.
  156.  
  157.     f(x, rate) = sin(x * pi / rate) / (x * pi / rate) if x != 0
  158.     f(0, rate) = 1 if x = 0
  159.  
  160. With rate = 1.0 the sinc does no filtering at all, and becomes a delta function
  161. in effect. If we plot the sinc function by asking the value of sinc at each
  162. integer value, it is zero everywhere except 1 at 0. To do no filtering at all
  163. is the same as having the cutoff at exactly half of the sampling rate. If the
  164. sinc is to remove everything above, say, 21 kHz, rate = 3.55e6 / 21e3 / 2.
  165.  
  166. An ideal sinc filter extends from negative infinity to the positive infinity.
  167. Because a practical implementation is needed, the sinc function must be
  168. truncated at some point.  The truncation affects the width of the stopband of
  169. the sinc function -- short functions begin filtering earlier and are not as
  170. steeply filtering as long filters.  I've found that 2048 point sinc functions
  171. are sufficient to perform filtering for Paula, so we'd sample the sinc function
  172. from f(-1024, rate) to f(1023, rate) and thus yield 2048 values.
  173.  
  174. Because 2048 point function is quite short, there is an issue at the edges,
  175. where the sinc is still oscillating. Some windowing function is needed to
  176. smooth the sinc properly at edges to zero. The Hann window, or the raised
  177. cosine window is the simplest. It is defined as hann(x, length) = 0.5 - 0.5 *
  178. cos(x / length * 2 * pi). For this, the length would be 2048, and the values of
  179. x would go from 0 to 2047.
  180.  
  181. The result should look like a symmetric "water-drop" with few small side lobes.
  182. If you have some plotting software, you can see the sinc function at this
  183. point. gnuplot suffices, you can type something like:
  184.  
  185. set xrange [-1024:1023]
  186. plot sin(x * pi / 84) / (x * pi / 84) * (0.5 - 0.5 * cos((x + 1024) / 2048 * 2 * pi))
  187.  
  188. The sinc table is final mixing frequency independent
  189. -----------------------------------------------------
  190.  
  191. The only things that go into calculating the sinc table are the Paula clock
  192. frequency and the frequency of the cutoff. As long as calls to obtaining output
  193. occur at rates greater or equal to 44.1 kHz, the table works. With rates below
  194. about 20 kHz, the blep table itself will produce aliasing.
  195.  
  196. The function is sampled at Paula clock rate, which is constant; it therefore
  197. also has a fixed duration in seconds. For 2048 points, it's 2048 paula ticks or
  198. roughly 0.6 milliseconds. From this it follows that at 44.1 kHz sampling
  199. frequency, about 25 samples of the table make it into the output.
  200.  
  201. The table must however be longer than merely 25 samples because depending on
  202. the phase where the blep starts, ie. paula pulse occurs, it's always different
  203. values of the table that need to be chosen, so all of them become used at some
  204. point. (However, in truth the table does not need to be that precise, it could
  205. well be sampled at a lower frequency without compromising output quality
  206. noticeably. Perhaps only about 16 different phase values are needed, but now it
  207. has 80 (as many phases as there are clocks between samples).)
  208.  
  209. Filtering the sinc
  210. ------------------
  211.  
  212. The minimum-phase reconstruction is implemented in cepstral domain in the
  213. included program.  My knowledge of the operations involved is too limited to
  214. explain why or how it works, but after we do this, the filter shape changes in
  215. such a way that as much as possible of the filter occurs "early on" and the
  216. filter "settles" for longer period after an initial large transition. So,
  217. before minimum phase reconstruction, the filter is a symmetric waterdrop-like
  218. wave, with pre- and postringing at the peak, but after the reconstruction the
  219. peak moves to the start of filter, and the filter displays decaying oscillation
  220. for most of its duration.  The upshot of the reconstruction is that it allows
  221. our IIR filters the longest period to settle after the initial excitation by
  222. the large part of the peak, thus allowing us to truncate them earlier.
  223.  
  224. There has been some talk that minimum-phase reconstructions are in general
  225. better audio filters than linear-phase filters (which are symmetric by the
  226. midpoint), because they do not display the preringing that is perceptible to
  227. ear if it occurs for too long. The increased postringing is not an issue, as it
  228. often occurs in nature, and thus the ear "expects" it.  However, the issue is
  229. made complicated by the fact that phase relationships are by definition
  230. modified by minimum-phase reconstructions. At any case, this choice makes no
  231. audible difference in this case, because in both cases the ringing is tuned to
  232. occur at frequencies above 20 kHz, and the phase delays are almost linear
  233. across the entire audible band.
  234.  
  235. The fixed filter
  236. ----------------
  237.  
  238. We should apply the IIR filters to the sinc function before proceeding to build the
  239. band-limited step.  The 6 dB/oct filter is an IIR filter that is computed
  240. according to the following formula:
  241.  
  242.     output(n) = b0 * input(n) + (1 - b0) * output(n - 1)
  243.  
  244. that is, the output is a linear combination of input and the value of output in the
  245. last iteration.
  246.  
  247. The b0 determines the cutoff value and is computed as follows:
  248.  
  249.     omega = 2 * pi * 5000 / sampling_rate
  250.     b0 = 1 / (1 + 1 / omega)
  251.  
  252. As a reminder, the cutoff frequency would be 5.0 kHz and sampling rate is 3.55
  253. MHz because we will operate the blep that is "sampled" at 3.55 MHz. The rest of
  254. the problem here is just applying the previous formula to filtering the BLEP.
  255.  
  256. Amiga 1200 current leakage simulation
  257. -------------------------------------
  258.  
  259. It seems that Amiga 1200 output response attenuated by some 0.2 - 0.3 dB near
  260. 20 kHz.  This would seem to be due to the LED filter even though it isn't
  261. enabled, because this effect does not manifest when turning the LED filter on.
  262. It is difficult to be sure, though, because of limited resolution and the
  263. extreme attenuation, noise floor gets very close to signal. Regardless, the
  264. effect can be modelled with a 32 kHz lowpass RC filter.
  265.  
  266. A word of caution
  267. -----------------
  268.  
  269. To correctly apply the filter, we first need to extend the sinc we computed by
  270. imagining that it is preceded by, say, 10 000 constant values of the first
  271. value of the filter, and these are run through the filter to ensure that the
  272. filter is properly settled for the base level of the blep. This is especially
  273. important if the first value of the sinc is not zero, or if the filter is
  274. reused between computations for any reason.
  275.  
  276. Additionally, to validate that the length of the blep is sufficient, it would
  277. be helpful to examine how much the IIR filter oscillates at the end of the
  278. table by computing the value of
  279.  
  280.     filter(last_output_value) - last_output_value
  281.  
  282. If the filter has not yet damped, the difference will not be zero, and depending of
  283. the oscillation magnitude, artifacts could be introduced.
  284.  
  285. The LED filter
  286. --------------
  287.  
  288. If no LED filtering emulation is desired, the next BLEP can be left uncomputed.
  289. To simulate the LED filter, we clone the previous sinc function for applying
  290. the LED filter on.  The model for LED filter requires application of a biquad
  291. filter that simulates the 2nd order Butterwoth lowpass filter.
  292.  
  293. The formula for biquad is as follows:
  294.  
  295.     output(n) = b0 * input(n) + b1 * input(n-1) + b2 * input(n-2)
  296.         - a1 * output(n - 1) - a2 * output(n - 2)
  297.  
  298. That is, two samples of past input history and two samples of past output history and
  299. the current sample are required. The coefficients are derived using bilinear transform
  300. in the Python source; feel free to look at the gory details for computing b0, b1, b2,
  301. a1 and a2 from there.
  302.  
  303. So, anyway, we should go and filter our copy of the previous sinc function
  304. with the biquad formula, and then proceed to figuring out how to convert the
  305. filtered sincs to bandlimited steps.
  306.  
  307. The step-function
  308. -----------------
  309.  
  310. Step is our representation of Paula's pulses. We could define it as follows:
  311.  
  312.     f(x) = 0 if x < 0
  313.     f(x) = 1 if x >= 0
  314.  
  315. The band-limited step
  316. ---------------------
  317.  
  318. Convolution in DSP is the application of a filter in the time domain. It is
  319. basically like computing a dot product between two vectors. The other "vector"
  320. is the tabularised version of the convolution function, and the other is the
  321. sample data around the point we wish to compute convolution at. We would be
  322. talking about 2048-dimensional vector here, if vectors were otherwise an
  323. appropriate metaphor.
  324.  
  325. In order to bandlimit a step, we can calculate a representation of the step
  326. sample stream.  We start with the entire sample buffer at zero and calculate
  327. the convolution. It is all zero as expected.
  328.  
  329. Then we introduce the edge, or the first sample of the step into the sample
  330. buffer. The convolution's value is now equal to the first value in the
  331. convolution function. As more samples are introduced, the value of convolution
  332. is the sum of the values in the convolution function over the area of the step
  333. function. Thus, the bandlimited step is in effect just a numeric integral of
  334. the convolution function.
  335.  
  336. We should take care to scale the step accordingly. If you are dealing with
  337. 16-bit sample data, you would probably want to scale the step in such a fashion
  338. that it starts at, say, 0 and ends at 65536. Or, as I often prefer, it starts
  339. at -65536 and ends at 0. This is just to make the blep evaluation code simple.
  340. The downside with starting with negative number is that the sign takes 1 byte
  341. to encode in the resultant C source. For the example program, the blep
  342. resolution is 17 bits and the blep runs from a positive value to zero.
  343.  
  344. Sample code in C (but for mono output)
  345. --------------------------------------
  346.  
  347. /* 131072 to 0, 2048 entries */
  348. #define BLEP_SCALE 17
  349. #define BLEP_SIZE 2048
  350.  
  351. /* the step function */
  352. int32_t sine_integral[BLEP_SIZE] = { precomputed table };
  353.  
  354. /* the instantenous value of Paula output */
  355. static int16_t global_output_level;
  356.  
  357. /* count of simultaneous bleps to keep track of */
  358. static unsigned int active_bleps;
  359.  
  360. /* the structure that holds data of bleps */
  361. typedef struct {
  362.     int16_t level;
  363.     int16_t age;
  364. } blep_state_t;
  365.  
  366. /* place to keep our bleps in. MAX_BLEPS should be
  367.    defined as a BLEP_SIZE / MINIMUM_EVENT_INTERVAL.
  368.    For Paula, minimum event interval could be even 1, but it makes sense to
  369.    limit it to some higher value such as 16. */
  370. static blep_state_t    blepstate[MAX_BLEPS];
  371.  
  372. /* return output simulated as series of bleps */
  373. int16_t
  374. output_sample(int filter_table) {
  375.     int i;
  376.     int32_t output;
  377.  
  378.     output = global_output_level << BLEP_SCALE;
  379.     for (i = 0; i < active_bleps; i += 1)
  380.         output -= winsinc_integral[filter_table][blepstate[i].age] * blepstate[i].level;
  381.     output >>= BLEP_SCALE;
  382.    
  383.     if (output < -32768)
  384.         output = -32768;
  385.     else if (output > 32767)
  386.         output = 32767;
  387.    
  388.     return output;
  389. }
  390.  
  391. In the code where new Paula samples get added to the system, you would so
  392. something like this:
  393.  
  394. function input_sample(int16_t sample) {
  395.     if (sample != global_output_level) {
  396.         /* Start a new blep: level is the difference, age (or phase) is 0 clocks. */
  397.         if (active_bleps > MAX_BLEPS - 1) {
  398.             fprintf(stderr, "warning: active blep list truncated!\n");
  399.             active_bleps = MAX_BLEPS - 1;
  400.         }
  401.         /* Make room for new blep */
  402.         memmove(&blepstate[1], &blepstate[0], sizeof(blepstate[0]) * active_bleps);
  403.         /* Update state to account for the new blep */
  404.         active_bleps += 1;
  405.         blepstate[0].age = 0;
  406.         blepstate[0].level = sample - global_output_level;
  407.         global_output_level = sample;
  408.     }
  409. }
  410.  
  411. And while no events occur, you would call the clock() function to advance the bleps.
  412.  
  413. void clock(unsigned int cycles) {
  414.     for (i = 0; i < active_bleps; i += 1) {
  415.         blepstate[i].age += cycles;
  416.         if (blepstate[i].age >= BLEP_SIZE) {
  417.             active_bleps = i;
  418.             break;
  419.         }
  420.     }
  421. }
  422.  
  423. This code is not very optimized in that it constantly keeps blepstate as a
  424. sorted list. It is possible to implement it more efficiently, for instance by
  425. keeping occurence times instead of ages, but that complicates the output sample
  426. function.  The right optimization of the code might depend on whether periods
  427. are small or large compared to sampling intervals. For Paula, periods are
  428. typically much larger than sampling intervals, so code like this might be
  429. better tradeoff. For modern computers, it takes quite little time, so I leave
  430. the optimizing for people who are that way inclined. Typical length of the blep
  431. list can be short, only 5-6 entries if the sample rates are low or occur
  432. conveniently otherwise. At the maximum, Paula is constantly producing pulse
  433. waves but it's probably sensible to limit maximum period to 124 unless Paula
  434. DMA fetch starvation is simulated.
  435.  
  436. A typical sequence of events would look like this. Assuming that we sample output
  437. at exactly 80 paula clock ticks apart, and that playback period is, say 140, we'd do
  438. calls like this:
  439.  
  440.     input_sample(samplebuf[0]);
  441.     clock(80);
  442.     *output++ = output_sample();
  443.     clock(60);
  444.     input_sample(samplebuf[1]);
  445.     clock(20);
  446.     *output++ = output_sample();
  447.     clock(80);
  448.     *output++ = output_sample();
  449.     clock(40);
  450.     input_Sample(samplebuf[2]);
  451.     ...
  452.  
  453. Results and analysis
  454. --------------------
  455.  
  456. I have hi-fi recordings of Amiga 500 and 1200 playing Sainahi Circles with both
  457. LED filter on and off. This analysis was largerly based on the sampling work by
  458. pyksy on #amigaexotic at IRCnet. Big thanks to pyksy for his work! In addition,
  459. Melkor (kohina board) sent me a hi-fi sampling of his Amiga 4000 playing Sainahi
  460. Circles.
  461.  
  462. The results are summarised as number of png pictures that display the simulated
  463. and real responses side-by-side and their residual (difference) spectrums. The
  464. units are in Hz and dB.  There is a constant offset that follows from the fact
  465. that the hi-fi samplings used 32-bit floating point samples, whose value range
  466. is different from the integer-coded PCM data, yielding the largest part of the
  467. difference. The rest is explained by the different recording levels used
  468. between setups.
  469.  
  470. The files called a{1200,500,4000}_{on,off}.png shows the simulated response of
  471. UADE and real Amiga superimposed. comb{1200,500,4000}_{on,off}.png show the
  472. difference between two in dB units. These are the same pictures as before, but
  473. with the spectra substracted, allowing for more precise evaluation of the
  474. differences between simulated and real output.
  475.  
  476. We can see that UADE and Amiga produce same features closely, thus essentially
  477. validating the model used for output synthesis.
  478.  
  479. We can observe a good cutoff starting around 20 kHz and quickly extinguishing
  480. response by approximately 23 kHz. This should yield satisfactory results for
  481. all sampling frequencies above and including 44.1 kHz. However, I personally
  482. recommend 48 kHz as default output frequency because it works without
  483. resampling on majority of cards, while 44.1 kHz output often may need a
  484. resampling step on either software or hardware.
  485.  
  486. For LED off cases, the residual spectrum is largerly just noise. There is,
  487. however a small 0.3 dB boost around 0 to 3 kHz. This can be seen on Amiga 500,
  488. but not on Amiga 1200. In both Amiga 500 and Amiga 1200, the LED filter also
  489. displayed this boost, and it was compensated for by adjusting the Butterworth
  490. filter with a small resonance term.
  491.  
  492. Despite adjustments, the LED filter case is not perfect. The residual spectrum
  493. is roughly flat to 3 kHz, thanks to manually fitting the filter, but after that
  494. there is some extra attenuation which is then compensated by something else on
  495. the analog filter.  For Amiga 500 case, the simulation hits noise floor and can
  496. not yield anything meaningful above 15 kHz, but those measurements also show
  497. the same general pattern.
  498.  
  499. Overall, I think the model yields accuracy to about 0.5 dB for all cases, which
  500. should be sufficient for all practical purposes. Especially the Amiga 1200 case
  501. seems well modelled.
  502.  
  503. Author
  504. ------
  505.  
  506. Antti S. Lankila <alankila@bel.fi>
  507.  
  508. feel free to send feedback, comments and spam.
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
 
Top