Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Also, if we talk about the volume parameter.

The human ear's dynamic range is about 120 dB, which includes about 20-30 dB of pain. With 127 bits, we can map that with 1 dB resolution.

16 bit audio ("CD quality") only has a 90 dB dynamic range.

We would almost never want a single instrument to have a 90 dB dynamic range, but if we did, MIDI values could logarithmically encode it with a better than 1 dB per step resolution.

In a mix, any instrument that is reduced by more than about 20 dB will disappear.

When synthesized music (e.g. electronic drumming) lacks dynamics, it is not because of the encoding of the raw volume parameter. It's due to other factors, like poor synth patches. Poor synth patches use a small number of samples, like say a snare drum being hit in three different ways, and they stretch that over the dynamic range with some naive scaling. A real drum doeesn't work that way; it makes a different sound for each intensity with which it is struck. You need samples of it being played at a myriad volume levels, sort those by intensity and map them to the intensity range.

Some instruments don't even change intensity that much when they are played louder; a lot of the perception of loudness comes from changing harmonic content. If you fake it with one sample that is just volume-adjusted, it will not sound right.

Synthesizers have tricks to help with this, like low-pass filters that respond to velocity: hit the key harder, and more high frequencies go through. That's one tool in the box for creating a more dynamic sound from scratch.



The perceptability of the 128 steps depends on the parameters it influences. If the midi parameter influences e.g. some form of pitch, 128 values won't get you very far without having the steps perceived.

It also has to do with the pressure range of midi controllers: 128 steps are few if you have to distribute them between "barely touching" and "hammering on it with full force". When you play a real instrument, you will notice that the range of things between the most silent and most loud you can manage is usually huge. For midi controllers this is kinda limited, so good developement.


You can distribute the values non-linearly, though; the difference between smallest and largest value might be big, but I don't think I could hit a pad or key in 128 different ways. Some controller software does offer a selection of velocity profiles.


It depends on the instrument. 128 positions is well enough for piano. From other instruments, drums is the one I know best, and there single parameter is just not enough: the result depends on velocity, the position where the stick hits drum head, for non round tip sticks the stick angle. For cymbals, in addition to hitting different parts of the cymbal, stick tip, shaft and shoulder give different sounds. And so on... There's a good reason why loops sampled from acoustic drums are used even though drum synths exist.


You do not need MIDI 2.0 to solve any of those problems.


No, and I don't expect MIDI 2.0 to help. It was just response to the idea that single parameter would be enough but 7 bits wouldn't.


If you want high resolution pitch, then you basically disagree with the whole concept of MIDI. The concept of MIDI is that notes are symbols. MIDI tells an instrument to play A4, not to play note witha 440 Hz fundamental. MIDI doesn't care how that instrument i tuned. A4 could come out as 430 Hz.

That said, MIDI supports microtonal effects like pitch bending. Pitch bend messages use 14 bits.


You get 7 bits with MIDI, not 127 bits. Also noone wants to encode audio samples using MIDI; the last time someone (ab)used a volume control for digital sample playback was on the Commodore 64.


That's not true ... http://www.4front-tech.com/pguide/midi/midi8.html describes the standard.

Those of us who were using MIDI in 1990 fondly remember it taking too much time, being not well supported, and generally not working well. "No one wants to" is true, but 30 years ago, many people did want to.


That's just a typo; 128 values.


> 16 bit audio ("CD quality") only has a 90 dB dynamic range.

That's a persistent myth. The channel noise floor at the frequencies of interest of 4x kHz / 16 Bit audio is below -100 dB due to combined noise shaping and dithering.

While this means for music 16 bit audio is generally sufficient, it has leaves little room for error; mastering has to be excellent. That's why everyone is recording in 24 bit; it allows you to patch up errors later.


It's not a "persistent myth" - it's absolutely correct.

The "persistent myth" is that signal-to-noise ratio and dynamic range are somehow identical. They aren't.

It's also a persistent myth that the unaltered quantisation noise spectrum is basically white noise. It isn't, except as a poor approximation.

In fact it's very spiky - mathematically it's literally a function related to related to the sample rate. Some frequencies produce more audible quantisation artefacts than others. This is audible on very good hardware, and it contributes to both harmonic and intermodulation distortion on cheaper hardware.

Dither and noise shaping distract from the effect in a subjectively pleasing way, but technically they're a cheap fix - like blurring a jpeg and pretending this somehow magically removes all of the compression artefacts. The result may be fine for Instagram, but not for commercial photography.

The bottom line is that 24-bit sampling fixes these issues because they simply become irrelevant. The SNR limits are defined by the analog limitations of the converters, and all of the quantisation artefacts remain below audibility.


> The "persistent myth" is that signal-to-noise ratio and dynamic range are somehow identical. They aren't.

Obviously :)

The SNR will stay at around ~96 dB for 16 bit audio. If that noise were white, then that would naturally limit the dynamic range to essentially the same number. But no one said that the quantization noise of the channel has to be white, indeed, the whole point of dithering and noise shaping is to de-correlate the quantization error from the signal and make the noise spectrum anything but white. Hence the dynamic range can be increased.

> It's not a "persistent myth" - it's absolutely correct.

No. The SNR is 96 dB; the dynamic range for the relevant frequency band is greater.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: