Sound is transmitted through waves and the pitch, volume, and timbre of the sound are all determined by the structure of the wave.
- 1 Wave Terminology
- 2 Waveform
- 3 Noise
- 4 Wave Creation
- 5 Envelopes
- 6 Filters
- 7 Hard Sync
The structure of a sound wave can be represented with a two-dimensional graph like the one to the right. The crest is the highest point of the wave, the trough is the lowest. Amplitude (often represented with an 'a') is the distance from either the crest or trough from the center line. Wavelength (often represented with the Greek letter lamda 'λ') is the distance from one crest to the next, or one trough to the next, or one crossing of the center line, to the next in the same direction.
Amplitude corresponds to the volume of a sound wave. A low amplitude yields a quieter sound, while a high amplitude yields a louder sound. The human ear perceives volume on a logarithmic scale. This means, in order for a sound to be perceived as twice as loud, it must actually be ten times as loud. Because of this, waveforms usually use a logarithmic scale for amplitude.
The pitch (or frequency) of a sound raises as the wavelength shrinks and lowers as the wavelength expands. Thus, sound that is high in pitch will have short tightly packed wavelengths, while sound that is low in pitch will have long spread out wavelengths. Due to the structure of the human ear, as sound waves change in pitch, we perceive them as also changing in volume. Generally, a low pitch sound will sound quieter than a high pitch sound with an identical amplitude. However, as the pitch continues to raise, the perceived volume eventually falls as it reaches the boundaries of our high-frequency hearing.
The timbre (sometimes called the color or texture) of the sound wave is based on the shape of the wave, or waveform. Common shapes can be seen below.
Square waves are waveforms shaped like squares. They produce a very artificial-sounding "beep" tone, but are easily generated. By definition, a square wave must have a waveform that is square shaped with symmetrical times between crests and troughs. If the values are not symmetrical, resulting in rectangular waveforms, the wave is described as a pulse wave which yields a different timbre. Most early 8-bit video game sound chips were capable of generating pulse waves with varying rectangular shapes including square waves.
A pulse wave is a simple waveform very similar in shape to a square wave, but with a non-symmetrical (or rectangular) waveform. Pulse waves usually have programmable duty cycles (or pulse widths), which means the amount of activity per period (or the percentage of which the wave is high during a single wavelength) is customizable. The diagram to the right represents a 75% active duty cycle (3/4 up, 1/4 down), while a 25% duty cycle would be the reverse (1/4 up, 3/4 down), and a 50% duty cycle creates a square wave (1/2 up, 1/2 down). Changing the duty cycle modifies the wave's timbre, but doesn't modify the pitch or volume. Pulse waves still produce classic "beep" sounds like square wave, but altering the duty cycle will slightly alter to timber.
Triangle waves are shaped like triangles and produce a deep tone similar to the classic "boop" sound of a sine wave. They were often used instead of sine waves in early synthesizers because they are much easier to produce, but the advent of lowcost programmable waves made them obsolete. An example of triangle waves can be heard in the bass line of most Nintendo Entertainment System games like Title BGM from Metroid (NES).
The APU of the Nintendo Entertainment System featured a dedicated triangle wave channel.
Sawtooth waves are waveforms with a shape similar to the teeth on a saw. Compared to a square wave, they produce a rougher "behp" sound. Though sawtooth waves are easily generated, they were far less common on early audio chips making them rare to early video game music. This was probably due to their rough sound making them less conducive to melodic music.
Konami's VRC VI chip had a dedicated sawtooth wave channel.
Sine waves are waveforms with a regular curve. When left unmodified, a sine wave produces a very artificial-sounding "boop" tone. They are fairly easily generated, though not as easy as a triangle wave which was more prominent in the 8-bit era. Sine waves didn't become popular in video game music until the early 1990s when chips were made which could easily modify them using FM Synthesis.
As synthesizers increased in complexity, it became possible to program complex waveforms. Complex waves can take the shape of any of the previous waveforms like the sawtooth or triangle, as well nearly any other arbitrary shape, even those that are non-symmetrical. This was very helpful toward replicating the sound of real-life instruments because their waveforms do not take such simplistic shapes.
The sound chips in the Famicom Disk System and Game Boy each featured a single programmable wave channel which could yield complex waveforms, but complex waveforms really became popular with the advent of FM Synthesis pioneered by Yamaha with their FM Operator line like the OPL2 and OPL3.
Noise is a term for random sound waves, which are usually undesirable, but which can be harnessed for musical purposes. Many early audio chips would have a dedicated noise channel that would be used in short bursts to simulate percussion. There are various types or "colors" of noise.
White noise is essentially random sound with no discernible pattern whatsoever. When plotted on a chart of discrete samples, no correlation can be seen between a point and any subsequent points. This gives the sound of white noise its trademark "hiss."
While pink noise is also random, unlike white noise, subsequent samples are correlated to the previous. When plotted, you can see the graph jump around, but never too far from the previous sample. Because of this, pink noise has a more dampened "hiss" than white noise.
Brown noise (also called Brownian noise or red noise) is like pink noise, but is much closer tied to the previous sample simulating Brownian motion. When plotted on a small scale, it doesn't even appear that random, but over time the changes accumulate to form a pattern-free randomization. This dampens the noise even further creating more of a "drone" or "roar" than a "hiss."
Noise also comes in blue, violet and gray, but these are less common among video game audio.
Electronic waves must be generated before they can be introduced into the air to be heard by our ears. These are several methods of generating waves.
Envelopes change pitch, duty cycle, cutoff frequency and volume over time. They make a sound more dynamic and less stiff.
The most common volume envelope is ADSR (attack, decay, sustain, release): When a note begins, it rises -- either from mute or the most recent volume -- to full volume in attack time. From there, it falls to sustain level in decay time. Once the note ends, it falls to mute in release time. The rise and fall can be linear or exponential.
The SID, OPL2 and S-SMP sound chips and the SF2 format feature ADSR. The AY-3-8910 sound chip and the square wave and noise channels in the NES and Game Boy platforms feature less phases, but programmable volumes, allowing sound programmers to make up complex volume envelopes.
Low-pass, band-pass, band-reject, high-pass, etc.
In hard sync, one wave (master) tries to push its pitch onto another wave (slave), with the higher pitch largely winning.
The SID features hard sync.