Chapter 2
Tonal Balance and Equalization

Tonal balance describes the relative energy of frequencies present in an audio signal. Also known as spectral balance, it is largely responsible for the perceived timbre of an audio signal. From single instruments and voices, to stems and full mixes, tonal balance is paramount for audio engineers, whether they are recording engineers or loudspeaker designers. Overall tonal balance can make a mix sound full or thin, even or uneven, warm or harsh, dark or bright, clear or muddy. The tonal balance of individual instruments in a mix impacts the overall balance. In other words, tonal balance is key to great-sounding audio. Likewise, a flat frequency response from loudspeakers and headphones is an ideal that many design engineers work toward. Manufacturers of high-end loudspeakers, headphones, and car audio systems continue to tune their systems by ear to achieve an optimal tonal balance.

Our ability to hear and control tonal balance is crucial to becoming outstanding audio engineers. The good news is that we can develop these necessary critical listening skills through focused listening and practice. There is no magic to the process. It takes consistent effort but the work pays off. Ideally, we can hear if or how our equipment alters the tonal balance/spectral content of our audio. If we know what is being added or removed by each device or plug-in, we can make the most of our tools. The more familiar we are with our tools, the more we can draw on them for specific purposes in our projects. In loudspeaker and headphone development, the process can move much quicker if the team doing the listening tests can talk about the tonal balance in terms of specific frequencies in hertz. In this chapter we will discuss methods to analyze tonal balance by ear and learn how to relate timbre perception to controllable audio parameters.

2.1 Tonal Balance: Descriptors

How might we describe the tonal balance of a mix, a track, or a loudspeaker? It could be “bright” if high-frequency energy is prominent, or “boomy” if low frequencies dominate. The problem is that descriptors such as these are imprecise and often inconsistent from person to person. Furthermore, such descriptors are unrelated to effects processor parameters.

On consumer sound systems we control treble and bass with basic tone controls, but as audio engineers we require more selective control. Parametric equalizers are the natural choice for most of us, but graphic equalizers are also useful. Historically, telephone companies designed equalizers to correct uneven frequency responses in telephone transmission lines. Their goal was to make the frequency response flat, where all frequencies have equal energy, thus the term equalization.

Because equalizers are the primary tools for altering the spectral balance of sound, engineers often describe tonal balance using equalization parameters:

  • frequency in hertz
  • Q (or, inversely, bandwidth in hertz)
  • amount of boost or cut in decibels

2.2 The Power of EQ

A fully parametric equalizer is an exceptionally powerful tool. In fact, I believe that a parametric equalizer, with fully independent, sweepable controls, is an audio engineer’s most valuable device. An equalizer’s simplicity belies its power.

How do we fully realize the power of an EQ? There is no single correct way to use an equalizer. Each situation is unique and many factors influence our use of EQ: individual instruments/voices or sound sources, the mix, the recording environment, the musicians’ performances, and the microphones used. Only after we can hear where problems exist can we apply appropriate corrective measures with an EQ. That is why technical ear training is so important—so that we can quickly pinpoint problems and correct them.

If we want to use EQ to correct tonal balance problems, how do we identify problems? Does it make sense to aim for a flat tonal balance in a recording or mix? Tonal balance is different than frequency response. The former characterizes a signal and the latter refers to a device. Many manufacturers measure and publish the frequency responses of their devices. Unless a device’s frequency response is flat, it will impart its own “equalization curve” onto a signal in some constant way. On the other hand, a music signal’s spectral content will vary according to its harmonies, melodies, transients, and unpitched sounds. Some test signals, such as white noise, have energy spread evenly across the spectrum. We might say a music signal’s tonal balance is flat if its frequency range is represented appropriately, but what does “appropriately” mean? Does it mean that a recording should sound identical to its acoustic version? Is that possible or even desirable?

Classical music recording engineers often try to recreate an ideal live concert experience in their recordings. In other music genres, engineers create mixes irrespective of the live performance experience. A pair of audience microphones at a pop or rock concert captures the sound waves at that point, but I would argue that the recording is less engaging than a recording of close microphones. Think about online concert videos recorded on a phone. The main drawback, besides possible clipping, is an overabundance of reflected and reverberant sound in the audience. The sound is washed out and unfocused due to the strength of reflected sound. Audience microphones alone do not give an “appropriate” recording of an amplified concert, even though the recording might be acoustically correct.

Professional engineers adjust equalization and spectral balance to best suit the situation they are working in. For instance, a jazz kick drum will likely have a different spectral balance than a heavy metal kick drum. Experienced recording engineers understand and can identify specific timbral differences between these two examples. Well-developed listening skills help us determine the equalization or spectral balance for a given recording situation. When approaching a recording project, we should be familiar with existing recordings of comparable music, film, or game audio and know each project’s timbral goals.

2.3 Human RTAs for Mixing?

Real-time spectral analyzers (RTA) provide some indication of the frequency content and balance of an audio signal (see Figure 2.1). Although it is tempting to use an RTA to determine equalization settings for creative audio work, it is generally not as effective as just using our ears. In contrast to subjective impressions, frequency spectrum measurements visualize and quantify audio signals. An RTA measures the energy in each frequency band, often using a mathematical operation such as a fast Fourier transform (FFT). An FFT provides what is called a “frequency domain” representation of a signal, in contrast to a “time domain” representation of a signal. Digital audio workstations (DAW) display tracks as time domain waveforms, whereas many DAW equalizer plug-ins provide real-time frequency domain representations (such as Figure 2.1). Real-time analyzers provide snapshots of frequency content at some predefined time interval, such as at 1024-, 2048-, or 4096-sample intervals. As an example, Figure 2.2 shows a time domain and frequency domain representation of a 1 kHz sine tone.

We might say that experienced audio professionals are human spectral analyzers because they identify and characterize tonal balance to a high degree of accuracy and work to correct it. Engineers monitor the spectral balance of individual microphone signals as well as the overall spectral balance of multiple, combined microphone signals at each stage in a recording project.

Figure 2.1 The frequency content of a mix at one moment in time. The lowest peak sits at approximately 65 Hz, which corresponds to a pitch of C, with subsequent peaks at higher harmonics.

Figure 2.1 The frequency content of a mix at one moment in time. The lowest peak sits at approximately 65 Hz, which corresponds to a pitch of C, with subsequent peaks at higher harmonics.

Audio engineers rarely rely on RTA power spectrum measurements for mix decisions. Instead they use their ears. Instead of an RTA, they use alternate monitoring systems and familiar reference tracks to increase objectivity in the recording and mix process. RTA spectral measurements are ineffective for creative audio work for three reasons:

  1. Music signals fluctuate in frequency and amplitude, which make the frequency spectrum displays difficult to read and interpret with any degree of accuracy or subtlety.
  2. A snapshot of the frequency spectrum provides a static visual display that can be easily analyzed, but because it is a snapshot the time frame is too narrow to be useful.
  3. Taken to the opposite extreme, if we average the spectrum over several minutes, the fluctuations slow down but problems are obscured or averaged out.

Because of these visual display problems, RTAs are unreliable for EQ decisions. Aurally obvious EQ changes are largely indiscernible on a fluctuating spectral display. More subtle EQ changes are impossible to see. Besides that, we do not know what the spectral plot “should” look like because there is no objective reference for a recording we have just made.

Figure 2.2 The time domain (top) and frequency domain (bottom) representations of a 1 kHz sine tone. Note the x-axis labels on the two plots.

Figure 2.2 The time domain (top) and frequency domain (bottom) representations of a 1 kHz sine tone. Note the x-axis labels on the two plots.

Things are a bit more complicated by the trade-off between time resolution and frequency resolution. If we update the display more frequently (increase time resolution), the frequency resolution decreases (we see less detail across frequencies). If we increase the frequency resolution, the analyzer smears transients because the time resolution is reduced. Thus, physical measures are largely inappropriate for EQ decisions, so we must rely on our ears.

Live sound engineers, on the other hand, who are tuning a sound system for a live music performance will often use real-time spectral analyzers. The difference is that they have a reference, which is often pink noise or a recording, and the analyzer compares the spectrum of the original audio signal (a known, objective reference) to the output of the loudspeakers. The goal in this situation is a bit different from what it is for recording and mixing, because a live sound engineer is adjusting the frequency response of a sound system so that the input reference and the system output spectral balances are as similar as possible.

2.4 Shaping Tonal Balance

In addition to the equalizer, we can use other methods to control spectral balance such as through microphone selection and placement. There are also indirect factors that influence our spectral balance choices, such as monitors (headphones and loudspeakers) and listening environment acoustics. In this section we discuss methods to alter the spectral balance directly and indirectly.

Indirect Factors Affecting Spectral Balance

Before we can even begin recording, mixing, mastering, or other production work, we need to be aware of the indirect factors that affect our work that are easy to overlook because it may be tempting to think we can solve all of our production issues with just the right plug-in. Because there is no direct connection between our recorded audio signals and the auditory processing center of the brain (at least not yet), we need to keep in mind that audio signals are altered in the transmission path between our recorders and our brains.

Three main factors influence our perception of the spectral balance of an audio signal in our studio control room:

  • studio monitors/loudspeakers and headphones
  • room acoustics
  • sound levels

I refer to these aspects as indirect because, although they can alter the tonal balance of our work by a significant amount, we are not adjusting their characteristics in a direct way like we do when we equalize a track, for example. We purchase a set of studio monitors or headphones and we work in a room that has certain acoustical characteristics that remain constant across a range of projects until we change acoustical treatments or purchase a new set of loudspeakers. Without accuracy in our monitoring system—loudspeakers and room acoustics combined—we cannot know whether our equalizer and other processing choices are valid or not.

Figure 2.3 illustrates the path of an audio signal from electrical to acoustical energy, highlighting three of the main modifiers of spectral balance that affect every choice we make in spectral balance processing.

Figure 2.3 The signal path showing the transmission of an audio signal as an electrical signal to a loudspeaker where it is converted to an acoustic signal, modified by a listening room, and finally received by the ear and processed by the auditory system. Each stage highlights factors that influence a signal’s spectral balance—both physical (loudspeaker and room) and perceptual (auditory system)—through the path.

Figure 2.3 The signal path showing the transmission of an audio signal as an electrical signal to a loudspeaker where it is converted to an acoustic signal, modified by a listening room, and finally received by the ear and processed by the auditory system. Each stage highlights factors that influence a signal’s spectral balance—both physical (loudspeaker and room) and perceptual (auditory system)—through the path.

Studio Monitors and Loudspeakers

Studio monitors and loudspeakers are like windows through which we perceive and make decisions about audio signals. Monitors do not have a direct effect on the spectral balance of a recording, but they do affect it indirectly because our equalization choices depend on what we hear from our monitors. Each type and model of monitor and loudspeaker offers a unique frequency response, and because we rely on monitors to judge spectral balance, monitor selection affects these judgments. If we monitor through loudspeakers with a weak low-frequency response, we may boost the low frequencies in the recorded audio signal.

A common way to add some objectivity to the mixing process is to audition on a second or possibly third set of speakers and headphones. Experienced engineers check their mixes on several different sets of loudspeakers and headphones to get a more accurate impression of the spectral balance. Each loudspeaker model is going to give a slightly different impression of a mix, and by listening to a variety of monitors we can find the best compromise. Even very inexpensive loudspeakers are useful as alternates to a main set of speakers, so that we might hear our work through speakers that the average listener might have, and therefore make mix decisions based on what we hear from them. Each monitoring system tells us something different about the sound quality and mix balance. One loudspeaker model may give the impression that the reverberation is too loud, whereas another may sound like there is not enough bass. We search for a compromise so that the final mix will sound optimal on many other systems as well. Engineers often say that a mix “translates” well if it remains relatively consistent across various types and sizes of loudspeakers. One mark of a well-made recording is that it will translate well on a wide range of sound reproduction systems, from mini-systems to large-scale loudspeaker systems.

Beyond the inherent frequency response of a loudspeaker, almost all active loudspeakers include built-in user-adjustable filters—such as high- and low-frequency shelving filters—that can compensate for such things as low-frequency buildup when monitors are placed close to a wall. So any decisions made about spectral balance will be influenced by the cumulative effect of a speaker’s inherent frequency response added to any filtering applied by the user.

Real-time analyzers and frequency response measurement tools can provide some indication of the frequency response of a loudspeaker within a room. One important point to keep in mind is that unless frequency response is being measured in an anechoic chamber, the measured response results from a combination of both the loudspeaker and the room resonances and reflections. Some loudspeaker models offer control of their frequency responses, ranging from limited controls (e.g., high- and low-frequency trims) to extensive parametric equalizers. Other than for small adjustments, it is not always helpful to use a parametric equalizer to fix loudspeaker-room frequency response problems. Equalizers cannot correct frequency response anomalies resulting from room modes and reflections. These acoustic conditions occur because two or more waves arrive at different times and either cancel or add. Frequency response notches created by room modes or comb filtering cannot be brought back to flat. There is little or no energy to boost at frequency notches because destructive interference has removed energy at those points. Equalization relies on having energy present at a given frequency so that it can amplify that frequency. If no signal exists, there is nothing to amplify.

The other reason equalization is ineffective in correcting room responses is that each point in a room will have a different frequency response. If we can correct for one location, another location nearby could be made worse. As we will discuss below, frequency resonances in a room are prominent in some locations and less so in others.

A further complication is that identical frequency response curves can have different time responses. So even if we correct for frequency response anomalies, there may still be time response differences. Let’s look at an example of this situation. Figure 2.4 shows the frequency response and impulse response of a digital audio workstation mixer channel at unity gain with no processing applied. The impulse response shows a perfect impulse, as short as possible and at maximum amplitude. The result is a flat frequency response, as we would expect from a perfect impulse.

Figure 2.4 A perfect impulse (top, time domain representation) contains all frequencies at equal level (bottom, frequency domain representation). The 30-ms delay of the impulse represents the digital output-input loopback time on the audio interface and software input-output buffer. We can still have a precisely flat frequency response even if the impulse is not perfect (see Figure 2.5).

Figure 2.4 A perfect impulse (top, time domain representation) contains all frequencies at equal level (bottom, frequency domain representation). The 30-ms delay of the impulse represents the digital output-input loopback time on the audio interface and software input-output buffer. We can still have a precisely flat frequency response even if the impulse is not perfect (see Figure 2.5).

To illustrate a change in time response that maintains a flat frequency response, we will consider an all-pass filter. Without getting into a technical discussion, we can simply say that an all-pass filter passes all frequencies equally but alters the phase differently for different frequencies. Sometimes called a phase rotator, an all-pass filter can be used to reduce the level of peak transients because it smears the energy over time. We may not hear the effect of a single all-pass filter on our audio. On the other hand, banks of all-pass filters are often used in parametric digital reverberation algorithms (in contrast to convolution reverberation) to simulate the natural decay of sound in a room, as we will discuss in Chapter 3. In reverberation algorithms, all-pass filter parameters are set to produce an obvious effect.

Figure 2.5 shows the impulse response of an all-pass filter. In this case it is the digital audio workstation REAPER’s ReaEQ all-pass filter. Note that the frequency response of the filter is completely flat, as with Figure 2.4, but the time response is different.

Figure 2.5 The impulse response (top, time domain representation) of an all-pass filter (in this case REAPER’s ReaEQ) and the resulting frequency response (bottom, frequency domain representation). Note the perfectly flat frequency response at 0.0 dB as in Figure 2.4 but a different impulse response.

Figure 2.5 The impulse response (top, time domain representation) of an all-pass filter (in this case REAPER’s ReaEQ) and the resulting frequency response (bottom, frequency domain representation). Note the perfectly flat frequency response at 0.0 dB as in Figure 2.4 but a different impulse response.

Things get slightly more interesting when we mix a signal with an all-pass filtered version of itself. In REAPER, you can set the mix ratio of the all-pass filter to 50%. Figure 2.6 shows the result of a signal mixed with an all-passed version of itself. For this particular measurement I set the center frequency (F c) to 1 kHz. As we can see from the plot, there is a deep notch at 1 kHz. The notch is due to the all-pass filtered version having a gradual increasing phase shift up to 1 kHz with a maximum shift of 180° at 1 kHz, and then a gradual decreasing phase shift as we increase in frequency above 1 kHz. When the original signal with 0° phase shift at 1 kHz mixes with 180° phase shift at 1 kHz (all-pass filtered version), the result is complete cancellation at that frequency. Further away from the center frequency, less cancellation occurs because the phase shift approaches 0° as we move further away from 1 kHz.

Figure 2.6 The frequency response of a signal mixed with an all-pass filtered version of itself. The result is a notch in the signal at the center frequency of the filter.

Figure 2.6 The frequency response of a signal mixed with an all-pass filtered version of itself. The result is a notch in the signal at the center frequency of the filter.

Another example of the all-pass filter effect comes from loudspeaker crossovers. A crossover is simply a low-pass filter for the woofer and a high-pass filter for the tweeter that combine acoustically to give a flat frequency response. Depending on the filter order and type, the phase at the crossover frequency could be shifted − 90° for the low-pass filter and +90° for the high-pass filter. The resulting phase difference between the two filters at the crossover frequency is 180° (i.e., 90° − (− 90°) = 180°). Another example of an all-pass filter is the phaser effect, which uses multiple all-pass filters and modulated center frequencies.

Control Room and Listening Room Acoustics

Our control room dimensions, volume, and surface treatments also have a direct effect on the audio we hear. International professional societies and standards organizations, such as the International Telecommunications Union (ITU) headquartered in Switzerland, have published recommendations on listening room acoustics and characteristics. One of their recommendations, ITU-R BS.1116 (ITU-R, 1997), defines a number of physical and acoustical parameters that can be applied to listening rooms such that listening tests may be somewhat comparable from one room to another. Some may be inclined to think that an anechoic room, free of room modes and reflections, would be ideal for listening because the room will essentially be “invisible” acoustically. But a reflection-free room does not represent the acoustic conditions of a typical room.

Sound originating from loudspeakers propagates into a room, reflects off objects and walls, and combines with the sound propagating directly to the listener. Sound radiates mainly from the front of a loudspeaker, especially for high frequencies, but most loudspeakers become more omnidirectional at low frequencies. The primarily low-frequency sound that is radiated from the back and sides of a loudspeaker is reflected back into the listening position by any solid wall that may be nearby. As it turns out, walls with significant mass, such as concrete or multiple layers of drywall, reflect low frequencies better than walls made from a single layer of drywall.

Regardless of the environment in which we are listening to reproduced sound, we hear not only the loudspeakers but also the room. Loudspeakers and listening environments act as filters, altering the sound we hear. Room modes depend on a room’s dimensions and influence the spectral balance of what we hear from loudspeakers in a room. Room modes are mostly problematic in the low-frequency range, typically below around 300 Hz. Fundamental resonant frequencies that occur in one dimension (axial modes) have wavelengths that are two times the distance between parallel walls. Splaying or angling walls does not reduce room modes; instead the resonant frequencies are based on the average distance between opposing walls.

Because the amplitudes of room resonances vary according to location, it is important to walk around and listen at different locations within a room. The listening position may have a standing wave node at a particular frequency. A node is a point in a room where cancellation occurs for a given frequency. Different frequencies will have nodes in different locations. If we are not aware of nodes in our mixing room, we may boost a missing frequency with an equalizer, only to realize when listening somewhere else in the room that the boost is too much.

To get a new perspective on a mix, many engineers take a moment to listen from an adjacent room (with adjoining door open). We might hear balance problems that are not apparent when we listen directly in front of our monitors. We are probably not going to make subtle mix decisions listening at a distance, but it is useful to listen to levels and balances of vocals, lead instruments, and low end (bass) especially. It is also useful to identify if any balances seem to change when we listen indirectly.

Sound Levels and Tonal Balance

Our perception of tonal balance also depends on listening levels, that is, how loud we listen. In 1933 Fletcher and Munson published results of their study of human loudness perception across the audio frequency range at different listening levels. These equal-loudness curves show that the human hearing system has wide variations in frequency response, and the response changes with loudness. In general, we are less sensitive to frequencies at the high and low ends of the spectrum, but very sensitive to mid-spectrum frequencies (from about 1000 to 4000 Hz). This insensitivity to low and high frequencies changes when we turn up the sound level, and our ears become more sensitive to high and low frequencies relative to mid-frequencies.

What practical implication does this have for recording and mixing? If we monitor at a high sound level—such as 100 dB average sound pressure level (SPL)—and then turn the level down much lower—to 60 dB SPL, for example—bass frequencies will be much less prominent in the mix. Mix engineers often check their mixes at different monitoring levels to find the best tonal balance compromise. Furthermore, even very small level differences, even just 1 dB, can make clear differences in sound quality attributes of a mix. Since there is no standard listening level for music, we do not know how loud listeners will hear our recordings. Movie soundtracks for theatrical releases are mixed on systems calibrated to a specific sound playback level. For music, it is helpful to compare our mix to commercial recordings to judge tonal balances at different listening levels.

Equalization

We can use equalizers to reduce particularly strong resonant frequencies and to accentuate or highlight characteristics of an instrument or mix. We remove particularly strong resonances if they mask other frequency components and prevent us from hearing the truest sound of an instrument. There is a significant amount of art in the use of equalization, whether for tuning a loudspeaker system or shaping a mix, and we rely on what we hear to make decisions about the application of EQ. The precise choice of frequency, gain, and Q is critical to the successful use of equalization, and the ear is the final judge of the appropriateness of an equalizer setting.

There are different types of equalizers and filters such as high-pass filters, low-pass filters, band-pass filters, graphic equalizers, and parametric equalizers, allowing various levels of control over spectral balance. Filters are those devices that remove a range or band of frequencies, above or below a defined cutoff frequency. Equalizers, on the other hand, offer the ability to apply various levels of boost or attenuation at selected frequencies. The next section describes briefly the most common types of filters and equalizers that we use in shaping the tonal balance of our work.

Filters: Low-Pass and High-Pass

High-pass and low-pass filters remove frequencies above or below a defined cutoff frequency. Usually the only adjustable parameter is the cutoff frequency, although some models do offer the ability to control the filter’s slope, that is, how quickly the output drops off beyond the cutoff frequency. Figures 2.7 and 2.8 show frequency response curves for low-pass and high-pass filters, respectively. In practice, high-pass filters are generally used more often than low-pass filters. We use high-pass filters to remove low-frequency rumble from a signal, while making sure the cutoff frequency is set below the music signal’s lowest frequency.

Graphic Equalizers

Figure 2.7 The frequency response of a low-pass filter set to 1000 Hz at three different slopes.

Figure 2.7 The frequency response of a low-pass filter set to 1000 Hz at three different slopes.

Graphic equalizers control only the amount of boost or cut for a given set of frequencies, usually with vertical sliders. The frequencies available for adjustment are typically based on the International Standards Organization (ISO) center frequencies, such as octave frequencies 31.5 Hz, 63 Hz, 125 Hz, 250 Hz, 500 Hz, 1000 Hz, 2000 Hz, 4000 Hz, 8000 Hz, and 16,000 Hz. Some graphic equalizers have more frequency bands, such as third-octave or twelfth-octave frequencies. A graphic equalizer’s designer usually predetermines the Q, such that the user cannot adjust it. Some models have proportional Q that widens for small boosts/cuts and narrows for large boosts/cuts. The graphic equalizer gets its name from the fact that the vertical sliders form the shape of the equalization curve from low frequencies on the left to high frequencies on the right.

Figure 2.8 The frequency response of a high-pass filter set to 1000 Hz at three different slopes.

Figure 2.8 The frequency response of a high-pass filter set to 1000 Hz at three different slopes.

Advantage:
  • Many possible frequencies or bands can be adjusted at once with a single unit or plug-in; for example, an octave band EQ has 10 adjustable frequencies (32 to 16,000 Hz).
Disadvantages:
  • Only the preselected frequencies can be adjusted.
  • Q is usually not adjustable.

Parametric Equalizers

Parametric equalizer is a term originally coined by Grammy Award–winning engineer, producer, and equipment designer George Massenburg in his 1972 Audio Engineering Society convention paper (Massenburg, 1972). It allows completely independent and sweep-tunable control of three parameters per frequency or band: center frequency (F c), Q, and amount of boost or cut at that frequency.

Q is inversely related to the bandwidth of the boost or cut and is defined specifically as:

Q = center frequency ÷ bandwidth

Bandwidth is simply the difference between two frequencies, a higher frequency minus a lower frequency: f 2 − f 1. As it turns out, there are different ways of determining what those frequencies are, and thus there are different definitions of Q. The classic definition defines the two frequencies, f 1 and f 2, as the points where the frequency response is 3 dB down from the maximum boost at the center frequency (F c) or 3 dB up from the maximum cut at the center frequency (F c). Figure 2.9 shows the frequency response of a parametric equalizer with a boost of 15 dB at a center frequency of 1000 Hz and a Q of 2, using the classic definition of bandwidth.

Another definition of Q, proposed by digital signal processing engineer Robert Bristow-Johnson (1994), defines the two frequencies, f 1 and f 2, as the points where the frequency response is the gain midpoint (Figure 2.10) for any gain level.

Given the same equalizer settings in Figures 2.9 and 2.10 (i.e., 1 kHz, +15 dB, Q = 2.0), we get different frequency responses from these two equalizers because the definition of bandwidth is different.

Figure. 2.9 The frequency response of a parametric equalizer with a boost of 15 dB at F c = 1000 Hz and Q = 2.0. In this case bandwidth is the classic definition or −3 dB down from the peak (at 1000 Hz), so f 1 = 781 Hz and f 2 = 1281 Hz, giving a bandwidth of 500 Hz and a Q of 2.0 (= F c /bw = 1000/500).

Figure. 2.9 The frequency response of a parametric equalizer with a boost of 15 dB at F c = 1000 Hz and Q = 2.0. In this case bandwidth is the classic definition or −3 dB down from the peak (at 1000 Hz), so f 1 = 781 Hz and f 2 = 1281 Hz, giving a bandwidth of 500 Hz and a Q of 2.0 (= F c /bw = 1000/500).

Figure 2.10 The frequency response of a parametric equalizer with a boost of 15 dB at F c = 1000 Hz and Q = 2.0 (= F c /bw = 1000/500). The bandwidth is 500 Hz, with f 1 = 781 Hz and f 2 = 1281 Hz, but because the bandwidth is defined as the midpoint gain, the curve is narrower than in Figure 2.9 even though the Q is the same.

Figure 2.10 The frequency response of a parametric equalizer with a boost of 15 dB at F c = 1000 Hz and Q = 2.0 (= F c /bw = 1000/500). The bandwidth is 500 Hz, with f 1 = 781 Hz and f 2 = 1281 Hz, but because the bandwidth is defined as the midpoint gain, the curve is narrower than in Figure 2.9 even though the Q is the same.

Equalizers can also have symmetrical or asymmetrical boost and cut responses. Figure 2.11 shows a symmetrical boost and cut overlaid on one plot. If we add these two curves or pass the audio through the boost followed by the cut, there is no change in the spectral balance. In other words, we can undo a boost with a cut by the same amount at the same center frequency.

Figure 2.11 The frequency response of symmetrical boost/cut equalizer showing a boost of 15 dB overlaid with a cut of −15 dB at F c = 1 kHz. We define bandwidth by the −3 dB point below the maximum (for a boost), or 3 dB above the minimum (for a cut).

Figure 2.11 The frequency response of symmetrical boost/cut equalizer showing a boost of 15 dB overlaid with a cut of −15 dB at F c = 1 kHz. We define bandwidth by the −3 dB point below the maximum (for a boost), or 3 dB above the minimum (for a cut).

Figure 2.12 The frequency response of an asymmetric boost/cut equalizer showing a boost of 15 dB and a separate notch cut at F c = 1000 Hz and Q = 2.0, with the boost and cut applied separately but overlaid on the same plot. The bandwidth is 500 Hz, with f 1 = 781 Hz and f 2 = 1281 Hz. Bandwidth is defined as the −3 dB point from the peak (in the boost) or −3 dB from 0 dB (in the notch).

Figure 2.12 The frequency response of an asymmetric boost/cut equalizer showing a boost of 15 dB and a separate notch cut at F c = 1000 Hz and Q = 2.0, with the boost and cut applied separately but overlaid on the same plot. The bandwidth is 500 Hz, with f 1 = 781 Hz and f 2 = 1281 Hz. Bandwidth is defined as the −3 dB point from the peak (in the boost) or −3 dB from 0 dB (in the notch).

Figure 2.12 shows an asymmetrical boost and cut overlaid on one plot. Notice that the bandwidth definition is different between the boost and cut in an asymmetrical equalizer. Because the cut (or notch) is narrow and deep at the center frequency, we measure the bandwidth as the − 3 dB point below 0 dB. On this type of equalizer, if we sum a boost and cut at the same center frequency and Q, we do not get a flat frequency response. In other words, we cannot undo a boost with a cut or vice versa, like we can with a symmetrical boost and cut equalizer. Figure 2.13 shows a summed asymmetric boost and cut.

Among equalizer designs, we find options and constraints in the numerous analog and digital models. Some equalizers—usually analog units or digital emulations of analog units—include stepped center frequency selection. With those units we can select only frequencies chosen by the manufacturer, rather than having continuously variable or sweepable frequency selection. Some models do not allow independent control of Q. Many analog designs limit the range of frequencies we can control for each band, presumably due to the constraints of analog electronics. One example might be a three-band EQ with a low range from 20–800 Hz, a mid range of 120–8000 Hz, and a high band of 400 to 20,000 Hz.

In contrast, basic digital parametric equalizers (such as those bundled with digital audio workstations) offer sweepable and fully independent control of center frequency, Q, and gain. Any fully sweepable parametric equalizer with independent controls can produce the same frequency response curves as one that does not have independent Q and frequency selection, but not vice versa. Why not choose an equalizer with the most control? If we are trying to tame a resonance at 240 Hz, and our equalizer is restricted to center frequencies of 200 Hz and 280 Hz without independent control of Q, that equalizer is not going to do the job. Once we have identified a resonant frequency, we need to be able to cut precisely at that frequency to make the most effective use of our equalizer without taking energy away from other areas.

Figure 2.13 If we apply an asymmetric boost and cut on the same track—in other words, if we sum the asymmetrical boost/cut equalizer for a given center frequency (F c = 1 kHz in this plot)—the frequency response is far from flat, as we can see in this plot. The practical consequence is that we cannot undo a boost with a cut by the same amount at the same frequency because the boost and cut curves are not symmetrical.

Figure 2.13 If we apply an asymmetric boost and cut on the same track—in other words, if we sum the asymmetrical boost/cut equalizer for a given center frequency (F c = 1 kHz in this plot)—the frequency response is far from flat, as we can see in this plot. The practical consequence is that we cannot undo a boost with a cut by the same amount at the same frequency because the boost and cut curves are not symmetrical.

Although frequency and cut/boost are standard units (Hz and dB, respectively), we find variations in the definition of bandwidth across models, beyond the two definitions mentioned above (classic and Bristow-Johnson). The drawback is that two EQs with identical settings might produce different frequency response curves. One EQ might have a wider measured bandwidth than another even though the Q parameter value is the same on both. Equalizers use different definitions of bandwidth, and we cannot necessarily transfer parameter settings from one plug-in to another and get the same sonic result. Part of the discrepancy is due to differences in bandwidth definition (− 3 dB vs. mid-gain points), but another difference is that some equalizers change Q proportionally with gain. The problem is that any Q-gain proportionality is unknown unless we measure it. Furthermore, if we train ourselves to hear one Q bandwidth definition, we will expect to hear these characteristics with every equalizer. Fortunately, center frequency and gain have standard values that mean the same thing for each equalizer. In practice, it is best to tune Q by ear rather than rely solely on the displayed value.

Shelving Filters

Sometimes confused with low-pass and high-pass filters, we can use shelving filters to adjust a range of frequencies by an equal amount. Whereas high- and low-pass filters remove a range of frequencies, shelving filters boost or attenuate by varying degrees a range of frequencies. High-shelving filters apply a given amount of boost or cut equally to all frequencies above the cutoff frequency, whereas low-shelving filters apply a given amount of boost or cut equally to all frequencies below the cutoff frequency. In recording studio equipment and plug-ins, shelving filters are often found as a switchable option in the lowest and highest frequency bands in a parametric equalizer. Some equalizer models also offer high- and low-pass filters in addition to shelving filters. Treble and bass controls on consumer audio systems use shelving filters. Figure 2.14 compares the frequency response of an attenuating high-shelving filter to that of a low-pass filter.

Microphone Choice and Placement

Figure 2.14 A comparison of a high-frequency shelf filter and low-pass filter.

Figure 2.14 A comparison of a high-frequency shelf filter and low-pass filter.

Microphones provide us with another important way to alter the tonal balance of acoustic sounds, because each microphone make and model has unique on- and off-axis frequency responses. Microphones are like lenses and filters on a camera. They affect not only the overall frequency content but also the perspective and clarity of the sound they picked up. Some microphone models offer a flat on-axis (0°) frequency response, whereas other models have prominent peaks and dips in their response. Engineers often choose microphones to complement the sound sources they record. We might choose a microphone with a peak from 2–4 kHz on a vocal or a microphone with a dip at 300 Hz for a kick drum. Classical music recording engineers often rely solely on microphone choice and placement for spectral balance and use no actual equalizers.

Although I said earlier that an equalizer is perhaps the most powerful tool in an engineer’s toolbox, equalizers cannot solve every problem. Some tonal balance and reverberation issues can only be solved through microphone choice and placement. If we optimize microphone placement for the perfect snare drum sound, the hi-hat sound leaking into the snare drum microphone may be harsh. If we place a microphone too far away from a source, there is no way to make it sound closer with an equalizer. Any equalization we apply to a microphone signal will affect all sounds, on- and off-axis, direct and indirect. We need to recognize sound quality issues such as these before we record and try to fix them with microphone placement. Technical ear training gives us the critical listening skills to make judgments about sound quality as quickly as possible during each phase of a project and take appropriate corrective action.

During recording sessions, we place microphones next to sources and listen to the resulting audio signals. We might also compare microphones to decide which ones have the most appropriate sonic characteristics for a given situation. How do we decide which ones are most appropriate? Generally, it is a good idea to consider the characteristics of a musician’s instrument or voice, the recording space, and the timbral goals for the final mix.

A microphone’s physical orientation and location affect an audio signal’s tonal balance due to:

  • the microphone’s off-axis frequency response
  • the microphone’s proximity effect
  • the sound source’s radiation patterns
  • the acoustic environment and any constructive and destructive interference of direct and reflected sound energy at the microphone location

Off-axis response is a critical, but easy to ignore, aspect of recording. Usually off-axis sound arriving at a microphone originates from:

  • indirect or reflected sound
  • direct sound from nearby instruments or voices

Microphones do not generally have the same frequency response for all angles of sound incidence. Even omnidirectional microphones, considered to have the best (flattest) off-axis response, have frequency responses that vary with angles of sound incidence. Simply changing a microphone’s orientation can alter a sound source’s spectral balance. Small diaphragm (1/4” diameter) condenser microphones provide close to a perfect omnidirectional polar pattern because the microphone capsule itself is small and is less likely to interfere with sound waves arriving off-axis.

Microphone manufacturers rarely provide off-axis frequency responses in a Cartesian graph format (i.e., a graph of frequency response [x-axis] versus magnitude [y-axis]) like they do for microphone on-axis responses. Still, many manufacturers include polar plot measurements at different frequencies. It is difficult to compare the on-axis (Cartesian plot) and off-axis frequency responses (polar plots) to determine the frequency response at a given angle. Some manufacturers simply claim that their microphones produce minimal off-axis coloration, without providing any measurements. We assume that a microphone with minimal off-axis coloration has a relatively flat frequency response or at least a smooth (no prominent peaks or dips) frequency response for sounds arriving off-axis.

Let’s explore an example of off-axis response in a practical situation. We know that a snare drum spot microphone, even if it is not omnidirectional, will also pick up other elements of the drum kit such as the hi-hat. We say that the hi-hat sound bleeds or spills into the snare microphone. If we aim the microphone at the drumhead, the hi-hat timbre depends largely on the microphone’s off-axis response. Drum set overhead and hi-hat spot microphones will also capture the hi-hat sound. If we use all of these microphone signals in a mix, the hi-hat sound coming from the snare drum microphone will blend with these other sources of hi-hat sound. Any significant coloration of the hi-hat from the snare microphone will have an impact on the overall sound of the hi-hat. Furthermore, any processing applied to the snare drum sound will also be applied to the hi-hat spill in the snare microphone. Taken a step further, because these microphones are different distances from the hi-hat, the hi-hat sound will arrive at slightly different times to the three microphones and could result in comb filtering if the signals are panned to the same place.

When setting up and listening to microphones, listen to the timbral quality of indirect and adjacent sounds. If you do not like what you hear, try moving or reorienting a microphone, or change the polar pattern if you can, and listen some more. It is usually difficult, if not impossible, to correct the timbre of indirect sounds at the mixing stage, so we save time and effort by getting the sounds we want directly from the microphones by adjusting their angles of orientation and location. Technical ear training plays a significant role in helping us determine timbre anomalies quickly and accurately so that we can take corrective measures early in the process.

Microphones also alter the tonal balance based on their proximity to a sound source. Directional microphones—such as cardioid, hypercardioid, and bidirectional polar patterns—amplify low frequencies when a sound source is close (1 m or less), in a phenomenon known as proximity effect or bass tip-up. In recording sessions, listen for changes in low-frequency response if a singer is moving while singing. The proximity effect can be used to our advantage to boost low frequencies when close miking a kick drum, for instance. Some microphones, especially those marketed specifically for kick drum applications, include frequency response graphs for different source distances.

Microphone location in relation to a musical instrument, voice, or instrument amplifier can have a direct effect on the resulting spectral balance because of the sound radiation patterns of sound sources. Tonal balance varies across the horizontal and vertical planes around a sound source. For example, sound emanating directly out of a trumpet bell will contain more high-frequency energy than sound to the side of the trumpet (assuming no reflective surface is directly in front of the instrument). We can affect a recorded trumpet’s timbre simply by changing a microphone’s location relative to the instrument. A trumpet bell aimed slightly above or below a microphone will result in a slightly darker sound (that is, it will contain less high-frequency energy) than when the trumpet is aimed directly at a microphone. For more detailed technical information about musical instruments’ sound radiation patterns, Jürgen Meyer’s book Acoustics and the Performance of Music (5th ed., 2009) is an excellent reference.

Polar patterns also help us focus on one source while attenuating others. Cardioid polar patterns are probably the most commonly used directional microphones, but bidirectional or figure-8 microphones are also highly effective in many situations but generally under-utilized. To take advantage of a bidirectional polar pattern, place other sound sources so they project into the microphone’s 90° off-axis null. When a microphone rejects sounds coming from 90° off-axis, there is more focus on the sound source at 0° or 180° (where there is also no attenuation). Strategic placement of bidirectional microphones and sound sources can help attenuate indirect and adjacent sound sources from a microphone and give a cleaner pickup, such as in Figure 2.15. No matter how good our parametric EQ skills are, some problems like spill or leakage are unfixable with EQ and must be corrected through microphone placement. Ear training is essential so that we can catch problems in the moment and fix them.

Figure 2.15 If we record acoustic bass with a bidirectional mic in the same room as saxophone, placing the sax 90° off-axis can help reduce the saxophone spill in the bass microphone.

Figure 2.15 If we record acoustic bass with a bidirectional mic in the same room as saxophone, placing the sax 90° off-axis can help reduce the saxophone spill in the bass microphone.

Microphones are powerful timbral shaping devices that we can use to our advantage when we listen closely to them and take their characteristics into account. They are the first stage of timbral shaping in the recording process, but they do not automatically give us ideal sound no matter where we put them. We still need to listen closely with our ears at various locations around a sound source to help guide microphone placement. Listening around a sound source to find a good microphone position is most effective with one ear facing the sound source and the other ear blocked with your hand. Once we decide on microphone placement based on listening acoustically, we can position the microphone and fine-tune its location based on the sound we hear from it.

2.5 Getting Started with Practice

All of the “Technical Ear Trainer” software modules are available on the companion website: www.routledge.com/cw/corey.

It is critical for us as audio professionals to have a keen sense of tonal balance for individual tracks as well as full mixes. Precise use of EQ can help tracks blend and reduce masking of one sound by another. Reading and thinking about what we are hearing is important for developing critical listening skills, but it is just as important to listen. We need technical ear training to become expert critical listeners.

By using the technical ear training software practice and test module “Technical Ear Trainer—Parametric Equalization,” you can increase your accuracy and speed of recognition of equalization. The software allows you to practice identifying by ear randomly chosen EQ settings. Figure 2.16 shows a screenshot of the user interface, and I will describe the functionality of the software below.

The key to practicing with any of the software modules is maintaining short but regular practice times on a daily or several times a week basis. In the early stages, 10- to 15-minute practice sessions are probably best to avoid getting too fatigued. Because highly focused listening requires intense energy, practicing for more than an hour at a time typically becomes counterproductive and frustrating. Eventually, as you get used to this focused type of listening, you may want to increase the practice period duration, but you may find that 45 to 60 minutes is the upper useful limit for a given practice period. Regular practice for shorter periods of time several times a week is much more productive than extended but less frequent practice sessions. Obviously, this could turn into a significant time commitment, but taking even 5 minutes a day is likely more effective than trying to cram a 2-hour practice session in once a month.

Figure 2.16 A screenshot of the software user interface for the Technical Ear Trainer parametric equalization practice module.

Figure 2.16 A screenshot of the software user interface for the Technical Ear Trainer parametric equalization practice module.

Practice Types

Starting at the top left corner of the window just below the header, there is an option to select one of the four practice types: Matching, Matching Memory, Return to Flat, and Absolute Identification:

  • Matching. Working in Matching mode, the goal is to duplicate the equalization that has been applied by the software. This mode allows free switching between the “Question” and “Your Response” to determine if the chosen equalization matches the unknown equalization applied by the computer.
  • Matching Memory. This mode is similar to the Matching mode, with one main difference—once gain or frequency is changed, the “Question” is no longer available for auditioning. “Question” and “Bypass” are available to be auditioned freely before any changes to the equalizer are made. Matching Memory mode helps us match sounds by memory and can be considered moderately to very difficult depending on the other practice parameters that are chosen, such as number of bands, time limit, and frequency resolution.
  • Return to Flat. In this mode the goal is to reverse or cancel the randomly chosen equalization applied to the audio signal by the computer by selecting the correct frequency and applying equal but opposite gain to what the software has applied. It is similar in difficulty to “Matching” but requires thinking in the opposite way since the goal is to remove the equalization and return the sound to its original spectral balance. For instance, if you hear a boost of 12 dB at 2000 Hz, the correct response would be to apply a cut of − 12 dB at 2000 Hz, thus returning the audio signal to its original state and sounding identical to the “Flat” option. Because the equalization used is reciprocal peak/dip, it is possible to completely eliminate any frequency boosts or cuts by applying equal but opposite boosts or cuts to the respective frequencies. It should be noted that, if you wish to try these exercises in a different context outside of the included software practice modules, not all types of parametric equalizers available are reciprocal peak/dip and thus not all will be capable of canceling a boost with an equal but opposite cut. This is not a deficiency but simply a difference in design.
  • Absolute Identification. This practice mode is the most difficult, and the goal is to identify the applied equalization without having the opportunity to listen to what is chosen as the correct response. Only “Bypass” (no equalization) and “Question” (the computer’s randomly chosen equalization) can be auditioned.

Frequency Resolution

You can choose from two frequency resolutions:

  • 1 Octave—the easiest of the two options with 9 possible frequencies
  • 1/3rd Octave—the most difficult with 25 possible frequencies

The frequencies correspond to the International Standards Organization (ISO) frequencies that are common on all commercially available graphic equalizers, as listed in Table 2.1. The software randomly chooses from among these frequencies to apply equalization to the audio signal. Exercises using third-octave frequency resolution are more difficult than those with octave frequencies because there are more frequencies to choose from and they are closer together. The list of third-octave frequencies includes octave frequencies plus two frequencies between each pair of octave frequencies.

Table 2.1 The complete list of frequencies (in Hz) shown with octave frequencies in bold.

table2_1

I recommend working with octave frequencies until you excel at correctly identifying all nine octave frequencies. In this initial stage, we try to develop a memory for the sound of each octave frequency. It takes time to develop this frequency memory, so do not expect it to happen overnight. Practice is essential to progress, and you will notice that regular practice does pay off.

Once you become confident with octave frequencies with a range of sound files, try some exercises with third-octave frequencies. By this point you should have developed a memory for each octave frequency’s sound, so that they are like “anchors” in the spectrum around which you can identify third-octave frequencies. One key strategy for identifying third-octave frequencies is to first identify the closest octave frequency. Then ask yourself if the frequency is an actual octave frequency, or if it is above it or below it.

Here are two specific octave frequencies (1000 Hz and 2000 Hz) with the respective neighboring third-octave frequencies:

  • 2500 Hz—upper neighbor
  • 2000 Hz—octave frequency anchor
  • 1600 Hz—lower neighbor

  • 1250 Hz—upper neighbor
  • 1000 Hz—octave frequency anchor
  • 800 Hz—lower neighbor

Number of Bands

You can choose the number of simultaneous frequencies that are affected in a given question: from one to three frequency bands. I recommend working with one frequency band until you start to get comfortable with octave frequencies. When working with more than one band at a time, it is more difficult to know what frequencies have been altered.

The best strategy with more than one band is to identify the most obvious frequency first and then compare your response to the equalizer question. If the frequency you choose matches one of the question frequencies, that particular frequency will become less noticeable when switching between the question and your response. The remaining frequency will be easier to identify because it is the only frequency that is different between the question and your response. The software can accept the frequencies in any order. When working with fewer than three frequency bands, only the left-most equalizer faders are active.

Gain Combination

The gain combination option refers to the possible gains (boost or cut) that can be applied to a given frequency. For each question, the software randomly chooses a boost or cut (if there is more than one possible gain) from the gain combination selected and applies it to a randomly selected frequency. When there is only one possible gain, the gain will automatically jump to the appropriate gain when a frequency is chosen.

As you would expect, larger changes in gain (12 dB) are easier to hear than smaller changes in gain (3 dB). Boosts are typically easier to identify than cuts, and I recommend that you work with boosts until you become proficient with them. It is generally difficult to identify something that has been removed or reduced, but if you switch from a cut to flat, the frequency in question reappears almost as if it has been boosted above normal.

When we work with a gain combination of a boost and a cut—such as +/− 6 dB—we can sometimes confuse a low cut with a high boost and vice versa. The human auditory system is sensitive to relative changes in frequency response, which can make a cut in the low-frequency range sound like a boost in the high-frequency range.

Q

The Q is a static parameter for any exercise. The default setting of Q = 2 is the best starting point for all exercises. Higher Q’s (narrower bandwidth) are more difficult to identify.

Frequency Range

We can limit the range of possible frequencies from the full range of 63 Hz to 16,000 Hz to a range as small as three octaves. Users are encouraged to limit the frequency range in the beginning stages to only three frequencies in the midrange, such as from 500 to 2000 Hz. Once these frequencies are mastered, you can expand the range one octave at a time.

After working up to the full range of frequencies, you may find some frequencies that still give you trouble. For instance, low frequencies (in the 63 Hz to 250 Hz range) are often more difficult to identify correctly when practicing with music recordings, especially with third-octave frequencies. This low-frequency range can pose problems because of a number of possible conditions. First, music recordings do not always contain consistent levels across the low-frequency range. Second, the sound reproduction system you use may not be capable of producing very low frequencies. Third, if your sound system reproduces low frequencies accurately, room modes (resonant frequencies within a room) may interfere with what you hear. Using headphones can eliminate room mode problems, but headphones and loudspeakers may not have a flat frequency response or may be weak in their low-frequency response. For recommendations on headphones and loudspeakers, see Section 1.4.

Sound Source

You can practice with either pink noise, which is generated internally in the software, or with any two-channel sound file in WAV format at 44,100-Hz or 48,000-Hz sampling rates. Averaged over time, pink noise has equal power per octave, and its power spectrum appears as a flat line when graphed logarithmically. It also sounds balanced from low to high frequencies because the auditory system is sensitive to octave relationships between frequencies, which are logarithmic rather than linear differences. For example, the lowest octave of our hearing range is 20 to 40 Hz, a difference of only 20 Hz. The highest octave of ideal human hearing is 10,000 to 20,000 Hz, a difference of 10,000 Hz. The auditory system perceives both of these ranges as being the same musical interval: one octave. In pink noise, both of these octave ranges—20 to 40 Hz and 10,000 to 20,000 Hz—have the same power. In contrast, we might say that white noise has 20 units of power in the lowest octave and 10,000 units of energy in the highest octave. That is why white noise sounds so much harsher and brighter to our ears than pink noise. By using pink noise, an audio signal that has equal power per octave, we hear a change at one frequency as easily as a change at any other frequency.

In the sound source selection there is also an option to listen to the sound source in mono or stereo. If a sound file loaded in contains only one track of audio (as opposed to two), the audio signal will be sent out of the left output only. By pressing the mono button, the audio will be fed to both left and right output channels.

It is best to start with pink noise when beginning any new exercises and subsequently practice with recordings of various instrumentation and genres. The greater the variety of sound recordings you use, the more you will be able to transfer critical listening skills to other situations.

Equalizer Selection

In the practice software, an audio signal (pink noise or audio file signal) is routed to three places:

  • straight through with no equalization (bypassed)
  • through the “Question” equalizer chosen by the computer
  • through the user equalizer (“Your Response”)

You can select which of these options to audition. The Bypass selection allows you to audition the original audio signal without any equalization applied. The selection labeled “Question” allows you to audition the equalization that has been randomly chosen by the software and applied to the audio signal. The selection labeled “Your Response” is the equalization applied by the user, according to the parameters shown in the user interface. See Figure 2.17, which shows a block diagram of the practice module.

Sound File Control

The Sound File Control section of the interface includes a waveform display of the audio signal. You can select excerpts of the full audio file by clicking and dragging on the waveform. The audio file automatically repeats once it reaches the end of the file or the end of the selected section. By simply clicking in the waveform, the waveform is selected from the location of the click to the end of the file.

Figure 2.17 A block diagram of the signal path for the Technical Ear Trainer practice module for parametric equalization.

Figure 2.17 A block diagram of the signal path for the Technical Ear Trainer practice module for parametric equalization.

Time Limit

In the recording studio or live sound venue, time is of the essence. As audio engineers, we must often make quick yet accurate decisions about sound quality and audio signal processing. To help prepare for these real-world situations, you can apply a time limit in the practice module so that you can practice equalization parameter identification with speed as well as accuracy.

Sometimes a time limit is useful in that it forces us to respond with our first impression rather than spend too much time thinking and rethinking. Novice recording engineers who have spent time with the practice module often report that overthinking a question results in mistakes and that their first impressions are often the most accurate.

Keyboard Shortcuts

The keyboard shortcuts included in the software are ideal for quickly indicating responses when using the timer. The tab key cycles through frequency bands in exercises with more than one band. The up/down arrows can be used to increment or decrement through the octave frequencies. Alternatively, number keys correspond to octave frequencies (0 = 20 Hz, 1 = 63 Hz, 2 = 125 Hz, 3 = 250 Hz, 4 = 500 Hz, 5 = 1000 Hz, 6 = 2000 Hz, 7 = 4000 Hz, 8 = 8000 Hz, and 9 = 16,000 Hz) and can be used to jump to an octave frequency immediately. The left/right arrows adjust the gain of a selected band in 3-dB increments. For exercises with only one gain option (e.g., +12 dB), the gain is automatically set when the frequency slider is changed from 20 Hz to any other frequency. Returning the frequency slider to 20 Hz resets the gain to 0 dB. For exercises with more than one gain option (e.g., +/− 12 dB), the gain stays at 0 dB until you adjust it; the gain does not automatically change when the frequency is changed. The keyboard shortcuts are listed below.

  • [space bar] toggles the Equalizer Selection depending on the Practice Type:
    • ○ Matching: toggles between Question and Your Response
    • ○ Matching Memory: toggles between Question and Your Response until a parameter is changed, at which point it toggles between Bypass and Your Response
    • ○ Return to Flat: toggles between Your Response and Bypass
    • ○ Absolute Identification: toggles between Question and Bypass

  • [enter] or [return] checks answer and moves to next question
  • [q] listen to Bypass
  • [w] listen to Question
  • [e] listen to Your Response
  • Numbers 1 to 9 correspond to octave frequencies of a selected band (e.g., 1 = 63 Hz, 2 = 125 Hz, 3 = 250 Hz, 4 = 500 Hz, 5 = 1000 Hz, 6 = 2000 Hz, 7 = 4000 Hz, 8 = 8000 Hz, 9 = 16,000 Hz)
  • Up/down arrows change the frequency of the selected band
  • Left/right arrows change the gain of the selected band
  • [tab] selects the frequency band to modify, if there is more than one band
  • [esc] turns audio off

2.6 Working with the EQ Practice Module

When you first open up the EQ practice module, select pink noise in the Monitor Selection, turn the audio on, and adjust the output level to a comfortable listening level. Make sure the Equalizer Selection is set to Your Response, and scroll through each octave frequency to get a feel for the sound of each frequency. Once you change the frequency, the gain will automatically jump to 12 dB; this is the default gain combination setting when you open the software module. Switch between Bypass (no equalization) and Your Response to compare the change in timbre that is created by a boost at each frequency. Spend some time initially just listening to various frequencies, alternating between flat and equalized. After you familiarize yourself with octave frequencies in pink noise, load in a sound file and do the same thing again with a music or speech recording.

When you audition a sound file, make note of what instruments or components of instrument sounds are affected by each particular octave frequency. For instance, a boost at 125 Hz may bring out the low harmonics in a snare drum or bass. On the upper end of the spectrum, 8 kHz may bring out crisp cymbal harmonics. If you audition a Baroque ensemble recording, you may find that a boost at 16 kHz makes a harpsichord more prominent.

Boosts at specific frequencies can sometimes bring out individual instruments in a mix, a phenomenon that skilled mastering engineers use to subtly rebalance a mix. Each recording will be affected slightly differently by a given frequency, even with comparable instrumentation. Depending on the frequency content and spectral balance of each individual instrument in a recording, an equalizer setting will affect each mix slightly differently. This is one reason why we must be attentive to what is required in each individual recording, as opposed to simply relying on what may have worked in previous recordings. For example, just because a cut at 250 Hz may have worked on one snare drum in one recording does not mean that it will work on all snare drum recordings. Sometimes during the recording and mixing process, we may evaluate and question our processing and mixing decisions based on logic of what seems correct from a numerical point of view. For instance, let’s say we apply a cut of 20 dB at 300 Hz on an individual instrument. We may be tempted to think that 20 dB is too much, based on some notion we have of what seems reasonable. We might think to ourselves, “I have never had to do this before and it seems like an extreme setting, so how can it be right?” We may rely on logic rather than what our ears tell us. What we think is appropriate does not always coincide with what clearly sounds most appropriate. In the end, it does not matter how ridiculous a signal processing or mixing decision may appear as long as the sonic result suits the artistic vision we have for a project. As an engineer, we can have a direct effect on the artistic impression created by recorded music depending on choices such as balance and mix levels, timbre, dynamics, and spatial processing. Judgments about what is appropriate and suitable should be made by ear with no judgment of the actual parameter numbers being chosen.

Vowel Sounds

A number of researchers have noted that associating specific vowel sounds with octave frequencies can help listeners identify frequencies because of the formant frequencies present in each vowel sound (Letowski, 1985; Miskiewicz, 1992; Opolko & Woszczyk, 1982; Quesnel, 2001; Quesnel & Woszczyk, 1994; Slawson, 1968). The following English vowel sounds roughly correspond to octave frequencies:

  • 250 Hz—[u] as in boot
  • 500 Hz—[o] as in tow
  • 1000 Hz—[a] as in father
  • 2000 Hz—[e] as in bet
  • 4000 Hz—[i] as in beet

Matching frequency resonances to specific vowel sounds can help with learning and memory of these particular frequencies. Instead of trying to think of a frequency number, you may find it useful to match tonal balance with a vowel sound. The vowel sound can then be linked to a specific octave frequency.

2.7 Recommended Recordings for Practice

Here are some commercially available recordings of various genres that are suitable for use as sound sources in the EQ software practice module. They represent examples of high-quality recordings that have good spectral balance across a wide frequency range. There are many other recordings that are appropriate, of course. Explore other recordings and see what works for you.

Compact disc–quality versions should be used (i.e., digital linear pulse-code modulation [PCM] 44.1 kHz, 16-bit AIFF or WAV) for all exercises, rather than data-reduced versions. Perceptually encoded versions (such as MP3, Windows Media Audio, Advanced Audio Coding, or Ogg Vorbis) should never be used for EQ exercises, even if they have been converted back to PCM. Once an audio file has been perceptually encoded, its quality has been degraded and cannot be recovered by converting back to linear PCM.

  • Anderson, Arild. (2004). “Straight” from The Triangle. ECM Records. (jazz)
  • Blanchard, Terence. (2001). “On the Sunny Side of the Street” from Let’s Get Lost. Sony. (jazz)
  • Brecker, Michael. (2007). “The Mean Time” from Pilgrimage. Heads Up International/Telarc/WA Records. (jazz)
  • Chapman, Tracy. (1988). “Fast Car” from Fast Car. Elektra. (pop/rock/folk)
  • Daft Punk. (2013). Random Access Memories. Columbia Records. (pop)
  • Earth, Wind & Fire. (1998). “September” from Greatest Hits. Sony. (pop)
  • The European Community Baroque Orchestra. (1991). “Concerto II—Presto” from 6 Concerti Grossi by Pieter Hellendaal. Channel Classics. (classical)
  • Florilegium & Pieter Wispelwey. (2006). Haydn: Cello Concertos Nos. 1 & 2, Symphony No. 104. Channel Classics. (classical)
  • Le Concert des Nations. (2002). “Marche pour la cérémonie” from soundtrack from the film Tous les matins du monde. Alia Vox Spain. (classical)
  • Massive Attack. (1998). “Teardrop” from Mezzanine. Virgin. (electronic)
  • McLachlan, Sarah (2003). “Time” from Afterglow. Arista Records. (pop)
  • The Police. (1983). “Every Breath You Take” from Synchronicity. A&M Records. (rock) Note: For my own EQ listening practice, I created an edited version of this track that includes just the instrumental introduction.
  • Raitt, Bonnie. (2003). “I Can’t Make You Love Me” from The Best of Bonnie Raitt. Capitol Records. (pop)
  • Randall, Jon. (2005). Walking Among the Living. Epic/Sony BMG Music Entertainment. (country/roots)
  • Steely Dan. (1977/1999). Aja [Original Recording Remastered]. MCA. (pop/rock)
  • Steely Dan. (2000). “Gaslighting Abbie” from Two Against Nature. Giant Records. (pop/rock)

There are also a few artists that are making multitrack stems available for purchase or free download. Author and recording engineer Mike Senior hosts a site with a growing number of free multitrack downloads: www.cambridge-mt.com/ms-mtk.htm.

Apple’s GarageBand and Logic Pro also offer recordings of solo instruments that can be useful with the EQ practice and test software.

2.8 Recommended Sequence of Practice and Test

Although the EQ module allows you to select any exercise parameter combination for practice and testing, here is a possible progression from easy to more difficult:

Octave Frequencies

  1. Practice Type: Matching
    • Monitor Selection: Pink Noise Frequency Resolution: Octave Number of Bands: 1
    • Gain Combination: +12 dB
    • Q = 2
    • Frequency Range:
      1. 500 to 2000 Hz
      2. 63 to 250 Hz
      3. 4000 to 16,000 Hz
      4. 250 to 4000 Hz
      5. 125 to 8000 Hz
      6. 63 to 16,000 Hz

  2. Same as above except:
    • Monitor Selection: a variety of sound recordings of your choice

  3. Same as above except:
    • Practice Type: Absolute Identification
    • Monitor Selection: Pink Noise and a variety of sound recordings of your choice
    • Frequency Range: 63 to 16,000 Hz

  4. Same as above except:
    • Number of Bands: 2

  5. Same as above except:
    • Number of Bands: 3

  6. Same as above except:
    • Gain Combination: +12/− 12 dB

  7. Same as above except:
    • Gain Combination: +9 dB

  8. Same as above except:
    • Gain Combination: +9/− 9 dB

  9. Same as above except:
    • Gain Combination: +6 dB

  10. Same as above except:
    • Gain Combination: +6/− 6 dB

  11. Same as above except:
    • Gain Combination: +3 dB

  12. Same as above except:
    • Gain Combination: +3/− 3 dB

Third-Octave Frequencies

Progress through the sequence above but instead of working with octave frequencies, select “1/3rd Octave” from the Frequency Resolution drop-down menu.

Time Limit

To increase difficulty further, you may wish to use the timer to focus on speed.

Summary

Equalization is perhaps our most important tool as audio engineers. It is possible to learn how to identify boosts and cuts by ear through practice. The available software practice module can serve as an effective tool for progress in technical ear training and critical listening when used for regular and consistent practice.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset