Words on Sights and Sounds – Relative balance

On the relative balance of sights vs. sounds on this blog


When I first started this blog in November 2011, I intuitively knew I’d be posting more images than audio files. I initially figured I’d sort it out as I went along, but for some reason quickly settled into a 5-day rotation: 4 images followed by 1 audio recording. I had quite a few of each already produced and edited, and knew that I’d be creating more for the project. I had no idea at the time if this was the right balance, but I just went with it.

As things progressed I quickly realized something I hadn’t really anticipated, despite having been actively involved in field recording since the late 1980s: it was proving much harder to get good (meaning technically good as well as aesthetically interesting) location audio recordings than it was good location photos. It was relatively easy to find and select interesting visual shapes, contours, textures, rhythms, etc., and to compose them into satisfying images. Sound recordings, on the other hand, were much harder: finding and selecting engaging aural shapes, contours, textures, rhythms, was hard enough; foregrounding them against other (even unwanted) sounds was not so easy. On further consideration, and after a few more months of work, I began to identify why this was. In particular, a question following a talk I gave in Santa Fe, New Mexico in July 2012 prompted me to attempt an answer as to why this was. As I experience it, the problem comes down to focus and isolation, and to three primary factors: the nature of sound vs. the nature of light; the current state of the soundscape vs. landscape; and the technology for phonography vs. that for photography.

The nature of sound and light

Though both deal with wave-like phenomena, sound and light each have their own peculiarities. Apropos the current topic, the most salient, in my experience, are issues of propagation and diffusion; in particular diffraction and refraction. Diffraction is the ‘bending’ (change of direction of travel) of a wave around an obstacle it encounters (or through an opening in such an obstacle) in its path. Refraction is a similar bending due to changes in density (and impedance) and therefore velocity as the wave transitions from one medium to another or through regions of the same medium with differing densities (i.e. due to temperature differentials, etc.). How does this impact what you see and hear? Simply put, sound bends around things much more than light. Sound spreads out in ways that light does not – or, to be more precise, to degrees that light doesn’t. Sound is much harder to contain, in this sense. It spills, leaks, and spreads around. Atmospheric conditions on certain days can cause sound to travel much further than normal, bending back (refracting) towards the earth’s surface as it travels, redirecting more of the acoustic energy back to earth rather than diffusing into the sky and therefore being audible over much greater distances. Sounds travel around corners (diffracting) with ease. Because both diffraction and refraction are frequency-related, and lower frequencies are more deeply affected (no pun intended), the lower frequencies of sound exhibit more pronounced effects from diffraction and refraction.

In terms of the soundscape, and location audio recording (sometimes referred to as ‘phonography’), this means that sounds are harder to isolate: they mix; they overlap, overlay, and interpenetrate; they mask and obscure; they bump into one another, jump queue, and knock each other down. Conversely, light, generally speaking, is much more orderly, polite, and willing to line up and wait its turn. Pointing a mic in one direction doesn’t necessarily mean you won’t pick up sounds coming from another, and in fact that’s largely what you get (specialty mics notwithstanding (see below)), resulting in recordings with mixtures of sounds that emanate from all sorts of directions; on the other hand, point a lens in one direction, and you don’t typically end up with a photo of something off to the side or behind you; you get pretty much what you’re pointing at. Now, if what you are after, sonically speaking, is a general ambience, then this is not much of an issue – you’ll get that pretty easily. If, on the other hand, you are trying to foreground or isolate a specific sound – particularly a relatively quiet or distant sound – then it’s just as likely you’ll end up recording a lot of other sounds just as much as (if not more than) the one you’re after.

The soundscape and the landscape

Let’s face it, contemporary Western societies spend much more time, energy, and money on thinking about, considering, evaluating, planning, engineering, and designing the visual aspects of the built environment than they do the aural aspects. One need only consider the visually spectacular architecture of a building while listening to the sonic detritus of its HVAC system, or dine in a visually attractive restaurant with gymnasium acoustics, or marvel at the engineering feat of modern transportation networks while being inundated with automobile, truck, bus, aircraft, and motorcycle noise. Whether the motives be economic, social, cultural, or political; whether we can attribute it to expediency, notions of ‘progress’, ideas of tradeoffs and compromise, or simple ignorance; and whether we consider the effects in terms medical, psychiatric, affective, or merely aesthetic, there’s clearly an imbalance of attention paid to the visual and acoustic spheres of the built environment. Specific counter-examples notwithstanding (sonic branding, industrial sound design, entertainment, etc.), the truth of the matter is rather starkly apparent to anyone who pays attention.

The key thing to remember is that the anthropogenic sound world is not unique in this regard. The entire acoustic environment is, to varying degrees, impacted by the sounds of humans and their constructions. So where does this leave the photographer and phonographer? On predictably different terrain. Whether the primary interest is the natural world, the human-made world, or somewhere between, the ability to select out specific sights of interest, without interference, is considerably easier than selecting out specific sounds. I’m not saying photography is easier than phonography, I’m saying the contexts and settings are different, with one more easily able to isolate the subject of interest.

Technology limitations

While there have certainly been a number of technological developments in audio recording technology – and here the microphone is the primary subject –  there’s still nothing comparable within the audio field to a good long lens with a wide aperture (with or without vibration reduction). There are, to be sure, a number of designs intended to increase the ‘reach’ of a mic (meaning more directional sensitivity with correspondingly less sensitivity to off-axis sounds) from super-cardioid and so-called ‘shotgun’ designs to parabolic reflectors and other more exotic ideas, but they all suffer to audible degrees from severe off-axis ‘coloration’ (less than faithful fidelity across the audible frequency spectrum) due to phase issues or non-uniform directional characteristics. Certainly, long lenses can introduce various types of visual distortion, but current optical design and construction techniques have largely minimized these as practical limitations (assuming sufficient budget) – not so in the microphone world.

Conversely, most significant advancements in mic design in recent decades have been at the other end of the issue: more faithful and ‘realistic’ reproduction of ambience and spatial context through various surround-sound applications (Soundfield, 5.1 arrays, binaural, etc.). Rather than enabling greater reach or enhanced selectivity, the advances have addressed accuracy in capturing environmental context. Not that this is, in itself, a bad thing – it simply doesn’t address the issue I’ve been discussing in this post.


These three factors have conspired to tip the balance in favor (quantitatively speaking) of sights over sounds on this blog. No doubt other factors are at play, as well. The issue of time – addressed in the previous post – also plays into it, in the sense that the time necessary to record, edit, and then listen to individual sound recordings is considerably greater than the time required to take, edit, and (arguably) look at individual photos.

At least, this has been my experience so far…


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: