Friday, September 6, 2013

some good things last forever

In the game Super Mario 64, there is, at one point, an infinite staircase.

I don't want to tell you exactly how long I tried to climb it.

When you try to climb it, fittingly, the game music switches to a song that seems to ascend in pitch infinitely!


I was ten years old, and it was completely amazing, but it was part and parcel of a completely amazing game.  It's easy for little touches like that to go unnoticed and underappreciated when you're completely spoiled for revolutionary awesomeness.

Looking back, I'm able to understand and admire a lot better the craft that went into it.  The music is actually what's known as a Shepard tone.  You can hear a Shepard-Risset glissando here:


I'll spare you the details of what's happening here musically/physically (though feel free to read the Wikipedia page to enlighten yourself!); what really interests me is this excerpt from the Wikipedia page:
"The acoustical illusion can be constructed by creating a series of overlapping ascending or descending scales. Similar to the Penrose stairs optical illusion (as in M. C. Escher's lithograph Ascending and Descending) or a barber's pole"
I can't help but be excited when I come across synergies between the senses; the fact that different senses can trick our brains in the same way is evidence of the underlying unified interface parsing those senses.  At some level, our brains are translating all stimulus, no matter the source or sense, into a common data format.  Unlocking that format is the key to a perfect (or as perfect as possible) mind-machine interface, which is a necessary precursor to uploading our minds into a digital state and attaining some actual semblance of immortality.

Super Mario 64 is special because it managed to link two of those illusions together (although, admittedly, the staircase was actually infinite, and so not an illusion); there's an inherent beauty to any harmony between the visual and the auditory.  My theory is that this beauty exists because it frees our minds somewhat from having to do the work of reconciling jarringly different stimuli into a sensible pattern for us to understand.  Think of how uncomfortable it is to watch a movie where the sound is a split-second off from the video; understand that I mean the opposite of that, the pleasure induced when the sound and the image reflect and reinforce one another, rather than simply agreeing (the neutral state).

Are there other examples of this?  The first that comes to mind is Disney's Fantasia, and that certainly seems to have been their intent (especially with the latter part of Toccata and Fugue), but it's not like the music was created to reinforce the animation, since it was written decades or centuries earlier.  (That article is still totally worth reading, if only to learn how Disney totally revolutionized movie (and eventually home) theater audio.  Also, did you know it was the Philadelphia Orchestra that recorded the original score?)

While doing research for this post, I came across an article addressing the audio-visual synergies occurring in human speech.  Of course the visual is a component of speech, from the subtle:

I would have gone with Star Wars, but Ithorian communications take a lot longer to explain.

to the not-so-subtle:


So the process for speech (where you are watching the speaker) goes something like this:

1.  Light travels through space and is absorbed by the eyes, where the information is converted into signals which reach the brain as sight.
2.  Vibrations in air are detected by ear drums and converted into signals which reach the brain as sound.
3a.  The brain interprets the sight and makes an estimation of the speaker's mood, emphasis, and inflections (or subversions) of meaning (and, if properly trained, receives the motion of the speaker's lips as verbal information, too); at the same time...
3b.  The brain sifts through the incoming sound, identifying certain patterns as speech and matching them up with known languages, considers the overall tone and delivery of the speech, and makes an estimate of the intended meaning.
4.  The brain combines its estimates of the visual and auditory speech with its knowledge of the speaker's personality and history, the topic, and the intended audience, into a unified message, making decisions on any ambiguities.
5.  The brain reacts to its final understanding of the content of the message by taking into account the listener's entire personality, history, and mood.

That is an extremely simplified overview of the low-level process when just a few words are spoken by one person to another.  That's a lot of work for the brain to do just to hear somebody tell you she ate a pancake, and our brains hate doing work.  They take shortcuts whenever they can, and really like it when things are just apparent.

But this is all an aside; I think engaging the speech centers of our brains forces them to spin into too high of a gear to appreciate all the low-level goodness inherent in the synergies I was looking for.  I expect that particular synergy can't really form from any sound but that which we identify as music; this may be because music seems to be, at least partially, interpreted by our minds as language.  And, most amazingly at all, it turns out that your interpretation of the direction of Shepard tones may be directly linked to the tonal qualities of your native language.

Ascending or descending? Your native tongue will decide.

What does this mean?  It just may be that we've been following this Sisyphean staircase in the wrong direction all along.  Stop for a minute, take stock, and see if you're really getting anywhere, or if it's all just music in your head.

No comments:

Post a Comment