There is a certain language that we learn, whether subconsciously or through a more active process, that helps shape the way we listen to and interact with music. We grow accustomed to certain aspects or tendencies, and eventually after enough listening, we come to expect things of the music. That interaction between anticipation and response shapes our listening experience whether we as individuals have developed a specific vocabulary to articulate it or not. Now, this can take place within one single piece: when we hear a new song for the first time, having listened to enough music of a similar style or genre, we come to expect certain harmonic behaviors—which the music then either thwarts or affirms. For example, a piece could begin in one key, and all of the sudden, before the final cadence—the ending resolution—modulate to a new and unexpected key, defying your expectations. This can also take place over many performances of a single work of music: we listen to multiple performances, and we can compare how the musicians interpret different sections, where they change things around. Imagine you go to listen to a piece with guitar, piano, and drum accompaniment, but the performers decide to play it on xylophone, double bass, and synth. Or it’s your birthday, and your family decides to sing “Happy Birthday” in a minor key. Those differences all come with the added weight of your denied expectations. There are many ways in which music can create anticipation. To actively listen to music is to become aware of those tendencies, recognize your expectations, and then notice or even try to place the exact point at which they were fulfilled or denied. But there are still other ways we can be lulled into complacency or surprised by music.
Recording technology offers a certain luxury when it comes to our listening experiences. We can go to Spotify, find our favorite song, and then loop it indefinitely. But those musical expectations still exist, albeit, in a slightly altered way. Whether we like it or not, technology alters our listening experience by implementing new expectations: not necessarily the expectations of harmonic functionality or of performance practices, but rather, the illusion of permanence in one particular recording of a piece, causing us to expect all other recordings or live performances to sound like the one we have grown most accustomed to. I want to take this opportunity to point to an actual example of this occurrence. In 2020, Willow Smith and Tyler Cole released a song called “Meet Me at Our Spot,” a catchy little indie alternative piece featuring a duet between male and female voices with very simple accompaniment of electric bass, electric guitar, and drums. Interestingly enough, a live recording caught the public eye, and soon it was all over social media—most noteworthy was its heavy use as a TikTok audio. When its fans sought the full audio version for their listening pleasure and convenience on streaming services, their hopes were dashed as they could only find the radio version. People were outraged, and eventually those streaming sites added the live audio recording.
So, is one recording better than the other? To find that out, I went and listened to both multiple times over, starting with the radio version. These are some of the immediate differences I noticed. To start, the live recording has way more dynamics, tempo changes, and harmonization. Additionally, without the significant autotuning you hear on the radio, the rawness of the average human voice gives the live performance a less uniform sound, especially in Willow’s voice. But it’s that non-uniformity that allows her to be more expressive in singing—the auto tune in the radio version weighs down, dulls, and mutes a lot of her expressiveness. Despite its claim to the title “live performance,” there is still a decent amount of autotuning, most notably in the male voice, but nowhere near as much as in the radio version. Willow also messes up some of the lines in the live video during the chorus—instead of singing the line “When we take a drive, maybe we can hit the 405,” she sings “When we catch a vibe, maybe we can hit the 405.” Admittedly, that’s pretty nitpicky, and doesn’t actually affect the musicality of the song.
Now the radio version is pretty similar, but there are some very noticeable differences though. Some of the parts are switched, everything is very uniform—no voice cracks, no differences in timbre from range difficulties—and there is some extreme autotuning going on. Because of the auto tune, both the sound of the voices and instruments meld together, and they have vastly different timbres—the unique color or tone quality of the pitch—from the live. Unlike the live version, there isn’t much distinction between individual voices or instruments, and the bass is a lot heavier or punchier; at certain parts it almost seems to drag. One last observation is that the female voice, which in the live is clearly an alto, is mixed to get rid of a lot of natural depth, but simultaneously causing each pitch to sound a little flat. Overall, though, there is not a ton of difference, and, what’s more interesting, I would argue is that the album version lines up with most of the stuff the radio plays anyway.
So why the uproar? Well, without saying whether I think one is better than the other, I do think technology’s influence runs deeper than people realize. Technology gave the public one recording of this song to which audiences became accustomed. In this way, technology, not the actual music, gave people expectations which the other recording then denied. Most of us first heard the song live on TikTok or some other social media platform. And that informed our listening of all other versions. We have become so accustomed to one recording of each song that we evaluate every other performance with the first one we heard in mind. We are privileged with such easy access that when we hear a different version, it seems wrong or even grotesque when in actuality there’s nothing wrong with the second version we hear. Interesting also is that with recording technology, we usually only see one version, and that becomes the song for us—static, unchanging. I read internet comments saying things like “Wow, it’s so surprising to hear a live version that’s better than the real version” as if the “real” song is only bound up in that very specific performance to be played over and over. Then we’re surprised that an actual performance could ever be better than the pristine, polished, auto tuned radio recording.
Often, it’s reversed: we become so used to the auto tune that any hint of a voice crack or slightly flat note causes anxiety within us as we view the live performance. I know from experience that eventually our ears can become so accustomed to the auto tune that we can barely handle anything live. Radio versions are generally the most played or rather overplayed, and whatever is most accessible becomes the favorite recording. And unfortunately, pop music is generally recorded as very short catchy pieces, with one version to be put on an album, the radio, or streaming services unlike classical pieces where we get access to the score to perform, and there are much more recordings out there to choose from.
I am not suggesting that we throw out album recordings. There’s no way to get around the use of audio recording technology. And in fact, it can be super useful for comparison or analysis, but it cannot be denied that it changes the way we hear and interact with music. Humans all have a deeper longing to engage with music in a kind of dance where there is intention, anticipation and response, and a sort of push and pull motion. Recorded music can certainly help facilitate this, but unfortunately it can also dampen that longing. There are two extremes: one that would throw out all technology and only listen to live performances and one that would completely succumb to whatever recording they first discovered. Neither of those are good alternatives because one is completely unfeasible while the other produces a completely passive and complacent listener. Instead, maybe we can find a solid middle ground by beginning to evaluate recordings with a more critical ear and learning to articulate what it is exactly that we like or dislike about different recordings.
April Smith is a senior studying Music and English.