Everything is music.
Of course, we’re used to treating dialog, and sound effects, and music channels differently in every phase of the audio pipeline. But I suggest a different way of thinking, and one that ultimately produces a better final mix.
If it comes out of the speaker — and it doesn’t matter whether it’s gunshots or glockenspiels — it’s all music and should be created and mixed down as such.
There’s a psychoacoustical basis for this approach to video game mixing. Your brain is hardwired to reject quieter signals that share similar temporal or frequency characteristics to a nearby louder signal.
Huh? What does that mean?
If you’re running your vacuum cleaner, you won’t hear your cell phone ring.
It’s not just that your vacuum cleaner is so much louder than your cell phone. It’s that your vacuum cleaner makes sounds at frequencies that mask your cell phone.
Here’s a practical demonstration of masking on YouTube:
You’ll hear a 1 kHz tone turned off and on in conjunction with some filtered noise that is masking the pure tone. If you’re very very observant, you’ll hear the masked tone hidden way deep underneath the masker.
It’s not that the masked tone is so much quieter than the masker sound. It’s that the frequencies are nearby one another.
And what’s the point of playing a sound that no one can hear?
When you mix audio, when it comes to differentiating sounds, your brain doesn’t care whether it’s musical notes or M40s in your ears.
All sounds are sums of frequencies over time. Compose your sound effects with just as much love and attention to frequency usage, as you do your music. And remember that in the final mix, everything is music. That is, every sound source must have its own frequency range.