How to write digital music

Let’s say you’re a human being and happen to enjoy that hip-hopping and bleep-blooping our species calls “music,” but are tired of sitting idly by and listening to the dumb stuff other people create. How do you get from wanting to write digital music to actually putting a song together?

You don’t really need music theory

Worried you can’t read sheet music? You’re in luck! It turns out band geeks and choir kids mostly wasted their time in high school. Digital composing has largely eschewed traditional music notation in favor of the digital piano roll, which shows the note pitch and duration much more intuitively because it’s not limited by the vertical or horizontal constraints of a piece of paper.

The greatest advantage music education provides is a developed ear for natural-sounding chord progressions, but even this can be learned without formal training by listening to and emulating music you like. Toby Fox picked up his music skills in high school playing music on the piano by ear.

The best bet for those lacking in musical intuition would be to study some of the basic relationship structures between notes: major and minor scales (plus harmonic minor, melodic minor, and pentatonic scales if you want bonus points, and modes if you want to gear into maximum overdrive), intervals, triads (major, minor, diminished, and augmented), and complex chords (7, M7, m7, 6, 9, 11, 13, sus4). Furthermore, prospective songwriters in pop and rock may benefit from learning standard chord progressions (like I-vi-IV-V) and the Roman numerals used to describe them.

If you’re interested in writing ambient, atonal, or otherwise avant-garde music, throw all this out the window and just start slinging notes down, unless of course you’d like to use the twelve-tone-technique to write methodical garbage.

You will probably need money, though

The type of software you need for digital music writing is called a DAW. More robust than a simple waveform editor, a DAW is a MIDI sequencer, VST host, and a digital recording studio rolled into one (hint: you will need at least the first two of those things). While there are some free DAWs on the market, such as Cakewalk, T7, Ohm Studio, and LMMS, most of the popular ones cost anywhere from $60 to $100 for the most basic license.

I, the one and only Plasterbrain, begrudgingly continue to use Acoustica Mixcraft, which is on the cheaper end at $89. Mixcraft is incredibly easy to use, but lacks the plugin support one would expect from a commercial DAW (haha iZotope Neutron Elements… RIP my wallet) and tends to shit the bed on CPU-heavy projects, so I avoid recommending it to people.

Instead, to find your perfect match, I encourage you narrow down a list of DAWs based on the following criteria:

  • OS. There are a couple of options that are platform-exclusive. Logic Pro is a no-go if you’re on Linux or Windows; conversely, Cakewalk is Windows only.
  • Basic Goddamn Features. Make sure the version of the DAW you’re looking at has the features it’s supposed to have. I don’t even mean bells and whistles that will make your life easier — I mean basic features. Unlimited tracks, audio files/recording, and support for 64-bit plugins are just a few.
  • Price. FL Studio remains a very popular entry point for electronic musicians, but its cheapest worthwhile edition1 is $199, which is perhaps one of the reasons everybody and their brother has pirated it. 🙃
  • Popularity. If you think you’ll need training wheels in the music-writing process, choose a popular DAW. Plugins are more likely to be compatible with a DAW that developers know is widely used. Furthermore, popular DAWs like FL Studio will have bigger communities you can rely on for tips, tutorials, and troubleshooting. Unless you’re choosing an option that’s open-source, you may also want to choose a popular DAW because it’s less likely to be abandoned2 by its developers than Little Timmy’s Honkey-Tonk Indie DAW 2005 Edition.
  • Bundled Software. If you’re willing to spend money on music software from the get-go, opting for a DAW that comes with premium plugins may be a better investment in the long run. Take a look at what different DAWs offer as included software might suit some types of musician better than others (e.g. mastering tools vs pitch-correction vs premium synths).

Once you’ve narrowed it down, download some free trials and get cracking. Idiomatically speaking, of course. Unless you chose FL Studio, in which case you might actually be cracking. Ho ho!

Getting to know your music software

Now that you’ve found a nice DAW to settle into in the quiet suburbs with in a house with a white picket fence, it’s time to learn about the things that go into your DAW to make music come out!

A project file in your DAW comprises various musical tracks, which either contain audio files or virtual instrument data.

The easiest way to make a song when you’re first starting out is to combine loops. If you’ve ever used GarageBand or Magix Music Maker, you’ve likely assembled a song out of loops.

Loops are short music files featuring one or more instruments which can be seamlessly repeated (looped). They are almost always royalty-free, and meant to be used in conjunction with other loops or your own music in the making of a song. For example, Modal Shanghai uses this EDM drum loop as the beat. I’m doing more of my own drums these days, I promise!

Aside from loops, unless you’re recording your voice or a live instrument, most of your tracks will probably be virtual instrument tracks, which use MIDI sequences instead of audio clips. In order produce sound from MIDI data, you’ll need to assign the track containing your MIDI sequence one or more virtual instruments, in the form of a VST.

Cinco MIDI Organizer | Tim and Eric Awesome Show, Great Job! | Adult Swim
Hell yeah, MIDIs!
The basic technology behind computer music, MIDI is the data describing the notes in your song and their various qualities. (In pop culture, the term “MIDI” often connotes the sounds of a default General MIDI synth, Casio keyboards, and the kind of music that used to autoplay on Web 1.0 pages.)

The general procedure goes like this. First, you make an instrument track. Second, you assign a virtual instrument to it. Virtual instruments, which are loaded into the DAW software as plugins, translate MIDI data into something resembling performance on a real life instrument. Third, you add clips to the track which you populate with MIDI data, either by playing notes on a computer-connected MIDI keyboard or dragging notes on screen using a good old-fashioned mouse. How you manage these three basic tasks depends on the interface of the specific DAW you’re using.

Most DAWs come packaged with a few virtual instruments of their own, but you’ll likely need to find new ones on the net to better customize your arsenal.

Understanding instrument formats

VST is the most popular audio plugin format. While some DAWs have their own proprietary formats (such as AU in Logic and GarageBand and AAX in Pro Tools), most rely almost exclusively on the use of VSTs. VST2 is the more widely supported version, though some plugins can also be installed as VST3. VST refers to both the instrumnt plugins as well as the audio effect plugins used to shape the resulting signal. VSTi is sometimes used to specifically refer to VST instruments.
Samples are high-quality recordings of an instrument’s or ensemble’s individual notes, curated for the purpose of producing a virtual instrument. Some kinds of virtual instruments, like a violin, may need several thousand samples to achieve a realistic sound, whereas a simple drum kit might use less than ten. Much of the work in developing high-end virtual instruments involves scripting the various ways in which samples are associated with keyboard notes.
A sampler is a kind of plugin that allows you to map samples to keyboard notes. In most cases, you’ll be loading instruments and soundfonts where this has already been done, though in the case of drum machines, much of the beauty comes from customization of the samples (called oneshots) that are used.
SoundFonts, commonly styled as the lower-case, genericized trademark “soundfonts,” are files which can be loaded by a SoundFont player to play a collection of samples. The traditional soundfont format is .sf2, but there’s also the open format .sfz as well as MuseScore’s very uncommon, ogg-based .sf3.

Because of the low barrier to entry of developing and using soundfonts, they are a popular virtual instrument format for hobbyists. Soundfonts are an ideal tool for f2p composers who aren’t overly concerned with ultra-realistic sound quality, or who want to emulate their favorite 16-bit chipsets. Nowadays, with the exception of a few instruments developed specifially for Plogue’s Sforzando, most soundfonts on the net are free, though they often materialize from the ether with no warranty, license, or definitive origin. The legality of writing with SoundFonts based on ostensibly copyrighted sound sets remains ambiguous, because nobody really cares.

Kontakt Instrument
Many high-end virtual instruments are either Kontakt libraries (.nicnt) or instruments (.nki), both of which can be used in Native Instruments’ Kontakt. Kontakt is the industry standard software sampler, and arguably the most commonly used engine for commercial virtual instruments due to its feature set and ease of use when compared to older technologies (soundfonts ahem).

Kontakt libraries are nearly always commercial and require serial number activation through the earth-shatteringly terrible NI Service Center the slightly less eath-shatteringly terrible Native Access, while instruments are often cheaper (or free) as the developers don’t have to pay Native Instruments a licensing fee. (There are also multis, .nkm, which are saved configurations of multiple instruments loaded together).

Kontakt costs $399, but a free version exists called Kontakt Player which can be used to load libraries unrestricted and instruments/multis for 15 minutes at a time.
See also: Do I need Kontakt? (Patron-exclusive)

While many virtual instrument formats exist, the easiest way to find the instrument you’re looking for is by searching “[instrument] vst,” “[instrument] soundfont,” or “[instrument] kontakt,” with an emphasis on SoundFonts (and increasingly VSTs) if you’re in the noncommercial market.

Features of sample-based instruments

Free instrument listings usually aren’t loaded with random technical jargon, but if you’re thinking of spending money on your next pretend piano or viola, it’s a good idea to know what you’re getting.

A polyphonic instrument can sound two or more notes simultaneously (e.g., a guitar), whereas a monophonic instrument is limited to one at a time (e.g., the human voice). Many virtual instruments and most synths offer the option to toggle between a polyphonic mode, which allows you to play chords and harmonies, and a monophonic mode, which can be used to play legato and, in synths, often results in a much bigger sound.
Many instruments — particularly string instruments — can be played using a variety of techniques, e.g., short notes versus long notes or plucked strings vs strings played with a bow. These play styles are referred to as articulations. Here are some articulations you might encounter, each with their related articulations listed underneath from most to least common.

  • Sustain – Long notes, which usually rely on a looped sample so they can be held indefinitely.
    • Legato – Sustained notes played as one continuous, lyrical phrase. Vocal performers can achieve this effect by waiting to take a breath until the end of a phrase, while string players will play a phrase without lifting their bow off the strings.
      • Portamento – Sliding from one note to another. Guitar liraries might cut to the chase and refer to these as slides. On pianos, harps, and other instruments where note values are discrete, the equivalent effect is a glissando.
      • Hammer-on/pull-off – A note played on guitar by placing or removing a finger on an already vibrating string.
      • Slur – Two or more notes played on a string instrument without changing directions of the bow. Slurs are not often listed as separate articulations, but oftentimes instruments will allow you to trigger a slur with a keyswitch or CC.
  • Staccato – Short, detached notes.
    • Spiccato – A kind of staccato achieved on strings by bouncing the bow back and forth.
  • Pizzicato – Notes played by plucking, rather than bowing, a string instrument.3
    • Bartok pizz – A kind of pizzicato where the strings are plucked so hard they snap against the fingerboard.
  • Tremolo – A “trembling” sound produced on string instruments by moving the bow back and forth in very short motions, playing one note repeatedly. On guitars and other instruments for which physical tremolo normally isn’t possible, the same effect can be achieved with a pedal or plugin effect which rapidly oscillates the volume of a note.
    • Flutter tongue – A buzzing, trembling, or growling note produced on brass and woodwind when the player flutters their tongue. The characteristic of the sound varies depending on the instrument.
    • Double tonguing – A rapid succession of short notes produced by making a “d-g” or “t-k” sound on a woodwind instrument.
  • Harmonics – a high, squeaky, isolated overtone produced by lightly pressing down on the string. I’ve only seen this included in guitar libraries, though bowed instruments can produce them as well.
  • Trill – Going back and fourth between two notes very fast. Usually instruments with this articulation will offer trills at different intervals, major/whole step and minor/half step.
    • Mordent – Similar to a trill, but it’s just one oscillation, used as an ornament at the start of a note.
  • Power chords – A perfect fourth or perfect fifth on the guitar. According to modern science, laying power chords is the #1 way to sound like a cool person.
    • Double stop – Two notes played on a string instrument at once by moving the bow across two strings simultaneously. Double stops are very uncommon in virtual instruments, and you’re more likely to find them on a cello or bass library than that of a viola or violin.
  • Marcato – Accented notes.
    • Tenuto – notes held out to their full length and then some. Tenuto can also indicate an accent depending on the context.
    • Sforzando – notes that are suddenly VERY LOUD. Similar to fortepiano, only fortepiano starts very loud and becomes very soft.

♫ Long and thin, legato is like spaghetti. ♫

♫ Short, and chopped in tiny pieces, staccato is like macaroni. ♫

We were forced to recite this chant in AP Music Theory. It is my life’s mission to find the person who wrote this song and kill them.

These playing styles are often lumped in with articulation in an instrument’s specifications, but they are in fact different “flavors” which can be combined with the articulations above to form many permutations of sound! … Colors is not actually an official term.

  • Muted – Notes produced by limiting the resonance of the instrument in some form, e.g. with the palm on the strings of a guitar or by shoving a metal cone into your trumpet.
    • (con) Sordino – Strings which are muted by placing a rubber clamp on the bridge.
  • Sul ponticello – Lit. “on the bridge,” meaning the bow is played on or very close to the bridge, resulting in a harsh, scratchy sound.
  • Sul tasto – Lit. “on the fingerboard,” meaning the bow is played on the fingerboard, resulting in a softer, more muted sound.
The layman meaning of “octave” can vary depending on the musical context, though in the case of digital instruments and the keyboard layout on which they are based, an octave is the inclusive range from one C note to the next B note above it. An 88-key piano includes seven full octaves, plus bits of octaves 0 and 8 at the beginning and end of its range, because the lowest and highest notes on a piano are not actually C. It’s not common to find octaves above 8 or 9, though a lot of keyboard interfaces will go as low as -4. To refer to a note in a particular octave, you append the octave number to it. For example, C4 is a bomb what pianists call “middle C,” and black holes apparently resonate at a B♭-53.

A diagram showing where each numbered octave on the 88-key piano begins and ends. Only three keys are present from octave 0 and only one key is present from octave 8.
Piano Frequences by AlwaysAngry, licensed under CC BY-SA 3.0.
Libraries with a lot of built-in articulations and features often allow you to switch between desired play styles and other options using notes below the instrument’s range, usually in octaves -2 through 0. Keyswitching can be used in addition to or as an alternative for custom-set CCs in some cases.
Round Robin
A feature of virtual instruments wherein playing the same note more than once will cause the script to rotate through multiple samples of that articulation/pitch, in order to sound more realistic and lss repetitive. A library with three samples for every pitch might boast x3 Round Robin. Neighbor-borrowing Round Robin indicates the sample for an adjacent pitch is tuned up or down to achieve the same effect.
Recordings of a clean electric guitar signal without amps, cabinets, or pedals. effects on.
Refers to whether an audio demo for an instrument features the instrument alone (naked) or “dressed” with accompaniment and effects. The page for Impact Soundworks’ Stroh Violin has demos of each kind.

Features of synths

In addition to understanding sample-based instruments, a lot of genres require at least a passing familiarity with the magic of synths. Though you can get by on the most cursory synth knowledge, a thorough understanding of synths will allow you to break out of your dependence on generic (and usually commercial) presets. Then you can spend your hard-earned money on more important things, like ice cream or Plasterbrain’s discography. I say all this as someone who uses 90% presets with some minor variations. I just want you to give me money.

A synth is a piece of hardware or software that programatically generates sound based on simple generic waveforms. Unlike virtual instruments, a synth does not need samples to create sound.

  • Additive synth – Creates sound by combining sine waves. A popular example is u-he’s Zebra2.
  • Subtractive synth – Creates sound by subtracting frequencies from a basic waveform. A popular example is NI’s Massive.
  • FM synth – Creates sound by modulating the frequency of one wave with a second wave. A popular example is Yamaha’s DX7.
  • Wavetable synth – Creates sound by modulating any of a number of complex, arbitrary waveforms saved in a table. A popular example is Xfer’s Serum.
An oscillator is a thing that goes back and forth, basically. In synth, the term “oscillator” by itself refers to the signal generator in a synth which creates a periodic sonic waveform. Synth waveforms5 are far more basic than the waves created by actual instruments. Read on as I attempt to describe their various types:

  • Pulse wave – A transient wave with a flat crest and trough, representing an “on” state (amplitude of 1) and an “off” state (amplitude of 0). Pulse waves are described by their duty cycle, which the ratio of time spent in the “on” state to the entire period of the wave. For example, a pulse wave with a 12.5% duty cycle is “on” for 1/8th of the total period. The four types of pulse wave you’ll find are 12.5%, 50%, and 25%/75% (which sound almost identical).
    • Square wave – A pulse wave with a 50% duty cycle.
  • Saw wave – A wave with sharp, sawtooth crests shaped like a right triangle. Saws are the bread and butter of Hi-NRG, Eurobeat, EDM, and Para Para. Usually, the intense, face-blasting saw leads you hear in these genres are referred to as supersaws or hypersaws.
  • Triangle wave – A wave with sharp, triangular crests which lack the vertical descent of a sawtooth. Triangle waves make great chiptune basses.
  • Sine wave – A wave shaped like a sinusoid. Take a math class sometime, nerd! Sine waves have a softer sound that you might hear in hip-hop.
  • Noise – A wave with randomly modulating amplitude that lacks musical quality. Outside of a synth context, noise signals are described by different colors (most commonly white, pink, or brown) depending on the relationship between their amplitude (dB) and frequency (Hz). In the case of the NES 2A03 sound channels, noise comprises a repeating sequence of 93 or 32767 random bits. Matt Montag, who created the Nintendo VST, compares them to “a square wave with a continuously-varying random pulse width.” 32767 bits is the highest signed 16-bit value, and 93… uh… I’m not sure why they picked 93. The robotic-sounding intro of Metal Crusher is all sounds using 93-bit noise.
  • Wavetable synths also use a selection of arbitrary waveforms. Sample-based synths, like iZotope’s Iris 2, generate synth based on audio samples. Chiptune emulators may also have a DPCM/PCM channel, which can be used to play back samples, e.g. for drumkits.
An LFO is a signal at 20 Hz or less which is used to modulate a parameter of a synth sound (like volume or pitch) in a rhythmic fashion. LFOs can be combined with other LFOs and envelopes for even more drastically modulated effects.
ADSR Envelope
An envelope is basically a graph describing a single note, where the x-axis is time and the y-axis is a single parameter, most commonly volume/amplitude. The x-axis of an ASDR envelope is divided into four sections: attack, decay, sustain, and release. You can modify the sound of a note by adjusting the length of these sections or the y-axis value within them.

  • Attack – The start of the note. A long attack will cause the note to fade in rather than sounding immediately.
  • Decay – The transition between attack and sustain.
  • Sustain – The duration of the note after the initial keypress until the key is released. A short sustain will end the note even if the key is still being held.
  • Release – The part after the key is released.

Virtual instruments often have a volume envelope that can be adjusted with ASDR knobs. Synths are more likely to have multiple visual envelopes (for amplitude, filter, pitch, etc.) which you can manually adjust on either axis.

Filters can be used to block certain frequency ranges.

  • HPF – Blocks off sound below a given cutoff frequency, allowing only the higher frequencies to pass.
  • LPF – Blocks off sound above a given cutoff frequency, allowing only the lower frequencies to pass.
  • BPF – Blocks off sound outside of a given frequency range, allowing only that “band” on the frequency spectrum to pass.
  • APF – Allows all frequencies to pass, but modifies the phase of certain freqencies. APFs are mostly used in phaser plugins, but you may occassionally see one on your synth, too.

Mastering MIDI

There’s more to MIDI than just note pitches and durations. In order to get the most out of the instrument features described above, and especially to imitate the nuances of live musical performance, it’s important to understand the more ancillary aspects of MIDI.

MIDI messages are represented as a number from 0 to 127, the largest signed 8-bit value.

CCs are used to assign MIDI notes certain performance data that aren’t covered by the other 6 Channel Voice Messages. For example, CC 11 controls “expression” and CC 2 is the breath controller. There are 127 CCs in all. Many instruments and synths will allow you to assign certain parameters to CCs, though certain ranges are reserved for internal use. Here’s a list of all the default CC assignments. Support for various CCs varies by instrument.
Note-on velocity is a MIDI controller which determines how hard the note is struck. Depending on the instrument, velocity may only effect the volume of the notes, or it may completely change the sound of the attack/transient. Note-off velocity affects how the note is released, but it’s not nearly as commonly used.
Mod Wheel
CC1 is modulation, represented by the mod wheel on hardware controllers. By default, modulation usually controls oscillation of the current note, which can be used to create a vibrato effect. However, it’s not uncommon for instruments and synths especially to have their own purpose for the mod wheel or allow you to assign to it the parameter of your choosing.
Pitch wheel/pitch bend
Pitch bend is a separate control from the 127 CCs. Its default value (0) starts in the middle, and can be tuned up or down to modify the note’s pitch, usually to create a portamento effect. The pitch wheel is situated next to the mod wheel on hardware controllers. Unlike the mod wheel, which moves up and down freely, the physical pitch wheel is designed to snap back to its default position on release.
Aftertouch is also not a CC. The term refers to MIDI data sent based on the pressure applied to a key after it has been pressed down.7

Mastering the mastering

These are words to know when you’re interested in mixing, mastering, or adding effects to your music. Most of the effects mentioned here are much more complex than I make them sound, so they’re worth looking into further if you want to become an audio mastering master. I am not an audio mastering master, in case that’s not extremely obvious.

Wet/dry signal
A wet signal refers to one that is physically aroused. Ho ho! has had effects applied to it. A dry signal is the original sound without effects applied. Many effect plugins will allow you to adjust the volume of the wet and dry signal. By setting a low volume on the wet signal, you can make the effect more subtle.
The panning of an audio signal represents its horizontal position in a stereo image. A track could be panned hard left (only audible from the left speaker), hard right (only audible from the right speaker), center (equally audible from both speakers), or anywhere in between. Along with volume, panning is one of the basic parameters of an audio track.
Equalizers are used to cut or boost of frequency ranges, or bands. For example, you might cut around 60-100 Hz if your bass sounds too muddy. You can find many online guides on how to tweak EQ to get the most out of a particular instrument.
A compressor reduce’s a sound’s dynamic range by lowering the loudest points and amplifying the softest points. A compressor is usually applied to single tracks.

  • Used to remove sibilance from vocals.
The limiter is the compressor’s cooler cousin. They are essentially the same, but a limiter uses a much higher gain reduction ratio and is intended to be used on an entire mix or submix to control the volume.
An exciter “brightens” sound quality through harmonic distortion, eq, and phase modulation.
A phaser uses all-pass filters to combine the original signal with a phase-shifted version of itself. Here’s what a phaser sounds like on guitar.

  • Flanger – A phaser where the wet signal is delayed by a very short amount before being recombined. The result is a phase-distorted, jet-like sound.
  • Chorus – A flanger with a longer delay time (usually greater than 20ms). The result is a more subtle, shimmering sound than can be achieved with a flanger. Unless you’re working in creative sound design or using a lot of electric guitars, you’ll be more likely to use a chorus than regular phaser or flanger. For example, to turn a regular Jhin into a Project Jhin, you just add a chorus effect to his voice lines. True story.
Reverb is the sound produced when a dry signal interacts with surfaces in an acoustic space. In digital music, reverb plug-ins can be used to emulate real life spaces using impulse responses (IR).
Delay is essentially an echo, an effect achieved by repeating a dry signal at a given rate, for a given stretch of time, at increasingly lower volumes. Delay plugins will often allow you to tweak the panning of the wet signal for a stereo “ping-pong” effect.

There are numerous other, though less common, effects you might be interested in, especially if you plan on replicating classic sounds by [insert favorite late 20th century band here]: arpeggiators, ring modulators, transient shapers, noise gates, tape and tube saturation, distortion, bit crushers, virtual amps and cabinets, tube screamers, wah-wah pedals, gating and side-chaining plugins, harmonizers, vocoders, pitch-correction software, octave reverb, tape stops… the list goes on.

For everyday mixing, achieving a basic grasp of reverb, compression/limiting, and EQ should be your biggest priority. Using panning, EQ, and side-chaining to give your tracks a more flattering stereo image is also a good skill to have.

  1. The $99 entry-level “Fruity” Edition does not include the ability to use audio files. Yikes.
  2. Related, Mixcraft is about a year behind major update schedule, hasn’t released any regular patches in months, and runs perpetual 50% off sales for their flagship product. I am sweating bullets, y’all.
  3. The term for using a bow on a string instrument is arco, but you won’t see it used with virtual instruments. I merely included it to show how smart I am. ,’:)
  4. See also this overview of synth concepts from YalaOrg
  5. See also Zoë Blade’s summary of periodic waveforms.
  6. The terms “continuous controller” and “control change” are often used interchangeably. Official MIDI specifications now seem to prefer the latter.
  7. Traditionally, aftertouch is the distance a piano key is depressed after initial contact is made between the jack toe and the let-off button (nos. 18 and 19 on this diagram), or the difference between the key’s initial contact and its full range of motion.