With the advent of computer audio cards that can record and playback digital audio, computer software has appeared which turns a computer into a device that can not only record and playback that digital audio, but also present that data in a way that makes it easy for musicians to view and edit it. The data can be graphically displayed upon a computer monitor, and can be manipulated with the mouse, for example. Furthermore, computer software can also perform its own manipulations of that data, yielding effects such as delay, transposition, chorus, compression, etc, sometimes even in real-time (ie, while the audio data is playing back). The wide range of computer audio products also means that a computer digital audio system can be tailored to many budgets. And being that computers are typically more easily upgradable than dedicated digital audio units (for example, adding a second hard drive to accomodate more digital audio tracks), and can do other things besides digital audio work, they are often ultimately more versatile and cost effective than dedicated digital audio units. In short, with good audio hardware and software, computers make very good digital audio workstations.

Examples of software that supports both digital audio recording, as well as MIDI, are CakeWalk's Sonar, Steinberg's Cubase, Rosegarden (freeware) etc. Examples of programs that specialize in digital audio recording (and may have a more powerful and easier feature set for digital audio editing than the sequencers) are Adobe's Audition, Sound Forge, SAW, Samplitude, Ardour (freeware), etc.


Digital Audio Recording

A typical computer system works with digital audio in the following way. First, to record digital audio, you need a card with an Analog to Digital Converter (ADC) circuitry on it. This ADC is attached to the Line In (and Mic In) jack of your audio card, and converts the incoming analog audio to a digital signal that computer software can store on your hard drive, visually display on the computer's monitor, mathematically manipulate in order to add effects or process the sound, etc. (When I say "incoming analog audio", I'm referring to whatever you're pumping into the Line In or Mic In of your sound card, for example, the output from a mixing console, or the audio output of an electronic instrument, or the sound of some acoustic instrument or voice being feed through a microphone plugged into the sound card's Mic In, etc). While the incoming analog audio is being recorded, the ADC is creating many, many digital values in its conversion to a digital audio representation of what is being recorded. Think of it as analogous to a cassette recorder. While you're recording some analog audio to a cassette tape, the tape is constantly passing over the record head. So the longer the passage of music you record, the more cassette tape you use to record that analog audio signal onto the tape. So too with the conversion to digital audio. The longer the passage of music you record (ie, digitize), the more digital values are created, and these values must be stored for later playback.

Where are these digital values stored? As your sound card creates each value, that data is passed to the card's software driver which then passes that data to the software program managing the recording process. Such software might be Cubase recording a digital audio track, or Windows Sound Recorder, or any other program capable of initiating and managing the recording of digital audio. Whereas the program may temporarily accumulate those digital audio values in the computer's RAM, those values will eventually have to be stored upon some fixed medium for permanence. That medium is your computer's hard drive. (For this reason, sometimes people refer to the process of recording digital audio to a hard drive as "Hard Disk Recording". Henceforth, I will abbreviate "hard drive" as HD). Usually, how this works is that the program accumulates a "block" of data in RAM (while the sound card is digitizing the incoming analog audio), for example 4,000 data values, and then writes these 4,000 values into a file on your HD. (It's a lot more efficient to write 4,000 values at once to your hard drive, than it is to write those 4,000 values one after the other separately). All of these 4,000 values go into one file. Usually, the format for how the data is arranged within this file follows the WAVE file format. (But there are other formats that may also be used by programs to store digital audio values. For example, AIFF is often used on the Macintosh. AU format is used on Sun computers. MP3 is a popular format nowadays because it compresses the data's size. Etc. Any of these formats could be used on any computer, but WAVE is considered the standard on a Windows-based PC). Now, if the recording process is still going on, the software program will collect 4,000 more values in RAM, and then write them out to the same WAVE file on the HD (without erasing the previously stored 4,000 values -- ie, the values accumulate within the WAVE file, so that it now has 8,000 values in it). This process continues until the musician halts the recording process. (ie, If you let recording continue long enough, it will eventually fill up your HD with digital audio values in one, big WAVE file). At that point, the WAVE file is complete, and contains all of the digital audio values representing the analog audio recorded.

So, digital audio recorded by most PC programs is predominantly stored in a WAVE file on your HD, and that file is created while the digital audio is being created/recorded.


Digital Audio Playback

In order to subsequently playback this digital audio (ie, WAVE file), you need a card with a Digital To Analog Converter (DAC) circuitry on it. Needless to say, most sound cards have both an ADC and a DAC so that the card can both record and play digital audio. This DAC is attached to the Line Out jack of your audio card, and converts the digital audio values back into the original analog audio (that was initially recorded during the recording process). This analog audio can then be routed to a mixer, or speakers, or headphones so that you can hear the recreation of what was originally recorded. You need software to manage the playback of digital audio, and not surprisingly, the same program that was used to manage the recording process usually can also manage the playback. For example, Cubase can playback the digital audio track that it recorded. The playback process is almost an exact reverse of the recording process. The program reads a block of digital audio data from the WAVE file on the HD. For example, Cubase may read the first 4,000 data values. (It's more efficient to read 4,000 values off of the HD at once, than it is to read those 4,000 values one after the other separately). Then, the program passes each one of these values to the card's driver which feeds it to the card's DAC. The program then reads the next 4,000 values from the WAVE file, and plays those back as described. In other words, the sound card is recreating the original analog audio while the program is reading the digital audio values off of the HD and passing them back to the card. This continues until the program has played all of the values in the WAVE file (or until the musician interrupts the playback).


Data processing during playback

Some programs can optionally perform some mathematical manipulation of the digital audio values immediately before the data is passed to the card's driver (ie, during playback). Such manipulations may be to add effects such as reverb, chorus, delay, etc. Programs that do such realtime processing often allow you to use add-on software called "plug-ins". For example, you may have a choice of reverb software to use. There are different formats for plug-ins, but most Windows audio programs support the format called VST.

Some programs also can process the digital audio to do some valuable things such as "time-stretching" and "pitch shift" (although these are often too computationally complex for today's computer to do during playback. So, this processing usually has to be applied before playback, and may be "destructive" in that it permanently alters the digital audio data).

Time-stretching: When you play a one-shot (ie, non-looped) waveform, it lasts only so long (ie, in terms of time). For example, maybe you've got digital audio tracks of a piece of music whose duration is 2 minutes. Sometimes, people need to adjust the length of time over which the waveform plays. For example, maybe the producer of a movie says "I want this piece of music to last exactly 2 minutes and 3 seconds in order to fit with this filmed scene I have which is this long". Well, if you had a MIDI track, you'd just slow the tempo a little in order to make the music last that extra 3 seconds. The music would sound exactly the same, but at a slightly slower tempo. OK, so how do you "slow down" the digital audio tracks? Well, you could reduce the playback rate a little. But, if you've ever used a sampler, you'll notice that when you play a waveform at a rate different than it was recorded, this changes the character of the waveform itself. You don't just hear a different tempo, you hear different pitch, vibrato, tremulo, tone, etc. So, time-stretching was devised to take a waveform, analyze it, and change its length without (hopefully) changing its characteristics. The net result is that you get the same effect that you had by changing the tempo of the MIDI track; ie, merely a change in tempo/duration rather than a change in the characteristics of the waveform. (Nevertheless, there is always some potential for a change in timbre when time-stretching algorithms are applied to a waveform).

Pitch shift: This changes the note's pitch without altering the playback rate (which would alter other characteristics of the waveform). So, if you sampled the middle C note of a piano at 44KHz, but when you play it back at 44KHz, you really want to hear a D note, you could apply pitch shift to it to create a waveform whose "root" (ie, the pitch you get when you playback at the same sample rate as when recorded) is D. If you take a musical performance with "rhythms" in it, and apply pitch shift, you should get a different pitch, but retain the same rhythms (ie, tempo).


Virtual Tracks

Although most sound cards have only 2 discrete digital audio channels (ie, stereo digital audio capabilities), many programs such as Cubase support recording and playing many more tracks of digital audio. And yet, Cubase seemingly plays more than 2 digital audio tracks using such a card. How is this done? It is accomplished through the use of "virtual tracks". What this means is that the program allows you to record as many digital audio tracks as you like, mono and/or stereo. The only limitation is your HD space, plus how many tracks can be read off of your HD at the required rate of playback. (You can record only 2 mono tracks, or 1 stereo track at one time, due to the card's limit of only 2 channels. But, you can do as many iterations of the recording process as you wish to yield many more than 2 tracks. It's on playback that the concept of virtual tracks works). For example, you could record 4 mono tracks, plus 3 stereo tracks (if you have a fast enough HD). You can set the individual panning for each of the mono tracks. Then, upon playback (ie, when Cubase reads the data for all of those digital audio tracks from their WAVE files), Cubase itself mathematically mixes all of the digital audio tracks into one stereo digital audio mix, and outputs this mix to the sound card's stereo DAC. In other words, Cubase itself functions as a sort of "digital mixer", mixing many mono and stereo tracks together (during playback) into one stereo digital audio track. So, using any sound card with a stereo digital audio DAC, you can actually record as many tracks as you like. Cubase virtualizes the card so that it appears to have many digital audio tracks, mono and/or stereo. Most pro programs that work with digital audio support the concept of "virtual tracks".

There are a few drawbacks to this scheme though. First, the more tracks that you record, the more that you have to back off on the individual volume of each track. Why? Because when all of the tracks are mathematically summed together by Cubase, if at any point the sum exceeds a 16-bit value, you'll get clipping. The more tracks that you sum, the more likely you are to get clipping -- unless you back off on the individual track volumes more with each added track. It doesn't matter if your card has 24-bit DACs. Cubase itself performs the sum into a 16-bit mix, so there is an inherent 16-bit limitation to Cubase's output (or any other program that is limited to 16-bit digital audio -- some digital audio programs offer greater than 16-bit resolution for their internal mixing, such as 24-bit or 32-bit. These programs don't require you to back off on the individual wave volumes so much. If you're using a lot of virtual tracks, use a program with at least 24-bit resolution for its internal mixing). So the more tracks you record, the less dynamic range you get for each individual track as you turn down the individual volume to avoid clipping the mix. I doubt that you'd want to mix more than 4 mono virtual tracks to one card, if your audio program does only 16-bit mixing. On the plus side, if you get more digital audio cards, you can then split up virtual tracks among them since many audio programs can use more than 1 card simultaneously. Of course, you could get one card with multiple digital audio channels, which usually support better than 16-bit DACs and perhaps better throughput (ie, lower latency).


Hard Drive requirements

With digital audio tracks recorded to a HD, usually the limiting factor in how many virtual tracks can be simultaneously played, and how well the audio plays (ie, no "dropouts" or odd noise bursts), is the speed at which data can be read/written to your HD. A slow HD (more than 12ms access time) is usually the limiting factor in how many virtual tracks can be used. Also, if using a mechanical head drive, you need a HD that has no problems with thermal recalibration (ie, lengthy adjustments the HD may make during reading/writing that can cause a delay in accessing the drive). These delays may cause a program to fail to feed a continuous stream of digital audio values to/from the sound card, and you'll then hear "glitches" in the audio. Newer HD designs have tended to minimize this problem, or eliminate such problems as with a Solid State Drive (SSD). Select a good HD for digital audio.

Furthermore, since digital audio requires a constant stream of values throughout the recording process (unlike with MIDI), you need a large HD if you want to record long digital audio tracks. You'll use up 5 MEG of HD space for every minute of a 16-bit mono digital audio track recorded at 44.1Khz (and 10 MEG for a stereo track). In other words, recording a CD-quality digital stereo track that is 5 minutes long will use up 50 MEG of your HD.


Audio card requirements

Of course, this is not to say that the quality of your sound card isn't important too. If you've got a card with a cheap DAC or ADC, you're going to get noisy digital audio. (ie, The result will typically sound like a "graininess" to the audio or even "tape hiss". On the other hand, problems with the HD usually manifest in horrid distortion or weird noises such as pops and clicks). To get nicely recorded digital audio, and as many simultaneous virtual tracks as possible, you'll want both a fast, large HD with efficient controller I/O, and a card with a good DAC and ADC.

Unless you've got a card that has digital I/O so that you can run a digital connection right to a DAT deck, for example, you'll also want a card that has a clean (ie, low Thd distortion) and quiet (ie, low signal-to-noise ratio) audio output stage.

For more information about digital audio cards themselves, see Digital Audio Cards.