Many of you will be happy to know that TapeDeck 1.3 is out today. It’s an exciting release, because it adds the much-requested ability to record lossless audio.

Now you pro audio folks can record audio in the highest quality, and drag your tapes straight from TapeDeck into GarageBand with no loss in fidelity. And, if you’re really nutty about your audio quality (and have the hardware to back it up), you can unlock TapeDeck’s recording quality in the preferences so you can record beyond 44.1kHz!

Check out http://tapedeckapp.com to grab the latest release, and see the updated site design (inside the drawer). I also put some nice little touches into the UI for this release, because it needed some love. :)

I’ve made a pretty big change in TapeDeck 1.3, which now records raw quality audio in HQ mode (using the Apple Lossless format), and at any sample rate (provided you unlock it in the preferences).

If this really interests you, or you just want to try the newest TapeDeck betas before anyone else, please apply for membership at http://groups.google.com/group/tapedeck-beta.

Oh, and if you’re interested in UI design, we’ve added some little UI tweaks that we’d love to get feedback on. You’re welcome to apply as well, so you can get a sneak peak (and gripe about pixels).

I get asked about drawing waveforms from time to time. Over the years, I came to realize that this is a black art of sorts, and it requires a combination of some audio and drawing know-how on the Mac to get it right.

But first, a little story.

Once upon a time I used to write audio software for BeOS while I was in university. As almost every audio software author eventually does, I came to a point where I needed to render audio waveforms to the screen. I hacked up a straightforward drawing algorithm, and it worked well.

When I started working on a follow-on project, I decided to re-use the algorithm I wrote for the first application, but it didn’t work so well. The trouble is, when I originally wrote that algorithm, the audio clips in question were all very tiny—less than 2s. Now I was dealing with much longer clips (up to a few minutes, in practice), and the algorithm didn’t scale well at all.

Around this time, I interviewed with Sonic Foundry, with the hopes of joining the Vegas team. During my interview, I asked, “How do you guys draw waveforms on-screen for large audio clips, and so quickly!?”

“That’s proprietary information, sorry.”

At the time, I just figured the guys were just avoiding a long, drawn-out response. I coded this up myself, except for the fact that it wasn’t so fast—so it can’t be that difficult, right? Unfortunately, I got similar responses from other people I had asked afterwards.

Regardless of whether you’re new to audio, or you’ve been doing it for a while, you are aware that there aren’t too many books on the topic. Furthermore, you probably aren’t going to find too much in the way of detailed algorithms, or even pseudocode, to help you out.

I’m starting to realize that the reason is two-fold.

First off, there really aren’t a lot of people out there who need to draw audio waveforms (or large data sets, for that matter) to screen. Second, it’s really not all that hard once you think about it for a while.

Overview

Drawing waveforms boils down to a few major stages: acquisition, reduction, storage, and drawing.

For each of the stages, you have many implementation options, and you’ll choose the simplest one that’ll serve your application. I don’t know what your application is, so I’ll use Capo as the main example for this post, and throw around some hypothetical situations where necessary.

Early on, you have to set some priorities: Speed, Accuracy, and Display Quality. The order of those priorities will help you decide how to build your drawing algorithm, down to the individual stages.

In Capo, I wanted to make Display Quality the top priority, followed by Speed, and then Accuracy. Because Capo would never be used to do sample-precise edits, I could throw away a whole lot of data, and then make the waveform look as good as possible in a short time frame.

If I were writing an audio editor, my priorities might be Accuracy, followed by Speed, and then Display Quality. For a sequencer (like Garage Band), I’d choose Speed, Display Quality, then Accuracy, because you’re only viewing the audio at a high level, and it’s part of a larger group of parts. Make sense?

Once you have an idea of what you need, you will have a clear picture of how to proceed.

Acquisition

This is almost worth a post of its own. I like using the ExtAudioFile{Open,Seek,Read,Close} API set from AudioToolbox.framework to open various audio file formats, but you may choose a combo of AudioFile+AudioConverter (ExtAudioFile wraps these for you), or QuickTime’s APIs, or whatever else floats your boat.

Your decision of API to get the source data is entirely up to your application. You can’t extract movie audio with (Ext)AudioFile APIs, for instance, so they might not help much when writing a video editing UI. Alternatively, you may have your own proprietary format, or record short samples into memory, etc.

Given the above, I’m going to assume you’re working with a list of floating-point values representing the audio, because that’ll be helpful later on. Using ExtAudioFile, or an AudioConverter, make sure that your host format is set for floats, and you should be good.

When you’re pulling data from a file, keep in mind that it’s not going to be very quick, even on an SSD drive, thanks to format conversions. I’d advise doing all this work in an auxiliary thread, no matter how you get your audio, because it’ll keep your application responsive.

In Capo’s case, there is a separate thread that walks the entire audio file, doing the acquisition, reduction, and storage steps all at once. Because Display Quality and Performance were high on the priority list, the drawing step is done only when needed.

Reduction

Audio contains tons of delicious data. Unfortunately, when accuracy isn’t the top priority, it’s far too much data to be shown on the screen. With 44,100 samples/second, a second of audio would span ~17 30″ Cinema Displays if you displayed one sample value per each horizontal pixel.

If accuracy is your top priority, you’re still going to be throwing lots of data away most of the time, except when your user wants to maintain a 1:1 sample:pixel ratio (or, in some cases, I’ve seen a sample take up more than 1 pixel, for very fine editing). If you’re writing an editor, or some other application that needs high-detail access to the source data, you will have to re-run the reduction step as the user changes the zoom level. When the user wants to see 1:1 samples:pixels, you won’t throw anything away. When the user wishes to see 200:1 samples:pixels, you’ll throw away 199 samples for every pixel you’re displaying.

In the case of Capo, I chose to create an overview data set for the ‘maximum zoom’ level, and keep that on the heap (a 5 minute song should take ~1MB RAM). In my case, I chose a maximum resolution of 50 samples per pixel, and created a data set from that. As the user zooms out, I then sample the overview data set to get the lower-resolution versions of the data. Accuracy isn’t great, but it’s pretty fast.

Now, when I talk about “throwing away”, or “sampling” the data set, I’m not simply discarding data. In some cases, randomly choosing samples to include in the final output will work just fine. However, you may encounter some pretty annoying artifacts (missing transients, jumping peaks, etc) when you change zoom levels or resize the display. If Display Quality is low on your list—who cares?

If you do care, you have a few options. Within each “bin” of the original audio, you can take a min/max pair, just the maximum magnitude, or an average. I have found the maximum magnitude to work well for the majority of cases. Here’s an example of what I do in Capo (in pseudocode, of sorts):

// source_audio contains the raw sample data
// overview_waveform will be filled with the 'sampled' waveform data
// N is the 'bin size' determined by the current zoom level
for ( i = 0; i < sizeof(source_audio); i += N ) {
    overview_waveform[i/N] = take_max_value_of( &(source_audio[i]), N )
}

Once you have your reduced data set, then you can put it on the screen.

Display

Here's where you have the most leeway in your implementation. I use the Quartz API to do my drawing. I prefer the family of C CoreGraphics CG* calls, because they're portable to CoreAnimation/iPhone coding, the most feature-rich, and generally quicker than their Cocoa equivalents. I won't get into any alternatives here (e.g. OpenGL), to keep it simple.

If we stick with the Capo example, then we've chosen to use the maximum magnitude data to draw our waveform. By doing so, we can exploit the fact that the waveform is going to be symmetric along the X axis, and only create one half of the final waveform path using some CGAffineTransform magic.

In the past, developers would create waveforms in pixel buffers using a series of vertical lines to represent the magnitudes of the samples. I like to call this the "traditional waveform drawing". It's still used quite a bit today, and in some cases it works great (especially when showing very small waveforms, and pixels are scarce like in a multitrack audio editor).

Traditional Waveform

I personally prefer to utilize Quartz paths so that I get some nice anti-aliasing to the waveform edge. Because Capo features the waveform so prominently in the display, I wanted to ensure I got top-notch output. Quartz paths gave me that guarantee.

To build the half-path, we'll also be exploiting the fact that both CoreAudio and Quartz represent points using floating-point values. Sadly, this code is slightly less awesome in 64-bit mode, since CGFloats become doubles, and you have to convert the single-precision audio floats over to double-precision pixels. Luckily there are quick routines for that conversion in Accelerate.framework (A whole 'nother blog post, I know...).

- (CGPathRef)giveMeAPath
{
    // Assume mAudioPoints is a float* with your audio points
    // (with {sampleIndex,value} pairs), and mAudioPointCount
    // contains the # of points in the buffer.

    CGMutablePathRef path = CGPathCreateMutable();
    CGPathAddLines( path, NULL, mAudioPoints, mAudioPointCount ); // magic!
    return path;
}

Because magnitudes are represented in the range [0,1], and we're using Quartz, we can build a transform that'll scale the waveform path to fit inside half the height of the view, and then append another transform that'll translate/scale the path so it's flipped upside-down, and appears below the X axis line (which corresponds to a sample value of 0.0). Here's a zoomed in example of what I'm talking about.

Flipped Waveform

And here's some code to give you an idea of what's going on to create the whole path:

// Get the overview waveform data (taking into account the level of detail to
// create the reduced data set)
CGPathRef halfPath = [waveform giveMeAPath];

// Build the destination path
CGMutablePathRef path = CGPathCreateMutable();

// Transform to fit the waveform ([0,1] range) into the vertical space
// ([halfHeight,height] range)
double halfHeight = floor( NSHeight( self.bounds ) / 2.0 );
CGAffineTransform xf = CGAffineTransformIdentity;
xf = CGAffineTransformTranslate( xf, 0.0, halfHeight );
xf = CGAffineTransformScale( xf, 1.0, halfHeight );

// Add the transformed path to the destination path
CGPathAddPath( path, &xf, halfPath );

// Transform to fit the waveform ([0,1] range) into the vertical space
// ([0,halfHeight] range), flipping the Y axis
xf = CGAffineTransformIdentity;
xf = CGAffineTransformTranslate( xf, 0.0, halfHeight );
xf = CGAffineTransformScale( xf, 1.0, -halfHeight );

// Add the transformed path to the destination path
CGPathAddPath( path, &xf, halfPath );

CGPathRelease( halfPath ); // clean up!

// Now, path contains the full waveform path.

Once you have this path, you have a bunch of options for drawing it. For instance, you could fill the path with a solid color, turn the path into a mask and draw a gradient (that's how Capo does it), etc.

Keep in mind, though, that a complex path with lots of points can be slow to draw. Be certain that you don't include more data points in your path than there are horizontal pixels on the screen—they won't be visible, anyway. If necessary, draw in a separate thread to an image, or use CoreAnimation to ensure your drawing happens asynchronously.

Use Shark/Instruments to help you decide whether this needs to be done—it's complicated work, and tough code to get working correctly with very few drawing artefacts. You don't even want to know the crazy code I had to get working in TapeDeck to have chunks of the waveform paged onto the screen. (Well, you might, but that's proprietary information, sorry. ;))

In Conclusion

People have suggested to me in the past that Apple should step up and hand us an API that would give waveform-drawing facilities (and graphs, too!). I disagree, and if Apple were to ever do this, I'd probably never use it. There are simply far too many application-specific design decisions that go into creating a waveform display engine, and whatever Apple would offer would probably only cover a small handful of use cases.

Hopefully the above information can help you build a waveform algorithm that suits your application well. I think that by breaking the problem up into separate sub-problems, you can build a solution that'll work best for your needs.

If you haven’t already noticed, I really dig John Mayer’s guitar playing. I also think his olympic white signature strat looks pretty sweet.

For a long time, I really wanted to buy that strat. I thought that if it was built to Mayer’s specs, it’d probably play and sound great. Unfortunately, it’s pretty rare to see signature strats at my local music shops, so I could never really test it out. The idea crossed my mind to just order it online (as I did with my red strat), but not at that price point.

Since I got my koa strat set up by my local guitar guru a couple of months ago, my red strat didn’t get much play. Something about the maple fingerboard, and the finish on the back of the neck slowed down my playing. The koa strat’s satin finish neck was faster to move around on, and the “C shape” versus the “modern C shape” was more comfortable in my hands—the “modern C” felt too thin, and it seemed to get thinner as you moved to higher numbered frets. Since it’s a shame to let a guitar sit unplayed, off for sale it went.

In early 2009, Fender introduced the Road Worn series. Back in February, while waiting for my red strat to ship, I got to play one in my local shop for about 10 minutes before my lesson started. I thought this thing was remarkable—a totally beat-up guitar that felt so comfortable in my hands! How on earth!?

I played the sunburst model, and it was especially worn. It had terribly rusted out saddles, extremely browned fret markers, etc.—not particularly attractive. Also, a grand for a beat-up guitar—and “Made in Mexico”?! Yikes.

Well, that experience stuck with me, and with the idea of replacing the red strat in my mind, I passively started checking out guitars again (since lessons had me in the music store weekly). I played another sunburst Road Worn Strat at my regular music shop, but again was turned off by the over-rusted saddles (seems more likely I’d break a string), and the setup was pretty bad on it. However, the neck was still very fast, and it felt great in my hands.

I went to my other local shop (yeah, there are only really two here in town) to drop off the red strat for sale on consignment. While there, I found an olympic white Road Worn 60s Strat, and I was instantly taken by its appearance. I plugged it into a Hot Rod Deluxe, played a few of my favorite tunes, and was really digging it. It played very well, had only a few small setup issues that I could correct myself, and the relic job was perfect—the fret markers looked old yet still white, and the saddles were not at all rusty where it mattered most.

However, there was a small wrinkle. On the same wall of guitars hung another olympic white strat—the John Mayer signature model.

Crap.

“I’m going to pull this thing down, play it, and fall in love,” I thought. “I have to sell off some serious gear to make up the money that it’ll take to get this home.” I was GASsing pretty hard.

Thoughts of purchasing this strat swirled through my head as I “asked for assistance” and got someone to carefully retrieve the guitar off the wall for me. After months of searching—even going as far as visiting some guitar stores while visiting family in Colorado—I finally got to lay my hands on the guitar I was sure I wanted.

At least, until I played it.

The neck was too thick (Mayer’s strat has a “thick C shape”), the finish on the neck was just as ‘sticky’ in my hands as the red strat was and, in terms of all the strats I’ve played, it was really nothing special. When I played it plugged-in, I used the words of Mayer himself—“it sounds like a strat.”

I was stunned. And relieved.

It wasn’t all bad, though. I genuinely enjoyed the Dunlop 6105 “tall jumbo” frets, the combo of olympic white with the mint green pickguard, and I really liked the vintage look of the neck (truss rod adjustment at the body end), vintage tuners, etc. But all this stuff was on the Road Worn Strat, too! Of course, there are some notable exceptions—it’s beat to hell (read: extremely comfortable), has tex-mex pickups rather than big dippers (big whoop), a 7.5″ fingerboard radius, and it’s Made in Mexico. Oh yeah, it also costs less than half of what the signature series sells for.

I got the Road Worn Strat back into my hands again, just to make sure something hadn’t happened to me—extra sweaty palms, nervous playing, a change in the atmosphere. Nope, I simply didn’t like my “dream guitar” after all. I’d play a few licks on the Mayer strat, then change back to the Road Worn—no matter what I’d play, I preferred playing it on the Road Worn strat.

This all went down about 3 weeks ago, when I started the process of selling my red strat. Just this week I finally sold it (on my own, which worked out better in the end), and I grabbed the olympic white Road Worn 60s Strat that the store held for me (for much longer than they said they would). I also got it for a decent price (about $200 off the lowest price on the tag, and $100 less than what the other shop sold them for), and I ended up with the exact one I played and liked. Score!

I hope to get some videos up with the new strat after I get over this cold I’m fighting. It’s a real joy to play, especially since I threw some new strings on it, and polished up the frets with some 0000 steel wool.

If you’re anything like me, and you rely on Octave in your work, then not having a working copy can be very frustrating.

I stumbled on, not one, but two issues plaguing Octave in Snow Leopard.

First, it appears that installing Octave via macports is busted. Actually, you’ll get stuck while building gcc45. There’s already an open macports bug for this (see the huge list of open issues), so it’ll probably get solved soon. I don’t normally install Octave via macports, but when I got stuck installing packages I tried to install via this route.

Second, when you try installing packages from octave forge from within the pre-made Octave binary, the build fails with some messages about mismatched architectures. This is because GCC is defaulting to build x86_64, but Octave is built as i386 for the time being.

I typically prefer to use the pre-made binaries. It’s the quickest route to getting Octave up and running on your system, and installing packages is quite painless. So, I decided to hack around until I could determine a quick solution to the problem. Here’s my solution:

  1. Open /Applications/Octave.app/Contents/Resources/bin/mkoctfile /usr/local/bin/mkoctfile
  2. Change CFLAGS, FFLAGS, CPPFLAGS, CXXFLAGS, and LDFLAGS so that they all contain -arch i386. I put it at the beginning, so it looks like this:
  3. CFLAGS="-arch i386 -I${ROOT}/include[...]
    FFLAGS="-arch i386 -I${ROOT}/include[...]
    CPPFLAGS="-arch i386 -I${ROOT}/include[...]
    CXXFLAGS="-arch i386 -I${ROOT}/include[...]
    LDFLAGS="-arch i386 -L${ROOT}/lib[...]

I hope this helps someone. I’m pretty sure it’s not the correct way to solve the problem, but I just don’t have the time right now to figure out the proper solution, submit a patch, etc.

Update 20090924—Changed location of mkoctfile. On my main machine, it’s apparent that I still have some holdovers from when I was messing around with the macports version of Octave. While setting up my laptop with the Octave.app package, I found out where mkoctfile was supposed to live. Apologies for the error.