Mike Gorman

Subscribe to Mike Gorman: eMailAlertsEmail Alerts
Get Mike Gorman: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: Java Developer Magazine, Open Web Magazine

Java Developer : Article

MIDI & Audio Sequencing with Java

MIDI & Audio Sequencing with Java

The Java Sound API, first introduced in J2SE 1.3, includes the package javax.sound.midi, which contains everything you need to be able to send and receive messages to and from any MIDI device visible to your operating system.

The Java Sound Programmer Guide and the Java Sound Demo, both available for download from Sun, are excellent references that illustrate all the "nuts and bolts" of sending and receiving messages. This article provides a brief overview of working with the MIDI and sampled audio primitives of the Java Sound API, and then explores using those primitives to construct a basic multi-track MIDI/audio sequencer in Java.

Programming with MIDI
Basically, every message you send to or receive from a MIDI device is one or more bytes. The first byte is referred to as the "message" or "status" byte. This is essentially the "command" (e.g., note on, note off, change patch, set volume, etc.). The specific command you are sending or receiving will determine what the next byte or bytes are, if any. Having an online MIDI specification reference available will be indispensable.

The first step is to obtain a reference to the specific MidiDevice you want to talk to. The javax.sound.midi. MidiSystem is your gateway to everything that Java Sound was able to detect was installed in your operating system. You might display a list of available devices to users and let them choose which device they want to use (see Listing 1).

Once you have a reference to your MidiDevice, you're ready to begin sending and receiving messages. The first step is to understand that there are Receivers and Transmitters. As you might suspect, Receivers receive MIDI messages and Transmitters transmit or are the source for MIDI messages.

Step one is finding out which MidiDevices the MidiSystem is reporting. The easiest way is to iterate through each one, try to play "middle C," and see what happens. You may find that some of the MidiDevices are configured as receive only, others are transmit only, and others might receive and transmit. In my setup, my MIDI interface reports two different MidiDevices for the same keyboard - one that is transmit only, the other that's receive only. So when I want to send a MIDI message, I send it to the MidiDevice that is receive only, and when I want to record MIDI messages that I trigger by playing the keyboard, I do it by listening to the MidiDevice that's configured to transmit only (see Listing 2).

Step two is to figure out which of the MidiDevices are transmitters. In this case, you'll need to create an implementation of the javax.sound.midi.Receiver interface (see Listing 3).

Next, figure out which of the available MidiDevices are acting as your keyboard's transmitter by trying them out one at a time. Obtain each MidiDevice's "transmitter," assign your receiver to it, play a few notes on your keyboard, and look for output in the console that indicates you've found the right "transmitter" device (see Listing 4).

Once you've figured out which MidiDevice is your receiver and which is your transmitter, and you know the basics of sending and receiving MidiMessages, you now have everything you need to create your own full-featured 16-track MIDI sequencer! Well, not quite...

Creating Your Own MIDI Sequencer
As I soon found out, there's a lot more to creating a sequencer than just knowing how to send and receive MIDI messages.

You may have noticed by now that the javax.sound.midi package already includes a class Sequencer. Unfortunately, this "built- in" sequencer is limited in its capabilities and is not extendable since it's an interface. But the biggest reason I was unable to use it is that it seems to be "hard wired" to use the internal Java synthesizer (Sun Bug ID 4783745). It also appears to have very bad timing problems (Sun Bug ID 4773012).

With the API it's easy to create your own sequencer. First we have to be able to record a single track and play it back in the exact same timing it was originally played in. Moreover, the performing artist (that's you) will want to have a four-bar metronome count off prior to recording start, then, to keep perfect time, you'll need to continue the metronome until the user clicks stop. Of course, the metronome sounds should not be part of the performance when it's played back.

You can have the computer emit a "system beep" for your metronome, but I prefer to listen to the hi-hat of a drumkit on the keyboard. A lot of sequencers use MIDI channel 10 (i.e., track 10) as a drumkit, but you can choose any one you like. A tempo of 120 beats per minute (bpm) means 2 beats/second or 1 tick of your metronome every 500ms.

As you may have guessed, you'll need one thread playing the ticks of your metronome (on Channel 10) while you record any MidiMessages you receive (via your Receiver implementation) on Channel 1. (Note that I am referring to the channels/tracks from the musician's perspective. The channel references in the API are zero-based.) Listing 5 shows what your metronome thread might look like.

Playing the metronome is simple enough, but you need to figure out a way to play it through just one time, then begin recording MidiMessages as they arrive at your receiver (discussed later). How you do that will be left for you to decide.

Recording MIDI
How exactly do you record MidiMessages? There are basically two strategies: you can try to take note of what time each message arrives, or you can use the included timestamp of each message. In either strategy, your implementation of the Receiver interface will create an ArrayList and add each MidiMessage it receives to the ArrayList. Of course, you'll need to make sure you record only MidiMessages for the duration immediately following the four-bar Metronome count off until the user clicks stop.

Your first strategy might be to use System.currentTimeMillis() to take note of the current system time (in ms) at which each MidiMessage arrives. You'll need to know this when you play back these messages. The general idea is to play back the messages using a thread, that's sleeping between messages, according to the relative time they originally arrived. In my experience, the system clock was not reliable enough to deliver rock-solid timings during playback. You'll know what I mean if you try this strategy when you listen to the playback of messages based on the system clock.

The other strategy is to use the embedded timestamp that accompanies each MidiMessage. This timestamp is expressed in microseconds based on the time you first opened the MidiDevice. Unfortunately, by the time the four-bar metronome count off ends, it's difficult to say when the first message should be played back. That is you can't assume that the first message that arrives should be played back at time zero. Perhaps the musician's first note is played halfway through the first measure. Since the MidiDevice was opened long before your metronome began playing, it's difficult to determine from the timestamp alone how much time your playback thread should wait until it sends the very first message. Of course, all messages after that are easy, since you can just calculate the time to wait in between each message based on the relative differences of the message's timestamps.

The best solution I came up with was to just take note (by way of System.currentTimeMillis()) of when recording actually begins (that is, after the four-bar metronome count off), and then take note of when the first MidiMessage arrives. Then, during playback, the playback thread merely needs to wait the calculated delay time before playing back the first message. Thereafter, it can simply use the relative differences between the MidiMessage timestamps for all subsequent messages.

It may surprise you to learn that what you think of as a chord (or several chords across multiple tracks) struck simultaneously is actually played back one note at a time, sent serially as a stream of MidiMessages, one at a time. You have to remember that the playback loop playing back the messages is so fast that the human ear will not be able to discern the difference between the original "three notes struck simultaneously" and "three notes played 1 ms apart."

You should now be able able to record and play back a single MIDI track at 120 bpm. If, when it plays back, it sounds just like you played it, you're halfway there. The next step is to be able record additional MIDI tracks while playing back previously recorded tracks.

Recording Multiple Tracks
You may have already begun to notice that, although you are receiving and recording the MIDI messages, it's hard to control what sound/voice/patch the keyboard is playing. This is why each of the 16 MIDI channels on the keyboard can have a different patch associated with it. Most keyboards allow you to change what MIDI channel they are transmitting on. Whatever MIDI channel you have selected on the keyboard should also change the patch selected as well.

The problem is that you don't want to constantly have to make sure your keyboard's selected channel matches the track you play to record in your sequencer. If they're not in sync, you'll think you're recording track/channel 2, but the keyboard still has channel 1 selected. Although you may have the "channel 2 ArrayList" full of the MidiMessages you received, those messages have one of their bytes indicating that they are channel 1 messages, and so playback of those "channel 2 messages" results in playback on channel 1, playing channel 1's patch instead of channel 2.

The solution seems tricky and not very efficient, but it seems to work just fine. The trick is to first turn off the keyboard's "keyboard" from triggering sounds internally; it will continue to transmit MIDI messages as usual:

ShortMessage msg = new ShortMessage();
msg.setMessage(ShortMessage.CONTROL_CHANGE, 122, 0);
_receiver.send(msg, -1);

Next, "route" all incoming MIDI messages to the keyboard, playing them back on the track the user thinks he is recording. For example, you may receive all your MIDI messages with the "channel 1 byte" set. If the user thinks she is recording track 2, then for each MIDI message received, in addition to recording it (by storing it in track 2's message ArrayList), change the "channel byte" to 2 and retransmit them back to the keyboard (see Listing 6).

Playing Back Multiple Tracks
Assuming you have several different tracks of MIDI data recorded, it's time to play them back. Your first approach might be to use a separate thread for each track (channel). While this is an intuitive programming model, you'll quickly find that although each track (thread) plays back in perfect time relative to itself, it's difficult to keep it perfectly in sync with the other tracks. If your tracks are short and you plan to loop them, you could use thread synchronization to make sure all tracks "sync up" with each other at the end of each iteration. However, you will soon find your clean sequencer code is getting cluttered up with complex thread synchronization all over the place, and it becomes harder and harder to manage and still achieve "rock solid" timing.

What I found to be easier to manage and virtually guaranteed to stay "in time" was to collect all MidiMessages, regardless of track (channel), put them into a single ArrayList, sort them all based on their timestamp, and then play them all back using a single playback thread.

Adding Digital Audio
By now you should have a good instrumental recorded using multiple MIDI tracks, but you'll add more interest to your song by laying down a vocal track or two. Luckily, the Java Sound API includes the javax.sound.sampled package dedicated to recording and playing back digital audio.

Recording Audio
Ultimately, any recorded digital audio comes down to samples. A sample is a measurement at a point in time of what you might picture as the audio "waveform." The standard CD sampling rate is to take 44,100 measurements, or samples, each second. Each sample may be 8 bits, 16 bits, or more. There are a variety of sample formats in use today, and the Java Sound API supports about everything you'll encounter. Some useful constants for recording CD quality sound are:

AudioFormat.Encoding encoding = AudioFormat.Encoding.PCM_SIGNED;
int rate = 44100;
int sampleSize = 16;
int channels = 1;
boolean bigEndian = true;

An AudioFormat object will be needed later:

AudioFormat format = new AudioFormat(
encoding, rate, sampleSize, channels,
(sampleSize / 8) * channels, rate, bigEndian);

Before you can begin recording, however, you'll need to obtain a TargetDataLine. The Java Sound API models its sampling API in terms of "lines." A line may be a microphone input, a previously recorded sample, the computer's "line out" or speaker, or any type of "input" or "output." To facilitate the playback of multiple samples at the same time, the interface Mixer is provided, which is itself a type of line. Lines may have controls that parallel what you'd find in a real mixer - gain, pan, volume, reverb, equalization, etc.

Like the MidiDevices returned from the MidiSystem, the class AudioSystem serves as your gateway into finding out and obtaining whatever Lines and Controls are installed and available to you. In general, the first step to recording an audio track is to obtain a TargetDataLine suitable for recording audio in the format requested, in this case an AudioFormat that is a single 16-bit channel recording 44,100 samples/second (see Listing 7).

As you may have suspected, you'll need a separate thread to capture the incoming sample data. Using the TargetDataLine and OutputStream created previously, you'll want to create a loop that reads a chunk of bytes at a time from the TargetDataLine, writing them out to the OutputStream until there's nothing left to read or until the user clicks stop (see Listing 8).

At this point, your ByteArrayOutputStream contains a ton of bytes. The average 3:30 minute song will require 9.3MB worth of samples for just a single mono track! FileOutputStream might be a better choice if you're going to be recording lengthy samples and memory becomes scarce. Of course, recording the sample is just half of the story. Now we have to play it back.

Playing Back Audio
Playing back a previously recorded audio track is essentially the reverse of recording it. That is, the sample's bytes, originally stored in an OutputStream, are written out to a SourceDataLine one chunk at a time until there's nothing left or until the user clicks stop.

To read the bytes a chunk at a time, we'll need an InputStream. The Java Sound API provides the class AudioInputStream that has several convenience methods for working with samples. Again, we'll need to refer to the same AudioFormat that the sample was originally recorded in. In our case, we'll assume we're dealing with a completely inmemory sample, expressed as an array of bytes (see Listing 9).

Note that AudioInputStream's mark method is used to mark the beginning of the sample, while the reset method is used to "rewind" the sample to the beginning.

As has been the case, we'll need a separate thread to play back the sample. We'll use the AudioInputStream set up above to read sample bytes from it, a chunk at a time, writing them out to a SourceDataLine. Just as we obtained our TargetDataLine from the AudioSystem, we'll obtain a SourceDataLine suitable for playing back a sample in our AudioFormat through inquiry (see Listing 10).

Since we have a SourceDataLine that can handle our AudioFormat, we can start a thread to write out the sample bytes to it (see Listing 11).

Now that you have your audio track playing back - we're almost done!

Putting It All Together
At this point we have the main ingredients for a basic multi-track MIDI sequencer that can also record and play back audio. Although we can play back multiple tracks of MIDI using just one thread, it's much more difficult to play back multiple samples with a single thread. For simplicity, we'll continue to use one thread for all MIDI data, but create a different thread for each audio sample.

The basic trick for integrating MIDI and one or more samples is to simply synchronize the start of the MIDI tracks thread with the audio track thread(s) using normal thread sychronization techniques.

Of course, real commercial MIDI/audio sequencers can do much more than record and play back multiple tracks. That's just the beginning. After all, a real sequencer can:

  • Play back what was recorded at one tempo at a different tempo
  • Import "instrument definitions" that specify the patch names mapped to patch numbers
  • Select each track's "patch" by searching the available patches by name
  • Provide a mixer with volume and pan sliders for each track
  • Record and play back volume changes from the mixer in real time
  • "Trigger" audio samples from the keyboard (a la a conventional sampler)
  • Quantize recorded MIDI data to the nearest 1/4 note, 1/8th note, 1/16th note, etc.

    I'm out of space, so for now, I'll have to leave that as an exercise for you, the reader. In the meantime, enjoy your new sequencer!


  • Open source MIDI and audio projects: Audio Development System: http://sourceforge.net/projects/adsystem
  • jMusic: http://sourceforge.net/projects/jmusic
  • Sound Grid: http://sourceforge.net/projects/soundgrid

    API References

  • Java Sound Programmer Guide: http://java.sun.com/j2se/1.4.1/docs/guide/ sound/programmer_guide/contents.html
  • Java Sound Demo: http://java.sun.com/products/javamedia/ sound/samples/JavaSoundDemo/

    MIDI Specification

  • Official MIDI Specification: www.midi.org
  • Online MIDI Specification (unofficial): www.borg.com/~jglatt/tech/midispec.htm


  • Bug ID 4773012: RFE: Implement a new stand-alone sequencer: http://developer.java.sun.com/developer/bugParade/ bugs/4773012.html
  • Bug ID 4783745: Sequencer cannot access external MIDI devices: http://developer.java.sun.com/developer/bugParade/ bugs/4783745.html
  • Comments (2) View Comments

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

    Most Recent Comments
    John Nozum 12/23/04 12:17:58 AM EST

    I want to thank you VERY MUCH for helping me understand MIDI sequencer programming much better. Years ago, I did make a single-track MIDI sequencer for the Commodore 64/128 computers. I had very little difficulty, if any, in understanding very basic sequencer theory. However, I had an AWFUL time trying to figure out how a multi-track MIDI sequencer might work internally! I feel that you helped me a LOT in understanding how a multitrack MIDI sequencer works. I liked how you explained the idea of having separate arrays for each track verses one array list that contains all the MIDI data. Fortunately, I can also kick some butt when it comes to arrays. After all, I have programmed computers for around 22 years and worked with at least 7-8 languages.

    One of these years, I'm thinking about trying to make my own MIDI sequencer for the PC. I am thinking about this for two reasons. First of all, years ago I had a Casio CT-7000 keyboard, which has a hardware sequencer. Unfortunately this keyboard does NOT have MIDI.. However, its hardware sequencer has a very unusual feature that I have not seen much of afterwards. If you make a mistake during recording, you can use the following procedure:

    1. While still in RECORD mode, hold down the REWIND
    button and go back a few measure.
    2. After letting up on the REWIND button, the sequencer
    will go into PLAY mode from that point forward.
    3. At some point prior to your mistake(s), start playing
    the keyboard. At this point, the sequencer will
    immediately go into RECORD mode, thus covering up
    your error(s).

    You can think of this as a "punch-in" that is triggered as soon as you hit a note. Please note that the sequencer remains in RECORD mode until you stop it or you run out of memory.

    The second reason why I might try to one day build my own multitrack sequencer is because I can't seem to find a PC sequencer that I liked near as well as Bars & Pipes Pro for the Commodore Amiga. This sequencer was a POWERHOUSE!

    Again, I thank you very much for helping me get more of an understanding on somewhat how a multitrack MIDI sequencer can work internally!

    From John Nozum

    Martin Cooper 04/23/04 02:40:31 AM EDT

    Nice article. I''ve done quite a bit of MIDI programming in C/C++ land in past lives, but am relatively new to doing this in Java land. This article was helpful.

    The main thing I found lacking was detail on timing. This is usually one of the trickier aspects of MIDI programming, and for good reason. There is a reference to the system clock being "not reliable enough", but later an implication that Thread.sleep() (which presumably uses the system clock)is the right way to time MIDI messages.

    What is the real answer here? If I have a list of (start + duration) messages ready to go, what is the best way to time the start and stop events?