.RMI - the RIFF-based MIDI file

Changelog

Introduction [top]

As the name implies, this is essentially just a standard MIDI file, wrapped in a RIFF container. This allows a MIDI file to contain extra tags. But wait! MIDI files does specify meta events like copyright, text, track names... "Hmm... there's 'tagging' alright, but we want to tag these the same way we did for bitmaps and WAV files," says Microsoft and IBM. And so they did. The metadata contained within the RMI was used if you check the file properties of one:

						Windows 95 file properties box of a file named
						'In the Hall of the Mountain King', showing the
						Midisoft logo image, copyright information,
						and other info such as Artist, Display Name and
						Subject.

Since information on it is so sparse and fragmented across the different standards that's used, I decided to write them as one reference page, which hope will help demystify this format a little bit. I won't be covering everything though. :P

Structure

In general [top]

Simply put, it's just a regular RIFF file where everything is split in terms of chunks in this format: [fourCC] [4-byte length] [data...]

The official specs can be found here, this page summarizes it for quick reference.

Hex Data Size Contents
52 49 46 46 4 bytes "RIFF" in ASCII
xx xx xx xx 4 bytes The size of the entire file minus the 8 needed for the RIFF header.
This is encoded in little-endian.
If your file is 12 kB = 12000 bytes = 0x2EE0, then you fill in 0x2ED8 = D8 2E 00 00 here.
52 4D 49 44 4 bytes "RMID" in ASCII
64 61 74 61 4 bytes "data" in ASCII
xx xx xx xx 4 bytes This is the size for the contents of the entire MIDI file that immediately follows this.
This is encoded in little-endian.
xx xx xx xx xx ... (variable) The MIDI file itself - it's literally a copy and paste job (with padding to make the file have an even number of bytes)

The official RMID spec specifically calls for this exact order as the bare minimum for an .RMI file. So it's very easy to convert a .MID into an .RMI, and vice versa. As the purpose is to bolt additional standardized metadata onto a basic MIDI file, doing just that might be pointless, so let's look at the other chunks you might come across.

DISP [top]

According to Microsoft's documentation on the then-new RIFF chunks (titled "New Multimedia Data Types and Data Techniques"):

A DISP chunk contains easily rendered and displayable objects associated with an instance of a more complex object in a RIFF form (e.g. sound file, AVI movie)...

...The DISP chunk is especially beneficial when representing OLE data within an application. For example, when pasting a wave file into Excel, the creating application can use the DISP chunk to associate an icon and a text description to represent the embedded wave file. This text should be short so that it can be easily displayed in menu bars and under icons.

To summarize, in this case, basically "attachments" to the file. How are they defined? Well:

Hex Data Size Contents
44 49 53 50 4 bytes "DISP" in ASCII
xx xx xx xx 4 bytes The size of the chunk minus the 8 bytes for the header, and including the 4 bytes of the following type definition.
This is encoded in little-endian.
xx 00 00 00 4 bytes The type of the data, see the table below.
xx xx xx xx... (Variable) The data itself.

The type is specifically defined to be Windows clipboard formats, which I'll save you the effort of looking through (for the curious, it's in WinUser.h, and also documented here):

Type Contents
01 Plain text (ANSI) [text]
02 a BMP image [bmp]
03 Metapicture file
04 a SYLK spreadsheet
05 a Data Interchange Format spreadsheet
06 a TIFF image
07 Plain text (OEM / codepage 437) [text]
08 a DIB image, this is used as a "file icon" [dib]
09 A color palette, apparently can be connected to the DIB image
0A Pen extension data for Windows for Pen Computing (probably irrelevant here)
0B a sound file inside a RIFF container [riff]
0C a WAV file
0D Plain text (Unicode) [text]
0E an Enhanced Metafile image
0F A zero-terminated list of files (Windows 95/NT 4 and up only)
10 Language of the plain text (Windows 95/NT 4 and up only) [locale]
11 a DIB image containing color space info (aka BITMAPV5, Windows 98/NT 5 and up only)

INFO [top]

This is a RIFF LIST chunk that essentially contains the tags for the RMI file, each tag being its own RIFF chunk in the usual format. Here's the header for this chunk:

Hex Data Size Contents
4C 49 53 54 4 bytes "LIST" in ASCII
xx xx xx xx 4 bytes The size of the chunk minus the 8 bytes for the header, and including the 4 bytes of the following ASCII identifier.
This is encoded in little-endian.
49 4E 46 4F 4 bytes "INFO" in ASCII
xx xx xx xx... (Variable) The data itself, made up of several RIFF sub-chunks.

A selection of tags are written below, a more complete list can be found here. The tags here are assumed to be written as plain, null-terminated text.

Tag Name Contents
IART Artist of the composition
ICOP Copyright information
ICRD Date created
INAM Title of the composition
ISBJ Description of the file / additional info
ICMT Comments about the composition or the file
ICMS Who or which entity commissioned this file
IGNR The genre of the composition

vers [top]

The specification also defines a version number chunk, curiously. This is basically the version number of the file itself. According to the specification, this chunk is to be placed right after the midi data and before the INFO chunks.

I haven't seen a file use this yet, but I thought this was interesting. Might as well write it here:

Hex Data Size Contents
76 65 72 73 4 bytes "vers" in ASCII
08 00 00 00 4 bytes The size of this chunk's data (8 bytes), in little endian
xx xx ww ww zz zz yy yy 8 bytes Suppose a version number like ww.xx.yy.zz, each number is encoded as a 16-bit little endian number and arranged like that.
Example: version 1.0.24.490 is encoded as
  • ww ww = 01 00
  • xx xx = 00 00
  • yy yy = 18 00
  • zz zz = EA 01
So the data will look like: 00 00 01 00 EA 01 18 00

DLS [top]

Perhaps one of the coolest things I didn't know an RMI file could have, that is basically its' own soundfont to go along with the sequence data. This makes MIDI files closer to something like a MOD or an XM, but not bound by module speed and row/frame boundaries.

As far as I know, DLS data is literally appended to the end of the RMID data. The difference is, the size of the RIFF file itself (at offsets 4-7) need to be updated to reflect the entire file.

Final words

It looks like this format was significant enough for it to be picked up by the MIDI Manufacturer's Association as a technical note RP-029 in 2000. But after a while, they might have realized that this format has its shortcomings (like the 4GB limit, or that it's best used with one vendor, idk), and developed the Extensible Music Format (XMF) to replace it a year later.

XMF is apparently used in mobile settings, but the thing is I've never even seen one of these files whereas I've only seen a few RMI's by comparison.