Introduction [top]
As the name implies, this is essentially just a standard MIDI
file, wrapped in a RIFF container. This allows a MIDI file
to contain extra tags. But wait! MIDI files does specify
meta events like copyright, text, track names...
"Hmm... there's 'tagging' alright, but we want to tag these
the same way we did for bitmaps and WAV files," says Microsoft and IBM.
And so they did. The metadata contained within the RMI was used
if you check the file properties of one:
Since information on it is so sparse and fragmented across the different standards that's used, I decided to write them as one reference page, which hope will help demystify this format a little bit. I won't be covering everything though. :P
Structure
In general [top]
Simply put, it's just a regular RIFF file where everything is split in terms of chunks in this format: [fourCC] [4-byte length] [data...]
The official specs can be found here, this page summarizes it for quick reference.
Hex Data | Size | Contents |
---|---|---|
52 49 46 46 | 4 bytes | "RIFF" in ASCII |
xx xx xx xx | 4 bytes |
The size of the entire file
minus the 8 needed for the
RIFF header.
This is encoded in little-endian. If your file is 12 kB = 12000 bytes = 0x2EE0, then you fill in 0x2ED8 = D8 2E 00 00 here. |
52 4D 49 44 | 4 bytes | "RMID" in ASCII |
64 61 74 61 | 4 bytes | "data" in ASCII |
xx xx xx xx | 4 bytes |
This is the size for the contents
of the entire MIDI file that
immediately follows this. This is encoded in little-endian. |
xx xx xx xx xx ... | (variable) | The MIDI file itself - it's literally a copy and paste job (with padding to make the file have an even number of bytes) |
The official RMID spec specifically calls for this exact order as the bare minimum for an .RMI file. So it's very easy to convert a .MID into an .RMI, and vice versa. As the purpose is to bolt additional standardized metadata onto a basic MIDI file, doing just that might be pointless, so let's look at the other chunks you might come across.
DISP [top]
According to Microsoft's documentation on the then-new RIFF chunks (titled "New Multimedia Data Types and Data Techniques"):
A DISP chunk contains easily rendered and displayable objects associated with an instance of a more complex object in a RIFF form (e.g. sound file, AVI movie)...
...The DISP chunk is especially beneficial when representing OLE data within an application. For example, when pasting a wave file into Excel, the creating application can use the DISP chunk to associate an icon and a text description to represent the embedded wave file. This text should be short so that it can be easily displayed in menu bars and under icons.
To summarize, in this case, basically "attachments" to the file. How are they defined? Well:
Hex Data | Size | Contents |
---|---|---|
44 49 53 50 | 4 bytes | "DISP" in ASCII |
xx xx xx xx | 4 bytes |
The size of the chunk
minus the 8 bytes for the
header, and including
the 4 bytes of the following
type definition. This is encoded in little-endian. |
xx 00 00 00 | 4 bytes | The type of the data, see the table below. |
xx xx xx xx... | (Variable) | The data itself. |
The type is specifically defined to be Windows clipboard
formats, which I'll save you the effort of looking through
(for the curious, it's in WinUser.h
, and also documented here):
Type | Contents |
---|---|
01 | Plain text (ANSI) [text] |
02 | a BMP image [bmp] |
03 | Metapicture file |
04 | a SYLK spreadsheet |
05 | a Data Interchange Format spreadsheet |
06 | a TIFF image |
07 | Plain text (OEM / codepage 437) [text] |
08 | a DIB image, this is used as a "file icon" [dib] |
09 | A color palette, apparently can be connected to the DIB image |
0A | Pen extension data for Windows for Pen Computing (probably irrelevant here) |
0B | a sound file inside a RIFF container [riff] |
0C | a WAV file |
0D | Plain text (Unicode) [text] |
0E | an Enhanced Metafile image |
0F | A zero-terminated list of files (Windows 95/NT 4 and up only) |
10 | Language of the plain text (Windows 95/NT 4 and up only) [locale] |
11 | a DIB image containing color space info (aka BITMAPV5, Windows 98/NT 5 and up only) |
- [dib]
Converting a DIB to conventional formats is easy:
convert rippedImage.dib rippedImage.png
... as for vice versa, it seems bugged when I try it.
A workaround for now is to convert it to a BITMAPV3 image as described in the point immediately below, and then chop off 14 bytes from the start of the file. - [bmp]
I haven't actually looked into this, you might be able to get away
using BITMAPV3:
convert myImage.png BMP3:myImage.bmp
- [riff] I'm not exactly sure what this means, yet. (assuming it's just a regular RIFF sound file...)
- [locale] I assume this would be two bytes (little endian?), with the specific values defined here...
- [text][1][2][3] Line breaks are CR+LF (0D 0A), and ends with a null byte. (00)
INFO [top]
This is a RIFF LIST chunk that essentially contains the tags for the RMI file, each tag being its own RIFF chunk in the usual format. Here's the header for this chunk:
Hex Data | Size | Contents |
---|---|---|
4C 49 53 54 | 4 bytes | "LIST" in ASCII |
xx xx xx xx | 4 bytes |
The size of the chunk
minus the 8 bytes for the
header, and including
the 4 bytes of the following
ASCII identifier. This is encoded in little-endian. |
49 4E 46 4F | 4 bytes | "INFO" in ASCII |
xx xx xx xx... | (Variable) | The data itself, made up of several RIFF sub-chunks. |
A selection of tags are written below, a more complete list can be found here. The tags here are assumed to be written as plain, null-terminated text.
Tag Name | Contents |
---|---|
IART | Artist of the composition |
ICOP | Copyright information |
ICRD | Date created |
INAM | Title of the composition |
ISBJ | Description of the file / additional info |
ICMT | Comments about the composition or the file |
ICMS | Who or which entity commissioned this file |
IGNR | The genre of the composition |
vers [top]
The specification also defines a version number chunk, curiously. This is basically the version number of the file itself. According to the specification, this chunk is to be placed right after the midi data and before the INFO chunks.
I haven't seen a file use this yet, but I thought this was interesting. Might as well write it here:
Hex Data | Size | Contents |
---|---|---|
76 65 72 73 | 4 bytes | "vers" in ASCII |
08 00 00 00 | 4 bytes | The size of this chunk's data (8 bytes), in little endian |
xx xx ww ww zz zz yy yy | 8 bytes |
Suppose a version number like ww.xx.yy.zz,
each number is encoded as a 16-bit little endian
number and arranged like that.
Example: version 1.0.24.490 is encoded as
|
DLS [top]
Perhaps one of the coolest things I didn't know an RMI file could have, that is basically its' own soundfont to go along with the sequence data. This makes MIDI files closer to something like a MOD or an XM, but not bound by module speed and row/frame boundaries.
As far as I know, DLS data is literally appended to the end of the RMID data. The difference is, the size of the RIFF file itself (at offsets 4-7) need to be updated to reflect the entire file.
Final words
It looks like this format was significant enough for it to be picked up by the MIDI Manufacturer's Association as a technical note RP-029 in 2000. But after a while, they might have realized that this format has its shortcomings (like the 4GB limit, or that it's best used with one vendor, idk), and developed the Extensible Music Format (XMF) to replace it a year later.
XMF is apparently used in mobile settings, but the thing is I've never even seen one of these files whereas I've only seen a few RMI's by comparison.