Modding FMOD
A couple of years ago I was tooling around with a (still unreleased) mod for Halo 3: ODST. The Halo modding scene cops some flak for being derivative as a lot of mods are just asset ports between games in the series. And this one was no exception; I needed some sounds from Halo 3.
All the Way to the Bank
The MCC ports of both games run their audio through FMOD, a cross-platform audio middleware. The runtime side is what you'd expect: a mixer, effects, 3D positioning, and the usual. The interesting bits for modding are the offline parts. You feed FMOD a pile of source .wav files and out the other end come packed .fsb files, as in FMOD Sound Bank (not to be confused with a certain three-letter agency). In MCC that authoring pass happens inside tool, which is an absolute Swiss Army knife of a CLI:
.\tool.exe sounds-single-layer "data\sound\game_sfx\ui\shield_charge\charge\loop" sfx -bank:h3
# more imports, then rebuild:
.\tool.exe report-sounds "sound"
.\tool.exe export-fmod-banks "reports\reports_00\sounds_report_sizes.csv" pc sfx -bank:h3
Like a lot of tool's workflows, this one's somewhat fraught. Bank corruption is common enough that many a modder has a horror story:
View post by Gashnor on XReplying to @KashiieraI would run one big import when I gathered all the sounds I'd need because it corrupted my banks so much. Ended up getting a decent amount in but yeah, not ideal.
I get why Bungie designed this way for the Xbox 360, but it isn't super scalable
In Bungie's defence, this isn't their design. FMOD was bolted on when the games were ported to the Xbox One as part of the MCC. Bungo would never. Traditionally, data in Halo is stored in what are called tag files. Every single game asset is a tag, from warthogs to frag grenades, and yes, sounds too. The sound tags still carry the per-clip gameplay metadata (sound class, playback flags, pitch ranges, and priority) but not the audio itself, which now lives in FMOD banks.
This got me thinking, what if I could copy the tag files as-is and modify ODST's banks to include the necessary sub-sounds, all without going through tool? Avoiding tool would give me two benefits:
- it would sidestep any unnecessary re-encodes; and
- it would prevent any surprise bank corruption.
Fourth Floor: Headers, Names, Keys to Sub-sounds
So I set about researching all I could about the proprietary .fsb format. I came across python-fsb5 and the venerable vgmstream. Both projects were invaluable references to have for the binary format, but of course each is only concerned with extraction so I would still have to roll my own tool. I also came across FMOD's CEO casually posting the FSB header struct on a support forum (as you do):
struct FSB5_HEADER
;
Mapping that to Python is mostly an exercise in writing a struct.unpack format string:
fsb5_header = f(
id,
sub_version,
num_sub_sounds,
header_chunk_size_bytes,
names_chunk_size_bytes,
data_chunk_size_bytes,
data_format,
data_format_version,
mode,
compatibility_hash,
guid_data_1,
guid_data_2,
guid_data_3,
guid_data_4,
) =
You'd be forgiven for thinking that string looks like hieroglyphics. Here's a quick overview covering what I used:
| Char | Meaning |
|---|---|
< | little-endian byte order |
4s | four bytes |
I | unsigned int |
i | int |
Q | unsigned long long |
H | unsigned short |
8s | eight bytes |
Immediately following that header are the sub-sound headers. Each one consists of a 64-bit integer, containing five aggressively bit-packed fields, optionally followed by a chain of extra fields:
| Bits | Field |
|---|---|
| 0 | Set if an extra field follows |
| 1-4 | Sample rate enum |
| 5 | Channel count minus 1 |
| 6-33 | Offset into the data chunk, in 16-byte units |
| 34-63 | Decoded sample count |
Each extra field is 32 bits laid out as follows, with data immediately following:
| Bits | Field |
|---|---|
| 0 | Set if another field follows |
| 1-24 | Data size in bytes |
| 25-31 | Data type |
headers = []
data_offsets = []
for i in :
# 8-byte base header, decoded as a 64-bit little-endian int
headers raw = int extra_field = frequency = channels = + 1
data_offset = * 16
samples =
# remember the offset so the data chunk can be sliced up later
data_offsets
# walk any extra fields chained off this sub-sound
while extra_field:
raw = int extra_field = data_size = data_type =
# data isn't parsed, just kept alongside the header
Next up is the name chunk which, mercifully, is much simpler. First, an array of 32-bit offsets from the start of the chunk to the start of each name's string. Then, the strings themselves—null-terminated and tightly packed.
names_chunk_start = fname_offsets = []
names = []
for i in :
name_offsets
# pair each offset with the next so we know where each name ends
for a, b in :
f names
And finally the data chunk. Encoded samples for each sub-sound aligned to 16 bytes, with the first on a 32-byte boundary. Not that we concern ourselves with any of that—we just grab everything leading up to the next sub-sound. As above, offsets are relative to the start of the chunk.
sub_sound_data = []
for start, end in :
sub_sound_data
Having decomposed the existing bank's chunks into three separate arrays (headers, names, and sub_sound_data), I could trivially cherry-pick by name and construct a brand new .fsb file from scratch. Only one field eluded me…
Who Ordered the Mystery Hash
I recall trying a few things to get guid to fall out, but it didn't matter as ODST never checked it at runtime. I left it zeroed but it always bothered me. So continuing my run of using Claude to answer questions I've left unanswered, I decided to turn to the LLM once more to crack it. All it asked for was a copy of tool.exe (just the binary, not even a decompilation) and then it hit the jackpot:
fsb_header = guid = hashlib
The sub-sound header chunk first, then the FSB header with compatibilityHash and guid zeroed. Mystery solved.
