The Sims™ Technical Aspects

String Table Resource Format


The following information is not based on any proprietary knowledge or restricted documentation—it was entirely derived from observation, experiment, and public information, thus it may be inaccurate or incomplete.

Originally analyzed by Dave Baum. Documented by Greg Noel. Analysis of formats FEFF and FCFF by Greg Noel.

The string table format is a general way of storing strings. It is used in several resource types, where each different resource uses the string for different purposes:

Actually, there are several different formats for the table; the specific format used is determined by the first two bytes of the resource, which we'll term the format code.

String table - format 0
Offset Size Value
0 2 Number of strings (N)
2 var String 1
var var String 2
    . . .
var var String N

If the format code, when read as a big-endian number, is non-negative, then the the format code is the number of strings in the table. We'll term this format 0.

The rest of the resource consists of N Pascal-style counted strings (one-byte length followed by data). That is, the length (L) is in the first byte and the next L bytes are the string. Note that a terminating null is not included in Pascal-style strings, so the string "foo" would be stored as four bytes (03 66 6F 6F).

If the format code is negative, it identifies the interpretation of the resource. We'll use the value in hexidecimal (still big-endian) to describe the layouts below.

String table - format FFFF
Offset Size Value
0 2 FF FF
2 2 Number of strings (N)
4 var String 1
var var String 2
    . . .
var var String N

The FFFF format allows the use of longer strings than format 0. The string is null-terminated so that it can be as long as desired.

The number of strings is a little-endian count of strings present.

Each string entry is a single null-terminated string. Thus, the string "foo" would be stored as four bytes (66 6F 6F 00).

String table - format FEFF
Offset Size Value
0 2 FE FF
2 2 Number of entries (N)
4 var String pair 1
var var String pair 2
    . . .
var var String pair N

The FEFF format has a pair of null-terminated strings instead of the single null-terminated string in the FFFF format.

The number of entries is a little-endian count of string pairs present.

Usually, the first string of the pair is the data and the second string is a comment. (As far as is known, comments aren't ever used by the game.) In many cases the comment string is empty, so it looks like the main string is terminated with two NULLs.

For example, the string "foo" with the comment "bar" would be stored as eight bytes (66 6F 6F 00 62 61 72 00). The string "foo" without a comment would be stored as five bytes (66 6F 6F 00 00).

String table - format FDFF
Offset Size Value
0 2 FD FF
2 2 Number of entries (N)
4 var Code and string pair 1
var var Code and string pair 2
    . . .
var var Code and string pair N
var N Extra data

The FDFF format has a pair of strings like the FEFF format, but it also includes a one-byte value ususally used as a language code.

The number of entries is a little-endian count of entries present.

The strings are null-terminated, so, for example, the string "foo" with language code one (English) and without a comment would be stored as six bytes (01 66 6F 6F 00 00).

In addition, there is some extra data at the end. In all files observed, the extra data is simply N bytes that all contain the value 0xA3 (163). It's possible that this is a second table containing a third set of strings, which just happen to be always empty.

String table - format FCFF
Offset Size Value
0 2 FC FF
2 1 Number of sets (N)
3 var String set 1
var var String set 2
    . . .
var var String set N

The FCFF format is essentially a reaction to the short string limit in format 0 (Pascal-style counted strings are limited to a maximum of 255 bytes) and to the inherent slowness of null-terminated strings in the other formats (which effectively requires two passes over the file). This format uses fast counted strings that can be up to 32K in length.

This format first appeared in The Sims Online™ and has not been seen in any expansion pack.

The number of sets is invariably twenty, one for each supported language. Even strings that aren't translated use a value of twenty; the first string set is filled in and the remaining nineteen string sets are empty.

String set
Offset Size Value
0 2 Number of entries (N)
2 var String entry 1
var var String entry 2
    . . .
var var String entry N

The number of entries is a little-endian count of entries present. Note that this field is unaligned in this format, which is so unusual that it may imply a failure in the analysis; fields in game files are almost never unaligned.

A string entry consists of a code byte and two strings. As with format FDFF, the code byte is ususally used as a language code. For some reason, the code is one lower than the corresponding code in the other formats; a code of 02 is for French, which is language three. The reason for this difference is unknown.

The string length is either one or two bytes and has a dynamic range of fifteen bits. If the string length is less than 128, the value is encoded in one byte. If the string length is greater than 127, the value is coded in two bytes with the 0x80 bit set on the first byte; the first byte has the low-order seven bits and the second byte has the high-order eight bits.

For example, the string "foo" with language code one (English) and without a comment would be stored as six bytes (00 03 66 6F 6F 00). And a 643-character (5*128 + 3) German string with no comment would be stored as 647 bytes (03 83 05 . . . 00 where the . . . would contain the string).


Reminder: This information is not based on any proprietary knowledge or restricted documentation—it was entirely derived from observation, experiment, and public information, thus it may be inaccurate or incomplete.

Valid XHTML 1.1! Valid CSS!
Copyright © 2001-2008 Dave Baum and Greg Noel. All rights reserved.
The Sims™ is a trademark of Maxis and Electronic Arts.
This page was last modified Sunday, 10-Nov-2002 06:11:03 UTC.
Made on a Mac
SourceForge