Game++. String Interning
String interning (also called a "string pool") is an optimization technique where only a single copy of each unique string is stored in memory. It's one of the most useful optimizations in game engines, alongside SWAR, SIMD strings, immutable strings, StrHash, and Rope strings.
The Core Problem
When the same string literal (e.g., "black") appears multiple times in code, whether duplicate copies exist in RAM depends on the compiler and vendor. Clang (Sony) and GCC can merge identical literals so only one copy is loaded. However, this is not guaranteed by the standard — it's a compiler extension. On Xbox, for instance, two identical literals may have different addresses.
How Interning Breaks
The optimization is fragile:
const char*— may share one copy (compiler-dependent)char[]— creates separate RAM copies per variablestd::string— constructor copies the literal
In managed languages like C# or Java, "string interning happens at runtime" via the VM. In C++, there's no runtime to do this automatically — it only happens at compile time, or must be implemented manually.
String Comparison Pitfalls
Using == on char* compares addresses, not content. If the compiler merged the literals, it works by coincidence. If not, it silently fails. The author reports catching six bugs in 2024 from pointer-based string comparisons. The correct approach is strcmp(), which compares character by character.
Memory Fragmentation
From experience with Unity, string data accounted for 3–6% of a debug build's memory footprint, with ~3% from fragmentation alone. Average string sizes of 14–40 bytes left many small unusable gaps. On a 1 GB iPhone 5S, this meant 30–60 MB of wasted "free memory" — significant enough to warrant optimization.
In release builds, actual string data can be stripped, leaving only hashes (4 bytes each). This reduces that 6% footprint to under 1%, and adds a layer of protection against resource tampering.
The xstring Implementation
The solution is a lookup table mapping hash IDs to strings:
namespace utils {
struct xstrings { eastl::hash_map< uint32_t, std::string > _interns; };
namespace strings {
uint32_t Intern( xstrings &strs, const char *str );
const char* GetString( xstrings &strs, uint32_t id );
bool LoadFromDisk( xstrings &strs, const char *path );
}
}
The xstring_value struct stores CRC, reference count, length, and the string data. The xstring class itself is essentially a POD int32_t, so all comparisons reduce to integer comparisons.
Full implementation: src/core/xstring.h
Real-World Impact
The author describes integrating this into Unity 4, where it was combined with other optimizations to run Sims Mobile on the iPhone 4S — a device the overseas team considered impossible to target. String interning alone provided "a performance boost of nearly 30% for animations."
Typical Usage Pattern
struct time_tag {
float12 time;
xstring tag;
};
void ai_unit::on_event(const time_tag &tt) {
if (tt.tag == events().aim_walk) {
ai_start_walk(tt);
} else if (tt.tag == events().aim_turn) {
ai_start_turn(tt);
}
ai_unit_base::on_event(tt);
}
Further Reading
- foonathan/string_id — C++ library for unique string identifiers via interning
- Understanding String Interning — detailed explanation of interning trade-offs
- libstringintern — C++ library for string interning and memory management