By 2014 the Unity engine had accumulated so many breaking changes and new features that the "5" was effectively a different engine. And though many, behind the identical facade, didn't really notice it, the changes touched every component of the engine, from the file system to the renderer. EA's St. Petersburg office had its own branch of the main repository, lagging behind master by at most a month. I've already written about various implementations and types of strings in game engines, but Unity had its own implementation, with both upsides and downsides, that was used in practically every subsystem. People got used to it, knew the weak spots and the bad "use cases" and the good "best practices." So when this system started being ripped out of the engine, a lot broke — and while ordinary users got a straight jump to the new version and only saw echoes of the storm, those of us with access "to the body" caught a lot of fun bugs.
The engine implemented the then-fashionable and convenient COW (copy-on-write) strings, "copy on write." Fashionable, because Qt and GCC also had their own implementations and were pushing them into the standard — which, thankfully, didn't happen; convenient, because when creating and copying such strings the allocations were effectively reduced to zero.
The main difference from the general implementation of this mechanism in Qt/GCC was partial sharing of the data. That is, if there were two strings "abcde" and "abc", the second referenced the buffer of the first but had the right length. At the time of profiling a level in Sims Mobile, there were about 3k string allocations at startup, and after that roughly 1 allocation of a new string every 40–50 frames, effectively once a second. All creations and copies of new strings were nullified by this system, and to understand how cool all of this was — for comparison, a similar level on PC in some internal tech demo on a fresh UE4, at the same level, produced close to 200 allocations per frame, just on strings. Every frame! Some not-so-fresh iPhone 5 simply keeled over trying to digest all that on Unreal.
Why COW
The core idea of COW (copy-on-write) is to share one and the same data buffer between different string instances and to make a copy only when the data in a particular instance changes. This is called "copy on write," and the main cost of such an implementation is the extra indirection when accessing string values; Unity supported COW implementations from the very first version, judging by the commit history. There were tales that Joachim Ante himself (the company's CTO) personally wrote and designed this class, and the entire localization system in the engine for that matter — the first commits with the implementation really were dated to 2006–2007, but there was no authorship attached, so I'm selling it for what I bought it for.
Why it was removed
The reason was the start of rewriting the engine's code to C++11, the partial migration of new code to std::string, and the serious mismatch that arose between the design of std::string and the in-house COW implementation. The standard library came to be used more in the engine, and in places this led to situations where people started working with COW strings as if they were const char* and passing them around as raw data — that is, you effectively passed a raw pointer out of a shared_ptr and worked with it, while the smart pointer itself went on living its own life. When it would fall over was only a matter of a few frames.
A COW string has two possible states: exclusive ownership of the buffer, or shared use of the buffer with other COW strings. Assignment and copy operations can move it into the shared state and back. But before performing a "write" operation you have to make sure the string is in the owning state, and that transition leads to creating a new copy and copying the contents of the parent's data buffer into a new, exclusively used buffer.
In a string intended for COW, any operation is either non-modifying ("read") or directly modifying ("write"). This makes it easy to determine whether the string needs to be moved into the owning state before performing the operation. In std::string, however, references, pointers and iterators to mutable content are handed out more freely, because every string is in a state of exclusive buffer ownership, to put it in COW-string terms. Even simply indexing values in a non-const string (s[i]) returns a reference that can be used to modify the string.
Therefore, for a non-const std::string every such operation can effectively be considered a "write" operation and must be treated as such in a COW implementation. The example below gives the basic code of the class that was used in the engine; I won't touch on the problems of initialization from literals. This code shows how assignment and copying were reduced to almost nothing:
using C_str = const char*;
using C_ref = const char&;
namespace uengine
{
class UString
{
using Buffer = vector<char>;
shared_ptr<Buffer> m_buffer;
USize m_length;
void ensureIsOwning()
{
if( m_buffer.use_count() > 1 )
{
m_buffer = make_shared<Buffer>( *m_buffer );
}
}
public:
C_str c_str() const
{
return m_buffer->data();
}
USize length() const
{
return m_length;
}
C_ref operator[]( const USize i ) const
{
return (*m_buffer)[i];
}
char& operator[]( const USize i )
{
ensureIsOwning();
return (*m_buffer)[i];
}
template< USize n >
UString( Raw_array_of_<n, const char>& literal ):
m_buffer( make_shared<Buffer>( literal, literal + n ) ),
m_length( n - 1 )
{}
};
}
Here the default assignment operator is used, which simply copies the m_buffer and m_length data. Copy-on-initialization works exactly the same way. Now let's look at an example of correct use of such strings:
int main()
{
UString str = "Unreal the best engine ever!";
C_str cstr = str.c_str();
// contents of `str` are not modified.
{
const char first_char = str[0];
auto ignore = first_char;
}
cout << cstr << endl;
}
Execution build compiler returned: 0
Program returned: 0
Unreal the best engine ever!
The COW string is in the owning state, the initialization of the first_char variable simply copies the value of the character — all is well. But if a developer accidentally, as kept happening constantly when working with std::string, adds a logical copy of the string without changing the string's value, the problems begin:
int main()
{
UString str = "Unreal the best engine ever!";
C_str cstr = str.c_str();
// contents of `str` are not modified.
{
UString other = str;
// .... some works
const char first_char = str[0];
auto ignore = first_char;
// .... some works
}
cout << cstr << endl; //! Undefined behavior, cstr is dangling.
}
Execution build compiler returned: 0
Program returned: 0
r({!4uCM&&V^Pt58>~:@|~jk0r/N|YRTM1Fg*&8q#VSyBv6D5/
Because the string str is in the shared state, the COW principle forces the str[0] operation to create a copy of the shared buffer in order to move into the owning state. Then, at the end of the block, the only remaining owner of the original buffer — the other string — is destroyed and destroys the buffer. This causes the cstr pointer to become dangling. This is an example close to real cases, which we caught by the dozen during the transition period; the strangest cases were when std::string and UString were mixed and part of the data stayed on the stack — for a while it was still accessible, and at a certain moment it became garbage. In the end the editor, after thinking a bit, produced something in the style of the screenshot below and crashed without dumps.
Godbolt (error example)
This was treated as a programmer's mistake and ignorance of the engine's basics, but in fact the type was simply badly designed, which is exactly what made it so easy to misuse. To fix such an error, had they started fixing it, in order to avoid the cases mentioned above, it would have been necessary to transition into the owning state on any access to a string element, which would have entailed copying the string's data in every case where a reference, pointer or iterator is handed out, regardless of the string's constness. Attempts to do this in the engine led to all the upsides of using this mechanism disappearing and only the downsides remaining — the need to maintain a not-exactly-simple-to-implement class and a set of algorithms, and the skill of carefully working with this class.
Somewhere after 4.3 and closer to 4.6 the tech leads admitted that the maintenance cost had become too high, and the remaining advantages too small, to keep supporting their own COW-string implementation in the engine. And by then string_view and a cheap small-string implementation had arrived in the main compilers anyway.
On threads
You can probably recall a fairly widespread misconception that COW strings worked poorly with threads, or that they were inefficient, because with this approach a normal string copy didn't yield an actual copy, and another thread could get free access to the data and change it independently of the main one.
To allow the use of string instances shared between different threads while ensuring shared buffer use, almost every access function, including simple indexing via [], would have to use a mutex.
In the engine, though, a simple solution was made: a check of the current thread's index in the assignment operator, and if it didn't match, a new copy of the string was created. This of course caused some inconvenience, but such cases were fairly rare, and I can't recall any bugs related to it at all.
Immutable strings
This data type showed itself best on immutable strings — like string hashes, identifiers and keys, which made up the overwhelming majority in the engine's code. That's when strings don't assume operations that change the data. Strings can still be assigned, but you can't directly change a string's data — for example, replace "H" with "B" in the word "Hurry." In the case of the engine's COW strings, they supported amortized constant-time initialization from string literals via a hash key for comparison operations, and various constant-time substring operations, for instance as a key in a map. And this was probably the biggest plus of such COW strings — the absence of string-comparison operations when searching in an array or a map. In the "5" development began moving away from home-grown wheels and custom solutions, even when it led to lower performance and higher memory use, as with the standard library's containers. Now the engine relies on the standard library entirely.
P.S. Since 2017 I no longer take part in the engine's development, but the adopted course toward unifying software solutions has hardly changed all that much.
Thanks for reading!
← All articles