C++101 (Part 3) — dalerank

C++101 — a four-part catalog of C++ idioms and techniques: Part 1 · Part 2 · Part 3 · Part 4

Contents of this part

Calling Virtuals During Initialization
Construction Tracker
Attach by Initialization
Non-Virtual Interface (NVI)
Thread-Safe Interface
Acyclic Visitor Pattern (tricky)
Capability Query
Covariant Return Types
Virtual Friend Function
Fake Vtable
Algebraic Hierarchy
Polymorphic Exception
Polymorphic Value Types
Hierarchy Generation
Function Object (Functor)
Object Generator
Object Template
Iterator Pair
Generic Container
Erase-Remove
Clear-and-minimize
Shrink-to-fit
Safe bool
Type Safe Enum
Attorney-Client

Calling Virtuals During Initialization

This is a set of workarounds around one of the nastiest traps in C++: calling a virtual function from a constructor (or destructor) when it doesn't do what you expect. While the base class is being constructed, the object is still only "a base," the derived part isn't constructed yet, so a virtual call from the base's constructor goes to the base's implementation rather than the derived class's, even if you're creating an object of the derived type. In the destructor it's the mirror image, of course: the derived part is already destroyed, so the call goes to the base.

The language's reasoning here is that calling an override in a derived class whose fields aren't yet initialized (or are already destroyed) would mean operating on garbage. So the standard deliberately "demotes" the object's dynamic type down to the class currently being constructed, but for a programmer coming from Java or C#, where a virtual call from a constructor goes to the most-derived class (with its own, far worse problems), this is a surprise.

There are several workarounds for cases where you really do need polymorphic behavior at creation time. The most "correct" one, though any "correctness" here is really just UB you haven't been caught at yet, is two-phase initialization: the constructor creates an "empty" object, and a separate (non-virtual, virtual-calling) init() method runs only after full construction, when the object has the correct dynamic type.

The second option is to pass the desired behavior as a parameter or via a factory rather than relying on a virtual call from the constructor. But either way, two-phase initialization breaks RAII (the object exists but isn't ready yet) and demands the discipline of "don't forget to call init," which is itself a source of new bugs.

The problem and its workarounds were dissected in detail by Scott Meyers in Effective C++ (a dedicated item, "never call virtual functions during construction or destruction"), and this is one of those places where C++'s rules are strictly logical but counterintuitive, and knowing the idiom saves hours spent in the debugger.

struct Widget {
    Widget() { /* do NOT call virtual draw() here — it goes to Widget::draw */ }
    virtual void draw() { /* base */ }
    void init() { draw(); }   // safe: the object is already fully constructed
};
struct Button : Widget { void draw() override { /* button */ } };

// Two-phase initialization to get polymorphism at "creation" time:
auto b = std::make_unique<Button>();
b->init();   // now draw() goes to Button::draw

In game development this trap has historically lived in hierarchies of game objects, UI widgets, and components, where the temptation to "configure myself polymorphically right at creation" is very strong. This is exactly why many engines introduce an explicit object lifecycle with methods like BeginPlay in Unreal or OnEnable/Start in Unity-like architectures, which is precisely explicit two-phase initialization, separating "constructed" from "ready to play," so that virtual calls happen over a fully built object of the correct type.

In other words, the industry has effectively baked the workarounds into engine architecture, and you almost never do heavy polymorphic initialization in a game object's constructor; instead you rely on a lifecycle callback that the engine will invoke later.

Understanding why it's done this way is a direct consequence of the fact that virtual calls in a constructor aren't polymorphic, and trying to work around that head-on leads to bugs that show up only for certain types in the hierarchy.

Construction Tracker

The Construction Tracker solves the annoying problem where several members in a constructor's initialization list are constructed with a chance of throwing, and you need to know exactly which one failed in order to react correctly or at least report it intelligibly. The thing is, if a member's constructor throws, the already-constructed members will be correctly destroyed (the language guarantees that), but the constructor itself has no simple way of knowing which member broke, and that's sometimes needed for diagnostics or for correctly destroying the object.

So you introduce a "tracker," a counter or phase indicator that advances as each member is successfully constructed. Since members are initialized in order, the tracker's value at the moment of the exception tells you how far you got, that is, which member was being constructed when it threw. The tracker is usually itself a member, initialized first, and you advance it through helper expressions in the initialization list (for example, functions that construct a value and increment the tracker along the way).

The price you pay is "noise" in the constructor, and the tracker itself is also a source of confusion, so in most cases it's better to handle the possible exceptions inside the members' own constructors, or to move the risky initialization out of the list and into the constructor body, where it's easier to wrap in try/catch with clear context.

All of this was described back in the '90s and is closely tied to the function-try-block (a try block around the whole constructor, including the initialization list) and the C++ mechanism that lets you catch exceptions from the initialization list, but does not let you "repair" the object, only giving you a chance for diagnostics before rethrowing.

class Pipeline {
    int phase_ = 0;
    Stage a_, b_, c_;
    template <class T> static T track(T&& v, int& phase) {
      ++phase; return std::forward<T>(v);
    }
public:
    Pipeline(Config cfg)
    try
        : phase_(0),
          a_(track(make_stage_a(cfg), phase_)),   // phase_ -> 1
          b_(track(make_stage_b(cfg), phase_)),   // phase_ -> 2
          c_(track(make_stage_c(cfg), phase_))    // phase_ -> 3
    {}
    catch (const std::exception& e) {
        log("pipeline failed at stage %d: %s", phase_, e.what());
        // we know where it broke
        throw;
    }
};

The construction tracker is more of an exotic curiosity, and you won't see it in most code, because game objects rarely have long lists of risky members, and subsystem initialization that can fail is usually made an explicit two-phase process with clear error handling at each step, rather than hidden in an initialization list.

But it can come in handy in loaders and tools, where a complex object is assembled from many parts, any of which may fail to load, and it's important to tell the editor's user not just "something broke" but "failed to load this specific material stage." Even there, though, people more often prefer explicit step-by-step assembly with checks over the magic of a tracker in the initialization list, so this is worth knowing about, but not worth using.

Attach by Initialization

Yet another mechanism, where an object "attaches" itself to some external system or registry right during its own initialization, usually through a global or static object whose constructor performs the registration. The whole point is to make something happen automatically, without an explicit call from main or from initialization code, where the mere fact of the object's existence (or the inclusion of the file that defines it) triggers the registration.

This is the flip side, the "useful" side, of the very same static initialization that was a source of misery in other idioms. Here we deliberately exploit the fact that global objects are constructed before main to perform registration, and a global registrar object, in its constructor, adds a type's factory to a registry, subscribes a handler to an event, registers a test, a plugin, or a command. The programmer only needs to declare such an object (often via a macro), and the "attachment" happens by itself.

The price, again, is the static initialization order with all its risks, and the registrar must touch only things that are guaranteed to be ready (usually a registry built via construct-on-first-use, so it definitely exists). Plus, the linker likes to discard object files that nothing explicitly references, and a self-registering object may have no such references, so it silently fails to make it into the build along with its registration.

The mechanism took up residence in testing frameworks, factories, and plugin systems long before it was catalogued under this name.

// A macro declaring a self-registering global object:
#define REGISTER_COMPONENT(Type) \
    static bool s_reg_##Type = (ComponentRegistry::get().add(#Type, []{ return new Type; }), true)

// In a component's file, a single line is enough and it registers itself:
class HealthComponent : public Component { /* ... */ };
REGISTER_COMPONENT(HealthComponent);
// attachment during static initialization

As I said above, this is an extremely popular technique for systems that need extensibility without a central list of "all types," like a factory of components and entities where each component registers itself, or type registration for reflection and serialization, or self-registering console commands, cvars, debug widgets, or editor plugins. It lets you add a new type by simply writing its file with a single registration line, without editing any shared registry, which is very valuable in large teams.

But this beauty has that same dark side that has burned many: the linker throws away "unused" object files, and the self-registration vanishes from the release build under aggressive optimization. So engines that lean heavily on this technique are obliged to take measures against the linker stripping out self-registrars, and debugging a registration that went missing in release but worked beautifully in debug is one of those classic mysterious bugs that programmers burn unforgettable evenings on.

The idea is powerful and very much alive, but it requires understanding what happens between the compiler and the linker, otherwise it will let you down at the worst possible moment.

Non-Virtual Interface (NVI)

Non-Virtual Interface turns the familiar "public = virtual" layout on its head: ALL public methods of the base class are made non-virtual, and only the private (or protected) methods called by those public ones are virtual. From the outside, the class provides a stable non-virtual interface, while the customization points for subclasses are hidden inside. A subclass overrides the private virtual "hooks" but cannot change the public contract.

The point is to make the non-virtual public method the single entry point, where the base fully controls the "scaffolding" around the customizable behavior, for example precondition and postcondition checks, logging, timing, lock acquisition, error handling. The base guarantees that all this scaffolding always runs, no matter what the subclass overrides, because the subclass can't bypass the public method; it merely plugs its behavior into a virtual "slot" specifically left for it in the middle. This is the Gang of Four "template method" principle applied as a maximally strict rule.

The price is one more level of indirection (the public non-virtual calls the private virtual) and unfamiliarity for those who expect that you override the public methods themselves. Plus, a subclass sometimes wants to override the scaffolding too, and NVI deliberately forbids that, which in rare cases turns out to be too rigid, but the benefit of a single point of control and the subclass's inability to "forget" the mandatory scaffolding usually outweighs it.

The idiom was formulated and actively promoted by Herb Sutter as "prefer to make virtual functions private," and he showed that "public" and "virtual" are two different properties that needn't always be combined. After the 2000s this became one of the most influential rethinkings in the language and gave rise to a separate style of programming, and of how to "correctly" design polymorphic interfaces in C++.

class Renderer {
public:
    void render(const Scene& s) {        // non-virtual: controls the scaffolding
        ScopedTimer t("render");         // a subclass can't bypass this scaffolding
        validate(s);
        do_render(s);                    // here is where the subclass plugs in its own
        present();
    }
private:
    virtual void do_render(const Scene&) = 0;   // customization point, private
    void validate(const Scene&);
    void present();
};

class VulkanRenderer : public Renderer {
    void do_render(const Scene& s) override { /* just the render itself, no scaffolding */ }
};

In games NVI maps wonderfully onto systems with a strict lifecycle and mandatory scaffolding around the steps, where a system's or component's base class, in a non-virtual update(), measures time for the profiler, checks state, gathers statistics, while the subclass only implements a private do_update(). This guarantees that no subclass will "lose" the profiling or the checks, no matter how sloppily it's written, and you can see all of this in every "DoUpdate" in the Unity/Unreal/Godot engines.

The same approach naturally applies to serialization, where a public save wraps a virtual serialize in writing the header and version, to event handling, to network replication. Everywhere there's a "mandatory around" and a "customizable inside," NVI gives a clean separation and protects invariants at the design level.

Thread-Safe Interface

This is the same NVI, but specialized for multithreading, and it solves a specific woe: nested calls to one object's methods lead to an attempt to reacquire an already-held mutex, that is, to a deadlock (if the mutex is non-recursive) or to hidden inefficiency (if it's recursive). If every public method acquires the mutex, and one public method calls another public method of the same object, the second acquisition of the same mutex is a deadlock out of nowhere.

So we split the methods into two layers: our public methods are non-virtual, and they're responsible for acquiring the mutex (and only for that, plus delegation), while all the real work happens in private methods that don't touch the mutex and assume the lock is already held by the caller. Then the public method acquires the mutex and calls the private implementation, and if one private implementation needs to invoke another's logic, it calls the private version directly, without reacquiring. This guarantees that the mutex is acquired exactly once on entry from the outside and never from the inside.

The price is usage discipline: the private methods can't be called from outside without acquiring the lock, and the public ones can't be called from the private ones (otherwise you get a double acquisition). This invariant has to be maintained by hand, and it's easy to violate, especially during refactoring, plus you're left with all the general complexity of locking, and this approach protects against only one specific class of deadlocks (recursive acquisition), but not against deadlocks between different objects, nor against the other joys of multithreading.

All of this was described and named by Herb Sutter in his articles on multithreading, as a development of his own NVI, which it follows directly from the principle "public non-virtual wraps private," where you can see that NVI works and that it's convenient to hang cross-cutting concerns on it like locking, logging, or transactionality, because there's a single controlled entry point.

class ResourceCache {
    std::mutex mtx_;
    std::unordered_map<Id, Resource> map_;

    void insert_locked(Id id, Resource r) { map_[id] = std::move(r); }  // no acquisition!
    Resource* find_locked(Id id) { auto it = map_.find(id); return it != map_.end() ? &it->second : nullptr; }
public:
    void insert(Id id, Resource r) {                 // public: acquires the lock
        std::lock_guard<std::mutex> lk(mtx_);
        insert_locked(id, std::move(r));
    }
    Resource get_or_load(Id id) {                    // calls *_locked, not the public ones
        std::lock_guard<std::mutex> lk(mtx_);
        if (auto* r = find_locked(id)) return *r;
        Resource r = load(id);
        insert_locked(id, r);                        // no reacquisition
        return r;
    }
};

The thread-safe interface is usually useful for services shared between threads, like resource caches accessed by both loading threads and game threads. Or task pools, or logging and telemetry systems writing from many threads. Everywhere an object has several public operations that may call each other and is protected by its own mutex, the split into "public with the lock" and "private without the lock" saves you from accidental recursive deadlocks.

That said, modern gamedev generally tries to minimize shared mutable state with locks, preferring lock-free models such as handing out tasks over non-overlapping data, double-buffering state between threads, or message queues instead of shared objects.

In this paradigm there are fewer mutex-protected objects, and the need for a thread-safe interface drops, but where a shared protected object really is needed, this idiom is a long-proven way not to shoot yourself in the foot with a deadlock, and you need to know it if you're writing multithreaded engine code.

Acyclic Visitor Pattern (tricky)

To begin with, you first need to recall the ordinary "visitor" pattern, which lets you add new operations to a class hierarchy without changing the classes themselves, by moving the operation into a visitor object with a visit method. The problem is that the base visitor interface is obliged to know about all the concrete types in the hierarchy (one visit per type), and this creates a rigid dependency cycle, where you added a new type to the hierarchy and now you're obliged to add a visit to all visitors and to their common base interface.

The Acyclic Visitor lets you break this cycle: instead of one "fat" visitor interface that knows all types, you make an empty marker base visitor interface and separate small interfaces "I know how to visit this specific type." An element's accept method tries, via dynamic_cast, to figure out whether the incoming visitor implements the "visit me" interface, and if so it calls it, and if not it simply ignores it. This way the visitor doesn't have to know about all types, and a new type doesn't break existing visitors; they just don't implement the interface for it and skip it.

The price is a dynamic_cast on every accept, which is a runtime check, plus "skipping" unhandled types can mask errors (you forgot to handle a type or didn't notice). That is, the acyclic visitor gives you dependency decoupling at the cost of performance, and this is often a conscious trade-off in rarely executed code.

The mechanism was described by Robert Martin (the very same "Uncle Bob") in the late nineties in his article "Acyclic Visitor," precisely as a solution to the problem of cyclic dependencies and extension fragility in the classic GoF visitor. It's an attempt to preserve the power of the "visitor" pattern (adding operations without editing classes) while removing its main drawback: the need to touch all visitors when adding a type.

struct Visitor { virtual ~Visitor() = default; };
// empty marker

template <class T> struct VisitorFor {
  virtual void visit(T&) = 0;
};  // one interface per type

struct Node {
  virtual void accept(Visitor&) = 0;
};

struct Mesh : Node {
    void accept(Visitor& v) override {
        if (auto* mv = dynamic_cast<VisitorFor<Mesh>*>(&v))
          mv->visit(*this);  // knows how — call it
          // doesn't know how, skip, and Mesh doesn't break old visitors
    }
};

// A visitor implements only the types it cares about:
struct BoundsCollector : Visitor, VisitorFor<Mesh> {
    void visit(Mesh& m) override { /* collect the mesh's AABB */ }
};

In games the acyclic visitor lives in tools and editors, where processing heterogeneous trees (scenes, asset graphs, script ASTs, UI nodes) and extensibility matter more than perf, and the set of operations over the hierarchy keeps growing and is written by different people. The ability to add a new node type without rewriting all the existing handlers, and a new handler without touching the nodes, is very valuable in long-lived editor tools.

In the engine's runtime, visitors (especially acyclic ones, with their dynamic_cast) are used rarely, because dynamic cast may not be available or may be expensive, so people prefer a data-oriented traversal of dense arrays without dispatch by type. But the acyclic visitor is worth keeping in mind as a tool for offline and editor code with rich, extensible hierarchies, not as a way to traverse the game world inside a frame.

Capability Query

A mechanism to "ask an object whether it can do something" at runtime before asking it to actually do it. So, instead of forcing every object in the hierarchy to implement every conceivable interface (with a pile of empty stubs), you give the object only the capabilities that suit it, and the calling code at runtime queries: "you wouldn't happen to be IDamageable, would you?" and if so, it gets the corresponding interface and works with it, and if not, it simply leaves that capability alone.

Technically this is an ordinary dynamic_cast to the interface of interest, and if the object implements that interface, the cast returns a valid pointer; if not, just nullptr, but underneath it's a runtime check of "does the object implement this contract." An alternative would be an explicit query_interface(id) method returning a pointer by interface identifier (as in COM), or a check via a system of flags/capabilities. The idea is the same in all variants: capabilities are optional and are queried dynamically.

The price, again, is runtime cost (dynamic_cast or its equivalent), and a philosophical question: how often you need to "query capabilities" is frequently a sign of bad design. If you're constantly asking objects "can you do this? and this?", maybe the capabilities should be moved into separate components that either exist or don't, rather than hidden in an inheritance hierarchy where you have to extract them with dynamic casts.

The mechanism was described by Scott Meyers under the name capability query in articles about how and when dynamic_cast is appropriate. It all grew into Microsoft's COM with its QueryInterface, which is the industrial-grade capability query, and any COM object can be asked whether it supports a given interface and get it or a refusal. And today all of this lives throughout the COM model, on which, in particular, DirectX is built.

struct Entity { virtual ~Entity() = default; };
struct IDamageable { virtual void take_damage(int) = 0; };
struct IInteractable { virtual void interact() = 0; };

struct Barrel : Entity, IDamageable { void take_damage(int) override { /* explosion */ } };

void shoot(Entity& target, int dmg) {
    if (auto* d = dynamic_cast<IDamageable*>(&target))   // can it take damage?
        d->take_damage(dmg);                              // yes — deal it
    // no — a wall, a decoration: just ignore it
}

In games the capability query is often the only way to express "not all objects can do everything," that some entities can be damaged, others set on fire, others picked up, and a fourth kind talked to, and most objects support only some of these capabilities. Querying the interface before acting lets you write generic code ("a shot checks whether the target can be damaged") without requiring every rock and wall to implement take_damage as an empty stub.

But in modern development this very task is already solved not by inheritance with dynamic_cast, but by composition from components, and "can be damaged" becomes the presence of a health component on the entity, and then the check turns into a cheap component query rather than a runtime cast up the hierarchy.

So the capability query in the form of dynamic_cast lives most vividly in more classic object-oriented engines and at the boundaries with COM-like APIs (DirectX again), while data-oriented ECS engines achieve the same effect through the presence or absence of components, which is both faster and more flexible. The principle itself, "ask before you demand," remains one of the fundamental ways of working with heterogeneous entities.

Covariant Return Types

This is a language feature that lets an overridden virtual function in a derived class return a more specific type than the same function in the base. If the base clone() returns Base*, then clone() in Derived can return Derived*, and this counts as a correct override rather than a different function. The return type is "covariant" if it changes in the same direction as the hierarchy, narrowing in the derived classes.

This is needed for type safety on the caller's side: without covariance, a virtual clone() would always return Base*, and calling it on an object you know for certain is Derived, you'd get a Base* and be forced to cast back to Derived*. With a covariant return, derivedPtr->clone() gives you Derived* right away with no casts needed, which removes a whole class of pointless conversions and makes virtual factories and clone pleasant to use.

The compiler pays here, and for us it's a "free" and safe language feature; the only limitation is that covariance works for pointers and references but not for return by value (which means you can't return a "more specific" type through a virtual call, because the size is different). So covariant return naturally pairs with returning pointers, which in the case of clone is usually what you want, and a slight rough edge appears when combining it with smart pointers, because unique_ptr<Derived> is not covariant to unique_ptr<Base> at the language level, and for them covariance has to be emulated by hand.

Covariant return types weren't added to C++ all at once, and early compilers didn't support them, but the feature settled into the language around the mid-nineties, becoming part of the C++98 standard.

struct Shape {
    virtual ~Shape() = default;
    virtual Shape* clone() const = 0;
};

struct Circle : Shape {
    Circle* clone() const override { return new Circle(*this); }  // covariant: Circle*, not Shape*
    float radius() const { return r_; }
private:
    float r_;
};

Circle c;
Circle* copy = c.clone();   // immediately Circle*, no cast — copy->radius() works

This is a handy little detail in game development that makes hierarchies of polymorphic objects with cloning and factories more pleasant to use. Everywhere there's a virtual clone, duplicate, create_instance (prototype systems, editor duplication, undo copies), covariance lets you work with the result as the concrete type wherever the concrete type is known, without littering the code with conversions.

There's no special "game-specific" angle to this feature, but since cloning and virtual factories show up regularly in games (especially in tools and more classic object-oriented architectures), the covariant return is simply one of those details whose knowledge distinguishes clean polymorphic code from code strewn with unnecessary static_cast. And the "pointers/references only" limitation is worth understanding, so you're not surprised why the trick doesn't work with unique_ptr directly.

Virtual Friend Function

An idea that resolves the contradiction that friend functions (friend) are neither inherited nor able to be virtual. Sometimes you want a friend free function to behave polymorphically, and the most common case is the output operator operator<< for a class hierarchy, which must be a free function (because the left operand is always a stream, not your object), yet must print the object according to its actual type, that is, polymorphically.

To do this, the friend free function is made non-virtual and unique (usually in the base class), and it delegates all the work to a virtual method of the class. That is, operator<< is a non-template, non-inheritable friend that internally calls obj.print(stream), while print is an ordinary virtual method overridden in subclasses. From the outside it looks like a polymorphic free function, but inside it's plain virtual dispatch hidden behind a thin free wrapper.

In essence this is a crutch, a workaround, not real "virtual friendship" (there's no such thing in the language and there likely never will be), and the trick is not to get confused about when virtuality comes from the method and when the free function merely forwards into it. There are also subtleties about which class to declare the friend function in so that ADL finds it, and about not breeding one such function in every subclass (only one is needed, in the base). But the result is the natural syntax stream << obj with polymorphic behavior.

The implementation mechanism was described long ago and belongs to the classic body of knowledge about the interaction of friendship, operator overloading, and polymorphism.

class Shape {
public:
    virtual ~Shape() = default;
    // free friend — one per hierarchy, delegates to the virtual print
    friend std::ostream& operator<<(std::ostream& os, const Shape& s) {
        s.print(os);            // polymorphism happens here
        return os;
    }
protected:
    virtual void print(std::ostream& os) const = 0;
};

struct Circle : Shape {
  void print(std::ostream& os) const override {
    os << "Circle";
  }
};

struct Square : Shape {
  void print(std::ostream& os) const override {
    os << "Square";
  }
};

Shape* s = new Circle;
std::cout << *s;   // prints "Circle" polymorphically through the free operator<<

In game development the technique surfaces in debug and tooling code, where you need to polymorphically print or serialize objects of a hierarchy through free functions and operators, for example dumping entity state to a log, or text-serializing polymorphic scene nodes. That is, everywhere you want to write log << *entity and get output by actual type, the virtual friend function is the standard solution.

In the hot path it's of course not used, because neither stream output nor virtual calls have any business there, so the idiom lives in diagnostics, tools, and text formats. Understanding that there is no "virtual friendship" in the language, only a free wrapper over a virtual method, spares you the attempts to declare a virtual friend directly and the bewilderment over why the compiler rejects it.

Fake Vtable

This is a manual implementation of what the compiler usually does for you when it creates the virtual method table. Instead of relying on the built-in virtuality mechanism, you set up your own structure of function pointers (that's the "table") and call the needed function through it via the pointer yourself. In essence it's polymorphism assembled by hand from function pointers, bypassing the language's logic of virtual classes.

You often need this when you require control over the layout and you know exactly where the table lives, how much the object weighs, how it's serialized, and you can change the object's "type" at runtime by swapping the pointer to the table (which you can't do with a real vtable).

The second case is when you need to ensure portability of a binary format and compatibility with C, where there are no virtual functions at all, yet your library needs them. The price you pay is taking the bread out of compiler developers' mouths, when you take on everything they already do correctly and invisibly: the correctness of the tables, their initialization, the type safety of the calls, maintenance when methods are added.

This is hard, expensive, error-prone, and you lose all help from the compiler. So a fake vtable is always a tool of last resort, and in 99% of cases real virtual functions are simpler, faster, and safer.

Conceptually it has been known for ages, but understanding it, and along with it understanding what the compiler generates for ordinary virtual classes, makes for a decent guide to development. Manual tables of function pointers are an ancient systems-programming technique in C (this is how all drivers and plugin APIs in the Linux kernel are built), brought into C++ for cases where built-in virtuality, for some reason, doesn't fit.

struct AllocatorVTable {                 // the "table" by hand
    void* (*allocate)(void* self, std::size_t);
    void  (*free)(void* self, void*);
};

struct Allocator {
    const AllocatorVTable* vtable;
    // pointer to the table like a real vptr

    void* state;
    void* allocate(std::size_t n) { return vtable->allocate(state, n); }
    // manual dispatch
};

// The allocator's "type" can be swapped at runtime by redirecting the vtable
// which you can't do with real virtual functions

In game development the fake vtable appears in the lowest-level, performance- and platform-sensitive layers, like memory allocator interfaces, backends of platform abstractions, and plugin ABIs that must be stable and C-compatible.

Another frequent habitat is rendering, where people want polymorphism but with the ability to "hot-swap" behavior (for example, switching a system's implementation on the fly) or with a cache- and serialization-friendly layout. But this is rare and almost always a conscious trade of clarity for control, and for the vast majority of code real virtual functions remain the most correct choice, and the fake vtable is worth knowing mainly as how virtuality is built from the inside.

Algebraic Hierarchy

A way to organize a hierarchy of numeric (algebraic) types so that the user works with a single "facade" type behind which several concrete representations are hidden. The classic example is a number that can be integer, rational, real, or complex, but the user operates on one type Number, while inside it dynamically holds one or another concrete representation and picks the optimal one depending on the value and the operations.

From the outside Number behaves like an ordinary value (it's copied, added, compared), while inside it holds a pointer to a polymorphic implementation (integer, fraction, float) and delegates operations to it. In operations between different representations, promotion to a common type happens (you added an integer to a fraction and got a fraction), as mathematical systems do, which combines the handle/body, envelope/letter idioms, and ordinary polymorphism in a specific application to numeric towers.

The price is allocations and virtual calls on arithmetic, which is orders of magnitude slower than machine int and float, but for a computer-algebra system or an arbitrary-precision calculator this is acceptable, while for hot-path code it would already be a problem. Plus the logic of type promotion and preserving value semantics on top of a polymorphic implementation is often nontrivial to implement.

The roots of the idea go back to James Coplien's classic 1992 work, where he dissected "envelope/letter" and numeric towers as an example, but this is largely an academic and niche pattern, illustrating how to build a "smart numeric type" with a dynamic representation while preserving for the user the illusion of an ordinary value.

class Number {                                  // value facade
    std::shared_ptr<const NumberImpl> impl_;
    // polymorphic representation inside
public:
    Number operator+(const Number& o) const {
      return impl_->add(o.impl_);
    }  // type promotion inside
};

struct NumberImpl {
  virtual Number add(std::shared_ptr<const NumberImpl>) const = 0; /* ... */
};

struct IntImpl : NumberImpl {
  long long v; /* int+int=int, int+rational=rational ... */
};
struct RationalImpl : NumberImpl {
  long long num, den; /* ... */
};

In games the algebraic hierarchy isn't used in its pure form, and frankly game math is built on exactly the opposite principle, where fixed concrete types (float, int, Vec3) are deliberately simplified further, all so the CPU and SIMD can churn through numbers at full speed, while a "smart number" with an allocation on every operation is an antipattern.

Where related ideas do surface is in tools and non-runtime systems, like expression editors or scripting languages and data systems, where a value can be "a number, a string, or a vector" (variant types). There a dynamic representation behind a single facade is appropriate, because performance isn't critical while flexibility matters. But in the engine core the algebraic hierarchy is more an example of how not to do it, and understanding why you shouldn't do it that way is useful as an illustration of data-oriented thinking.

Polymorphic Exception

This is a mechanism for correctly working with exceptions in a polymorphic hierarchy of exception classes, where you need to throw, catch, and if necessary store and rethrow an exception without losing its actual type.

The basic C++ rule is to catch exceptions by reference (catch (const std::exception&)), because catching by value slices the derived type down to the base (object slicing), losing all the specifics of the concrete exception. You should throw a temporary object, while the copy for the "flight" is created by the exception mechanism itself.

The difficulty arises when an exception needs to be not merely caught and handled on the spot, but stored, passed to another thread, or rethrown later, and then you face the task of "copy the caught exception while preserving its dynamic type," which is highly nontrivial, because you caught a const std::exception& but don't know which subclass it actually is in order to copy it, and the classic solution leads to virtual methods like raise() (throw myself) and clone() (copy myself) in the base exception class, so that the exception can polymorphically reproduce and transport itself.

The price here is manually implementing clone/raise in every exception class, which is tedious and easily broken, and the errors show up at the most inopportune moment, for example already while handling another error higher up the stack.

C++11 largely solved the task out of the box, introducing std::exception_ptr and the functions std::current_exception() / std::rethrow_exception(), and now any exception can be captured into an exception_ptr (preserving its exact type), passed anywhere, including another thread, and rethrown later without any manual clone.

The idea with virtual raise/clone was described before C++11, and the arrival of exception_ptr in C++11 was in large part motivated precisely by the need to carry exceptions between threads in asynchronous code, which made the manual idiom unnecessary in most cases, though the understanding of the problem in general and of the need to catch by reference remained.

// The modern way to carry an exception while preserving its exact type:
std::exception_ptr captured;

void worker() {
    try { risky_load(); }
    catch (...) { captured = std::current_exception(); }   // captured any type
}

void main_thread() {
    worker();
    if (captured) {
        try { std::rethrow_exception(captured); }          // rethrown here
        catch (const ResourceError& e) { /* exact type preserved */ }
        catch (const std::exception& e) { /* catch by reference — no slicing */ }
    }
}

In games the attitude toward exceptions is, to put it mildly, nonexistent, and a significant part of the industry doesn't use them at all, often building projects with -fno-exceptions so they don't get in the way of certain optimizations. So the core of many engines is built on error codes, std::optional/expected-like results, and assert, rather than on exceptions.

Still, polymorphic exceptions are appropriate in tools, editors, asset loaders, and other non-runtime code, where reliable error handling matters more than perf, and where exception_ptr nicely carries failures between threads of an asynchronous load.

Polymorphic Value Types

We've talked about exceptions, but the same problem is relevant for ordinary values too, and the attempt to preserve polymorphic behavior (like objects in a hierarchy behind a pointer to the base) and value semantics (like int or Vec3 — copied, placed into a container by value, requiring no manual lifetime management) remains. Usually these two things are opposed, because polymorphism requires pointers and inheritance, which means manual ownership and the risk of slicing, while values copy easily but aren't polymorphic.

This is implemented with a wrapper that internally holds a pointer to a polymorphic object of the hierarchy but behaves like a value, and copying it makes a deep copy via a virtual clone, so the copy is an independent object of the correct dynamic type, with no slicing problems and no shared ownership, and destroying it correctly deletes the stored object. From the outside you get an ordinary value that you put into a std::vector, copy, pass around, while inside polymorphism and the exact type are preserved.

The price is each copy of such a "value," which 99% of the time does an allocation and a virtual clone, which is more expensive than copying an ordinary value and unacceptable in the hot path or under intensive copying. Plus it requires the whole hierarchy to support clone, which partly kills the convenience and is paid for with the cost of dynamic memory and virtual calls on every copy.

The idea was actively promoted by Sean Parent in his talks on "value semantics and concept-based polymorphism," showing how to wrap polymorphism in values and rid the code of manual pointer management and of inheritance in the user-facing API. It made it into the standard as std::polymorphic_value, but it's still criticized, by proponents of inheritance and evangelists of value-oriented design.

// The wrapper behaves like a value but stores a polymorphic object:
template <class Base>
class PolyValue {
    std::unique_ptr<Base> p_;
public:
    template <class D> PolyValue(D d) : p_(std::make_unique<D>(std::move(d))) {}
    PolyValue(const PolyValue& o) : p_(o.p_->clone()) {}
    // copy = deep clone, no slicing
    Base* operator->() { return p_.get(); }
};

// You can put polymorphic objects into a vector BY VALUE
std::vector<PolyValue<Shape>> shapes;
shapes.push_back(Circle{});
shapes.push_back(Square{});
auto copy = shapes;   // a deep copy of each, types preserved

In games polymorphic value types in pure form are rare due to the allocation, but the idea of "value semantics for polymorphic things" is very appealing for code where simplicity of ownership and safety matter, that is, editors again, data models and undo/redo systems (where deep copying of state is the very essence of undo), or descriptions of configurations and properties.

And this same idea has taken root well in indie games, as part of the same drift of the industry from "everything through inheritance and pointers" to value-oriented and data-oriented design and ECS patterns, but that's a separate conversation.

Hierarchy Generation

This is a separate technique in which an entire class hierarchy is generated automatically, by templates, from a list of types, instead of writing out each class by hand. You give a list of types to the engine (for example, TypeList<int, float, std::string>) and a "generator" that recursively builds a hierarchy from it, that is, a class inheriting from handlers for each type in the list, or a chain of classes, one level per type.

This is often needed to avoid manual duplication when creating families of related classes, and on this were built, in particular, generic functors, abstract factories, and variant-like storages, where you need "one of something per type in the set."

All of this is a heavy form of metaprogramming brain-fry, with all its companions in the form of monstrously long type names, unreadable errors, noticeable compilation slowdown with large type lists, and the general difficulty of debugging the generated hierarchy.

With the arrival of variadic templates (C++11), hand-rolled TypeLists and recursive generators have largely become obsolete, and now we have parameter packs that can be expanded into hierarchies and data structures directly, without the cumbersome logic of type lists.

The idiom in its canonical form was introduced and began to be preached by Alexandrescu in the early 2000s along with the whole infrastructure of TypeList and hierarchy generators in the Loki library, and it was perhaps one of the most impressive demonstrations of the power of template metaprogramming of that time. And it spilled over into modern equivalents with variadic templates, std::tuple (which is, in essence, that very hierarchy generation produced from a type pack), and std::variant.

// A linear hierarchy with one inheritance level per type in the pack
template <class... Ts> struct Handlers;
template <> struct Handlers<> {};                       // recursion base
template <class T, class... Rest>
struct Handlers<T, Rest...> : Handlers<Rest...> {
    virtual void handle(const T&) = 0;                  // handler for T
    using Handlers<Rest...>::handle;                    // plus everything below
};

// An event handler with one handle per event, generated automatically:
struct EventHandler : Handlers<KeyEvent, MouseEvent, ResizeEvent> {
    void handle(const KeyEvent&)    override { /* ... */ }
    void handle(const MouseEvent&)  override { /* ... */ }
    void handle(const ResizeEvent&) override { /* ... */ }
};

In games all this ancient stuff has long been rewritten with std::tuple and std::variant and variadic templates. As a distinct kind, it has crept into ECS engines, which use packs of component types to generate storages and systems from a list of components, and separately it lives in serialization, which expands a struct's fields by the list of their types. All of these are already implementations and descendants of this idea of hierarchy generation, just in newer and more convenient syntax.

Directly writing recursive hierarchy generators from TypeList in new code is almost never justified, because variadic templates do the same thing more simply and compile faster. Hierarchy generation is worth knowing first and foremost as the foundation of how variadic templates work, and this principle is alive and well, it has merely swapped its toolkit from heavy type lists to lightweight parameter packs.

Function Object (Functor)

This is an object that pretends to be a function, because it has an overloaded operator(), but what distinguishes it from an ordinary function is the ability to hold state. In general, a functor is a class with fields (state) and a call operator, and each instance of it can be configured in its own way and remember something between calls or capture parameters at creation, that is, in essence it's "a function with memory."

It has two advantages over a function pointer. The first, as I already said, is state: a function pointer always calls the same function in the same way, whereas a functor can be parameterized (a comparator that remembers the sort direction, or a predicate that remembers a threshold). The second is inlining: when you pass a functor to a template algorithm, its type is known at compile time, and operator() is inlined right into the algorithm's body, whereas a call through a function pointer usually stays an indirect call that doesn't inline and hinders the optimizer.

There's almost no direct, explicit cost here, and although functors are often more verbose than you'd like (you have to declare a class), which before C++11 was their main pain, and to pass a custom comparator to sort you had to set up a separate class somewhere off to the side from where it's used. But C++11 lambdas solved exactly this problem at the compiler level, wrapping it in syntactic sugar that the compiler unfolds into an unnamed functor with the captured variables as fields. That is, every lambda is a functor, just written concisely and in place.

The concept of functors in C++ took shape together with the STL of Stepanov and Lee in the early nineties, where functors (predicates, comparators, operations) were an integral part of generic algorithms, and they're precisely what gave the STL its reputation for "abstraction without overhead." The standard library always preferred functors over function pointers in template algorithms, and C++11 lambdas cemented functors as an everyday tool.

// A stateful functor: a comparator that remembers the reference point
struct DistanceFromCamera {
    Vec3 camera;
    bool operator()(const Object& a, const Object& b) const {
        return dist2(a.pos, camera) < dist2(b.pos, camera);
        // inlined into sort
    }
};
std::sort(objects.begin(), objects.end(), DistanceFromCamera{cam_pos});

// A lambda is the same functor, written concisely (the compiler generates the class itself):
std::sort(objects.begin(), objects.end(),
          [cam_pos](const Object& a, const Object& b){
            return dist2(a.pos,cam_pos) < dist2(b.pos,cam_pos);
          });

Functors and lambdas are ubiquitous, as comparators for sorting visible objects by distance or by material, predicates for filtering entities, event callbacks, small tasks for a job system, or operations passed into generic passes over data.

The difference between a lambda as a functor (the type is known statically, it's inlined, zero overhead) and a lambda wrapped in std::function (type erasure, a virtual call, a possible allocation) tells in the hot path, where std::function loses on speed. The difference isn't huge, but it exists, and understanding this mechanism distinguishes performant code from ordinary code.

Object Generator

This is a helper function that creates an object, deducing its template parameters from its arguments, so you don't have to write them out by hand. Before C++17, template classes couldn't deduce arguments from the constructor, and to create a std::pair<int, std::string> you had to either write out all the types (std::pair<int, std::string>{1, "a"}) or call std::make_pair(1, "a"), which deduces the types for you, and that very make_pair is the object generator, a factory function that exists solely for type deduction.

The point here is to offload type deduction onto the template function argument deduction mechanism, which always worked, unlike deduction for template classes, which didn't exist. A function make_X(args...) looks at the argument types and deduces the template parameters of X from them and returns a constructed X<...>, which removes duplication (no need to repeat types that are already visible from the arguments) and makes the code shorter and more resilient to type changes.

The price is separate functions for each template (make_pair, make_tuple, make_shared, make_unique, bind...), and this was clearly a workaround for a missing language feature. C++17 brought CTAD (class template argument deduction), deducing class template arguments straight from the constructor, after which many object generators became unnecessary, and now you can write std::pair{1, std::string{"a"}}, which deduces the types itself.

But historically most generators outlived CTAD, because they make for slightly more readable code and are less prone to accidental errors.

This idea is one of the fundamental ones in the STL and Boost: std::make_pair (Stepanov and company), which then grew into std::make_tuple, std::make_shared, boost::bind/std::bind, and remained a generic technique after C++17.

// Object generator and the types are deduced from the arguments, no need to write them by hand
template <class A, class B>
std::pair<A, B> make_pair(A a, B b) { return {std::move(a), std::move(b)}; }

auto p = make_pair(42, std::string{"hp"});   // deduced pair<int, string>

// C++17 CTAD removes the need for the generator in many cases:
std::pair p2{42, std::string{"hp"}};
// also pair<int, string>, without make_

Object generators in the form of make_unique/make_shared are an everyday tool for creating owning pointers, while make_pair/make_tuple do crop up now and then in generic code, although CTAD has displaced much of that. Engines often write their own generators for their template types in the form of make_handle, make_span, or factories of typed identifiers and wrappers.

The value of make_shared/make_unique lies not only in type deduction, but in combining the object's allocation and the control block into one (fewer cache misses, less work for the allocator), and make_unique guarantees no leak under exceptions in complex expressions, so these generators are worth preferring over raw new not just for brevity.

Object Template

The Object Template (not to be confused with C++ templates) is a mechanism about serializable "template" objects that describe how to create and configure an object, separating the description from the instance itself. The idea is that you have description data (template/blueprint/archetype) that specifies the object's initial state and composition, and a factory that produces real objects from that description. The description itself is also data that can be loaded from a file, edited in a tool, or replicated.

And the point of the mechanism is to separate "what the object looks like and what it's made of" from "this specific live object." The description exists as editable data, not as code, and therefore can be changed without recompiling, stored in assets, versioned, reused. One "enemy template" produces hundreds of concrete enemies, but editing the template changes all future ones (and sometimes the existing ones), and this is already more of a data-driven approach, where the behavior and composition of objects are defined by data rather than hard-coded into classes.

The price is a mechanism for mapping description data onto real types and fields (essentially a form of reflection or type registration), and you have to settle the question of inheritance among the templates-descriptions themselves, where one template extends another, which complicates the whole system. And you have to figure out what to do with already-created objects when their template changes, but the benefit outweighs it, because flexibility and the ability to hand content creation to designers without a programmer is the best thing a programmer can do in an engine.

In gamedev object templates are one of the central architectural ideas, known under the names prefab, blueprint, archetype, data template, and its origin lies in data-driven design itself, which the industry developed so that designers could create and configure content without programmers, and so that iterations wouldn't require rebuilding the code.

// The "blueprint" description as data (loaded from a file, edited in a tool):
struct EntityTemplate {
    std::string name;
    float max_health = 100;
    std::vector<ComponentDesc> components;   // which components and with which parameters
};

// The factory produces live entities from the blueprint:
Entity spawn_from_template(const EntityTemplate& tmpl, World& world) {
    Entity e = world.create();
    for (const auto& c : tmpl.components)
        component_registry().build(c, e, world);   // mapping data onto real types
    return e;
}

All of this has evolved into Unity's prefabs, Unreal's blueprints, archetypes in ECS engines, and enemies, items, effects, and even levels are now described by template data that designers create and edit in the editor, and the engine instantiates at runtime. This lets you create enormous volumes of content without a programmer and iterate without rebuilding the game.

Beneath this data-driven superstructure there almost always lies C++ infrastructure for reflection, type registration, and factories (often built on attach-by-initialization for self-registering components and on member detector/traits for field serialization). That is, the object template as a high-level idiom relies on a whole set of low-level C++ ideas higher up the list, without which it would be impossible, and this is a good example of how separate abstract template techniques add up into a coherent foundation, on which stands a quite tangible and production-important capability: handing content creation to designers, and giving programmers time to drink a cup or two of coffee.

Iterator Pair

This is a fundamental STL mechanism, which it itself spawned for its algorithms. A range of elements is specified not by a container nor by a pointer with a length, but by a pair of iterators, to the beginning and to "one-past-the-end." All the standard algorithms are built on top of (begin, end), and this pair fully describes the sequence to work on, knowing nothing about which container it lives in, or whether it lives in one at all.

This is how algorithms and containers were decoupled, because std::sort doesn't need to know whether it's sorting a vector, a piece of an array, or a range inside a deque, and a pair of iterators of the right category is enough for it.

On this idea rests the entire orthogonality of the STL: "M containers × N algorithms = M+N code instead of M×N," and algorithms are written once for iterators, while containers provide iterators, enabling any algorithm to work with any container.

The price is the unspoken agreement that the pair of iterators is correct at the type level. You can easily pass iterators from different containers or mix up the order, getting undefined behavior with no diagnostics whatsoever.

On top of that we got the problem of iterator invalidation when the container is modified, as a separate eternal source of bugs. People have tried to solve this problem since the very birth of the language, and it all spilled over into the concept of ranges, where a range became a single object (std::ranges::sort(v) instead of sort(v.begin(), v.end())), but which inside is still a pair of iterators, and from the outside seemingly a single entity, yet one that's already harder to spoil and that can be lazily composed via views.

This idea about iterators is the very heart of Stepanov's STL, and he deliberately built the whole library around iterators as a generalization of pointers, and the iterator pair will live with us as long as the STL itself lives. This is perhaps the most influential design decision in history, defining what generic code looks like and will look like for decades to come.

// The algorithm works with a pair of iterators, knowing nothing about the container:
template <class It, class T>
It find(It begin, It end, const T& value) {
    for (; begin != end; ++begin)
        if (*begin == value) return begin;
    return end;                            // "not found" = iterator to one-past-the-end
}

std::vector<int> hp = {100, 80, 0, 60};
auto dead = find(hp.begin(), hp.end(), 0);     // works with vector
int arr[] = {5, 3, 8};
auto it = find(arr, arr + 3, 8);               // and with a raw array — the same code

// C++20 ranges: the range as a single object
std::ranges::sort(hp);

It's worth noting separately that the iterator interface lets you apply algorithms to sub-ranges and to non-standard "containers" like chunks of a pool or views over someone else's memory, which comes up constantly in engines. C++20 ranges and std::span (a non-owning view of a range as a "pointer + length" pair) made this even more convenient and safer, and modern engine code increasingly uses span, which has caught on especially well as a way to pass "a view of an array" without binding to a specific container and without copying.

Generic Container

By this people mean a set of conventions a container follows in order to become a "good citizen" in the STL world, that is, so that standard algorithms and range-based for work with it, and so that it behaves predictably for everyone used to the standard containers. It's a body of rules about which nested types to provide (value_type, iterator, const_iterator, size_type), which methods (begin/end, size, empty), and what copy and swap semantics.

These are the rules of "good manners for a container," and if your container provides begin()/end(), it can already be used in a range-based for and in standard algorithms. If it adds the right nested typedefs, iterator_traits and generic code that asks about them work with it. If it has correct swap and value semantics, it plays nicely with containers of containers and copy-and-swap. Following these conventions makes your type interchangeable with the standard ones and embeddable into the existing ecosystem without special code.

The price is code, of which you have to write a great deal and by hand, and historically all the rules and conventions were informal (just "do as std::vector does"), and it's easy to miss something and end up with a container that almost works. Plus a full implementation of all the requirements (including allocator-awareness, correct iterator categories, exception safety) is a serious amount of code, and only C++20 formalized a significant part of these implicit conventions in the form of concepts (std::ranges::range, std::input_iterator, etc.).

These conventions emerged as the STL evolved and for decades lived as its spirit, in the form of the "container requirements" from the standard, and as the practice of "imitate the standard containers."

// The minimum for a custom container to play nicely with range-for and algorithms:
template <class T, std::size_t N>
class FixedVector {
    T data_[N];
    std::size_t size_ = 0;
public:
    using value_type = T;                 // nested typedefs for generic code
    using iterator = T*;
    using const_iterator = const T*;

    iterator begin() { return data_; }    // begin/end => range-for and algorithms work
    iterator end()   { return data_ + size_; }
    std::size_t size() const { return size_; }
    bool empty() const { return size_ == 0; }
};

FixedVector<Enemy, 64> enemies;
for (auto& e : enemies) { /* works */ }
std::sort(enemies.begin(), enemies.end());   // and algorithms too

Most popular engine container libraries (EASTL by Electronic Arts being the best-known example) deliberately replicate the STL interface precisely for this compatibility, differing in implementation (allocators, growth, debug checks) but not in interface. This gives the best of both worlds: control over the memory and performance of your own containers plus compatibility with the entire ecosystem of generic code. So understanding "what makes a container a container" is a practical skill for a developer who cares about more than just academic knowledge of the standard library.

Erase-Remove

This is the behavior when removing elements from a sequence container, born from the non-obvious design of the std::remove algorithm, where std::remove (and remove_if) removes nothing, because it works with a pair of iterators and has no access to the container itself to change its size. Instead, remove rearranges the elements and shifts the "wanted" elements to the front, overwriting the "unwanted" ones with them, and then returns an iterator to the new logical end, while the "tail" after it is left with unspecified (but valid) contents.

To actually delete the data you now need a second step, the container's erase method, which you pass the range from the new logical end to the old physical end, like v.erase(std::remove_if(v.begin(), v.end(), pred), v.end()). remove_if compacts the wanted elements and returns the boundary, while erase trims the tail. Hence the name.

The price of this mechanism is its two-step nature, which generations of programmers have stumbled over, and simply calling remove_if without erase is a classic mistake, after which the "removed" elements remain in the container (the size hasn't changed), merely having moved to the tail.

The mechanism is unintuitive and requires you to remember why there are two calls here, and only C++20 finally gave us the direct functions std::erase and std::erase_if, taking a container and a predicate and doing everything in one call, that is, after twenty years of erase-remove the language acknowledged that people just wanted to "remove elements by condition."

This is a direct consequence of the STL's design, where algorithms are deliberately decoupled from containers and therefore physically cannot change their size. Meyers devoted separate items in Effective STL to erase-remove, explaining both the idiom itself and why remove is built exactly this way, and those pages saved quite a few people from the "removal that doesn't remove" bug.

// Remove all dead enemies from a vector:
enemies.erase(
    std::remove_if(enemies.begin(), enemies.end(),
                   [](const Enemy& e){ return e.hp <= 0; }),   // compacts the living, returns the boundary
    enemies.end());                                            // erase trims the dead tail

// C++20 — the same thing in one call:
std::erase_if(enemies, [](const Enemy& e){ return e.hp <= 0; });

It's worth noting, though, that in the hot path engines often use the faster swap-and-pop technique if element order doesn't matter, where the element being removed is swapped with the last one and the vector is shortened by one, giving removal in constant time instead of a shift. Erase-remove, on the other hand, is irreplaceable when order matters or when many elements are removed at once by a condition, but in any case you have to understand that std::remove by itself removes nothing.

Clear-and-minimize

This mechanism solves a specific problem: how to not merely clear a container of its elements, but also make it return the memory it occupied back to the operating system, because vector::clear() destroys the elements but doesn't free the allocated capacity, and the vector is left with the same reserved memory, ready to fill it again without new allocations. Usually that's what you want, but sometimes you want precisely to return the memory, for example when the container ballooned to a peak and will never be that large again.

The classic pre-C++11 solution was "swap with an empty temporary container," std::vector<T>().swap(v), where an empty temporary vector is created (with zero capacity), it swaps contents with yours, and now your vector is empty and with zero capacity, while the temporary took the old large buffer and immediately dies at the end of the expression, freeing it. An elegant trick that exploits the fact that swap exchanges both the data and the capacity, and the temporary's destructor does the dirty work.

But the price is the non-obviousness of the technique itself (to the uninitiated vector<T>().swap(v) looks like pointless nonsense), and not freeing memory. Often keeping the capacity for reuse is more advantageous than returning the memory and then allocating it anew. C++11 finally gave the more readable, if non-binding, shrink_to_fit(), and for a full clear with release, the combination clear() + shrink_to_fit(), which expresses the intent explicitly rather than through the swap trick.

The idiom was described by Scott Meyers in Effective STL, but it's often skipped, and many forget that a vector has two independent notions: size (how many elements) and capacity (how much memory is allocated), and that standard operations affect them differently, and that managing capacity requires separate techniques.

std::vector<Particle> particles;
particles.resize(1'000'000);
// peak load: allocated memory for a million

// ... the surge is over, particles are no longer needed in such volume ...

particles.clear();
// size 0, but the memory for a million is STILL occupied

std::vector<Particle>().swap(particles);
// pre-C++11 we genuinely free the memory

// or C++11:
particles.clear();
particles.shrink_to_fit();
// more readable: clear and return the memory

In games managing container capacity is still a separate concern, because the memory budget, especially on consoles, often isn't met, and peak surges (loading a level, a mass effect, temporary buffers) can balloon containers to sizes that are no longer needed and simply occupy memory, so Clear-and-minimize lets you return that memory after the surge.

But in the hot path the opposite strategy applies, where capacity is precisely retained to reuse buffers between frames without repeated allocations. A particle vector or a list of visible objects is cleared via clear() (keeping the capacity) at the start of each frame and refilled, and this is a classic way to avoid allocations within a frame. So clear-and-minimize and "clear while retaining capacity" are two tools for two opposite situations.

Shrink-to-fit

Logically, this idea (and since C++11 a built-in method too) emerged as a special case of the section above, that is, bringing a container's capacity into line with its actual size, when you need to remove the "excess" reserved memory and leave exactly as much as is needed for the current elements. It's a relative of clear-and-minimize, but more general, and it reduces capacity to the size not only for an empty container, but for any container whose capacity noticeably exceeds the number of elements.

Vectors have the property of growing with headroom (usually doubling the capacity on overflow, to amortize the cost of growth), and after a series of additions and removals the capacity can turn out to be two or three times larger than actually needed. And if such a container is going to live in this state for a long time, the excess memory is wasted, and Shrink-to-fit says "fit the capacity to the size," returning the surplus.

The price is a separate call to shrink_to_fit(), but it's not mandatory and the standard allows an implementation to ignore it and reduce nothing. In practice, though, the major implementations honor it, but you can't rely on guaranteed release. Besides, reducing capacity is usually implemented through reallocation and moving all the elements into a new, precisely fitted buffer, that is, it's not a free operation but a full pass with an allocation, and abusing it in the hot path is a so-so idea. Before C++11 the same effect was achieved with the same swap trick as in clear-and-minimize, only swapping with a copy (std::vector<T>(v).swap(v)).

std::vector<RenderCommand> commands;
commands.reserve(10000);          // reserved with a large margin
build_commands(commands);         // but actually filled, say, 1200

// The frame is built, the command list will live until the end of the frame — return the excess:
commands.shrink_to_fit();         // capacity is ~fitted to 1200 (non-binding!)

// pre-C++11 equivalent:
std::vector<RenderCommand>(commands).swap(commands);

In games shrink-to-fit is applied selectively and deliberately, usually for long-lived containers that ballooned and whose excess memory is more advantageous to return than to hold. For example, containers filled during level loading with a margin and then "frozen" for the entire time the player is on the level, and after loading it's reasonable to shrink them, freeing memory for gameplay.

But in per-frame, reused containers shrink-to-fit is usually contraindicated, and there the excess capacity is a handy feature that spares you allocations on the next fill, and shrinking such a container would mean paying for a reallocation every frame. So the practical rule in engines is: shrink only what ballooned and won't grow again.

Safe bool

This is a solution to a tricky problem before C++11: how to give a class the ability to be used in a boolean context (if (obj), while (ptr)) without opening the door to dangerous implicit conversions in the process. The naive solution was to add an operator bool(), which generally works but has horrible side effects in the form of an implicit conversion to bool, and bool to int, so that nonsensical expressions like obj << 1, obj + 5, or comparing two unrelated smart pointers via conversion to bool suddenly start compiling, and the object "leaks" into arithmetic and comparisons where it has no business.

Safe bool worked around this by returning not a bool but a pointer to a member function (or another type convertible to bool in a condition but not to int), and such a type is fit for a check in an if, because a pointer is compared against zero, but it doesn't participate in arithmetic and doesn't convert to a number, which cuts off the entire dangerous class of expressions. The implementation was generally cumbersome: you needed a private member type, returning a pointer to a dummy method, and all of it just so that if (obj) would work while int x = obj + 1 would not.

The price was the verbosity and non-obviousness of the solution, plus it still didn't close all the holes, and only C++11 solved the problem, finally, by introducing explicit conversion operators like explicit operator bool() const. The keyword explicit means the conversion fires automatically in a boolean context (the condition of an if, while, &&, ||) but does not fire implicitly anywhere. In a word, exactly what safe bool achieved through a nightmarish workaround is now expressed in a single word.

The idea itself in its canonical form was described by Björn Karlsson and popularized by the Boost authors (where it was used in boost::shared_ptr and other smart pointers before C++11), with a detailed breakdown in articles of the period. The appearance of explicit conversion operators in C++11 was largely a matter of dragging their ideas into the standard, when the committee acknowledged that since everyone was cobbling together this idiom, the language needed a proper mechanism.

// pre-C++11: safe bool via a pointer to member (cumbersome!)
class Handle {
    void (Handle::*safe_bool_)() const;
    void this_type_does_not_support_comparisons() const {}
public:
    operator decltype(safe_bool_)() const {
        return valid_
                 ? &Handle::this_type_does_not_support_comparisons
                 : nullptr;
    }
    bool valid_ = false;
};

// C++11 — the same thing, in one word:
class Handle2 {
    bool valid_ = false;
public:
    explicit operator bool() const { return valid_; }
    // works in if, but not in arithmetic
};

Handle2 h;
if (h) { /* OK */ }
// int x = h + 1;   // compile error — and thank goodness

In modern engine code this is simply explicit operator bool, and it's worth adding to all handles, optional wrappers, and result types, so they naturally work in an if while staying protected from accidental arithmetic silliness. Safe bool is interesting mainly as a historical lesson, where a whole cumbersome idiom existed only because the language lacked a single keyword, and its disappearance is a good example of how language evolution drags good ideas inside.

Type Safe Enum

Formerly a separate implementation, and since C++11 a language feature too, for creating enumerations that don't suffer from the type-safety holes of the classic C-style enum. The old unscoped enumerations had three woes: their names "spill" into the surrounding scope (two enums with a member Red conflict), they implicitly convert to int (which lets you add, compare, and confuse values of different enumerations), and their underlying type is unspecified, which hinders forward declaration and size control.

Formerly this was cured by wrapping the enumeration in a class, and then the values became static constants or objects of the wrapper class, the scope was confined to the class, and implicit conversions to int were forbidden. The result was a "real" type-safe enum, where the values must be qualified by the type name, and different enumerations can't be mixed. The compiler itself caught attempts to use a value of one enum where another is expected.

The price was verbosity, and you had to reproduce all of an enum's behavior by hand (down to use in a switch, as a template parameter, as a value), which was hard and cumbersome.

C++11 solved the problem by introducing enum class (scoped enumerations), and such names no longer pollute the scope (Color::Red), there's no implicit conversion to int (an explicit static_cast is needed), and you can specify the underlying type (enum class Flags : std::uint8_t), controlling the size and allowing forward declaration. This is exactly what the wrappers were after before, but now it's part of the language.

enum class in C++11 was a proposal that came in part from Stroustrup too, and became the canonical solution, while where you want arithmetic/bitwise operations, for which enum class requires either explicit casts or operator overloads, people still write helper wrappers and macros.

// C-style enum — leaky:
enum Color { Red, Green, Blue };        // Red leaks into the scope
enum Fruit { Apple, Banana, Red2 };     // name conflict if you call it Red
int x = Red + Blue;                     // compiles (when it shouldn't)

// C++11 enum class — type-safe:
enum class Team : std::uint8_t { Red, Blue, Neutral };   // size specified
Team t = Team::Red;                     // name qualified
// int y = t;                           // error: no implicit conversion
int y = static_cast<int>(t);            // explicit cast needed — the intent is visible

In games type-safe enumerations have long been an everyday thing and a great blessing, because game code is riddled with enumerations, like states (idle/walk/attack/dead), commands, factions, damage, or collisions. The C-style enum in a large codebase was a constant source of bugs, and mixed-up values of different enumerations compiled and masked errors, so enum class catches this at compile time. So type safe enum in the form of enum class became a "free" practice that costs nothing at runtime but noticeably reduces a class of errors, and it's worth applying by default everywhere, except for cases of compatibility with an old C API.

Attorney-Client

This idea solves a problem inherent in the friend mechanism: when you declare a class or function a friend, you give it access to all the private members at once, and there's no way to say "be friends only with these three methods, and not with the rest," which violates encapsulation. And a friend often needs access to one or two details, but it gets the keys to the whole apartment.

The idea is that the "client" (your class with private members) is friends not directly with whoever needs access, but with a "holder" class (attorney), or counsel. The attorney is a friend of the client and therefore sees everything private, but exposes outward only a narrow, carefully chosen set of static methods that open exactly those details a specific acquaintance needs. The acquaintance is now not your friend, but works through the attorney and gets access only to what the attorney decided to provide. This way a big friendship becomes only the necessary acquaintance.

The price is an extra layer of template code, purely to configure access, which complicates understanding and is needed fairly rarely. Often the need for attorney-client is a sign that you've made the class too large and should reconsider its architecture (perhaps the private detail should be a separate entity). Plus it's still not real protection, and the keys to the apartment are now with the attorney, and sufficiently motivated code can still bypass encapsulation. The idea expresses intent but doesn't guarantee the absence of keys to the apartment altogether. This is a niche but elegant technique for libraries that need to open part of their internals to specific cooperating components without flinging them open to the whole world.

class Renderer {
    void set_internal_state(int);     // private detail
    friend class RendererAttorney;    // friends only with the attorney
};

// The attorney exposes EXACTLY ONE method outward, and only to trusted code
class RendererAttorney {
    static void set_state(Renderer& r, int s) { r.set_internal_state(s); }
    friend class DebugOverlay;        // here's who the attorney permits
};

// DebugOverlay gets access to
// one detail, without seeing the rest of Renderer's private parts

In games attorney-client appears in specific places at the boundaries of engine subsystems, where, say, a serializer or a debug inspector needs to dig into a class's private guts, but only into strictly defined ones, not all of them. Instead of making the serializer an all-encompassing friend of every class (flinging open all the encapsulation just to save a couple of fields), the attorney opens up exactly the needed fields to it.

But in general, large engines more often solve this task through reflection systems that register serializable fields explicitly, or through a well-thought-out split into public/private interface, than through attorney-client. The idiom remains more of an elegant tool from the library author's arsenal, useful to know so you can recognize it in someone else's code and understand that what you're looking at is an attempt to make friendship granular. In everyday game code you'll see it rarely, and that's probably fine, and if you find yourself frequently doling out private access, it's usually simpler to reconsider who owns what.

Part 4 →

← All articles