Craft

Don't panic, you've just landed in AA+ gamedev

Dec 3, 202510 min
Don't panic, you've just landed in AA+ gamedev

I wrote this article about ten years ago, when I'd just joined the big studio EA SpB. I'd have forgotten about it, but recently while going through old notes and drafts on an old HDD I decided it's still relevant — only the numbers have grown. Back then projects under a million LoC seemed like giants; that's probably still a lot, but now it's just the engine code. The essence hasn't changed, the numbers just got bigger.

I remember the day I first sat down at my desk in the office. Before that I'd worked on other projects, and a codebase of 100k lines including libraries, engine and logic seemed — well, pretty sizable. And here I downloaded the repo, opened the IDE, and it hung for about fifteen minutes indexing files. I stared at all this mess and thought: "Is this normal? Did they give me the crappiest junior PC? Did I already break something during onboarding? Are we all going to die?" No, everything was fine, I'd just encountered the industrial codebase of a big project for the first time.

At that moment the team was shipping a major Sims Mobile release that had slipped almost a month, and honestly there simply weren't resources for proper onboarding. The PM handed me a simple warm-up task, so to speak — to make the chairs in the home editor keep their position and size between launches, because, as you've probably guessed, they didn't, and ended up at the default spawn points. Sounds elementary, right? Save the coordinates to a config, read them on start, and the task is done. Except I had no idea where to look for the code that saves the config and the objects, what the class is called, where the config and the chairs themselves live, and whether there's even a system for such things at all or whether I'd have to write it from scratch. Small spoiler: there was no system, all objects in the home always spawned at the points where the designer placed them, i.e. there was no save for the home editor, but there was one for the game.

At university they taught us algorithms very well and, surprisingly, for two whole years we wrote from scratch the dream OS of one particular professor and nurtured his desire to become the new Torvalds rather than make those games of yours, and at my previous job I had to write almost all the code from scratch, out of my head or at most from whatever the in-house algorithmist-mathematician brought in his beak — that's the specifics of developing security systems and various hydroacoustic-underwater software, you can read about it in the notes of a sled cat. None of this prepared me to open a project of nearly two million lines without the orderlies watching.

These days it's the norm — I don't mean the orderlies, mind you. If you've joined a studio working on something that isn't a tiny indie project, the first thing you'll have to learn is to navigate this monster of inherited code, sketchy patterns and unique ideas aged for umpteen years, written by a great many people before you. It doesn't matter whether it's an in-house engine, a licensed one, or something assembled from open-source components, but modern sizes still frighten the tender worldview of a newcomer, and most likely the first weeks you'll feel as if you were dropped into an unfamiliar city without a map, acquaintances or means of communication — the street names seem familiar, the intersections look alike, but it's utterly unclear how to get to the library at three in the morning. The good news — it passes, or else you pass on; you just need to know how to approach the task properly.

The giant codebases of modern engines

Modern game engines are giant software systems, and the real volume of source code often turns out to be much larger than expected. Exact figures are rarely published officially, so you have to rely on external estimates, studies and fragmentary facts.

Estimates of the size of Unreal Engine's codebase vary enormously even with an open repository available. Different sources often cite a ballpark of around 2...20+ MLoc, but that includes the engine itself, the editor, the tooling, plugins and auxiliary modules, and the large spread is more likely explained by the closed nature of most of the module and plugin code, which makes up 70% of the engine's ecosystem (although the engine itself is free and open, mind you). And it's hard to make correct calculations in projects with heavy code generation, templates and dependencies.

Despite its reputation as a "lightweight" engine, Unity is actually also huge; the bulk of the engine is closed (but there's a leaked repo from 2015 and 2017 + personal experience working with version 5 of the engine), and the sources of the C# layer of the UnityEngine and UnityEditor modules contain several hundred KLoc. The C++ engine (runtime + backends + rendering + asset pipeline) is estimated at up to about 1MLoc; the developers themselves posted somewhere that the pure C++ part of the core systems takes around 500KLoc.

Godot is the most compact of the "big" engines here, but even it crossed the 1MLoc mark. A quick analysis of the repo gives a total volume of around 1MLoc, including resources and scripts; pure C++ takes at least 500KLoc. Godot's relative compactness is achieved not through good code and architecture, but because of a smaller number of platforms, the absence of a functional 3D editor at the level of UE, and heavy systems like Blueprint.

Dagor — an industrial engine of significant age and deep specialization, actively developed for over 15 years. There are no direct public figures, but by the developers' internal estimates and the scale of the projects (War Thunder, Enlisted, CR) the codebase is ~2+MLoc, with a significant part being C++ optimized for various platforms, including consoles, PC and proprietary toolchains, plus a large amount of tooling and a custom programming language, daSlang.

4AEngine — the folks don't officially disclose the size of the codebase publicly, but by the scale of the renderer, tools and platform matrix, and some insider info, the volume is estimated at around 500KLoc, plus an internal Level Editor; the engine is very compact compared to UE/Unity, but it's oriented toward a single game, so its architectural density and complexity are extremely high.

Unreal    ████████████████████████████████ 2-20 MLoc
Dagor     ████████░░░░░░░░░░░░░░░░░░░░░░░░ ~2+ MLoc
Unity     ████████░░░░░░░░░░░░░░░░░░░░░░░░ ~2 MLoc
4A Engine █████░░░░░░░░░░░░░░░░░░░░░░░░░░░ ~2 MLoc
Godot     ██░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ ~1 MLoc
All this code was written by people just like you

The first thing to understand — all this code was written by people just like you, which unfortunately isn't always a compliment, because my colleague's and my approaches to design are radically different, and "hack, hack and into production" I see at review far more often than "let's sit down and think." They're smart (that's also not always a compliment, since smart people often write code the rest of us suffer from later), experienced, but listen, you're no fool either, otherwise you wouldn't have been hired. Years went into building this codebase, maybe more, but in the end everyone started out a newcomer in it, even the one who wrote the very first lines. There's no magic here — well, no, I lied, there's a ton of magic in there, and the older the commit, the more magic there usually is, but even so it's just programming, like on your little pet projects. And here you have to figure out how to even build this thing; each of the 10+ game projects I took part in had its own build system (eamake > premake > xmake > meson > jam > cmake > UBT > ninja > make > fastbuild > msbuild) and I feel the list is far from complete. Every big codebase had its own dependency build policy, and its own set of tambourines and sacred rituals for setting up the build environment. If you're lucky, but that's rare (I've been lucky only twice out of all the cases), this process is automated, but I've never once seen everything build on the first try without a fuss.

Ideally this was documented somewhere at some point, but that documentation is most likely outdated, and usually it's among the duties of new programmers to also fix up the document as they set things up. If there's no such document at all, and that's also a frequent case, you may be granted the honor of writing it and making an indelible impression on the lead. The build process is usually very homegrown and tailored to the specific studio, so it's hard to give universal advice. Depending on the company, support might have set everything up before you arrived at your workstation, but I prefer them to install only the base and to deliver the rest myself with the right permissions, and this often saves a ton of time, because you end up in a familiar working environment rather than a plastic room where everything is in unfamiliar places.

If it doesn't build — reread the documentation, if there is any. It's unlikely there's a broken build sitting in the repository; if there were, you'd see seniors wandering around like freshly risen zombies in search of fresh brains. It's not shameful to ask about build problems in the first month, since nobody expects an outsider to know all this sacred knowledge. Second point: people appreciate it when you try to figure things out yourself, and rare, if silly, questions are generally a good idea — "because I have literally no idea what OpenEXR is or where to get it?" instead of the generic "it's broken" will be received fine. To a specific question you'll get a specific answer faster, and it'll take less time from whoever's answering.

Finding the chairs config in a codebase of a couple million lines

Congratulations! It finally built on your machine, and maybe it took only 40 minutes, or maybe three days and claimed your keyboard as a sacrifice, but you managed it, and now you're faced with the task of actually finding that chairs config and adding saving to it. Sounds simple, right? All that's left is to find that damn config and the system serving it in a codebase of a couple million lines. At this point an almost irresistible temptation arises to walk up to the nearest senior and ask: "Hey, where do we store the chairs?" At which point senior-Alexander, without looking up from the debugger (where he's currently figuring out why the little house glitches only on PlayStation, only on Thursdays, and only if the player is wearing a red hat), will instantly answer: "Ah, that's in Game/Play/Home/HomeInventory/KesloInventoryIndoor.cpp, line 2847, and no, I didn't forget the letter r. Just don't forget to register the handler in HomeManager, otherwise the system will work everywhere except that chair. Oh, and check you didn't break the placement of chairs outside the home, because there's KesloInventoryOutdoor.cpp."

Five seconds and you'll have the answer to a question whose search could have taken umpteen minutes. Magic? Telepathy? A photographic memory developed over years of crunch and reading crash logs at 3 a.m.? Spoiler — no, seniors don't remember everything by heart. It's physically impossible when you have 2+ million lines of code, 2K+ engine modules, 3K+ UI elements, 150k+ assets and 470 different systems that interact with 230 others. Add to this legacy code written by people who left the studio in 2013, and comments in three different languages — English, Russian, and my favorite: "what was I smoking when I wrote this in C++."

Seniors don't know the code by heart. They're just good at searching. And this is a skill that can and should be learned. I might be revealing some secret, but usually the senior tomatoes are busy putting out fires in prod, reviewing pull requests that intend to break half the game, optimizing code that's been lagging on consoles for the last 5 years, and explaining to designers that a feature "in a couple of hours" will actually take a month, or trying to understand why the physics suddenly started behaving... creatively. Helping you find the file with the chair config sits at the very bottom rung of today's priorities, but they're people too and not strangers to humanity, so you will get the answer to your question.

After a couple or three months of searching on your own, you'll find what you need in five minutes instead of twenty, you'll learn the project's architecture better than any documentation, you'll stop fearing the huge codebase, and you yourself will become that person with "encyclopedic" knowledge. And finally, the magic — which is also experience. After six months of working with the project you'll simply think: "A chair... that's, surprisingly, the home inventory!" — and immediately open Game/Play/Inventory/InventoryHome.cpp. There it is, works in 95% of cases, takes 10 seconds, but requires 6+ months of working with the project.

Ask if you've searched 20+ minutes without success

Knowing how to search yourself doesn't mean "never ask questions." Ask if you've searched 20+ minutes without success — that's probably the boundary where a newcomer should ask for help; maybe the code is in a non-obvious place or uses strange naming you couldn't have guessed. Ask if the code clearly looks wrong and you suspect a bug. Ask when you need to make an architectural decision or extend existing functionality, and you definitely need to ask if the code is covered with comments like // TODO: Ask John about this and John left 3 years ago. And when you've searched everything and come to a senior with the question "I looked for that chair for 30 minutes, tried searching the whole project, looked in the fridge, checked the home classes, but found nothing. Where could FooBazChair be?", the senior will not only answer the question but maybe also give advice on how you should have searched. Maybe the code was called FooBazKreslo, not FooBazChair, for historical reasons.

One useful trick when looking for parts of code that relate to the editor, models, or non-system features is to search by strings visible to the user. The editor is oriented less toward professional programmers and more toward its users — designers, artists, sound engineers — and therefore its documentation often turns out to be in far better shape than what's available to us programmers, especially in licensed engines; and knowing how to use the editor and the game itself helps you better understand the corresponding code. Going back to the FooBazChair example, it often helped me to launch the editor, find those ill-fated chairs there, look at their properties or visible components, then go back into the engine code and look at the files containing those names. Maybe it's in a string table and it's a key-value pair, which means we then search by the key, which ultimately leads us to the right system.

It's important to remember that the names the player sees don't always match the names in the code. In the early stages of development, functions and systems often get temporary internal names, and the official names appear later. So FooBazKreslo will haunt your nightmares for a long time, and FooBazChair will very rarely be used in the code, for historical reasons. By such subtle semantic nuances it's very easy to detect which part of the code was written by Eastern European folks and which wasn't — just as it's easy to detect Indian, Australian and Asian semantic styles. So if searching by the obvious keywords leads nowhere, that's no reason to rush off and refactor half the engine; maybe you're just missing part of the internal context (the magic), and this is a good moment to clarify the details with colleagues.

Version history is the most underrated tool for studying code

Version history still remains the most underrated tool for studying code. Git itself contains detailed changes for each file, which lets you understand when a particular function appeared, who wrote it, which files were changed together with it, when Masha got married and why John was fired. This is especially useful if you're doing a similar task: you can see which parts of the project were touched earlier and not miss possible dependencies. Another typical mistake of newcomers in large codebases, besides avoiding searching the commit history, is reinventing the wheel. Believe me, every possible kind of wheel with colored pedals and crooked handlebars has already been invented before you. But on getting a task that requires a standard mathematical or utility solution, for some reason we try to write our own implementation, not suspecting that such a function already exists in the project.

At best this comes up at review, if the team lead and a knowledgeable senior show up, and you'll have to redo the work; at worst this duplication makes it into the main code and the project becomes more complex. So in large engines you should always proceed from the assumption that someone has already tackled and solved a similar task, and first look for ready-made code. Even in the case of the FooBazKreslo chair, it's easier to study how other dressers save their position than to write the logic from scratch. Sometimes it's even useful to step through these sections in the debugger to understand the mechanism.

Studio code styles are altogether a separate part of a project's culture. Every (every, Carl) developer has their own preferences, but in an existing codebase you must adhere to the already-adopted style. And this is clearly not the place to show off personal creative ambitions — whatever the system is, you need to learn it and follow it. Freedom to experiment remains for home projects, while in work code a uniform structure matters.

Finally, you have to remember that the source code will always be the ultimate truth. Documents go stale, the wiki lies, comments don't match reality, and colleagues' answers are sometimes based on memories rather than facts. If you need to understand exactly "what," "where" and "when," all of that exists only in the code. The only question you can't always derive from the program's text is "why the hell," and for the exact answer and the direction to move in you really will have to go and bow to the long-timers.

A tired senior after the 73rd 'where does it live...' question

P.S. A tired senior after the 73rd "where does it live..." question this week. If this article helped you — share it with your colleagues, especially with those who've just joined the project.

← All articles