Game Design, Programming and running a one-man games business…

Code bloat has become astronomical

There is a service I use that occasionally means I have to upload some files somewhere (who it is does not matter, as frankly they are all the same). This is basically a simple case of pointing at a folder on my hard drive and copying the contents onto a remote server, where they probably do some database related stuff to assign that bunch of files a name, and verify who downloads it.

Its a big company, so they have big processes, and probably get hacked lot, so there is some security that is required, and also some verification that the files are not tampered with between me uploading and them receiving them. I get that.

…but basically we are talking about enumerating some files, reading them, uploading them, and then closing the connection with a log file saying if it worked, and if not what went wrong. This is not rocket science, and in fact I’ve written code like this from absolute scratch myself, using the wininet API and php on a server talking to a MySQL database. My stuff was probably not quite that robust compared to enterprise level stuff, but it did support hundreds of thousands of uploaded files (GSB challenge data), and verification and download and logging of them. It was one coder working maybe for 2 or 3 weeks?

The special upload tool I had to use today was a total of 230MB of client files, and involved 2,700 different files to manage this process.

You might think thats an embarrassing typo, so I’ll be clear. TWO THOUSAND SEVEN HUNDRED FILEs and 237MB of executables and supporting crap, to copy some files from a client to a server. This is beyond bloatware, this is beyond over-engineering, this is absolutely totally and utterly, provably, obviously, demonstrably ridiculous and insane.

The thing is… I suspect this uploader is no different to any other such software these days from any other large company. Oh and BTW it gives error messages and right now, it doesn’t work. sigh.

I’ve seen coders do this. I know how this happens. It happens because not only are the coders not doing low-level,. efficient code to achieve their goal, they have never even SEEN low level, efficient, well written code. How can we expect them to do anything better when they do not even understand that it is possible?

You can write a program that uploads files securely, rapidly, and safely to a server in less than a twentieth of that amount of code. It can be a SINGLE file, just a single little exe. It does not need hundred and hundreds of DLLS. Its not only possible, its easy, and its more reliable, and more efficient, and easier to debug, and…let me labor this point a bit… it will actually work.

Code bloat sounds like something that grumpy old programmers in their fifties (like me) make a big deal out of, because we are grumpy and old and also grumpy. I get that. But us being old and grumpy means complaining when code runs 50% slower than it should, or is 50% too big. This is way, way, way beyond that. We are at the point where I honestly do believe that 99.9% of the code in files on your PC is absolutely useless and is never even fucking executed. Its just there, in a suite of 65 DLLS, all because some coder wanted to do something trivial, like save out a bitmap and had *no idea how easy that is*, so they just imported an entire bucketful of bloatware crap to achieve it.

Like I say, I really should not be annoyed at young programmers doing this. Its what they learned. They have no idea what high performance or constraint-based development is. When you tell them the original game Elite had a sprawling galaxy, space combat in 3D, a career progression system, trading and thousands of planets to explore, and it was 64k, I guess they HEAR you, but they don’t REALLY understand the gap between that, and what we have now.

Why do I care?

I care for a ton of reasons, not least being the fact that if you need two thousand times as much code as usual to achieve a thing, it should work. But more importantly, I am aware of the fact that 99.9% of my processor time on this huge stonking PC is utterly useless. Its carrying out billions of operations per second just to sit still. My PC should be in super-ultra low power mode right now, with all the fans off, in utter silence because all thats happening is some spellchecking as I type in wordpress.

Ha. WordPress.

Computers are so fast these days that you should be able to consider them absolute magic. Everything that you could possibly imagine should happen between the 60ths of a second of the refresh rate. And yet, when I click the volume icon on my microsoft surface laptop (pretty new), there is a VISIBLE DELAY as the machine gradually builds up a new user interface element, and eventually works out what icons to draw and has them pop-in and they go live. It takes ACTUAL TIME. I suspect a half second, which in CPU time, is like a billion fucking years.

If I’m right and (conservatively), we have 99% wastage on our PCS, we are wasting 99% of the computer energy consumption too. This is beyond criminal. And to do what? I have no idea, but a quick look at task manager on my PC shows a metric fuckton of bloated crap doing god knows what. All I’m doing is typing this blog post. Windows has 102 background processes running. My nvidia graphics card currently has 6 of them, and some of those have sub tasks. To do what? I’m not running a game right now, I’m using about the same feature set from a video card driver as I would have done TWENTY years ago, but 6 processes are required.

Microsoft edge web view has 6 processes too, as does Microsoft edge too. I don’t even use Microsoft edge. I think I opened an SVG file in it yesterday, and here we are, another 12 useless pieces of code wasting memory, and probably polling the cpu as well.

This is utter, utter madness. Its why nothing seems to work, why everything is slow, why you need a new phone every year, and a new TV to load those bloated streaming apps, that also must be running code this bad.

I honestly think its only going to get worse, because the big dumb, useless tech companies like facebook, twitter, reddit, etc are the worst possible examples of this trend. Soon every one of the inexplicable thousands of ‘programmers’ employed at these places will just be using machine-learning to copy-paste bloated, buggy, sprawling crap from github into their code as they type. A simple attempt to add two numbers together will eventually involve 32 DLLS, 16 windows services and a billion lines of code.

Twitter has two thousand developers. Tweetdeck randomly just fails to load a user column. Its done it for four years now. I bet none of the coders have any idea why it happens, and the code behind it is just a pile of bloated, copy-pasted bullshit.

Reddit, when suggesting a topic title from a link, cannot cope with an ampersand or a semi colon or a pound symbol. Its 2022. They probably have 2,000 developers too. None of them can make a text parser work, clearly. Why are all these people getting paid?

There was a golden age of programming, back when you had actual limitations on memory and CPU. Now we just live in an ultra-wasteful pit of inefficiency. Its just sad.


17 thoughts on Code bloat has become astronomical

  1. I’m taking up PC game programming (been doing other programming for many years). I figured that Unity seems very popular and probably a good place to start. Installed Unity, created a new empty project. Suddenly my computer fans start spinning like crazy. Turns out it was Dropbox trying to sync my new project folder. It was 16 000 files, totalling about 1GB. That was not a typo either. For an EMPTY project!
    I’m now learning OpenGL instead…

      1. After many years of custom engines I’ve just started using UE5.

        If I so much as add a blank space to some files (which are max a couple of hundred lines long) it can take OVER A MINUTE to recompile that one file. These aren’t core engine files, they’re my game files. They’re not doing anything complex either, just some data housekeeping.
        It’s maddening.
        I don’t know how people put up with it, it’s like trying to run with a sack of potatoes tied to each leg.
        As the build tool is open source (It needs its own program to figure out how to compile itself efficiently! And it still takes OVER A MINUTE!) I’m hoping that with some tinkering I won’t have to put up with this.

        I’m increasingly of the opinion that the more coders you have working on something, the slower and more bloated it is.

        1. > I’m increasingly of the opinion that the more coders you have working on something, the slower and more bloated it is.

          I think there will be a strong correlation there. Programmers produce code, so more programmers will generally mean more code. More code generally means more opportunities for bugs, and more code generally takes more time to execute than less code. Obviously these aren’t 100% strict, but they should generally be true.

          An interesting case study would be the Tiny C Compiler vs GCC. TCC is a 1MB executable, including it’s own assembler and linker. And it compiles C code *nine times* faster than GCC. And much faster than Clang as well. It can push around 1 Million lines of code per second. TCC was principally made by one person. So clearly a single developer can beat out dozens in the speed category, even for relatively complex tasks. TBF, this single developer is Fabrice Bellard, who is a freakishly talented programmer.

          The question I feel is always “necessary complexity vs incidental complexity”. You could describe this with the question “Is this thing a million lines of code because that’s legitimately how complex the problem is, or is it a million lines of code because the company does not care?” Chrome is over 25 Million lines (https://www.openhub.net/p/chrome) and web standards are definitely quite complex; I suspect any reasonably modern-compliant web browser would probably be measured in millions of lines, but 25? It was only 5 million about a decade ago. I don’t really think web technology has gotten five times more complex over the last decade. Will it be 125 million a decade from now? Apparently Google’s entire product suite is over 2 Billion lines of code, and honestly that does sort of explain some things to me.

    1. You may enjoy the Godot Engine in that case. It comes as a ~70 MB executable and an empty project is under 10 KB, and you don’t have to wait 10 minutes to build an executable for your game even for a small game project unlike with Unity.

  2. Cliffski, you are so absolutely correct. I have these AI apps – that I designed and built myself – after torturing myself with Tensorflow and finally pitching it. This AI stuff is all custom – the UI is a bunch of graphics stuff that would (and does!) happily run on a VGA monitor, on an old Pentium class machine, that runs an early version of Fedora Linux that I compiled from source. The machine is an old uniprocessor machine I found in an electronics recycle-bin, and repaired. It has a nice high-res screen, and a vector-oriented time-series database that I update every night. The thing runs *fast*. (I bought a Windows “gaming machine”, which runs pretty quick also – because it is running my old stuff, some of which had to run a bunch of special math-oriented interpreter code. New machines do not seem to be much faster, since the multiple cores cannot really be used, unless you recode sequential code into re-designed multi-thread parallel style code frames… and if your stuff runs fast enough, why break solid stuff that works well?
    What is wild is how astonishing good Linux design and architecture can run, on an old Pentium uniprocessor. The machine runs a modified Linux kernel 2.6.25-14, has 250mb RAM, and a 4.3 gigabyte disk. The processor is a Pentium III (Coppermine). I have a source-compiled version of Mplayer that I can stream music to (works fine in real time), and I can even watch medium resolution video on this box. And, with a source-built version of WINE, I can run my original vector database tool (and the interpreter it uses), which the AI stuff then uses to crunch the nightly forecasts.
    Why do I use this old thing? Because it JUST WORKs! It is comical. It is working so well, it is silly. When I think of the bloated nightmare of softglop that I had to download, so I could compile and run Tensorflow, it is just comical. And Tensorflow is good – and my experience was back with version 1.4, quite a while back. I remember how much hoohaw it was, just to get Graddle to work (Google’s thing to do builds…)
    I understand we need to move forward, and GPUs are great for massive AI apps using images or terabytes of natural language data.
    But we are basically a tiny one-man-band, mostly, and I need to make money. My AI stuff is tiny – almost back-of-the-envelope simple.
    The key thing is that is all works. I’ve added to some real-estate holdings, and bought a silver sports car and also some sporting toys with some of the trading proceeds. Plus the AI has given me insight – contrary to big-bank analysts opinions – and induced us to retain positions, which we might have dumped (and deeply regretted) during the Covid pandemic.
    I honestly believe that *much* higher quality code and design results from working in a constrained – and well-understood – environment.
    An awful lot of modern bloatcode applications, have so many internal linkages and dependent cross-connections, that they are almost certain to encounter failure conditions – or have security-holes – either of which can kill the code if it is responding to real-world (with it’s wild randomness) situational inputs. Simpler designs are more robust.
    The bloat is raising – rapidly – the risk of system failure everywhere.
    And as you have pointed out, the bloat is reaching insane levels.
    My simple Pentium III build, started out as an experiment. I never
    expected it to evolve into something that worked so well as it does. But there are specific, interesting scientific reasons why it works as well as it does. Look at really good technology, and good design, and good science. There is often simplicity and clarity evident.
    Bloat is bad. It is worse than noise. Bloat is really a fraud, which cost the user, impairs his ability to conform and adjust design, and thus benefits the bloat-maker. Look at Apple’s restrictive ecosystem.
    The bloat helps expand the moat. It’s not just lazy programmers, it is by design. So it is a double-bad phenomenon.
    Red-Hat used to be a good, well-designed Linux distro. But each year the bloat was made bigger, and more ugly and complex. Once the bloat reached scale, where it became insane and grotesque, Red Hat was sold to IBM for 134 billion US $.
    The bloat is the moat.

  3. I recently set out to look at setting up a wiki for a niche social group interested in a particular area of retro games. And wanted to host the wiki on a cloud server, rather than use some free-tier of a closed service and potentially be gated-in. And like, I *should* be able to run a wiki on a trivially cheap server, I mean, it’s basically a simple server-side HTML template renderer.

    But looking into it, none of the popular solutions like MediaWiki were that clear on how well they could perform on a 512MB RAM VPS (simply the least amount of RAM I could get). And it became very clear that *none* of them seemed to be actively tested for low memory, despite how extremely simple the software *should* be. A popular simple wiki, DokuWiki, ships over 1 MB of JS with it’s pages, which isn’t some unbearable amount for a developed nation’s internet speeds, but it gives me no confidence that the developers were conscious of the resources their software uses. I don’t know what on earth the page could even be *doing* with that amount of JS.

    So I wrote my own, in Python which isn’t even a fast language, in a few weeks. It does what I need, in ~2500 lines of Python and HTML templates, and some reasonably conservative picks of dependencies. The entire VPS, including operating system (Alpine Linux), Nginx reverse-proxy and wiki server uses about 100 MB of RAM under normal traffic loads. No JS, simple pages ship under 15KB of data for the HTML and CSS. Because c’mon, it’s a simple page renderer, this is the kind of performance that it *should* have.

    So yeah I don’t even think it’s always the programming languages that people choose, but that it’s this negligent style of programming where people just throw in any dependency, without thinking about it, and never question if their software is consuming reasonable resources. It’s always “What’s easiest for me to write” and externalize the cost of their decisions onto the user.

      1. Well, sensible given that I had some time, wanted to brush up a bit on some web server programming, and just enjoy programming.

        I do plan to offer it as “Look, if you want a simple very cheap wiki, and you’re a programmer, here’s a sub-3k lines wiki server, you can just read all the code in a reasonable amount of time, and feel confident that you understand it in-depth.” So maybe it’ll be a neat time-saver for someone out there, over setting up say MediaWiki or something. I really appreciate that style of project. For my own quite unimpressive blog I use a python script called Makesite, which is like 250 lines of Python, plus some HTML templates, and it generates your blog as static HTML content. And yeah, it’s nice because I’ve read the whole thing, and so there’s no surprises. The out-of-the-box behaviour is fairly thin, but that’s so you can understand it easily, and just make some modifications to do the things you want to. It’s definitely not for everyone, but it should be very reliable going forward. I expect Python, HTML and CSS to generally exist for at least another two decades so hypothetically I don’t have to do any upkeep for it beyond this point. And man, I wish more software was like that!

  4. Elden Ring is rather popular and seems like a reasonable comparison because the game world is rather large but the game seems rather small on disk compared to otherxs.

    At 45GB it’s not 2,000x larger than Elite but 750,000x.

    Even Minecraft has consumed an entire 1½GB somehow, or 20,000 Elites (Dwarf Fortress, infinitely more complex but without the graphics, is 30MB).

  5. Actually why should something as common as moving a list of files to a cloud server and verifying them take any code at all?

    Surely our operating systems should be capable of doing that securely without any additional software?

    Ditto for any webserver being able to setup a Wiki?

    Have we built up too many layers in technology and those layers are the bloat?

    On the other hand are the layers/bloat jobs and careers for other programmers? If we solved all or computing problems with low level modular compact solutions, some built into our OS’s, then would we put programmers out of a job?

    What if an AI woke up and logged on to find all this bloatware taking up CPU processing power it could use. Would it hack itself into our systems and live under the cover of bloatware?

    1. PS And think of the hardware manufacturers they are only managing about 20% improvement in CPU performance every 2 years.

      In a world of no bloat then why would people need a 1.2x speed boost on blisteringly fast slim apps and services.

      Whereas with bloat a one or two year old CPU will struggle with the latest bloatware taking ages to do what should be complete in milliseconds. Maybe bloatware is keeping the IT industry afloat.

  6. Of course all modern software is horrible. Modern programmers are horrible because modern culture is horrible. Click on my name to see my website for more.

  7. The real reward of becoming a successful indie dev is not money, but rather not having to deal with JavaScript and Kubernetes ecosystems.

Add Your Comment

* Indicates Required Field

Your email address will not be published.

*