CliffyPDA Memory Use

Adam C. Clifton

25 Jul 2021

CliffyPDA project is aiming to create a fast and usable operating system with a selection of connected apps for modern services on the constraints of teeny tiny low cost hardware. Using a Keyboard Featherwing with it's BlackBerry keyboard and screen with a Huzzah32 for wifi, bluetooth, battery support and a CPU I have more than enough to get something running, I just need to write ALL the codes :P.

Here is the Huzzah32 mounted on the back of the Keyboard Featherwing.

On the Huzzah32 is an ESP32 SoC (System on a Chip) is designed for IoT (internet of things) uses so it has a lot of useful features like low power sleep and the aforementioned connectivity options. It also has what I'd consider to be quite a beefy CPU, clocking in at 240mhz with 2(!) cores. The only disappointment for me is the teeny tiny amount of RAM, 520kb advertised, but once launched with all the overhead of drivers and libraries, we are down to about 270kb that we can use.

I'll note that there are other ESP32 boards that add an extra 4mb of RAM, but it is slower to access so I won't be using it for now. But I may add support for it in the future to quickly cache things locally.

I'll also note that most of these memory measurements are approximations from old notes and logs but should be good enough for an idea of where we've come from and how memory use has changed.

For graphics drawing, we create a full buffer of the screen to draw on, which is then copied to the screen to be displayed. With a 16 bit (2 bytes per pixel) that takes up 2x320px240 bytes or 150kb. Which leaves us with about 120kb for everything else, so things are looking pretty dire, that's not even enough to store a fullscreen background image!

The first obvious thing we can do to fit more into RAM is to squish down our images as much as possible. One method is to use a color palette, so instead of storing a 16bit RGB value per pixel, we store all the unique colors used in a table, then just store an 8 bit index into that table for each pixel. Assuming individual images don't use too many colors, this allllmost doubles the amount of image data we can store.

But we'll still need much more aggressive measures to claw back as much RAM as we can.

RAM free 120kb

Stop Using Arduino IDE

While the Arduino IDE is great for getting up and running quickly on external hardware, it's not suited to larger projects with many source files. Additionally the ESP32 has uses a bit of wrapper code and forced defaults in its Arduino setup, which causes extra overhead and memory use, whilst also restricting access to some features. So the first order of business was to start building directly with the ESP-IDF, a development framework direct from the ESP32 manufacturer, Espressif.

ESP-IDF has it's own system for project files, but that was not too much of a problem, as I already generate the project files for other platforms (Visual Studio on windows, Make on Linux, XCode on mac) from a script I created, so it was not too much trouble to add in ESP-IDF as another output.

One problem with ditching Arduino, is that I'm also ditching all the support libraries built around it, particulary things like drivers for the screen, keyboard and SD card. After many nights of confused poking, I put together a programming library to simplify access to the hardware, using the low level ESP-IDF libraries. That's available on Github if anyone else is foolish/brave enough to take the path I've chosen.

The only remaining problem at that point was the jlm-back-buffer graphics library I was using. It was built on top of the Adafruit GFX library so was very intertwined with that Arduino specific code. I managed to hack together a temporary wrapper to simulate what I needed of the Adafruit library to get it functioning.

At the end of this, I've gained about 30kb of RAM, which sounds small but is a 25% improvement.

RAM free 150kb

Drop Full Screen Backbuffer

With most of our RAM being taken up by the back buffer, being able to work without it would save us a lot of space.

We could do something like a dirty rect, where we track what parts of the screen change, and only draw and upload those to the screen. This would greatly reduce our screen size buffer, but would add complexity to app code to manage what parts of the screen are now dirty. I prefer the app code being able to blindly draw everything on every frame, because it's simpler for app developers, and will be more compatible with hardware acceleration if I'm lucky enough to have that in the future.

The method I eventually came up with, involves queueing up a series of graphics commands, that will then be run to render the pixels of the screen. This is similar to how 3d acceleration works with things like OpenGL, you are giving commands to the graphics card to perform in it's own time and send to the screen.

These command queues are then applied to a fraction of the screen at a time, in our case a 16x16 square, before being uploaded to the screen, resulting in a comparatively tiny back buffer of 512 bytes. There is additional storage required for the command queues, and caching, about 50kb in total, but it's still a gigantic memory saving.

At this stage I was also able to drop jlm-back-buffer as there was no longer a back buffer to manage!

A fun side effect of this command queue based rendering, is we can know earlier if a cell is unchanged since the last frame. Instead of checking the pixels of the final output after all rendering is done, we can just compare the command queues and if they are the same we can skip the rendering entirely. This is very dependent on what's happening on screen, but on the mostly static desktop view, the frame time has dropped from 11ms to 4ms. Doing less work like this can only help battery life.

Ram free 250kb

With this much free RAM, as long as we are careful with how we use it, we should be able to do lots of cool things!

Previous: Creature Sweeper Released
Next: Flashcard Clash Development Update 10