Last week, I reported that I had started working on adding lighting to the dungeon walls and doors in The Dungeon Under My House, my second Freshly Squeezed Entertainment project.
Since then, I’ve been working on adding lighting to the floors and ceilings.
Sprint 2024-2: Pre-production and initialization
Planned and complete:
- Render dungeon based on light levels
Changing the brightness of the walls and doors was relatively easy: I just changed the tint when rendering each column in my raycasting code based on how dark the current cell is.
But I couldn’t do the same for the floors and ceilings, as I said last time:
While the walls draw columns of textures and can be tinted, the ceilings and floors draw individual pixels of textures, each of which represents the actual pixel color from the source texture as a Uint32 value.
Now, when I usually deal with color, I am dealing with RGBA values, with each of R, G, B, and A being a value from 0 to 255, or even using hexadecimal values if I want to match what I might see in HTML color codes. So 0xff would be equivalent to 255.
Drawing walls, the only color I deal with is my tint value, which is the dungeon light level’s color multiplied by its brightness percentage.
But drawing the ceiling, I need to directly modify the color value of the pixel in question, and I am still figuring out exactly how. The Uint32 value should correspond to RGBA-like values, and in this case, I believe it is ABGR8888 in SDL2’s representation.
I imagined that I would need to change the hex values that represented the pixel, and I didn’t know if the math was going to be complicated or straightforward.
But my colleague Joel Davis, who has helped me before with his insight and experience on this project, suggested that the way to handle it (which I later found supporting posts online that say the same) is to first change to RGBA, then change the color as usual, the change back to the Uin32 hex values.
And wouldn’t you know, it worked as expected!
I had some strange artifacts in terms of white lines at the edges of the walls, though. When I stopped drawing walls to see what was going on, I could see that my default brightness was applying anywhere that there wasn’t an actual cell to draw:
This seemed straightforward to solve: just change the default color to something darker.
But there were more pressing concerns. Namely, now that I was rendering the floor and ceiling with lighting data from the current cell, suddenly the game was rendering incredibly slowly, which was noticeable during transitions, such as turning or moving forward and backward:
It isn’t smooth at all.
I was struggling with how to optimize this code. According to my poor man’s profiling (interrupt the program in the debugger, and see what function/line it seems to be processing most often), the code was spending most of its time looking up the dungeon cell’s lighting data.
WARNING: it gets a bit technical again
Now, as a reminder, I’m doing raycasting to simulate a 1st-person 3D dungeon. My game’s viewport is 960×576 pixels, and while drawing walls is for the most part 960 columns of processing, drawing the ceilings and floors is more of a pixel by pixel situation.
Most raycasting tutorials have a viewport that mimics what computers were capable of back when raycasting was the only real technique for faking 3D aside from drawing sprites that just happened to look 3D. Those viewports were something on the order of 320×200 pixels. When you don’t have hardware acceleration and only a slow processor, having only 320×200 = 64000 pixels to draw is quite fast, especially if it is just a tutorial and leaves a lot of extra things as homework for the developer.
960×576 pixels is over half a million pixels to process. That sounds like a lot, but without processing lighting, and hardcoding the texture used, it was quite fast before.
But now that I am handling lighting, and I eventually want different textures based on the dungeon cell being drawn, I can’t really see a way around it processing them pixel by pixel.
My game isn’t meant to be a real-time one, as it is a turn-based party-based RPG, but I did have visions of using this same raycasting code in future games that might be real-time, and I was starting to worry that it wasn’t going to be possible.
I was also worried that this game was just going to feel slow and sluggish when players tried it.
So what was slowing things down? Well, my dungeon was being represented as a std::map
For each pixel, a projection was made to figure out what cell was being drawn, then I would search the map for the lighting for that cell, and I would process it to modify the color so that the pixel would be darker or lighter.
The code that I used to find the lighting data originally looked like:
gameData.dungeonLighting()[cell]
This code worked for the walls just fine: the cell always existed in the map.
But this code did not work for the ceiling and floor because the projected cell would sometimes represent a tile very, very far away near the horizon, and that cell was unlikely to be defined in my dungeon.
So here’s the fun part: accessing a map with the [] operator using a key that doesn’t exist in the map will ADD AN ENTRY TO THE MAP!
I knew this fact about std::maps, but I was careless with my coding. It was functionally correct in that rendering was, in fact, happening as expected. It was just a very suboptimal way to go about it.
According to my debugging, my map went from about 30 entries to about 10,000+ entries!
Which meant for each of those 960×576 = 552,960 pixels, I was searching through potentially 10,000+ entries in a map for a cell to get lighting data from. That’s a lot of accidental extra computation! Whoops!
So my first fix was to prevent additions to the code:
DungeonLight light; std::map<Point, DungeonLight>::const_iterator iter = lightLevels.find(cell); if (iter != lightLevels.end()) { light = iter->second; } else { continue; }
std::map’s find() doesn’t add entries to the code, and it returns the map’s end() iterator if it can’t find the entry.
Instead of needing to search through potentially 10,000+ entries, it was only 30ish entries, but that’s still a lot of computation time spent looking for data, especially if most of the cells being searched for are not in the map, so it is almost always worst-case scenario.
Even with std::map’s indexing/find operations running at logarithmic time (so, better than iterating through each entry), it is still too slow.
My initial attempts to solve this problem were to draw fewer pixels. I stopped drawing about 100 pixels before getting to the horizon, and I figured that I could always draw a single hazy fog sprite at the horizon to cover up the fact that nothing is getting drawn there. While it helped, it still felt jittery and slow. There is still too much computation happening, and I am still only drawing the ceilings and floors. What happens when I add back in the walls and doors, and when I start adding more things like other characters populating the dungeon?
Luckily, Joel Davis pointed me in the right direction. Why was I using the inherently slow map to represent my dungeon and lighting data when a 1-dimensional collection works much faster?
You know why, Joel (and anyone reading this who is also wondering)? Because until this project, I have never had to worry about how fast my rendering code was.
That isn’t to say that I was purposefully careless. I am not one of those people who say, “Eh, well, computers are fast enough today to not worry about it.” I think there are plenty of good reasons to still be respectful and efficient of someone else’s computer hardware’s and/or battery usage, and I try to be efficient in general.
But this is the first project that required me to worry about trying to do too much in one frame of rendering. Most of my past projects, the bottlenecks in my code were not related to rendering at all, but in pathfinding or other calculation-heavy systems.
Anyway, what Davis suggested is typical in games. Instead of a map or a 2-dimensional collection to represent a 2-dimensional tile-based map, use a 1-dimensional array/vector. Some very simple and quick math gets you the index you need into that collection, and lookups are no longer logarithmic but constant time.
Now, I will need more space to store the dungeon. Instead of a map of 30-ish entries, I need to accommodate entries in my vector for cells that might not exist. But I think this trade-off of speed vs memory is fine, as the extra memory used isn’t too great anyway.
So I hacked in a quick std::vector of DungeonLighting, piggybacking on my previous lighting code to populate it, and when I went to draw the ceiling and floor, I calculated the index of cell by:
index = cell.X() + cell.Y() * DUNGEON_DIMENSIONS;
Now, I can quickly check if the index is within the bounds of my vector. If it is, I then get the lighting data and process the pixel. Otherwise, I stop processing the pixel and move on to the next.
Mathematically, this means that my viewport needs only 960×576 = 552,960 potential lookups to process the ceiling and floor combined, which is a substantial improvement!
Visibly, IT IS SO MUCH FASTER, even after I put the wall and door drawing code back in:
And I wasn’t done yet! There was still room for improvement in my dungeon rendering code in general. Drawing my walls and my doors still used the old map to get lighting data, so switching to the vector version there resulted in even more improvement, and in fact, I believe that my future real-time projects are likely to be a possibility!
There is still more optimization to do. Lighting data wasn’t the only thing I was looking up in a map, as the current texture to use for walls and eventually ceilings and floors is also something I searched for. Switching to a vector representation of the dungeon data should result in a much, much smoother experience for the player.
End of technical details
Anyway, I’ve been very happy to get the game to render the dungeon so much more quickly and efficiently. My previous optimization efforts prevented the rendering from occurring in the first place when it wasn’t needed, such as when the player is standing still. There’s no need to re-render the dungeon if it won’t change, after all, and I loved how putting my phone down meant that the battery life didn’t drain due to extra, unnecessary processing.
But now I can also be assured that the game will feel snappy AND avoid draining batteries even more.
I look forward to finishing up the optimization and finally solving the problem of the door that resists being rendered according to the cell’s lighting data. So far, my troubleshooting shows me that it is grabbing lighting data from a cell outside of the dungeon’s boundaries, so clearly my math is wrong somewhere when I determine which cell to use for lighting. But I haven’t figured out how it is wrong and what would be correct yet.
Thanks for reading!
—
Want to learn when I release The Dungeon Under My House, or about future Freshly Squeezed games I am creating? Sign up for the GBGames Curiosities newsletter, and download the full color Player’s Guides to my existing and future games for free!
2 replies on “Freshly Squeezed Progress Report: More Dungeon Lighting Work”
[…] my previous report, I got into the details of how I fixed the silly way I was storing data in my game so that it was […]
[…] weeks ago, I reported that while I could light up the dungeon, it was too slow, but then I fixed a silly architectural representation issue that was slowing things down needlessly and it was much […]