Page 1 of 1

Question Regarding Tile Engines

PostPosted: Fri Jun 26, 2015 3:10 pm
by Skeith
Hello All,

This is my first post here, after stumbling across ES a day or two ago. I've been binge watching AiGD and am enjoying it a lot, in addition to the Twitch stream.

I'll get to the point, seeing the development of ES has made me curious about tile engines, so I've done a little research but there's one question I don't quite have the answer to:

I understand the conventional approach of a 2D array of tile IDs, my question is this: using C++ and modern (4.X) OpenGL, how does one actually render tiles? Is it literally a case of making x * y quads (4 verts each), applying the relevant texture and rendering them? After applying a basic cull against an orthographic camera? Or have I missed something?

With an approximate Tile class/struct being something like this?
struct Tile {
VertexData
TextureData
TileID
};

Thanks in advance for any light you're able to shed some light on things. :)

Re: Question Regarding Tile Engines

PostPosted: Sat Jun 27, 2015 9:54 am
by dandymcgee
Skeith wrote:I understand the conventional approach of a 2D array of tile IDs, my question is this: using C++ and modern (4.X) OpenGL, how does one actually render tiles? Is it literally a case of making x * y quads (4 verts each), applying the relevant texture and rendering them? After applying a basic cull against an orthographic camera?

Yup, that's pretty much it. As with most things in programming, there's really no reason to over-complicate things unless there's a legitimate problem to be solved (e.g. performance, advanced features, portability, etc.).

Re: Question Regarding Tile Engines

PostPosted: Wed Jul 01, 2015 8:27 pm
by Falco Girgis
That's the basics.

If I were you, and I were using the OpenGL core profile, I would store that giant-ass array of geometry as a VBO GPU-side, and drastically reduce the overhead fo transferring vertices every frame.

Re: Question Regarding Tile Engines

PostPosted: Thu Jul 02, 2015 8:17 am
by K-Bal
Falco Girgis wrote:That's the basics.

If I were you, and I were using the OpenGL core profile, I would store that giant-ass array of geometry as a VBO GPU-side, and drastically reduce the overhead fo transferring vertices every frame.


...and send only one vertex per tile and inflate it with geometry shader to a full quad for even less transfer overhead. As dandymcgee pointed out, don't do this until you need the performance.

Also depending on the situation I would aggregate common data for faster access. So instead of an array of "struct Tile" I would use arrays of vertices, tile ids and so on.

Re: Question Regarding Tile Engines

PostPosted: Fri Jul 03, 2015 11:22 am
by dandymcgee
K-Bal wrote:Also depending on the situation I would aggregate common data for faster access. So instead of an array of "struct Tile" I would use arrays of vertices, tile ids and so on.

Ah yeah, excellent point. There can be a lot of duplicate data in the pipe when you're dealing with tile engines.

Re: Question Regarding Tile Engines

PostPosted: Wed Aug 19, 2015 1:54 am
by Falco Girgis
K-Bal wrote:...and send only one vertex per tile and inflate it with geometry shader to a full quad for even less transfer overhead.
Sure hope this dude is not targeting a mobile device without support for GLES 3 then... We can't even use geometry shaders because most mobile devices are still cock-blocked at GLES2.

K-Bal wrote:As dandymcgee pointed out, don't do this until you need the performance.
I guess I can't really argue against this philosophy, but if your targeted device is mobile, the chances are that you are IMMEDIATELY going to need the performance if your tile geometry is not stored GPU-side. Not to mention client-side "anything" storage is deprecated in modern GL ES... the dude doesn't have a choice for mobile.

K-Bal wrote:Also depending on the situation I would aggregate common data for faster access. So instead of an array of "struct Tile" I would use arrays of vertices, tile ids and so on.
I wouldn't. You can almost never guarantee that something like this is going to be an optimization when it comes to GPU architectures. You're going to be striding the SHIT out of the cache for every tile when its data is all over the place in disjoint arrays like that. A GPU is optimized for making the most out of every access into global memory, and organizing data as an array-of-structs rather than structure-of-arrays means that after the initial access, more of the data is immediately available without having to revisit global memory and striding the shit out of the cache... I can almost guarantee you this will actually be a performance hit unless you have a SHITLOAD of cache lines or a small data set.

Also aggregating data like that is going to cause an additional layer of indirection if you need to use data from one array to index into another array to make anything useful in your shader. You're going to have better performance if it's preprocessed and is all a single, contiguous, interwoven set. Sure, he's going to have duplicate uv coordinates for vertices with the same tile IDs, and in our case, the VBO is WAAAY bigger, but it's going to be way faster due to its cache coherency.

Re: Question Regarding Tile Engines

PostPosted: Wed Aug 19, 2015 2:06 am
by Falco Girgis
K-Bal wrote:...and send only one vertex per tile and inflate it with geometry shader to a full quad for even less transfer overhead. As dandymcgee pointed out, don't do this until you need the performance.
I honestly haven't fucked with the geometry shaders enough to know if this would speed things up or not. It's a really good idea... I want to look into it.

Re: Question Regarding Tile Engines

PostPosted: Wed Aug 19, 2015 3:35 am
by K-Bal
You're probably right about the memory access of a structure-of-arrays on a GPU. I was thinking too much about cache coherency on a CPU.

Falco Girgis wrote:I honestly haven't fucked with the geometry shaders enough to know if this would speed things up or not. It's a really good idea... I want to look into it.


When I did this for my bachelor thesis it did speed up things, just can't say how much it did. That was five years ago and I did no scientific comparison. Here's a video of about 5 million particles simultaniously on screen on my 2010 computer, which had a medium grade graphics card at the time:



I also wrote a physics shader that generates anti-penetration forces on collision. Particle count about 2 million simultaniously on screen: