Updated: Aug 15, 2020
This is a pure Babble from my inner self. It might not even make sense at some points. Warning: contains typos. I'll fix them later. Maybe.
UPDATE 15-08-2020: Please read my Knowledge Update on RDNA's Primitive Shaders!
The little Guy
Nobody paid much attention to Little Navi. Well, at least not as much as they did to his bigger Brother, Middle-sized Navi (Navi 10) that first debuted in the RX 5700 XT. People are obsessed with the high performance, large-die size, performance-tier pushing parts that usually cost a fortune too. And that's okay, I love those too. As a tech enthusiast, it's awesome to see the boundries of computing get pushed forward with each new architecture.
However, we must remember that the Little Guys, I am talking about GPUs like Polaris (10/11) and Navi 14, and before those: the likes of Pitcairn, the Immortal GPU. It is the smaller, 'entry-level' processors that enable low-cost access to the latest technology, and it is them that enable everyone - regardless of budget - to access it.
Now, you may already be aware that I hold Polaris in a very high esteem, much like I do Ryzen, especially with how they have enabled people on low incomes to break down the barrier to computing and PC gaming. Indeed, I am a big proponent of high value technology products because I feel they are the most progressive for the vast majority of the population.
That brings me to Navi 14.
But Sash, RX 5500 XT was kinda 'meh' on launch and the value wasn't even that great. You're an idiot, why are you typing this crap?
I am fully aware of that. But the purpose of this post is to talk about a GPU that often doesn't get a lot of attention, and well, I have one as my main GPU and I just wanted to talk about it. So you can just deal with that. CHUMP! That was a joke. :3
After my Polaris 30-based RX 590 burped and I decided to retire him permanently, I found myself using his somewhat ill-fated (value position) replacement, RX 5500 XT; based on the tiny - and somewhat adorable - Navi 14 graphics processor.
Navi 14 is a bit different from the GPU that it 'replaces' in terms of performance level, and it actually is more of the successor for Polaris 10/20/30's little brother, Polaris 11/21 - of which featured in the RX 460 and 560 cards. I actually touched on the subject of GPU succession when some dumb people threw their toys out of the pram complaining that RDNA (Navi) is crap because RX 5700 XT wasn't 'That much faster' than Vega 64.
I mean, RX 5700 XT's Navi 10 processor is designed to replace Vega 64, not succeed it. The job of succeeding the relatively fat (~500mm2) Vega 10 will be down to the so-called 'Big Navi' that everyone is hyped about for the next few months. Anyway, on the subject of Navi 14, this little guy is actually most likely intended to suceed the Polaris 11 processor - also known as 'Baffin' which servs on the RX 460 graphics card.
So in effect, what we have seen with RDNA is a pretty huge jump in performance, so much so that AMD has been able to push an entire tier of GPU up a notch - much like Nvidia did with the GK104-based GTX 680 in 2012. Navi 14; RX 5500 XT; for all intents and purposes, is an RX 660 at the same performance level as an RX 580.
Technical details of Little Navi
We really have to dive into the details of this little chip and compare it to its predecessors to get an understanding of the sort of market this little processor was built for. The post I mentioned above on GPU succession has a nice little layout of specifications that you can read, it is relevant to this subject.
Anyway, Navi 14 is a very small processor that has design choices to make the chip cheaper to make and improve yields, resulting in maximisation for margins in a market that already has very low margins (entry-level). Despite these constraints, Navi 14 achieves a full performance tier gain over GCN, being able to provide RX 580-like performance with fewer stream processors (though interestingly; more transistors - increasing clock speeds and upgrading internal caches eats into the transistor budget, along with new features such as video engine and my belief that an RDNA WGP [2x CU] has signficiantly more transistors than two GCN CU, additional Scalar unit, bigger caches, wow this bracket sentence is huge. It's 5.7b transistors for Polaris 10 and 6.4 for Navi 14, by the way, Navi 14 has 200m more transistos than even Hawaii [R9 290X]), half the memory interface width (GDDR6 memory helps), half the PCI-E Express lanes (4.0!) and only two primitive output units.
Bus Width to Graphics Memory
The memory interface on this GPU is only 128-bits wide. That is to say, it only needs four GDDR chips to occupy the interface to its fullest implementation - as GDDR memory chips have up to 32-bit wide access granularity, or 16-bit when a card is configured in 'clamshell mode'. The very fact that the bus is so narrow is a big indicator that this GPU was built to be cost effective - less complex PCBs due to fewer traces and fewer memory chips. A smaller bus also uses less power, and the physical connections (PHYs) on the chip occupy less space; about half as much space, as you might have thought, as the 256-bit connection on Navi 10, but maybe a bit more than half than Polaris 10 assuming we normalise for the process density: That is because I believe GDDR6 PHYs are slightly larger on chip than GDDR5 ones.
This contrasts the RX 580's 256-bit interface, and is likely helping to offset the added cost of using GDDR6 instead of GDDR5. Obviously, I have to point out that the data-rate on Navi 14's standard GDDR6 rating - 14 Gbps - is significantly greater than the standard of 8 Gbps on the GDDR5 for Polaris 10. Almost twice the signals per second means that despite the bus being 50% the width; the effective data bandwidth is almost the same - netting all those space savings in the process.
I said almost; Navi 14 with 14 Gbps GDDR6 along 128-bit produces a theoretical peak raw memory bandwidth of 224 GB/s. Polaris 10, with its 256-bit interface and 8 Gbps GDDR5 produces 256 GB/s in raw bandwidth; 32 GB/s more than Little Navi.
However. This would bring me to the little extra section tacked on under the Graphics memory Section.
Navi 14 has new tricks to improve bandwidth efficiency and I wish I had more Cache.
Since GCN3 (Tonga/Fiji) AMD has followed Nvidia and implemented a lossless compression technology on their GPUs; that essentially tries to minimise the amount of raw colour data sent to memory by grouping bits of data that are similar up together - essentialy compressing them and reducing the overall bits of data sent to memory. This allows additional traffic to occupy that saved space - inreasing effective bandwidth.
This technology is in its 2nd generation on Polaris; it will almost certainly have been upgraded to a 3rd generation on RDNA1 (Navi 10 & 14) GPUs. Since this compression technology is baked into the silicon logic, upgrades cannot be back-ported to older chips; newer ones with feature the improvements and upgrades.