AMD RX 9000 RDNA 5: Patent Unveils Chiplet Design for GPUs and Parallel Rendering Functionality

A new patent filed on November 23 by AMD reveals something truly significant and complements others we have already seen. Although patents are not “law” and many don’t lead to anything as such, we believe that the detail and content of this one is crucial, as it most likely forms the basis for future RX 9000 graphics cards with RDNA 5 architecture, based on chiplets and parallel rendering. So, how will these graphics cards work and operate?

Well, the way AMD is going to work with GPUs is quite different from what it does with CPUs. There won’t be an IOD as such; the scheme, as it is currently presented, shows that we are talking about “complete” chiplets but, above all, independent ones. This raises a debate about how they will synchronize and how latency will be mitigated; while we don’t know for sure, we will offer some speculations on the subject.

AMD RX 9000 with RDNA 5 and its new chiplet patent: parallel rendering?

This could be the new concept to consider, and it might be talked about from late 2025 or early 2026 onwards. Before diving in, let’s understand the explanation AMD offers as a general concept about the patent:

The patent discloses systems, apparatuses, and methods for performing geometry work in parallel on multiple chips. A system includes a chiplet-based processor with multiple chiplets for performing parallel graphics work.

Instead of having a central distributor to distribute the work to individual chips, each chiplet determines for itself the work to be performed. For example, during a draw call, each chiplet calculates which parts to fetch and process from one or corresponding fetch portions of one or more graphical objects of the draw call.

Once the portions (or parts) are calculated, each chiplet obtains the corresponding fetches and processes the indices. Chipsets perform these tasks in parallel and independently of each other.

When index buffers are processed, chiplets perform one or more subsequent stages in the graphic rendering process in parallel.

Many questions in the air

The patent is complex and, logically, much information is missing, but it also reveals many other details. For instance, it mentions up to 12 chiplets in the patent, among other configurations – this is a surprising number because it implies more simplified chiplets than initially expected or a much denser number of shaders.

However, there are more things in the air. AMD discusses complex task assignments to get to the CPs, which the patent says will be done between firmware, the driver, and hardware. This opens up the possibility that, with the elimination of a typical IOD, there will be a new segmentation in AMD drivers where they start from scratch with RDNA 5.

To summarize and not confuse too much amid all that the patent names and leaves in the air, AMD makes five distinctions as minimum block units in this order of arrangement of elements from exterior to interior:

Chiplet -> CP -> Geometry Engine -> Shader Engine -> Rasterizer

Interestingly, as mentioned, each chiplet is entirely independent – a traditional monolithic GPU like those available today – and, best of all, they are transparent to each other. In other words, according to the diagrams, they can share information and, it seems, even certain resources.

Independent but connected and sharing?

This is the philosophy of the Ryzen CPUs, but with lower latency. AMD may eventually make a similar shift in CPUs in the future in an attempt to move elements to the chiplets to reduce latency. According to the patent, RX 9000 with RDNA 5 will be able to access memory independently for each chiplet, suggesting greater use (not necessarily greater bandwidth per second as such, this remains to be seen) of PCIe 5.0 or 6.0, since even though these are independent units, memory access logically must be shared.

This implies that with RX 9000 RDNA 5, RAM speed and latency will be crucial for boosting performance in DDR5. Finally, a detail that may go unnoticed: the system, the PC in general, sees the chiplets as a single unit, hence the dashed line box in Figure 2.

This suggests that only the firmware can understand the internal chiplet arrangement, as the driver in that figure is on the opposite side, connected to the OS, the application, and the CPU, which encompasses these three without forgetting RAM.

In summary, the way it appears to work is very similar to the current approach, because there are still three major blocks: CPU, RAM, and GPU, only the latter has chiplets, and more importantly, they are connected by what AMD calls the Communication Link, better known from its patent as Crosslink.

AMD Crosslink, what exactly is it and how will it connect the chiplets?

This brings us to something that was seen on December 31, 2020. The patent is extensive, so we will try to simplify it since AMD will use it in PC for the first time in what we understand as RDNA 5 and, primarily, speaking of GPU chiplets.

First, it’s essential to understand that the L3 and its corresponding PHY will return to the die. In fact, what will be done according to the AMD Crosslink patent is to go back to the RX 6000 and divide the units into independent chiplets, as the MI300 does in its GPU arrangement.

Therefore, each RX 9000 chiplet with RDNA 5 will have the following units:

WGP

GFX

GDF (Graphic Data Fabric)

SDF (Scalable Data Fabric)

PHY Controller

With this order in mind within each chiplet, all are connected to what AMD has called the HBX Passive Crosslink, located at a connection point between the SDF and L3.

Why? Because this is the interconnection route between all chiplets, and considering that the CPU’s direct information will go to the SDF, this minimizes latency if they have to exchange data between them.

AMD uses the Crosslink system in the MI300 but requires a CPU for that. Here, there will obviously be no CPU, just chiplets, so we understand that AMD is temporarily killing off Infinity Fanout Link to an extent since it wouldn’t be needed in this case, unless they decide to move the L3 out of the chiplets again, which seems unlikely given what we’ve seen.

RDNA 5, the RX 9000, and this chiplet patent leave the door open: in 2 nm or 3 nm?

As these are patents, and as we mentioned earlier, this is just a sketch of what may come. Given the correlation between both patents, what was seen with the MI300 and its variants, and this new one that is only a week and a half old, it is likely that we will see something very similar with RDNA 5.

Now, how many chiplets will there be? We need to look at the lithographic processes to get an idea of the density that each chiplet can hold and thus be able to make calculations for the number of shaders as such. Considering what was seen in TSMC’s roadmap up to 2026, the reality is that we will be on 3 nm at that point, and only well into 2026 can we expect the N2X and N2P, which could suggest that RDNA 5 and the RX 9000 could arrive in both nodes because of the timeliness of this latest chiplet patent.

That is, they are closer to being built with N2 than with N3X for temporal reasons, but this is still pure speculation, and secondly, a mystery.

The post AMD RX 9000 RDNA 5: A patent reveals the chiplet design for its GPU and how parallel rendering works first appeared on El Chapuzas Informático.

Need Help?

Can't Find What Your Looking For? Just fill out our form and we'll ask our trusted distributors

Click me, now!

Contact Cellular Stockpile.

Wholesale Inquiries Only.

Email Cellular Stockpile
Tel: | Whatsapp:

Need Help?

Contact Cellular Stockpile.

Wholesale Inquiries Only.

Leave a Reply Cancel reply