Yeah, I finally purchase EVGA Geforce RTX 3090 K|NGP|N. This is a genuine handpicked graphics card for overclocking, and the boost clock (core) is set to 1920 MHz. This is the highest level of any Geforce RTX 3090 and this prides the highest performance (side by side with GALAX HOF OC Lab Edition). As you know, semiconductors are depleted worldwide today, and it was very difficult to obtain this top-of-the-line graphic card.
This graphics card is equipped with 360mm AIO water cooler to cool the core and air cooler to chill the other components. As a result of trying overclocking, this cooling mechanism is showing higher performance than I imagined, and I think that needs no modification when basic overclocking, except vBIOS*.
*You should rewrite vBIOS to break the power limits. The default power limit is till 450W, but modified vBIOS provide the power limit till 1000W. You can get the modified vBIOS and method to flash vBIOS to graphic cards of NVIDIA are easily from some web sites.
In this article, I will write about the HBM2 which had been used in high-end class GPU, especially Radeon Vega series from AMD or TITAN V from NVIDIA. The HBM (High Bandwidth Memory) is a stacked memory that realize the very high memory bus width.
Two ways to extend the memory bandwidth
There are two ways to extend the memory bandwidth. One is increasing the memory clock, which can be seen in the progress of GDDR. The other is expanding the bus width, and the stacked memory such as the HBM aims at this.
The memory clock and the memory bus width are related as shown below. Extending the memory bandwidth can be thought as conveying more cargo to a location. If cargo are carried by trucks, getting up speed the memory clock corresponds to accelerating the trucks, and expanding the bus width corresponds to increasing the lane of the road. It can be seen that both methods contribute to extending the memory bandwidth.
The Structure of the HBM
HBM has two main features. One is that multiple memories are stacked and connected by TSV (Through-Silicon Via), and the other is that a sub-board called Silicon Interposer is interposed between the processor and memory.
Some advantages of the HBM
When trying to achieve a high bus width, a physical distance between a memory and a processor has increased, and as a result, the operating voltage and the power consumption get higher. On the other hand, a stacking memory can save the mounting area (see below) and solves the above problems.
Also, a TSV has a short connection distance, so it makes less resistance and less possibility to suffer noise. Thus, power consumption can be reduced, waveform deterioration and signal delay can be restrained, and high-speed operation can be achieved.
The silicon Interposer is a substrate made from silicon, and it can reduce the operating voltage and the power consumption due to high electrical conductivity of silicon. In addition, silicon allows large amounts of wiring in tight spaces, so it can made that wire which has high bus width are connected directly between memory and processors (without to bundle signals). It also contributes to reduce in mounting area compared to wire on a substrate.
The serious disadvantage of the HBM
However, the HBM has a fatal disadvantage of high cost. It is inevitable as long as the TSV and the Silicon Interposer are adopted. Due to this problem, and the progress of GDDR, HBM2 is no longer used in GPUs for consumers (actually, AMD had adopted HBM2 in the Vega series, but in followng Navi series, they has adopted GDDR6).
Probably, the best and last GPU for consumers which equips HBM2 will be Radeon VII (with 16GB VRAM). HBM2 which realize high bus width is attractive, but will it become a relic?