(GPU) NVIDIA GeForce RTX 2060 SUPER
(Profile updated as of 12th August. 2019)
Here is my GPU profile for the RTX 2060 Super. I wanted RTX so I took the plunge with the second wave of Turing-based 20-series cards.
(click for full images).
(Picture 1) The silicon die of TU106-410. It is surrounded by 8x 1024 MB GDDR6 SDRAM chips, making up the 256-bit interface. The image above is from TechPowerup's review of the card.
(Picture 2) The architectural block-diagram for TU106-410. Note the disabled TPC with its two Streaming Multi-Processors.
(Picture 3) Actual silicon die-shot using Infrared imaging of the TU106 GPU, the chip pictured is the 200 silicon, from RTX 2060, but I have annotated a single disabled TPC with two SMs that the 2060 Super has laser cut. Image credit is to Fritzchens Fritz for the die shot, and the information on GPC structure on die is from this highly useful tool.
Graphics Card Information
Graphics Card: NVIDIA GeForce RTX 2060 SUPER
Graphics Card Manufacturer: NVIDIA
Graphics Card Release Date: July 9th, 2019
Graphics Card MSRP: $399 USD
Graphics Processor Codename: TU106-410
Graphics Processor Manufacturer: NVIDIA
Graphics Processor Implementation: Cut die
Graphics Interface: PCI-E 16x Gen3
Architecture: Turing (TU10x)
Lithography Process: TSMC 12nmFFN FinFET
Approximate die size: 445mm²
SashleyCat's GPU die Size Rating: Large
Approximate Transistor Count: 10,800 Million
Approximate Transistor Density: 24 Million / Square Milimetre
Double-speed FP16 Shading: Yes (FP16x2 Facilitated by Tensor Cores)
Asynchronous Compute Capability: Full
DirectX Hardware Support: DX12.1 (FL 12_1)
Dedicated DXR Acelleration on chip: Yes (RTX)
Variable-rate Shading: Yes (Adaptive Shading)
Adv. Geometry shading: Yes (Mesh Shading)
Adv. Geometry shading (Programmable/DX12 Mesh Shaders): Yes
AI/ML Acceleration: Yes (Tensor Cores)
Advanced Memory Management: No
Integer and Float Shader Co-execution: Yes
Tile-based Renderer: Yes
GPU Computing Resources
GPU Substructures: 3 Graphics Processing Clusters, 17 Texture Processing Clusters (18 TPC Full chip)
Graphics Cores: 34 Streaming Multi-processors (36 Full Chip)
Graphics Cores per Substructure: 2 per TPC, 2 x GPC with 12, 1 x GPC with 10
Total Stream Processors (ALU/Shaders): 2172 (float/Int) (2304 Full Chip) *
Stream Processors per Graphics Core: 64 Float32, 64 INT32
Graphics Core SIMD Structure: 4 x 16 Float32, 4 x 16 INT32
Total Special Execution Units: 544 Special Function Units (576 Full Chip), 544 Load/Store Units (576 Full Chip) 272 Tensor Cores (288 Full chip) 34 Ray Tracing Cores (36 Full Chip), 68 FP64 CUDA Cores (72Full Chip)
Special Execution Units per Graphics Core: 16 Special Function Units, 16 Load/Store Units, 8x Tensor Cores, 2 FP64 CUDA Cores, 1x Ray Tracing Core
Total Texturing Units: 136 (144 Full Chip)
Texturing Units per Graphics Core: 4
Pixel Pipelines (ROPs): 64 (8 x ROP Partitions with 8 Pixels per clock)
Level 2 shared on-chip cache: 4096 KB
Geometry/Tessellation Processors: 17 (18 Full Chip)
Raster Engines: 3
GPU Memory Subsystem
Graphics Memory Type: GDDR6
Graphics Memory Standard Capacity: 8192 MB
Graphics Memory Composition: 8 x 1024 MB GDDR6 SDRAM Chips
Graphics Memory Access Granularity: 32-bit (4 bytes)
Graphics Memory Standard Clock Speed / Data Rate: 1750 MHz / 14000 MHz
Graphics Memory Full Interface Width: 256-bit (32 bytes per clock)
Graphics Memory Peak Memory Bandwidth: 448 GB/s
GPU Frequency and Peak performance
Graphics Engine Clock: 1650 MHz *
GPU Computing Power FP16: 14,335,200 Million operations per second with FMA
GPU Computing Power FP32: 7,167,600 Million operations per second with FMA
GPU Computing Power FP64: 223,987 Million operations per second with FMA
GPU Texturing Rate INT8: 224,400 Million texels per second
GPU Texturing Rate FP16: 224,400 Million texels per second
GPU Pixel Rate: 105,600 Million pixels per second
GPU Primitive Rate: 4,950 Million triangles per second *
GPU Thermal and Power
Standard Cooling Solution: Dual-Fan Axial cooler with Vapour chamber heatsink
Typical Board Power: 175 W
Maximum Board Power: Varies per design (210W standard)
Maximum Allowed Junction Temperature (TJ Max): 89*C
Graphics Card description
GeForce RTX 2060 Super launched in early July, 2019 as a refresh of the 'mid-range' 20 series graphics cards, based on the TU10x Turing architecture. This card doesn't entirely replace the 2060 vanilla, instead it occupies a price point just above that card, and under the original RTX 2070 by 100 USD, while offering more or less the same performance as that card. A major advantage of the RTX 2060 Super is that it now features 8GB of video memory, and along the 256-bit memory interface with the same 14Gbps GDDR6; has a lot more memory bandwidth to back it up. RTX 2060 Super for all intents and purposes, is an RTX 2070 with a single TPC disabled and slightly more aggressive clock rates, resulting in the same performance.
Being based on the TU10x Turing architecture, using the TU106 silicon, the RTX 2060 Super fully supports Hardware-accelerated Ray Tracing, using Microsoft's DXR or other API versions. In addition, the GPU also contains "Tensor Cores" for acceleration of machine-learning and AI workloads. Like all Turing GPUs, TU106 silicon represents a fairly significant change over previous generation 'Pascal' processors, some major changes include a switch from Instruction Level Paralellism and dual-issue warps, to a thread-level paralellism design and major overhauls to the streaming multi-processor with enhanced L1 cache performance and size. In addition, Turing-based GPUs have dedicated pipelines for Integer shader code, which can now execute non-dependent instructions of Floating-point and Integer types, concurrently. In games that utilise lots of mixed instructions of INT and FP, this can result in a fairly significant increase in shading efficiency.
Turing GPUs are built on TSMC's 12nmFF 'N' Process (FFN, the "N" designated that this process is optimised especially for Nvidia). Due to the increased transistor requirements of the additional hardware logic (INT pipes, Tensors, Ray Trace, and vastly increased caches), and the lack of any significant density improvement afforded over 16nmFF from TSMC, the Turing chips of the 20 and 16-series are very large. For example, the 'smallest' Ray-Trace capable Turing GPU, TU106, a truly 'mid-range' GPU is now almost as large as the flagship GP102 processor from the previous-generation GTX 1080 Ti, and TITAN Xp video cards. (445 vs 471mm²), with almost as many transistors (10.8 vs ~12 billion).
The RTX 2060 Super holds the accolade of being NVIDIA's first "60" ('Mid-range') positioned graphics card with 8GB of video memory, but the price puts it more in the territory of the "70" series of previous generations.
Graphics Card approximate 3D Performance
Sashleycat gaming performance rating (2019): Great for 1440p maximum settings 60 FPS (1080p High settings with DXR), or 1080p maximum settings high refresh (no DXR)
GeForce RTX 2060 Super provides performance around the same as the original RTX 2070, putting it slightly ahead of the RX Vega 64 and GTX 1080. Performance is comparable to AMD's latest RX 5700 (non XT) video card. This results in great performance for a 1440p monitor, with maximum detail settings, in the latest titles in 2019. Unique to the 20-series, the RTX 2060 Super can use dedicated hardware blocks to accelerate Real Time Ray Tracing in video games. With this feature enabled, the card is pushed down a performance tier to 1080p, where it provides reasonable FPS.
Graphics Engine Clock
NVIDIA-spec rated boost is listed. Actual gaming clock will be higher due to GPU Boost. It varies on power limit and cooling capacity, per design but will likely be around 1900 MHz. As a result is almost impossible to say what each card will run at in gaming situations.
GPU Primitive Rate
Raw triangle output based on my understanding of the Raster Engines. PolyMorph engines attached to each TPC may have an effect on total triangles rastered.
Total Stream Processors (ALU/Shaders)
Only 32-bit precision CUDA cores are listed, and only advertised CUDA cores. You can see the SIMD structure for the full pipeline count in 32-bits.
This bit is for my personal opinion on this Graphics card / Graphics processor
Sashleycat's Awesomeness Rating: Cool and innovative technology. But still too expensive to be truly awesome.