DirectX 12 (DirectX 12 download) benchmarks for Nvidia’s fastest graphics card ever, the GeForce GTX 1080 have finally been released. Impressive entries for the GTX 1080 have made their way to the Ashes of The Singularity benchmarking database and we’re going to share them with you.

Graphics CardGTX 980GTX Titan XGTX 1070GTX 1080
Manufacturing Process28nm28nm16nm16nm
Transistors5.2 Billion8 Billion7.2 Billion7.2 Billion
CUDA Cores20483072TBA2560
Memory Bus256-bit bus384-bit bus256-bit bus256-bit bus
Launch DateSeptember 2014March 201510th June 201627th May 2016
Launch Price$549$999$449 (Founder's Ed)$699 (Founder's Ed)

NVIDIA GeForce GTX 1080 DirectX 12 Performance Benchmarks Revealed

The “Founder’s” edition is just the name that Nvidia decided to give to the reference designed card, featuring the blower fan and the metallic shroud. Many gamers found this quite bizarre as the reference designed cards are usually the least sought after due to their higher noise output, higher temperatures and lower clock speeds. Nvidia’s decision to market the reference design as a premium option very likely stems from the initial limited availability of the card, so the $100 premium will as an early adopters tax for the time being. Until Nvidia’s board partners launch their own custom versions of the GTX 1080 at $599 later on.

Reviews for the GTX 1080 will go live on May 17th, 10 days before the Founder’s edition GTX 1080 is made available for purchase. The GTX 1070 Founder’s edition will launch June 10th for $449, with the board partner cards launching later for $379. The embargo date on GTX 1070 reviews hasn’t been revealed as of yet, but it will likely precede the June 10th date by at least a week similar to the GTX 1080.

According to Nvidia the GeForce GTX 1080 will be roughly 20% faster than the GTX 980 Ti and the GTX 1070 will be slightly faster than reference GTX 980 Ti cards and on par with factory overclocked variants.

GeForce GTX 1080 DirectX 12 Performance Revealed

The entries include two resultions, 2560×1440 and 1920×1080, both of these ran at the “Crazy” graphics preset. The GTX 1080 was 13% faster than the GTX 980 Ti and 11% faster than the R9 Fury X at 1920×1080.  At 2560×1440 and the same preset the GTX 1080 was 9% faster than the GTX 980 Ti and 11% faster than the R9 Fury X.

DirectX 12 Ashes Of The Singularity – 1080p Crazy Presetgraph_35 - Copy

DirectX 12 Ashes Of The Singularity – 1440p Crazy Presetgraph_34 - Copy

These numbers aren’t as high as the ones touted by Nvidia during the press event over the weekend but they’re respectable none the less. We also can’t draw any definitive conclusions just by looking at the card’s performance in one game. So keep your eyes peeled for those reviews on the 17th.

Nvidia-GTX-1080-BenchmarksNvidia Marketing Benchmarks – Take With A Grain Of Salt

The Pascal Building Block Of Every GTX 1080 And GTX 1070 Graphics Chip

The basic building block of every Pascal GPU is called the streaming multiprocessor or SM for short. The streaming multiprocessor is a graphics and compute engine that schedules and executes instructions on many threads simultaneously.

NVIDIA Pascal GP100 SM

Each Pascal streaming multiprocessor houses 64 FP32 CUDA cores, half that of a Maxwell SM. Within each Pascal streaming multiprocessor there are two 32 CUDA core partitions, two dispatch units and a brand new, smarter, scheduler. In addition to an instruction buffer that’s twice the size of Maxwell per CUDA core. This gives each Pascal CUDA core access to twice the registers compared Maxwell.

The end result is more performance per clock per CUDA core, lower power consumption and a higher overall clock speed. The updated hardware scheduler extends Pascal’s abilities to execute code asynchronously, which will no doubt have a positive impact on the architecture’s performance when it comes to DirectX 12 Async Compute.

If you want to read more about the Pascal architecture, check out our in-depth break-down of the architecture versus its predecessors Maxwell ( GTX 900 series ) and Kepler ( GTX 600 and 700 series ) here.

DirectX 12 Async Compute Still As Important As Ever – Can Nvidia Catch Up To AMD?

I dove deep into the Pascal architecture last month and explored the nitty and gritty of details. One of the more important architectural changes that Nvidia has introduced with pascal is the addition of a hardware scheduler, similar to what AMD did with the GCN architecture in 2011 starting with the HD 7000 series

This new hardware scheduler will play a crucial role in allowing Pascal GPUs to perform better at executing tasks asynchronously, even though it still evidently relies on pre-emption and context switching according to what Nvidia has revealed in its Pascal whitepaper.

So while this scheduler doesn’t actually allow tasks to be executed asynchronously it will still improve the performance of Pascal GPUs when it comes to executing code that’s written asynchronously. It’s sort of a hack to hold Pascal off until proper async compute is implemented in Nvidia’s future architectures.

Async Compute has always been a controversial issue for Nvidia, largely because the company refused to talk about it for many months and promised a driver that would enable it on Maxwell that never came. The addition of an updated hardware scheduler in Pascal signals a change of heart for Nvidia. It represents a walk-back on some of the trade-offs that the company decided to make with Maxwell to achieve its power efficiency goals.

These trade-offs many developers argued were reasonably sound for DirectX 11 and traditional generic APIs, but not as much for the new era of VR and low level APIs such as DirectX 12 (download DirectX 12) and Vulkan. Where executing code asynchronously has proven to be of benefit to latency and performance. Pascal should be better at Async Compute thanks to the new hardware scheduler. Although how much better exactly no one knows yet. However, if these DirectX 12 Ashes of The Singularity – a game that makes plentiful use of DirectX 12 async compute – benchmarks are of any indication, then we’re likely only looking at minimal improvements with Pascal.