cancel
Showing results for 
Search instead for 
Did you mean: 

Weird slow NVMe read speeds

FlyingBear
Level 7
I'm having a serious issue with my build, and I can't pin it down. I'd appreciate any help/comments/ideas. The net is that NVMe read speeds are very slow.

ROG Zenith Extreme, 1950x, 4x8GB Corsair Vengeance 3600 (on QVL list), GTX 1080 Ti, EKWB loop.
2x1TB 970 Pro on DIMM2, 2TB 970 Evo under the cover. 2x4TB 860 Evo on SATA. The DIMM2 is cooled with a fan, and the NVMe drive temperatures are just fine, well under 50degC.

No RAID. No HPET. No Asus software. Clean Windows 10 1803 x64 install, and just the AMD chipset drivers installed. Windows is on one 970 Pro, and an empty NTFS partition on the other 970 Pro.

BIOS optimized defaults. Literally no changes at all to those defaults.

Atto and AS-SSD on all three NVMe drives shows 2.5GB/s write, but caps out at 1.8GB/s read. Weird, no? CrystalDiskMark shows respectable 2.5GB/s writes, 3.5GB/s reads. I don't understand why the benchmarks are so different. I suspect that Atto/AS-SSD are correct, because copying a 10GB file from any of the NVMe drives to any of the other two runs at 1.5GB/s. HWInfo confirms that all NVMe drives are PCIe GEN3 x4.

When I install the Samsung NVMe driver, there's a very slight improvement in benchmarks, largely on smaller transfers, as usual.

When I overclock the memory to 3200 (very stable), CPU to 4GHz (shockingly easy and very stable), Atto and AS-SSD improve to 2.5GB/s writes, 2.4GB/s reads. That's a pretty amazing differenc, but still very slow reads compared to the 3.5GB/s reads that I'd expect.

When I RAID 0 the two 970 Pros, the benchmarks double, i.e. 5GB/s writes and 3.6GB/s reads.

All of this is the same with BIOS 0902 and 1003.

SOMETHING is throttling the NVMe reads, but my puny mind can't figure it out. Any ideas about what to try, ow what could be causing this? THANKS!
13,746 Views
8 REPLIES 8

Wobbler
Level 7
Currently 3700MHz 64GB/4 3200Mhz CL14 CR2 with SMT("Hyperthreading") disabled, and in NUMA mode (memory local mode).
Let's see now for my system with non empty drives and with latest drivers/W10, I need at least a queue of 3 and 4MB transfer size to max* out my NVME 0 raid sequential speed (7000MB read/4000MB write) in ATTO (Overlapped I/O) with lower queue the read speed can be halved, AMD bottom drivers and the NVMe raid layer at work here. Intel drive in the same setup, screwed by the mentioned raid layer, with it's own drivers needs about 1MB/Q2 (64KB/Q3...) to max* transfer rate in ATTO (OS drive, manual installation of drivers required for all NVMe drives + blue screen will happen during the driver installations). There are some driver limitations with the AMD bottom drivers at least when handling smaller read/writes and the intel drive.

Crystal disk will max out 1 CPU for the Q32T1 max sequential test so higher overclock will help, and watching YouTube etc can half the performance, while ATTO seems to use 5 threads.

max* = drives maximum sequential transfer rate +in raid x2 = more or less equal with online test results max transfer raters, not so in queue depth/transfer size in Intel drive case it is about 1 queue depth or 896KB transfer sIze difference when NVMe raid in enabled.

There is also diskspd.exe https://gallery.technet.microsoft.com/DiskSpd-a-robust-storage-6cd2f223 with it you manually set all the test parameters, and get more info.
example.
-w100 for 100% write test
Diskspd.exe -b128K -d30 -L -h -o2 -t1 -c1G c:\io.d

Intel NVMe
CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 29.11%| 1.09%| 28.02%| 70.88%
1| 0.05%| 0.00%| 0.05%| 99.95%
2| 0.26%| 0.05%| 0.21%| 99.74%
3| 0.00%| 0.00%| 0.00%| 100.00%
4| 4.84%| 4.79%| 0.05%| 95.16%
5| 0.00%| 0.00%| 0.00%| 100.00%
6| 0.00%| 0.00%| 0.00%| 100.00%
7| 0.05%| 0.05%| 0.00%| 99.95%
8| 0.00%| 0.00%| 0.00%| 100.00%
9| 0.00%| 0.00%| 0.00%| 100.00%
10| 0.00%| 0.00%| 0.00%| 100.00%
11| 0.00%| 0.00%| 0.00%| 100.00%
12| 0.00%| 0.00%| 0.00%| 100.00%
13| 0.00%| 0.00%| 0.00%| 100.00%
14| 0.16%| 0.00%| 0.16%| 99.84%
15| 0.00%| 0.00%| 0.00%| 100.00%
-------------------------------------------
avg.| 2.15%| 0.37%| 1.78%| 97.84%
Read IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 80717021184 | 615822 | 2565.90 | 20527.22 | 0.097 | 0.011 | c:\io.d (1024MB)
-----------------------------------------------------------------------------------------------------
total: 80717021184 | 615822 | 2565.90 | 20527.22 | 0.097 | 0.011

%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.066 | N/A | 0.066
25th | 0.092 | N/A | 0.092
50th | 0.092 | N/A | 0.092
75th | 0.103 | N/A | 0.103
90th | 0.105 | N/A | 0.105
95th | 0.112 | N/A | 0.112
99th | 0.135 | N/A | 0.135
3-nines | 0.211 | N/A | 0.211
4-nines | 0.307 | N/A | 0.307
5-nines | 0.341 | N/A | 0.341
6-nines | 1.582 | N/A | 1.582
7-nines | 1.582 | N/A | 1.582
8-nines | 1.582 | N/A | 1.582
9-nines | 1.582 | N/A | 1.582
max | 1.582 | N/A | 1.582
--------------------------------------------

Diskspd.exe -b128K -d30 -L -h -o8 -t4 -si -c1G e:\io.d
NVMe Raid 0
CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 48.59%| 0.94%| 47.66%| 51.41%
1| 42.50%| 0.73%| 41.77%| 57.50%
2| 43.12%| 0.83%| 42.29%| 56.87%
3| 38.28%| 0.83%| 37.45%| 61.72%
4| 8.18%| 3.39%| 4.79%| 91.82%
5| 14.53%| 1.09%| 13.44%| 85.47%
6| 0.83%| 0.36%| 0.47%| 99.17%
7| 0.21%| 0.16%| 0.05%| 99.79%
8| 14.01%| 0.00%| 14.01%| 85.99%
9| 24.90%| 0.00%| 24.90%| 75.10%
10| 13.39%| 0.05%| 13.33%| 86.61%
11| 4.17%| 0.05%| 4.11%| 95.83%
12| 4.27%| 0.00%| 4.27%| 95.73%
13| 13.44%| 0.05%| 13.39%| 86.56%
14| 0.68%| 0.00%| 0.68%| 99.32%
15| 0.00%| 0.00%| 0.00%| 100.00%
-------------------------------------------
avg.| 16.94%| 0.53%| 16.41%| 83.06%

Read IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 53132787712 | 405371 | 1689.03 | 13512.25 | 0.592 | 0.029 | e:\io.d (1024MB)
1 | 53330706432 | 406881 | 1695.32 | 13562.58 | 0.590 | 0.028 | e:\io.d (1024MB)
2 | 53324546048 | 406834 | 1695.13 | 13561.01 | 0.590 | 0.028 | e:\io.d (1024MB)
3 | 53357838336 | 407088 | 1696.19 | 13569.48 | 0.589 | 0.027 | e:\io.d (1024MB)
-----------------------------------------------------------------------------------------------------
total: 213145878528 | 1626174 | 6775.67 | 54205.33 | 0.590 | 0.028

%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.302 | N/A | 0.302
25th | 0.579 | N/A | 0.579
50th | 0.589 | N/A | 0.589
75th | 0.599 | N/A | 0.599
90th | 0.612 | N/A | 0.612
95th | 0.624 | N/A | 0.624
99th | 0.668 | N/A | 0.668
3-nines | 0.752 | N/A | 0.752
4-nines | 1.396 | N/A | 1.396
5-nines | 2.421 | N/A | 2.421
6-nines | 2.494 | N/A | 2.494
7-nines | 2.516 | N/A | 2.516
8-nines | 2.516 | N/A | 2.516
9-nines | 2.516 | N/A | 2.516
max | 2.516 | N/A | 2.516

Thank you for all of this detail. I've also fond that NUMA improves performance. I'll experiment along the ines that you suggest. It's a pity that out-of-the-box performance is so poor, though.

https://youtu.be/_MUJBdd5GOA?t=1m49s

this guy claims setting the pcie setting to x3 manually improves speeds.

NorySS wrote:
https://youtu.be/_MUJBdd5GOA?t=1m49s

this guy claims setting the pcie setting to x3 manually improves speeds.


Thank you. I tried that, with no effect. What did work was overclocking memory to 3200 (using QVL memory rated at 3600). It raised the speeds in CrystalDiskMark to what I've seen in reviews. Still slower in Atto, but closer to what I'd expect.

FlyingBear wrote:
Thank you. I tried that, with no effect. What did work was overclocking memory to 3200 (using QVL memory rated at 3600). It raised the speeds in CrystalDiskMark to what I've seen in reviews. Still slower in Atto, but closer to what I'd expect.


Changing to NUMA mode, changing to GEN 3 per the video, and the latest BIOS has given me the speeds below which have been my best so far (Using 3 EVO 860 in RAID 0). THe first image is after both NUMA and GEN 3, the second is after NUMA

75243

wow those are great speeds. best i have gotten was 7GBs reads.

Overclocking the CPU also helps. I saw a difference between 4Ghz and 4.15Ghz.

NorySS wrote:
wow those are great speeds. best i have gotten was 7GBs reads.

Overclocking the CPU also helps. I saw a difference between 4Ghz and 4.15Ghz.


I am running at 4Ghz and have my ram at 3200, I have 3 NVMe drives how many are you working with?

vsimone67 wrote:
I am running at 4Ghz and have my ram at 3200, I have 3 NVMe drives how many are you working with?


3 Also. Samsung 960EVOs