Results 1 to 4 of 4
  1. #1
    ROG Guru: Blue Belt Array jab383 PC Specs
    jab383 PC Specs
    Motherboard24/7 rig : Maximus VI Extreme
    Processori7 4790K
    Memory (part number)16GB Mushkin Redline 2400 10-12-12-28 + 16GB Corsair Vengeance 2400 10-12-12-31
    Graphics Card #1AMD Firepro W5000
    Sound CardM6E Supreme FX
    MonitorDell U2413
    Storage #1Kingston SH103S3240G SSD
    Storage #2Seagate ST1000DM003 1TB
    CPU CoolerCustom water loop, Delidded, Liquid Metal TIM
    CaseCoolerMaster HAF XM
    Power SupplyCorsair HX-750
    Keyboard Logitech G710+
    Mouse Logitech M705
    OS Windows 7 64 Pro
    jab383's Avatar
    Join Date
    Feb 2014
    Reputation
    107
    Posts
    848

    DDR4-4400 vs 4133

    Recent posts have discussed the point of overclocking DRAM -- what can be gained, what is the cost of trying to go too far and where is a reasonable sweetspot? In my competitive overclocking, I hadn’t considered those questions in enough detail to have an opinion, so I ran some tests with various DRAM profiles.

    Since the point is to compare DRAM settings, all else is held as consistent as possible. Hardware list and CPU settings are:

    Maximus X Apex motherboard
    Core i5 8600K CPU (because that’s what I had in the socket)
    5.2 GHz core clock, 4.9GHz cache clock at 1.37 Vcore, LLC 7, -1x AVX setback
    Chilled water cooling at 14-15C (to stay above the dew point)
    73C peak core temperature
    GT 1030 video card
    One kit 2x8GB G.Skill Trident Z DRAM F4-4400C19D-16GTXSW

    The tests use six manually tuned profiles, three each at 4400 and 4133. Note that these are tuned for the best performance I can get with these DRAM sticks, this CPU, this motherboard, at this temperature, etc. YMMV.

    The primary difference in these profiles is the criterion for stability. First, for 24/7 assured stability, MemtestHCI is the best choice. I only went for 100% coverage because all that tuning and testing was taking long enough without going longer. Profiles 1 and 4 are good enough that you would want those in your 24/7 gaming or productivity rig. I don’t want them on my testbench.

    For competitive benchmarking, I want a DRAM profile just stable enough for the benchmark being tested. The worst case of all benchmarks on HWBOT is y-Cruncher. The Pi 1 billionth digit test exercises a lot of the DRAM and catches any errors. It also takes less than a minute on a six-thread CPU. I call the result semi-stable. In my experience, if a DRAM profile completes the Pi 1B test, it’s good enough for any other benchmark on HWBOT. Giving up 24/7 perfection gives gains in bandwidth and latency. You probably don’t want profiles 2 or 5 on your 24/7 rig, but I use them extensively – for most 3d benchmarks and for the toughest 2d tests like y-Cruncher.

    The third level takes the ‘just stable enough’ approach another step. Profiles 3 and 6 are not stable, but they can do enough benchmarks that they are useful, at least to me.

    Click image for larger version. 

Name:	8600K-4400-c17-mt.jpg 
Views:	15 
Size:	192.0 KB 
ID:	73709


    Click image for larger version. 

Name:	8600K-4400-c17-yc.jpg 
Views:	6 
Size:	110.3 KB 
ID:	73710


    Click image for larger version. 

Name:	8600K-4400-c17-ycer.jpg 
Views:	5 
Size:	109.8 KB 
ID:	73711


    Click image for larger version. 

Name:	8600K-4133-c16-mt.jpg 
Views:	13 
Size:	189.4 KB 
ID:	73712


    Click image for larger version. 

Name:	8600K-4133-c15-yc.jpg 
Views:	3 
Size:	110.4 KB 
ID:	73713


    Click image for larger version. 

Name:	8600K-4133-c15-er.jpg 
Views:	4 
Size:	110.2 KB 
ID:	73714


    Aida 64 benchmarks give a measurement of writing bandwidth, reading bandwidth and latency. Bandwidth depends on the number of read or write operations the combined DRAM and CPU can complete. Latency here is not to be confused with the CAS Latency primary setting. CAS Latency is only a part of the measured delay of a data access from the CPU’s view. Measured latency also includes the round trip travel over motherboard traces, I/O time in the DRAM and in the memory controller. Changes to core and cache clocks can affect this measurement and were held constant in these tests. Interpreting how these measured numbers affect performance is another issue.

    I tested with y-Cruncher because it’s so tough on DRAM, is nearly memory bound when core clocks are over about 5.0GHz, and shows score variation with memory bandwidth and to a lesser extent with latency. Y-Cruncher uses AVX instructions, burns a lot of power and gets your CPU hot. The CPU overclock needs to be very stable to pass y-Cruncher and be useful for tuning DRAM without inserting CPU errors that confuse the issue.

    Geekbench 3 emphasizes ‘everyday’ tasks in an extensive array of workload tests. It gives single-core and multicore scores for CPU performance and gives an explicit memory performance score. The multicore score responds to memory bandwidth and the memory score responds to both bandwidth and latency.

    SuperPi 32m is an old, honored benchmark that uses a single thread to compute a lot of digits of Pi. This takes nearly 6 minutes per run at these clock rates. SuperPi is the next most stringent DRAM test and set the stability bar for profiles 3 and 6. SuperPi scores respond more to DRAM latency and less to bandwidth.


    Click image for larger version. 

Name:	Score Spread.jpg 
Views:	6 
Size:	90.2 KB 
ID:	73715


    So, some results:

    Profile 1 – 24/7 stable at DDR4-4400
    I got 4400 cl17, but couldn’t get cl16 – wouldn’t POST even with elevated voltage. CL17 beats the cl 19 of the XMP profile for these DRAMs, so I didn’t try XMP at all. DRAM voltage proved to be a controlling factor. With any more than 1.45 volts, the Samsung B-die on those G.Skill sticks threw bit errors that were caught by MemtestHCI. With any less than 1.45 volts, some of the timings would need to be relaxed.

    Profile 2 – y-Cruncher semi-stable at 4400
    This profile was sitting around in the M10A BIOS from previous benchmarking. I revisited tuning only a little. The only change was improved stability at a lower DRAM voltage. Compared to #1, this profile shows a relatively big step improvement in bandwidth and some improvement in latency as measured by AIDA64. The benchmark scores improved proportionally. You can see why I’d rather have this profile on the testbench than #1. Note the much increased DRAM voltage compared to profile 1. That’s the trick when tuning the bandwidth/stability tradeoff toward increased bandwidth.

    Profile 3 – not very stable, but it works sometimes at 4400
    This was also sitting around in BIOS and didn’t take much work – again mostly with DRAM voltage. The improvements are rather small since most of the secondary and tertiary settings are nearly the same as in profile 2. Nonetheless, benchmark scores are improved. This profile has the DRAM voltage set at the lowest point where the benchmarks except y-Cruncher 1B will work. At that voltage, much to my surprise, y-Cruncher 1B runs about one time in four tries. That score gets an asterisk because it only happens sometimes.

    Lesson learned: Increase DRAM voltage to get tighter settings, but lower voltage is necessary for stability.

    Profile 4 – 24/7 Stable at DDR4-4133
    I’ll compare across the DDR rates at the same stability level. This profile has CL16. That’s faster than 17 in profile 1, right? Not so much. 4400 CL17 and 4133 CL16 are virtually identical when expressed in nanoseconds. Aida64 latency measurements are also identical. Aida 64 measures quite a difference in bandwidth due to the higher transfer rate at 4400. Oh, wait. The y-Cruncher and Geekbench 3 scores are better at 4133 in spite of 3-5% less bandwidth. All those tighter secondary and tertiary settings in profile 4 count for something. Perhaps Aida64 doesn’t exercise them when measuring bandwidth. The difference in SuperPi time does show some effect of bandwidth.

    Profile 5 – y-Cruncher semi-stable at 4133
    Giving up 24/7 stability allows for some serious tightening of timings – more than at 4400. This one works at CL15, which is the same number of nanoseconds as 4400 CL16. I think it is the CPU’s memory controller that works better at 4133 in the case of CL. RCD/RP at 16 would POST, but not run with the stated stability.

    We’re closer now to answering the basic questions: Profile 5 has equal or better benchmark scores than profile 2.

    Profile 6 – not stable, but useful for many benchmarks.
    There isn’t any DRAM voltage where profile 6 can run y-Cruncher 1b, so it can’t be called semi-stable. Read bandwidth is improved over profile 5 a lot by the two-step decrease in RCD/RP. Profile 6 claim to fame is its latency – best of the bunch – which shows up in latency-sensitive scores like SuperPi.

    The conclusion: Raja is right. Comparing 4400 and 4133 DRAM speeds at equal levels of stability, 4133 is the sweetspot.

    The clock rate, CAS latency, measured bandwidth, measured latency etc. are nice but don’t determine or reveal DRAM performance by themselves. On modern multithreaded benchmarks, including the ‘everyday’ workloads of Geekbench, the 24/7 stable 4133 profile beats the 4400. That isn’t really a surprise to me. I had thought the secondary and tertiary settings needed for stability at 4400 might even be worse than they are.

    A major difference is the ability of the CPU-DRAM combination to reach lower CL as both counts and nanoseconds.

    The surprise comes with the less stable benchmarking profiles. The 4133 profiles can be tightened so much more that they have a considerable performance advantage in many benchmarks. That advantage doesn’t hold for all benchmarks, though. I’ll have to try profiles 2 and 5, or 3 and 6, on each benchmark before I run for submission to HWBOT.

    The 4133 profiles have another advantage. They are easier to tune thanks to Raja’s pre-packaged profile in the BIOS of recent ROG motherboards. That 4133, 1.40 volts profile ran MemtestHCI stable and made a great starting point for tightening. I recommend it to those who want to run a well-tuned DRAM profile, but aren’t ready to try manual tuning

  2. #2
    ROG Guru: Gold Belt Array Menthol PC Specs
    Menthol PC Specs
    MotherboardM 10 Apex / XI gene
    Processor8700K / 9900K
    Memory (part number)2x8 Gskill Trident Z 4266/2x2x8 Gskill Trident Z 4500
    Graphics Card #1RTX 2080ti / GTX 1060
    Sound CardSoundBlaster Ae5 / Onboard
    MonitorBENQ 32" 4K
    Storage #1Intel 900p 480GB/Samsung 960 Pro 1TB
    Storage #2Intel 750 1.2TB/Plextor 1 TB 2X 950 Pro
    CPU CoolerCorsair H150i AIO / H1110i
    CaseCorsair 740 ? Lian Li Air
    Power SupplyCorsair AX 1200i / AX 1200i
    Keyboard Corsair
    Mouse ASUS Strix
    Headset HyperX Cloud Alpha
    Mouse Pad ASUS
    Headset/Speakers Logitech Z906
    OS 10 X64 Pro
    Network RouterVerizon Fios
    Accessory #1 Intel Wifi BT pcie card
    Accessory #2 Red Bull
    Accessory #3 English Breakfast Tea
    Menthol's Avatar
    Join Date
    Jan 2012
    Reputation
    241
    Posts
    4,547

    Jab, thanks very much for sharing your test results

  3. #3
    Tech Marketing Manager HQ Array Raja@ASUS's Avatar
    Join Date
    Apr 2011
    Reputation
    161
    Posts
    7,454

    Good post. From a 24-7 standpoint, it’s difficult to overcome the performance of 1T at DDR4-4133.

  4. #4
    ROG Junior Member Array
    Join Date
    Nov 2012
    Reputation
    10
    Posts
    4

    Just wanted to say thank you Jab383 for your hard work and providing your findings.

    I've had a hell of a time trying to dial in my G-Skill 4133 4 x 8 kit until recently and now I can look towards further performance enhancements using this thread.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •