Results 1 to 10 of 16

Threaded View

  1. #1
    Super Moderator Array
    Join Date
    Jul 2011
    Reputation
    74
    Posts
    659

    MCE explanations and others

    MCE

    I’m seeing quite a lot of misunderstanding the workings of MCE so I’m partly writing this to address it.
    It is not the taboo that it has been made up to become. There are 3 options for it, namely Auto, enabled and disabled.

    Enabled merely maxes out Power and current limits so that users don’t have to manually do these themselves.

    Disabled sets these limits to intel’s defaults. Even when you customize ratios, these limits are still in place unless manually adjusted.

    Auto means that the board has liberty to determine what limits are reasonable, competitive, reliable and logical. Factors such as thermal, performance, Segment, competitor’s out of box perf, stability are taken into account. Logical meaning that when you customize a ratio, all limits are raised to the max with the logical assumption that you want to run that frequency and not clip from power.
    Therefore it is totally redundant to disable MCE and then max out power and current limits, since enabling MCE does the exact same thing and no more. Really, just leave MCE at auto if you plan on overclocking, it does you no harm.

    TVB

    Now for the current emphasis on totally stock perf of the i9’s by the review sites, all the attention is on TDP but that’s just a gnat compared to the camel swallowed. NO site actually talked about and examined the latest feature of the i9, Thermal Velocity Boost TVB. By default Intel enables this but I see that only Asus boards enable this at defaults. The other boards I tested have this disabled even at defaults.

    What this does is it reduces voltage guardbands depending on core temp. Traditionally, the voltage request by the proc is always based on worst case scenario TJMAX, meaning the voltage the proc thinks it needs for the frequency when temp is 100c. It is well-known that the cooler the chip runs, the lesser the voltage needed. Therefore TVB is opportunistically reducing power and temps. The behavior is quite linear and I observed the following on several samples.

    TVB takes effect from 40~50x on 99k and 40 to 49x on 97k and 40 to 47x on 96k, simply 40x to single core boost ratio. The V/temp curve runs from 0c to 100c. For example 150mv delta between 100c and 0c for 50x, meaning every 1C drop from 100c VID requested will reduce by 1.5mv. The reduction is smaller as you go down to 49x, the smaller the ratio the smaller the reduction, and below 40x you get no reduction. This is good for most people running stock. You can try this yourself by noting the VID idle, and then unplug your water pump and let the core temp rise slowly, noting down the correlated temp/VID, and see what i'm talking about.

    During OC, when you try to run adaptive mode voltage with this mechanism, you will need to change your perspective in how you set the ‘target adaptive voltage’ since you need to assume that’s the voltage you get when 100c and do the reduction to your lowest (usually ambient) temp and gauge what voltage is needed to be set. So if you set 1.35v for example, when you idle at 30c you will get maybe 1.25v instead. This can be confusing for many people, therefore we disable TVB once you customize a ratio. This is not to say you cannot exploit this mechanism to work for you during OC but you really need to find out your idle Vmin (lowest stable voltage). You can find this option in CPU internal power management in the bios and you can force it to enable during OC.
    For those who want to check or try this on other boards, simply download r/w everything http://rweverything.com/ and add CPU MSR 0x150

    Access this register and set bit 63 to 1 and [39:32] to 18h:

    https://ibb.co/gUyvUf

    Bit 3 shows you if TVB is enabled or disabled (0=disabled). If TVB is disabled, simply flip the bit and use command 19 to write.

    https://ibb.co/jCEDFL

    https://ibb.co/muKDFL


    Then you can see what the default stock behavior is really like. This will truly affect temperature, power consumption, boost frequencies when TDP is default, etc so those who want to dig deeper into ‘stock performance’ really needs to get this correct.

    The other thing that also affects ‘stock performance’ is the ACDC loadline programmed into the processor. Boards should let CPU know the actual loadline the board is currently set to by writing the correct loadline. This doesn’t mean that the board has to be honest about it, and with the generous guardband Intel are used to providing (not as generous any more perhaps – well you know they need to factor in stability after 10 years of heavy use for example), it is not uncommon for boards to lie to the processor so as to get it to undervolt. You cannot really tell how much the board has lied to the proc but at the same frequency/load, just by probing the inductor on different boards with a multimeter, you can see that at least more than one board is lying to the proc. Obviously TVB setting should be similar during the test or else you get very skewed results as explained above.

    Finally, VRM temp should not be the only factor when evaluating a VRM, much less a whole board. For OC, my opinion is that transient response is very important. Contrary to popular belief, you do not need expensive equipment to test transient response. You can use Cache OC or AVX offset to test this.

    If you played with Cache OC, you see that it is very intolerant of any undershoots. Straightaway you would hardlock or BSOD. You can even test it at default. Since it shares the same rail as core, set core ratio to something really low like 40x. Set min and max cache ratio to 43x and set a manual voltage like 1.15v. Run a heavy load like prime 95 non AVX. Dynamically slowly reduce the voltage 5mv at a time. You will find the VMIN this way. Once you find the VMIN under continuous load, stop prime95. If it doesn’t hang, run it again, back and forth between running and stopping. Even try booting straight from bios with that VMIN. You will see that this VMIN requires a guardband for transient load changes, meaning you will need 5mv+++ more. You will observe bigger guardbands needed at higher cache. Obviously the better the transient response, the guardband requirement is smaller.

    There is also AVX offset, or ratio change mechanism in general that you can observe transient response. First, find the VMIN under continuous heavy load like prime95 non AVX 26.6 on say 47x cpu ratio or something with a manual mode voltage with AVX offset at 0.
    Next set AVX offset to any value, such as 1 or 2. Run the same frequency/load at it’s VMIN. It will not last too long.

    Avx offset or other ratio change mechanisms has always had this issue whereby voltage guardband needed is bigger
    Heres why, the ratio change takes place by getting the core plls to go to sleep and then waking up to new pll frequency.
    The transient is very bad and violent when u run high loads cos it will go from really high load to almost no load and back to high load very quickly.

    Now you may think you did not even run AVX. For AVX offset, a lot of background stuff may run a few AVX instructions, such as dot net framework.
    Sometimes u can see avx offset occur when u don’t deliberately run avx, its usually very fast and you only see the small pockets.
    Therefore the ratio change occurs quickly and vmin is raised due to the guardband requirement increasing.

    The way to mitigate this is to use a steep LLC and higher vid. The transient will be better.

    You can trigger this guardband by doing other stuff that changes ratio, such as when running prime 95, keep setting down short duration power limit and upping it with XTU continuously.
    The ratio will keep changing and finally hang when your guardband is just enough.
    Or just keep changing ratio up and down.

    Therefore use AVX offset bearing the extra guardband in mind. This is totally the behavior of Intel’s proc. Again, obviously you can gauge the ‘responsiveness’ of a board by measuring the GB needed. For example you can logically conclude that a board that requires 150mv GB is less ‘agile’ than a board that requires 80mv guardband.
    Miniatura de Adjuntos Miniatura de Adjuntos 1a.jpg  

    1b.jpg  

    1c.jpg  

    Last edited by Shamino; 11-15-2018 at 02:55 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •