cancel
Showing results for 
Search instead for 
Did you mean: 

Overclocking Tips - Part Two

Raja
Level 13
Overclocking Tips Part Two


Part one of this guide was posted over in the Rampage IV section of the forum a year ago. The central focus of that guide was to increase awareness of what goes on at the electrical level as we overclock a system and run out of headroom (using DRAM as an example). I had planned on adding more to it eventually – it makes sense to do so now. As the subject matter is related and reliant on part one, we’ll copy over some of the key sections, making edits to accommodate new info and then add a new section below. The new “module” adds info on impedance together with how things can affect POST and general system stability. Enjoy!


A few Overclocking Technicalities Analogized


Why does a system become unstable when it is overclocked? There are numerous reasons, actually. More than one could cover in a single article. Many require electrical engineering backgrounds to both write and to understand. Electrical engineers we are not… Well, most of us (including me) are not anyway, so we’re going to try and keep things simple.

Fundamentally, the role of a processor is to calculate, write and read data. At the core level, this data is represented and moved around the system as 1’s and 0’s (binary patterns). Let’s look at a crude visual representation of how data is represented at the electrical level:

46370




The “wavy” line is the signal alternating between a high and low voltage to represent 1 and 0. In this brief example the data pattern being transmitted from the memory bus to the processor is 101010.

VOH (voltage output high) is the high voltage output level of the transmitter that represents a logic 1, while VOL is the low output voltage (voltage output low) representing a logic 0.

VREF is the reference voltage. The reference is typically set at the midpoint between VOL and VOH.

VDDQ (not shown, and known as DRAM Voltage on motherboards) is the voltage supply for VOH, VOL and the voltage from which VREF is derived.

Most signal stages have three states: VOH, VOL and “off” - known as a tri-state transceiver/s. Typically, “off” state will be a certain level lower than VOL but above ground potential (or centered). The off state is required to prevent inadvertent data transmission. This standing voltage (bias) in “off” state is attached to a compensation network to hold the voltage in steady state when the signal line is not transmitting. Compensation is usually in the form of a resistor to ground, but can be something more elaborate if required. The reason a certain level of bias is present and the line it is not at ground potential in “off” position is due to a number of factors which fall outside the scope of this article.

For this example, let us assume VOH is around 80% of VDDQ, VOL is 20% of VDDQ, while VREF is 50% of VDDQ. The signal swings between VOL and VOH to represent data as a 1 or 0 while a data strobe compares the signal against VREF. If the voltage is higher than VREF it is interpreted as a logic 1, or if the signal voltage is below VREF it is interpreted as a logic 0. Needless to say, the process of transferring the data and interpreting it accurately requires that the transmitter and the receiver device be in close timing sync. If there is a lack of synchronization between the transmitted signal and the strobe, the data could be read erroneously.

In the ideal world, the signal waveform would be perfectly symmetrical as it transitions between high and low states. Never crossing above VOH or below VOL. The keen eyed among you will notice in the diagram above that the signal varies slightly from one transition to the next. That’s mostly because I’m crap at drawing, but in this case, fortunately, the rendition fits! The waveforms are non-symmetrical and have different levels of excursion past VOH or VOL (overshoot and undershoot). There are various reasons why these issues occur; power supply fluctuation, jitter, impedance issues and noise just to name a few. We won’t go into all of the factors leading to these problems as many fall outside the scope of this article. However, we can break down what happens as a result of them by using a real-world analogy in a context we can relate to.
54,315 Views
14 REPLIES 14

Raja
Level 13
Analogy Number One: Data Sampling and Conveyer Belts…



One could use the analogy of a conveyer belt and a strobe light to visualise the process of sampling data. Boxes are placed on a conveyer belt and a strobe light is used to highlight a barcode on the box to take a recording of the number printed on it. As the box moves past the flashing strobe light, the light turns on to look at the barcode. If the timing between the strobe and the box is not in sync, the strobe will flash to take a sample when the box is not in the ideal position – leading to a misread.

Suppose we want to speed up production (overclocked production); we increase the speed of the belt and the speed of the machine that places the boxes on the belt to do so. However, we soon hit a few snags. At high speed, the machine that places boxes onto the belt may not place them perfectly square, or there may be timing issues where the gap between the boxes varies. This eats into the available sampling margin for the strobe and there comes a point where the barcode on a box cannot be read accurately.

In effect, this is very similar what happens as we overclock a system – waveform integrity starts to suffer and we lose margin to sample or transfer data. We can counter some of this by increasing or tuning voltages, altering memory timings and by improving system cooling (for lower temperatures of critical areas).

The impact of voltage changes to various portions of a processor can have different effects. Take CPU Vcore for example, we know that it has to be increased as we raise the frequency of the processor cores to any appreciable degree. We are asking the processor to “switch” faster and voltage provides the potential for that to happen. The electrical process behind this is a lot more complex than our statement, but in the real world the fact holds that CPU core overclocking is primarily reliant on increased voltage and to various degrees (excuse pun) on low processor temperature.

We can also apply the same rule to the processor’s memory controller or memory modules – although not quite to the same extent. Why? Well, the role of what voltage does to portions of the memory sub-systems can be more delicate; there are instances in which more voltage is not the answer…

Raja
Level 13
Overshoot, Ringback and Undershoot



Let’s roll back to our earlier “data” image:


46371



If one were to look at the waveform at increased resolution on a scope, it is likely that the portions of the waveform showing overshoot would look something like this:


34034




34035
Enlarged view showing VOH (logic 1) overshoot and ringback


In signalling language, the term overshoot refers to a signal excursion beyond a defined level. In our images the defined level for a logic 1 (high level) is VOH. The images above show a signal that has breached VOH and then recovers oscillating slightly (ringback) before returning to a logic 0. Similarly, undershoot refers to a signal that falls below an ideal level.

Overshoot and undershoot occur to a greater degree when the device responsible for generating the signals (transistor stages) is run outside comfortable operating limits. Transistors have an ideal voltage region in which they operate in a predictable manner; when we increase voltage and ask them to switch faster, we push them outside that region which can lead to unpredictable behavior. This is the reason why processors have stipulated supported operating speeds. The processor vendor spends a great deal of time evaluating the guaranteed and safe operating limits of their CPUs at frequencies, voltages and power consumption levels that ensure stability and acceptable lifespan.

The horizontal axis in these illustrations represents time. If the signal overshoots or undershoots the expected level (VOH or VOL), the time it takes to transition between high and low states is affected and this eats into the available timing budget because the sampling signal (strobe) may not be experiencing exactly the same problem.

If the time difference between the DQ signal and strobe is too wide, a signal could be seen as a logic 1 instead of a 0 or we may simply run out of timing budget for any kind of data transfer. The faster the system is running in frequency, the more chance there is of such issues. As a result, we have a smaller and smaller timing budget to play with as we overclock a system, until we eventually run out of overhead and things become unstable.

Raja
Level 13
Analogy Number Two: One Man, One Ball and a Wall….



There’s no simplistic way to explain the intricacies of signalling, as the scope of this subject is HUGE. In fact, there’s no single real-world analogy that can be used either, as there are too many contributors and detractors to the integrity of a signal. So, we’ll aim for a small degree of the picture with limited accuracy. Probably enough to raise more questions than it answers, so please keep an open mind as what follows is by no means absolute. You certainly don’t want to walk into a university lecture for EE students, waving your finger around claiming to understand the depths of signalling thanks to our wonderful analogies 😄

Ball game: One person throws a ball over a defined distance at a target painted on a wall which then bounces off the wall and must be caught in order to throw it again. As we’re relating this to overclocking, there are some caveats to the game:


i) In order to meet our timing budget, 10 throws at the target must be completed in 10 seconds. We can relate these 10 throws to frequency – obviously bearing in mind that computer systems operate at many thousand “throws” per second.


ii) Voltage supplies the potential to throw the ball. Higher voltage levels equate to a faster throw.


iii) If throwing voltage is excessive it affects throw accuracy and the ability to catch the ball when it rebounds.


iv) If throwing voltage is not sufficient it may not be possible to make enough throws in the designated time. It may also not be sufficient to reach the target accurately.


v) Failure to hit the target or close to the target results in the ball returning to the thrower in a non-ideal path, negatively affecting time between each cycle.



If we are fortunate enough to have an imagination that can visualise this game, we’ll understand that there are quite a few constraints and a growing potential for issues as we increase the number of throws that need to be made within a given timeframe. Overclocking any bus on a system is riddled with issues similar to this and many more.

In this instance, we’ve just begun to touch on the relationship between signal integrity and voltage. In the real world, if we apply too much voltage to signal stages, we risk signal integrity and possible damage to the transistors that swing the signal. However, we also need to supply enough voltage to ensure the signal slew rate (acceleration of a signal is defined as slew rate) is sufficient to meet timing requirements. In other words, too much voltage can be harmful and induce instability. Similarly a voltage level too low can also incite instability, although there’s less chance of processor damage if we stick closer to “stock” voltage levels.

The settling limit time budget is the maximum deviation the signal can suffer when recovering from overshoot. If the signal breaches the settling time limit, we are effectively outside reliable sync with the strobe and the system becomes unstable.

Raja
Level 13
Impedance


The latest processor and DRAM architectures employ automated impedance training procedures to ensure a close match between driver and receiver impedance. The interconnect impedance specification (motherboard traces) between driver and receiver stages is provided by the platform chipset vendor. Depending upon the motherboard vendor and the motherboard model, the specification values supplied by the chipset vendor are sometimes used as a guideline only – rather than strictly adhered to. The motherboard vendor may find situations in which using a different line impedance or trace layout helps with overclocking margin and system performance. As platforms become more integrated, this is an area in which a motherboard vendor spends a significant amount of development time. At least, any good vendor should spend time on these areas if they are charging more money for an overclocking centric end-product.

Again, the automated process of impedance matching is quite robust at stock and mild overclock frequencies. It’s when we get close to component and platform limits, that we start experiencing drift and increased sensitivity to parameter adjustments.

A closely matched driver, line, and receiver impedance is important. When closely matched, energy can be transferred from the driver to the receiver in an efficient and predictable manner. As with resistance, impedance is measured in Ohms – the difference being that impedance is used when dealing with AC signaling, as capacitance and inductance need to be taken into account.

If line impedance is not well matched, signal integrity will suffer – more overshoot, undershoot, and ringing effects manifest. By now, we should know that these things are bad for overclocking, as they eat into our all important timing budget:



46372



46373



46374
Impedance mismatches give rise to signaling issues



Due to the laws of physics, impedance is affected by temperature; the voltage swing of a driver may change with temperature, as does line impedance of traces which are made of copper. To manage these changes, the system employs different types of POST test routines from cold power on and warm reset to tune/re-tune driver impedance.

The success of this training process is reliant on a number of factors:

The quality of interconnect: cheaper manufacturing processes, fewer PCB layers, and poor trace length matching, all eat into overclocking margin sooner rather than later. Even if we make things as good as we can from an engineering standpoint, the platform itself may have limitations regarding flexibility for custom layouts and trace impedance values. It is possible, at times, to make things “too good”, because the memory controller, or even the platform microcode does not have the requisite flexibility to cater for design enhancements.

During POST, the driver calibration routine will determine how many pull-up resistors are needed to match required impedance levels of the DRAM ICs (drive strength). As with motherboards and chipsets, DRAM vendors also follow target specs for driver impedance. The usual range is between 34~40 Ohms, although there may be provision to support a slightly wider range. Remember, we are dealing with finite resources; the system does not have an infinite number of drivers to match impedance for every scenario; the closest match has to be used. When a system is overclocked to its limit, disparity in temperatures at the time calibration is performed, and the temperature of the system when it is running applications in the operating system, can be sufficient for the system to crash or fail during calibration (failed POST). This can also happen if system voltages are not sufficiently tuned for the applied overclock – the system isn't at absolute limits, it just hasn't been tuned well enough to dial out instability by the user.

There’s also some insight here about why certain DRAM IC types fall out of favor from one platform to the next. If the DRAM IC vendor chose to build their part to an impedance value or at voltages that fall at the edge of design spec, it may not perform well on newer platforms, as the trend for Intel and AMD is always to reduce power consumption. These efficiency enhancements usually result in lower recommended/specified voltages. The drivers within the processor will be optimized for performance around that level. This does not mean to say the memory kit will not work with the platform; it simply may not reach the same frequency or same timing latencies as the platform it was built for.

Back on topic, as we increase memory operating frequency and data rates, the ability of the output stages to drive signal-levels sufficiently, starts to suffer – obviously, rail voltages can have a significant impact on this. Therein lie the reasons why we need to adjust voltages to facilitate overclocking or stability. Note we used the word “adjust”, and not “increase”. Increasing voltage works when there is sufficient overhead in drivers to meet the required slew rate, without suffering from overshoot and undershoot. In some cases, we may need to move the voltage to a different value rather than a higher value. This is more apparent when impedance matching is not “good”; in such cases we're likely to encounter stability issues when increasing or adjusting voltage levels.


This brings us back to using systematic and gradual overclocking methods: system voltages need to be tuned gradually. If voltage is increased to a high value without any form of evaluation, we may have pushed the system outside reliable signaling thresholds before we’ve truly evaluated what our parts can do (given more ideal operating parameters.)

We often get requests here for overclocking guides when a new platform is released. When asked why us/him/me? The reply is “you can tell me what the safe voltages are”, or “I don’t know how much voltage to use”. To be blunt, if this isn’t your first overclocking experience, taking such a stance is utter hogwash. You simply use a systematic and gradual approach and that in itself will guide you. It’s how we formulate the guides ourselves; we do things gradually and work out where the platform shows signs of issue with stability. Safe voltage levels for rails that are not related “directly” to signaling such as Vcore, can be evaluated easily by finding out how well frequency scales in return for each small step in voltage increase. At a point where a larger jump in voltage is required, fall back to the earlier point. Follow these simple guidelines and you’ll never be looking for someone to write a guide again.

That’s enough babble for now. It’s time to apply a real-world analogy to the effects of impedance...

Raja
Level 13
Analogy number three: The therotical effects of impedance on signal transfer



Until this point, our ball game signaling analogy assumed a limited number of obstacles. Things are about to change as we’re going to add the impact of impedance into the mix.


Previously, we assumed that the ball bounced back to the thrower immediately after each throw – we’re going to remove that assumption at this point. The ball is caught by the receiver and held until it is requested back. That’s what storage of data on DRAM does. The incoming signal is read as either a high or low value (ball height when caught), and that value is stored until the memory controller issues a request to read it. Therefore, each data bit written to memory is a "new ball".


Let’s also assume that a closely matched impedance between the output driver (the arm that throws the ball), the receiver (the arm that catches the ball at the target) and the transmission line (the motherboard traces or interconnect), implies that wind speed is zero. That means the ball will not veer off path or slow down due to a sudden crosswind or headwind. The receiver is also loaded with the correct amount of tension to catch the ball without suffering from excessive settling or "bounceback". Things aren't this simple in the real world as this is all bound by the laws of physics – but we’ll ignore the absolute reality of physics to keep things simple.


We also need to remove the human element to the ball game completely at this stage – the arms are better visualized as being mechanical:




  • DRAM voltage controls how fast and far the arm swings when making a throw.
  • If impedance is well matched, drive power to the arm is optimized to throw the ball with the proper amount of force – accounting for ball weight and external effects. Note, increasing drive strength does not increase the speed of the throw (slew rate). It merely loads the arm with the correct power to minimize the impact of ball weight and external forces that affect ball trajectory.
  • Impedance mismatch affects perceived ball weight and the impact of any cross or headwind.
  • Overshoot, undershoot and ringing result in a longer or shorter flight time, which has implications for our timing budget. In reality, the amount of undershoot, overshoot and ringing will vary with the data pattern being sent over the bus – which means that timing alignment cannot be perfect for all scenarios. Eventually we run into a situation where the system starts showing signs of conditional stability – stable in some applications (lighter loading), not all.


Note that impedance mismatch isn't the only factor that gives rise to overshoot, undershoot and ringing. It is just one of them – there are many others.

Now that we understand some of the implications, we should be able to piece together what happens to a system as we overclock it. The further we push, the more difficult it becomes to pass POST training routines, because numerous factors play a part in stability. The primary issue is that their impact can be random in nature - on an unstable system, overshoot, undershoot and ringing can vary on every clock cycle. As a result, it becomes impossible for the training mechanisms to find a best fit for all data patterns, rendering the system partially or totally unstable.

Keeping a firm grasp on context is also important - the issues discussed above should not serve as a deterrent to overclocking nor should they be cited as the cause to every DRAM stability issue. 🙂

Reserved 2

Reserved 3

Raja
Level 13
Reserved 1

Atars
Level 7
Continua Raja,
finalmente qualcosa di interessante e serio per comprendere l'overclock!
Grazie

finally something interesting and serious to understand overclocking!
Thanks