The signal paths in any processor are not all created equal. Some are tighter than others. Early on in the life of a new device the vendor characterizes the silicon to find the slowest paths, adds an engineering margin and bases the frequency and other specifications of the reference device on that. Occasionally it happens that some speed path is missed in initial analysis, and part specifications have to be adjusted accordingly (e.g. slightly higher voltage or reduced operating temperature) to guarantee proper device function at the specified parameters.
What is the purpose of the engineering margin? There are two major reasons: The silicon manufacturing process, consisting of numerous chemical and mechanical steps, has some variability. With feature sizes that can be measured in atom lengths, it is easy to be off by a few atom lengths in one direction or the other. This means there will be variability between speed paths across individual parts of the same model. The second reason is that semiconductor devices slow down as they age. Wires may thin through electromigration, increasing resistance and slowing signal propagation, transistor switching speed may slow down due to trapped charge. The hotter a part runs the faster aging occurs (see Arrhenius equation). The engineering margin applied is a function of anticipated manufacturing variability and anticipated service life of the component, say 5 years for a processor.
Overclocking eats into the frequency margin designed into vendor specifications, and since it increases current and often temperature as well, leads to faster aging of the semiconductor device. Since the speed paths are not publicly documented, any number of operations or sequences of operations can hit such a speed path and cause software to fail. The failure may be obvious (crash, blue screen, kernel panic) or very subtle and therefore may go unnoticed for a long time.
Back when I was a poor student, I needed performance but could not afford top-of-the line hardware. So I became an avid overclocker of CPUs. Of course I would “stress test” my overclocked parts to make sure everything worked properly despite running in excess of manufacturer specifications. Several years into this, I found strange deviations in a floating-point intensive simulation code. It took me forever to track this down to an FSQRT instruction that occasionally delivered results where a few bits were flipped. Further testing showed these failures to be operand dependent, but clearly due to overclocking. Ever since then, I have taken an extremely critical view of overclocking, whether by end users or by “factor overclocked” hardware. The aggravation of having an application fail in subtle ways that may not be noticed for months is not worth it, IMHO. Other people take a different stance by observing that failures (both severe and subtle) are more likely to originate in software rather than hardware.
To summarize: Speed paths in processors are at minimum dependent on voltage, temperature, age of the processor, instructions or instruction sequences, instruction operands, and noise from manufacturing tolerances. There is no sure fire way for an end user to know when overclocking is completely safe, nor is it generally possible to incorporate knowledge about speed paths into a compiler’s code generation (e.g. tweaks to the manufacturing process over the life time of a chip can change them). In my experience compiler code generation is primarily oriented towards performance under the “race-to-finish” model as measured for example by clock cycles, and secondarily to reduce power consumption, where this second goal is something added over the past decade.