Stringent performance requirements magnify the performance degradation impact of Design-for-Testability (DfT) techniques. As more aggressive performance optimizations are being employed, resulting in high-performance designs with reduced logic depth, the impact of scan multiplexers is becoming even more magnified. In this work, we propose a scan cell transformation technique that transfers the scan multiplexer delay from the input of the flip-flop to its output, enabling the removal of the scan multiplexer delay off the critical paths. By inserting a few shadow flip-flops properly, the proposed transformation technique retains test development (test data, quality, etc.) and application (test time, power dissipation, etc.) intact, fully complying with the conventional design and test flow. Experimental results justify the efficacy of the proposed techniques in eliminating the performance penalty of scan quickly and cost-effectively, and thus in enhancing functional speed of integrated circuits.