Shakes Me Up: Inside Nehalem: Intel’s Future Processor and System

Sunday, October 14, 2012

Inside Nehalem: Intel’s Future Processor and System

http://www.realworldtech.com/nehalem/7/

L1D Cache?

Inclusive caches are forced by design to replicate data, which implies certain relationships between the sizes of the various levels of the cache. In the case of Nehalem, each core contains 64KB of data in the L1 caches and 256KB in the L2 cache (there may or may not be data that is in both the L1 and L2 caches).

This means that 1-1.25MB of the 8MB L3 cache in Nehalem is filled with data that is also in other caches. What this means is that inclusive caches should only really be used where there is a fairly substantial size difference between the two levels. Nehalem has about an 8X difference between the sum of the four L2 caches and the L3, while Barcelona’s L3 cache is the same size as the total of the L2 caches.

Nehalem’s cache hierarchy has also been made more flexible by increasing support for unaligned accesses.

As a result, an unaligned SSE load or store will always have the same latency as an aligned memory access, so there is no particular reason to use aligned SSE memory accesses.

Shakes Me Up

Sunday, October 14, 2012

Inside Nehalem: Intel’s Future Processor and System

No comments:

Post a Comment