High Performance Memory Systems

Haldun Hadimioglu, David Kaeli, Jeffrey Kuskin, Ashwini Nanda, Josep Torrellas

Springer Science & Business Media, 2003. gada 31. okt. - 297 lappuses

The State of Memory Technology Over the past decade there has been rapid growth in the speed of micropro cessors. CPU speeds are approximately doubling every eighteen months, while main memory speed doubles about every ten years. The International Tech nology Roadmap for Semiconductors (ITRS) study suggests that memory will remain on its current growth path. The ITRS short-and long-term targets indicate continued scaling improvements at about the current rate by 2016. This translates to bit densities increasing at two times every two years until the introduction of 8 gigabit dynamic random access memory (DRAM) chips, after which densities will increase four times every five years. A similar growth pattern is forecast for other high-density chip areas and high-performance logic (e.g., microprocessors and application specific inte grated circuits (ASICs)). In the future, molecular devices, 64 gigabit DRAMs and 28 GHz clock signals are targeted. Although densities continue to grow, we still do not see significant advances that will improve memory speed. These trends have created a problem that has been labeled the Memory Wall or Memory Gap.

Priekšskatīt šo grāmatu »

No grāmatas satura

Saturs

Introduction to HighPerformance Memory Systems	5

12 PowerAware Reliable and Reconfigurable Memory	6

13 SoftwareBased Memory Tuning	7

14 ArchitectureBased Memory Tuning	9

15 Workload Considerations	11

Speculative Locks Concurrent Execution of Critical Sections in SharedMemory Multiprocessors	15

22 Speculative Locks	16

23 Evaluation	25

102 Related Work	160

103 Algorithms	161

104 Results	165

References	166

Array Merging A Technique for Improving Cache and TLB Behavior	169

112 Related Work	170

113 Basic Notions	172

114 Cacheconscious Merging	174

24 Conclusions	26

Acknowledgments	27

Dynamic Verification of Cache Coherence Protocols	29

32 Dynamic Verification of Cache Coherence	33

33 SMP Coherence Checker Correctness Coverage and Specificity	37

34 Coherence Checker Overhead	38

35 Related Work	42

36 Future Work	43

References	44

TimestampBased Selective Cache Allocation	47

42 Related Work	48

43 Evaluation Methodology	50

45 Selective Allocation	52

46 Experimental Results	57

47 Future Work	58

PowerEfficient Cache Coherence	67

52 Snoopy Coherence Protocols	68

53 Methodology	70

54 Directory Protocols	76

55 Simulation Results	79

56 Conclusion	81

References	82

Improving Power Efficiency with an Asymmetric SetAssociative Cache	83

62 Related Work	85

63 Methodology and Modeling	88

64 Asymmetric SetAssociative Cache	89

65 Results	94

66 Discussion and Future Work	96

67 Conclusions	98

References	99

Memory Issues in HardwareSupported Software Safety	101

72 Historical Context	102

73 Motivating Applications	104

74 Architectural Mechanisms	108

75 Results	112

76 Conclusions	114

References	115

Reconfigurable Memory Module in the RAMP System for Stream Processing	117

82 RAMP Architecture	119

83 Cluster Architecture	121

84 Memory Module Architecture	123

86 Controller	127

87 Handshake Blocks	130

88 Scan Chain Register	131

810 Conclusion	133

Performance of Memory Expansion TechnologyMXT	139

92 Overview of MXT Hardware	141

93 The MXT Memory Management Software	143

94 Performance Evaluation	144

95 Related Work	152

96 Conclusions	155

ProfileTuned Heap Access	157

115 Case study	177

116 Experimental Results	179

117 Conclusions	182

Software Logging under Speculative Parallelization	185

122 Speculative Parallelization and Versioning	187

123 Speculation Protocol Used	190

124 Efficient Software Logging 1241 Log Operations	191

125 Evaluation Methodology 1251 Simulation Environment	193

126 Evaluation	195

128 Conclusion	197

An Analysis of Scalar Memory Accesses in Embedded and Multimedia Systems	203

132 Previous Work	204

133 Experimental Setup	205

134 Results	206

135 Conclusion and Future Work	213

References	214

BandwidthBased Prefetching for ConstantStride Arrays	217

142 Previous Work	218

143 OffChip Bandwidth	219

144 Cache Conflicts	221

145 Algorithm Details	222

147 Conclusion	228

References	229

Performance Potential of Effective Address Prediction of Load Instructions	231

152 Effective Address Predictors	234

153 Evaluation Methodology	238

154 Results	242

155 Related Work	246

156 Conclusion and Future Work	247

References	249

Evaluating Novel Memory System Alternatives for Speculative Multithreaded Computer Systems	253

162 Background and Motivation	254

163 The Superthreaded Architecture Model	255

164 Methodology	256

165 Results	258

166 Conclusion	261

References	264

Evaluation of Large L3 Caches Using TPCH Trace Samples	267

172 TPCH Traces	268

173 Evaluation Methodology	271

174 Simulation Results	272

176 Conclusion	279

Acknowledgments	280

Exploiting Intelligent Memory for Database Workloads	283

182 Related Work	284

183 FlexRAM	285

184 FlexDB	286

185 Experimental Setup	289

186 Experimental Results	291

Author Index	297

Autortiesības

Citi izdevumi - Skatīt visu

High Performance Memory Systems
Haldun Hadimioglu,David Kaeli,Jeffrey Kuskin,Ashwini Nanda,Josep Torrellas
Ierobežota priekšskatīšana - 2011

High Performance Memory Systems
Haldun Hadimioglu,David Kaeli,Jeffrey Kuskin,Ashwini Nanda,Josep Torrellas
Priekšskatījums nav pieejams - 2012

Bieži izmantoti vārdi un frāzes

Address Predictor algorithm allocation applications array array merging asymmetric caches benchmarks byte cache block cache coherence cache conflicts cache line cache misses chapter checker checking cluster coherence misses coherence protocol compiler compression Computer Architecture configuration counter cycle data fetch DEAP direct-mapped effective address prediction entry evaluation execution Figure hardware hybrid IEEE implementation instructions International Symposium iteration L3 cache latency level-1 data cache logging lookup loop loop fusion memory accesses memory hierarchy memory module memory system merging miss rate Miss ratio multiprocessor optimization P.Array P.Mem parallel performance physical memory pipeline pointer power consumption prefetch Proceedings processor queries references request reuse scalar scan chain serial snooping set-associative cache shared level-1 shows simulation SPEC speculative locks speculative memory buffer speculative threads speedup structure Superscalar Superthreaded Table techniques thread units trace samples transaction workloads

Bibliogrāfiskā informācija

Nosaukums	High Performance Memory Systems
Redaktori	Haldun Hadimioglu, David Kaeli, Jeffrey Kuskin, Ashwini Nanda, Josep Torrellas
Izdevums:	ilustrēts
Izdevējs	Springer Science & Business Media, 2003
ISBN	038700310X, 9780387003108
Apjoms	297 lappuses

Eksportēt avotu	BiBTeX EndNote RefMan

Par Google grāmatām - Konfidencialitātes politika - Pakalpojuma _ noteikumi - Informācija izdevējiem - Ziņojiet par problēmu - Palīdzība - Google sākumlapa