内容介绍
本书采用ARMv8-A体系结构,介绍当前硬件技术的基本原理、汇编语言、计算机算术、流水线、内存层次结构和I/O。本书更加关注后PC时代发生的变革,通过实例、练习等详细介绍近期新涌现的移动计算和云计算,更新的内容还包括平板电脑、云基础设施以及ARM(移动计算设备)和x86(云计算)体系结构。
计算机组成与设计 硬件/软件接口(英文版·原书第5版·ARM版)
目录
●CHAPTERS
1 Computer Abstractions and Technology 2
1.1 Introduction 3
1.2 Eight Great Ideas in Computer Architecture 11
1.3 Below Your Program 13
1.4 Under the Covers 16
1.5 Technologies for Building Processors and Memory 24
1.6 Performance 28
1.7 The Power Wall 40
1.8 The Sea Change:The Switch from Uniprocessors to Multiprocessors 43
1.9 Real Stuff:Benchmarking the Intel Core i7 46
1.10 Fallacies and Pitfalls 49
1.11 Concluding Remarks 52
1.12 Historical Perspective and Further Reading 54
1.13 Exercises 54
2 Instructions:Language of the Computer 60
2.1 Introduction 62
2.2 Operations of the Computer Hardware 63
2.3 Operands of the Computer Hardware 67
2.4 Signed and Unsigned Numbers 75
2.5 Representing Instructions in the Computer 82
2.6 Logical Operations 90
2.7 Instructions for Making Decisions 93
2.8 Supporting Procedures in Computer Hardware 100
2.9 Communicating with People 110
2.10 LEGv8 Addressing for Wide Immediates and Addresses 115
2.11 Parallelism and Instructions:Synchronization 125
2.12 Translating and Starting a Program 128
2.13 A C Sort Example to Put it All Together 137
2.14 Arrays versus Pointers 146
2.15 Advanced Material:Compiling C and Interpreting Java 150
2.16 Real Stuff:MIPS Instructions 150
2.17 Real Stuff:ARMv7 (32-bit) Instructions 152
2.18 Real Stuff:x86 Instructions 154
2.19 Real Stuff:The Rest of the ARMv8 Instruction Set 163
2.20 Fallacies and Pitfalls 169
2.21 Concluding Remarks 171
2.22 Historical Perspective and Further Reading 173
2.23 Exercises 174
3 Arithmetic for Computers 186
3.1 Introduction 188
3.2 Addition and Subtraction 188
3.3 Multiplication 191
3.4 Division 197
3.5 Floating Point 205
3.6 Parallelism and Computer Arithmetic:Subword Parallelism 230
3.7 Real Stuff:Streaming SIMD Extensions and Advanced Vector Extensions in x86 232
3.8 Real Stuff:The Rest of the ARMv8 Arithmetic Instructions 234
3.9 Going Faster:Subword Parallelism and Matrix Multiply 238
3.10 Fallacies and Pitfalls 242
3.11 Concluding Remarks 245
3.12 Historical Perspective and Further Reading 248
3.13 Exercises 249
4 The Processor 254
4.1 Introduction 256
4.2 Logic Design Conventions 260
4.3 Building a Datapath 263
4.4 A Simple Implementation Scheme 271
4.5 An Overview of Pipelining 283
4.6 Pipelined Datapath and Control 297
4.7 Data Hazards:Forwarding versus Stalling 316
4.8 Control Hazards 328
4.9 Exceptions 336
4.10 Parallelism via Instructions 342
4.11 Real Stuff:The ARM Cortex-A53 and Intel Core i7 Pipelines 355
4.12 Going Faster:Instruction-Level Parallelism and Matrix Multiply 363
4.13 Advanced Topic:An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations 366
4.14 Fallacies and Pitfalls 366
4.15 Concluding Remarks 367
4.16 Historical Perspective and Further Reading 368
4.17 Exercises 368
5 Large and Fast:Exploiting Memory Hierarchy 386
5.1 Introduction 388
5.2 Memory Technologies 392
5.3 The Basics of Caches 397
5.4 Measuring and Improving Cache Performance 412
5.5 Dependable Memory Hierarchy 432
5.6 Virtual Machines 438
5.7 Virtual Memory 441
5.8 A Common Framework for Memory Hierarchy 465
5.9 Using a Finite-State Machine to Control a Simple Cache 472
5.10 Parallelism and Memory Hierarchy:Cache Coherence 477
5.11 Parallelism and Memory Hierarchy:Redundant Arrays of Inexpensive Disks 481
5.12 Advanced Material:Implementing Cache Controllers 482
5.13 Real Stuff:The ARM Cortex-A53 and Intel Core i7 Memory Hierarchies 482
5.14 Real Stuff:The Rest of the ARMv8 System and Special Instructions 487
5.15 Going Faster:Cache Blocking and Matrix Multiply 488
5.16 Fallacies and Pitfalls 491
5.17 Concluding Remarks 496
5.18 Historical Perspective and Further Reading 497
5.19 Exercises 497
6 Parallel Processors from Client to Cloud 514
6.1 Introduction 516
6.2 The Difficulty of Creating Parallel Processing Programs 518
6.3 SISD,MIMD,SIMD,SPMD,and Vector 523
6.4 Hardware Multithreading 530
6.5 Multicore and Other Shared Memory Multiprocessors 533
6.6 Introduction to Graphics Processing Units 538
6.7 Clusters,Warehouse Scale Computers,and Other Message-Passing Multiprocessors 545
6.8 Introduction to Multiprocessor Network Topologies 550
6.9 Communicating to the Outside World:Cluster Networking 553
6.10 Multiprocessor Benchmarks and Performance Models 554
6.11 Real Stuff:Benchmarking and Rooflines of the Intel Core i7 960 and the NVIDIA Tesla GPU 564
6.12 Going Faster:Multiple Processors and Matrix Multiply 569
6.13 Fallacies and Pitfalls 572
6.14 Concluding Remarks 574
6.15 Historical Perspective and Further Reading 577
6.16 Exercises 577
APPENDIX
A The Basics of Logic Design A-2
A.1 Introduction A-3
A.2 Gates,Truth Tables,and Logic Equations A-4
A.3 Combinational Logic A-9
A.4 Using a Hardware Description Language A-20
A.5 Constructing a Basic Arithmetic Logic Unit A-26
A.6 Faster Addition:Carry Lookahead A-37
A.7 Clocks A-47
A.8 Memory Elements:Flip-Flops,Latches,and Registers A-49
A.9 Memory Elements:SRAMs and DRAMs A-57
A.10 Finite-State Machines A-66
A.11 Timing Methodologies A-71
A.12 Field Programmable Devices A-77
A.13 Concluding Remarks A-78
A.14 Exercises A-79
Index I-1
ONLINE CONTENT
B Graphics and Computing GPUs B-2
B.1 Introduction B-3
B.2 GPU System Architectures B-7
B.3 Programming GPUs B-12
B.4 Multithreaded Multiprocessor Architecture B-25
B.5 Parallel Memory System B-36
B.6 Floating Point Arithmetic B-41
B.7 Real Stuff:The NVIDIA GeForce 8800 B-46
B.8 Real Stuff:Mapping Applications to GPUs B-55
B.9 Fallacies and Pitfalls B-72
B.10 Concluding Remarks B-76
B.11 Historical Perspective and Further Reading B-77
C Mapping Control to Hardware C-2
C.1 Introduction C-3
C.2 Implementing Combinational Control Units C-4
C.3 Implementing Finite-State Machine Control C-8
C.4 Implementing the Next-State Function with a Sequencer C-22
C.5 Translating a Microprogram to Hardware C-28
C.6 Concluding Remarks C-32
C.7 Exercises C-33
D A Survey of RISC Architectures for Desktop,Server,and Embedded Computers D-2
D.1 Introduction D-3
D.2 Addressing Modes and Instruction Formats D-5
D.3 Instructions:The MIPS Core Subset D-9
D.4 Instructions:Multimedia Extensions of the Desktop/Server RISCs D-16
D.5 Instructions:Digital Signal-Processing Extensions of the Embedded RISCs D-19
D.6 Instructions:Common Extensions to MIPS Core D-20
D.7 Instructions Unique to MIPS-64 D-25
D.8 Instructions Unique to Alpha D-27
D.9 Instructions Unique to SPARC v9 D-29
D.10 Instructions Unique to PowerPC D-32
D.11 Instructions Unique to PA-RISC 2.0 D-34
D.12 Instructions Unique to ARM D-36
D.13 Instructions Unique to Thumb D-38
D.14 Instructions Unique to SuperH D-39
D.15 Instructions Unique to M32R D-40
D.16 Instructions Unique to MIPS-16 D-40
D.17 Concluding Remarks D-43
Glossary G-1
Further Reading FR-1
内容介绍
本书采用ARMv8-A体系结构,介绍当前硬件技术的基本原理、汇编语言、计算机算术、流水线、内存层次结构和I/O。本书更加关注后PC时代发生的变革,通过实例、练习等详细介绍近期新涌现的移动计算和云计算,更新的内容还包括平板电脑、云基础设施以及ARM(移动计算设备)和x86(云计算)体系结构。
微信支付
支付宝
扫一扫购买