1 Introduction 1.1 Research Background and Significance 1.1.1 Development Trends of Neural Network 1.1.2 Requirements of NN Processor 1.1.3 Energy-Efficient NN Processors 1.2 Summary of the Research Work 1.2.1 Overall Framework of the Research Work 1.2.2 Main Contributions of This Book 1.3 Overall Structure of This Book References 2 Basics and Research Status of Neural Network Processors 2.1 Basics of Neural Network Algorithms 2.2 Basics of Neural Network Processors 2.3 Research Status of Digital-Circuits-Based NN Processors 2.3.1 Data Reuse 2.3.2 Low-Bit Quantization 2.3.3 NN Model Compression and Sparsity 2.3.4 Summary of Digital-Circuits-Based NN Processors 2.4 Research Status of CIM NN Processors 2.4.1 CIM Principle 2.4.2 CIM Devices 2.4.3 CIM Circuits 2.4.4 CIM Macro 2.4.5 Summary of CIM NN Processors 2.5 Summary of This Chapter References 3 Energy-Efficient NN Processor by Optimizing Data Reuse for Specific Convolutional Kernels 3.1 Introduction 3.2 Previous Data Reuse Methods and the Constraints 3.3 The KOP3 Processor Optimized for Specific Convolutional Kernels 3.4 Processing Array Optimized for Specific Convolutional Kernels 3.5 Local Memory Cyclic Access Architecture and Scheduling Strategy 3.6 Module-Level Parallel Instruction Set and the Control Circuits 3.7 Experimental Results 3.8 Conclusion References 4 Optimized Neural Network Processor Based on Frequency-Domain Compression Algorithm 4.1 Introduction 4.2 The Limitations of Irregular Sparse Optimization and CirCNN Frequency-Domain Compression Algorithm 4.3 Frequency-Domain NN Processor STICKER-T 4.4 Global-Parallel Bit-Serial FFT Circuits 4.5 Frequency-Domain 2D Data-Reuse MAC Array 4.6 Small-Area Low-Power Block-Wise TRAM 4.7 Chip Measurement Results and Comparison 4.8 Summary of This Chapter References 5 Digital Circuits and CIM Integrated NN Processor 5.1 Introduction 5.2 The Advantage of CIM Over Pure Digital Circuits 5.3 Design Challenges for System-Level CIM Chips
5.4 Sparse CIM Processor STICKER-IM 5.5 Structural Block-Wise Weight Sparsity and Dynamic Activation Sparsity 5.6 Flexible Mapping and Scheduling and Intra/Inter-Macro Data Reuse 5.7 Energy-Efficient CIM Macro with Dynamic ADC Power-Off 5.8 Chip Measurement Results and Comparison 5.9 Summary of This Chapter References 6 A 「Digital+CIM」 Processor Supporting Large-Scale NN Models 6.1 Introduction 6.2 The Challenges of System-Level CIM Chips to Support Large-Scale NN Models 6.3 「Digital+CIM」 NN Processor STICKER-IM 6.4 Set-Associate Block-Wise Sparse Zero-Skipping Circuits 6.5 Ping-Pong CIM and Weight Update Architecture 6.6 Ping-Pong CIM Macro with Dynamic ADC Precision 6.7 Chip Measurement Results and Comparison 6.8 Summary of This Chapter References 7 Summary and Prospect 7.1 Summary of This Book 7.2 Prospect of This Book