- 3D Gaussian Splatting1
- 3D NAND Flash1
- 3D accelerator1
- 3D reconstruction1
- 3D spatial computing1
- 3D-Stacked DRAM1
- 3D-Stacked Memory1
- 3D-stacked DRAM1
- 3D-stacked-memory1
- 4-bit matrix multiplication1
- AESPA1
- AI1
- AI Accelerator1
- AI accelerator2
- AI processor1
- AR/VR1
- ARMv8-A1
- Accelerator5
- Accelerator-in-Memory1
- Activation Compression1
- Address Translation1
- Algorithm-Hardware Co-Design1
- Approximate Computing1
- Approximate Nearest Neighbor1
- Arbitration1
- Ascend NPU1
- Associative Processor1
- AsyncDIMM1
- Atomic Operations1
- Atomic Regions1
- Auto-Tuning1
- Autoscaling1
- BM1684X1
- Bandwidth Utilization1
- Bank-Level Parallelism1
- Big Data1
- Bit-Serial-SIMD1
- Bit-serial SIMD PUD1
- Bit-slice Architecture1
- Bitwise Operation1
- Bitwise Operations1
- Bitwise-Operations1
- Block Floating Point1
- Brain-Computer Interface1
- Branch Prediction1
- BreakHammer1
- Bulk Data Copy1
- Bulk bitwise operations1
- CARS1
- CGRA1
- CKKS-TFHE1
- CNN1
- CNN accelerator2
- CNN training accelerator1
- CNN/DNN Accelerator1
- CNN_accelerator1
- CPU-Optimization1
- CUDA VMM1
- CXL6
- Cache Coherence1
- Cacheline Locking1
- Cambricon-C1
- Chiplet2
- Cloud1
- Code Generation1
- Coherence1
- Collective-Communication1
- Command Processor1
- Comparator-based_Neural_Network1
- Compiler1
- Compiler Framework1
- Compiler Optimization1
- Composition-of-Experts1
- Compute-in-Memory1
- Computing-in-Memory2
- Concurrency Control1
- Consistency1
- Continuous Batching1
- Cost-Optimization1
- D-RaNGe1
- DDR DRAM1
- DDR51
- DIMM-Link1
- DIMM-NMP1
- DLRM1
- DMA Descriptor1
- DNN Accelerator1
- DNN compiler1
- DNN training1
- DNN-accelerator1
- DRAM25
- DRAM Cache1
- DRAM PIM7
- DRAM mapping1
- DRAM-Cache1
- DRAM-PIM2
- DRAM-Throughput1
- DRAM-based FPGA1
- DRAM↔PIM data transfer1
- DVFS2
- Data-Movement1
- Data-Parallel Processor1
- Datacenter accelerators1
- Dataflow1
- Dataflow Architecture1
- Die-Stacked-DRAM1
- Diffusion Model1
- Distributed Caching1
- Distributed Systems1
- Domain-Wall Memory1
- Domain-wall Logic1
- Dynamic Memory Management1
- Dynamic Scheduling1
- Dynamic Sparsity1
- ECC1
- Early Exit1
- Edge AI1
- Edge Deployment1
- Edge inference2
- Edge-Computing1
- Efficient Inference1
- Energy Efficiency3
- Error-Correcting-Code1
- Event-based HAR1
- Execution scheduling1
- FFT1
- FHE1
- FPGA8
- FaaS1
- Fibonacci-coding1
- Fine-grained Activation1
- Fine-grained-DRAM1
- FlashAttention1
- Floating-Point1
- Fully Homomorphic Encryption1
- GCN1
- GDMA1
- GEMM1
- GPGPU1
- GPGPU simulation1
- GPU8
- GPU Architecture1
- GPU Cluster1
- GPU Inference1
- GPU Memory1
- GPU Memory Management1
- GPU Optimization1
- GPU sharing1
- GPU synchronization1
- GhostMinion1
- Graph Analytics1
- Graph Computing1
- Graph Neural Network1
- Graph Neural Network Accelerator1
- Graph Processing3
- HBM2
- HBM21
- HLS1
- Halide1
- Hardware1
- Hardware Accelerator6
- Hardware Architecture1
- Hardware Transactional Memory2
- Hardware-Software Co-Design1
- Hardware-Software Co-design1
- Hardware/Software Co-Design1
- Hash Table1
- Hashed Page Table1
- Heterogeneous Architecture2
- Heterogeneous Computing1
- Heterogeneous Memory1
- Hierarchical Search1
- High-Level Synthesis (HLS)1
- Hybrid Accelerator Design1
- Hybrid Memory Cube3
- Hybrid-Memory-System1
- Hypergraph Neural Network1
- ILP1
- Image Processing1
- In-Cache Computing1
- In-DRAM Computing1
- In-Flash Computing1
- In-Flash-Processing1
- In-Memory Computing3
- In-Order Processor2
- In-Situ Accelerator1
- In-band ECC1
- In-situ-accelerator1
- Inter-DIMM Communication1
- Interconnection-Network1
- Job Scheduling1
- KV Cache1
- KV Cache Compression2
- KV cache4
- KV cache compression8
- KV cache eviction1
- KV-Cache Compression1
- KV-cache compression2
- KVCache compression1
- Kernel Fusion1
- LLM5
- LLM Acceleration1
- LLM Inference9
- LLM Training1
- LLM acceleration1
- LLM compression1
- LLM inference8
- LLM serving1
- LLM-Inference1
- LLM-inference1
- LRDIMM2
- LUT-NN1
- LUT_based_multiplication1
- Large Language Model2
- Large Language Models1
- Locality1
- Logarithmic Number System1
- Logic-PIM1
- Long-context LLM1
- Lookup Table (LUT)1
- Lookup Tables1
- Lookup-Table1
- Low-Bit Quantization1
- Low-Rank Approximation1
- Low-Rank Projection1
- Low-bit LLM1
- Low-power Memory1
- Low-rank Approximation1
- MAGIC logic1
- MCM1
- MICRO 20221
- MICRO 20241
- MIMD1
- MLLM1
- MPU-Sim1
- Machine Learning Inference1
- Mamba2
- MappingOptimization1
- Matrix Computation1
- Matrix Multiplication1
- Matrix-Multiplication1
- Matryoshka training1
- Max/Min Search1
- Memory1
- Memory Controller1
- Memory Expander1
- Memory Hierarchy1
- Memory Pooling1
- Memory Reliability1
- Memory Tiering1
- Memory management unit1
- Memory-Level Parallelism1
- Memory-Management-Unit1
- Memory-Wall1
- Memory-based Computing1
- Memory_Model1
- Memristive CIM1
- Memristor1
- Message Passing1
- Microarchitecture2
- Microarchitecture Security1
- MiniCPM-V1
- Mixed-Precision1
- Mixture-of-Experts1
- MosaicCPU1
- MosaicScheduler1
- Multi-Instance1
- Multi-chip1
- Multilingual1
- NAND-Flash1
- NAND-Flash-Controller1
- NDP1
- NPU4
- NTT1
- NUMA2
- NVM crossbar1
- NeRF2
- Near-DRAM Acceleration1
- Near-DRAM Processing1
- Near-Data Processing4
- Near-Data-Processing1
- Near-Memory Computing1
- Near-Memory Processing2
- Near-Memory-Processing1
- Near-bank1
- Near-bank computing1
- Network-on-Chip1
- Neu101
- NeuISA1
- NeurIPS 20241
- Neural Network Acceleration1
- Neural Network Accelerator2
- Neural rendering1
- Neuromorphic Computing1
- Neuromorphic Processor1
- Non-Volatile Memory1
- Non-blocking Miss Handling1
- Nonvolatile-Memory1
- Normalized Effective Rank1
- OCR1
- On-Device LLMs1
- On-device AI1
- OpenSSD1
- Operand Collector1
- PCIe1
- PCN Accelerator1
- PF-DRAM1
- PIM11
- PIM accelerator1
- PIM-enabled Instructions1
- PIVOT1
- PRAM1
- PVT Variation1
- Page Table1
- Page-Table-based1
- PagedAttention1
- Parallelism2
- Performance debugging1
- Point Cloud1
- Polymorphic ECC1
- Power Modeling1
- Power-Management1
- Power-Modeling1
- Prefetching1
- Processing-In-Memory2
- Processing-Using-DRAM1
- Processing-Using-Memory1
- Processing-in-DRAM1
- Processing-in-Memory37
- Processing-in-Memory (Concept)1
- Processing-in-memory2
- Processing-using-DRAM2
- Programmable Accelerator1
- PuM1
- PyPIM1
- Python Tensor Library1
- QoS1
- Quantization8
- Quantum Computing1
- RISC-V3
- RISC-V Vector1
- ROC1
- RRAM PIM1
- Racetrack Memory2
- Ray Tracing2
- ReRAM3
- ReRAM PIM2
- Real-Time Systems1
- Recommendation System1
- Reconfigurable logic1
- Reconfigurable-Dataflow-Unit1
- Register File1
- Register-based Addressing1
- Reliability1
- Resource Management1
- Resource Partitioning1
- Resource-constrained systems1
- Retention Time1
- RoPE1
- RowClone2
- RowHammer2
- Runahead1
- SLAM2
- SN40L1
- SNN-Accelerator1
- SOT-MRAM1
- SPASM1
- SRAM CIM1
- SRAM-CIM2
- SSD-Architecture1
- SSM1
- Scheduling1
- ScopeAdvice1
- Security1
- Server CPU1
- Serverless1
- Serverless Computing1
- Side-Channel Attack1
- Simulation1
- Simulator2
- Slice-level Sparsity1
- SoC1
- Software Prefetching1
- Software Transactional Memory1
- Software-Defined-Storage1
- SpMV3
- SpTRSV1
- SpaceA1
- Sparse Accelerator1
- Sparse Attention1
- Sparse Data Structures1
- Sparse Embedding Similarity1
- Sparse Matrix2
- Sparse Tensor Algebra1
- Sparse attention1
- Sparse-Matrix1
- Sparsity1
- Speculative Lock Elision1
- Speculative Value Forwarding1
- Spike-Driven Processing1
- Spiking Neural Network2
- Spiking Neural Networks1
- Spiking-Neural-Networks1
- Spintronic1
- SplitSync1
- Stage-Customization1
- State Space Model1
- Static Analysis1
- Stiefel Manifold1
- Stochastic Computing1
- Storage Optimization1
- Stream-based DRAM Cache1
- Subarray-Level Parallelism1
- Synchronization2
- Systems1
- Systolic Array2
- TAGE1
- TLB2
- TPU1
- TYR1
- Tags1
- Temporal Parallelism1
- Temporal Similarity1
- Tensor compiler1
- Ternary weight network1
- Tesseract1
- Thermal Management1
- Time-Domain Interface1
- Token Merging1
- Top-K SpMV1
- Transformer2
- Transformer Accelerator1
- Transformer Models1
- Tree Traversal1
- Triple-row activation1
- UPMEM5
- Unified Virtual Memory1
- VMM latency optimization1
- Value Prediction1
- Vandermonde1
- Variational Quantum Algorithm1
- Vector Quantization3
- Vector-Similarity-Search1
- Virtual Memory2
- Vision Transformer1
- Visual Encoding1
- Wear-Leveling1
- accelerator2
- active message1
- actor model1
- address translation1
- all-SRAM accelerator1
- approximate computing1
- associative accelerator1
- attention accelerator1
- bank-level parallelism1
- bit-pipelining1
- bit-serial architecture1
- branch prediction2
- cache hierarchy1
- cache indexing1
- cache side-channel1
- chiplet1
- cloud-platform1
- code deformation1
- collective communication2
- computation-in-memory1
- computer-architecture-simulation1
- computing-in-memory1
- content addressable memory1
- cross-point RAM1
- database accelerator1
- datacenter networking1
- debugging-tool1
- decoding speed1
- deep learning hardware1
- deep learning systems1
- die-stacked DRAM1
- differentiable KMeans1
- diffusion LLM1
- direct-attached accelerators1
- distributed machine learning1
- distributed on-chip memory1
- dynamic defects1
- eDRAM1
- eDRAM/eNVM Accelerator1
- einsum cascade1
- energy efficiency1
- entropy1
- formal verification1
- function calls1
- ghost arbitration1
- graph pattern mining1
- graph-analytics1
- hardware accelerator1
- hardware compression1
- hardware generation1
- hardware-accelerator1
- hardware-software co-design1
- helper threads1
- heterogeneous memory architecture1
- hot page management1
- hybrid fidelity1
- hypercube1
- image projection accelerator1
- in-DRAM PIM1
- in-memory computing1
- incremental SVD1
- inference optimization1
- input-stationary dataflow1
- inter-DIMM-broadcast1
- issue queue1
- kernel scheduling1
- large language models2
- last-level cache1
- latency-critical data center1
- leakage contracts1
- long-context LLM4
- long-sequence-modeling1
- low-precision quantization1
- low-rank approximation4
- low-rank attention1
- low-rank decomposition1
- low-rank projection1
- mMPU1
- memory bandwidth partitioning1
- memory management1
- memory-centric architecture1
- memory-security1
- memory-system1
- memory_maintenance1
- memristor2
- microarchitectural side channel1
- microarchitecture4
- mixed-precision1
- mixed-signal accelerator1
- multi-camera system1
- multi-model scheduling1
- near-data computing1
- near-data-processing1
- near-memory computing1
- near-memory-processing1
- network stack1
- network topology1
- network-on-chip1
- neuromorphic1
- normalized effective rank1
- on-device inference1
- out-of-order core1
- performance isolation1
- precomputation1
- processing-in-memory1
- processing-using-memory2
- python2
- quantization1
- quantum error correction1
- real-time-monitoring1
- register stack1
- reinforcement learning1
- resistive memory1
- resource allocation1
- secure prefetching1
- self-attention1
- side-channel attack1
- sparse acceleration1
- sparse attention1
- sparse iterative solver1
- spatial accelerator1
- spatial architecture1
- spiking neural network1
- spiking neural networks1
- stall analysis1
- subarray-level parallelism1
- surface code1
- system optimization1
- task scheduling1
- throttling1
- throughput1
- token editing1
- uPIMulator1
- uv2
- virtual memory2
- virtualization1
- wakeup logic1
- weight clustering1
- 以查代算1
- 端侧大模型1
- 统一内存2
- 量化1
Software Transactional Memory
🔐 本文已加密,请输入密码查看
1