- 3D Gaussian Splatting(1)
- 3D NAND Flash(1)
- 3D accelerator(1)
- 3D reconstruction(1)
- 3D spatial computing(1)
- 3D-Stacked DRAM(1)
- 3D-Stacked Memory(1)
- 3D-stacked DRAM(1)
- 3D-stacked-memory(1)
- 4-bit matrix multiplication(1)
- AESPA(1)
- AI(1)
- AI Accelerator(1)
- AI accelerator(2)
- AI processor(1)
- AR/VR(1)
- ARMv8-A(1)
- Accelerator(5)
- Accelerator-in-Memory(1)
- Activation Compression(1)
- Address Translation(1)
- Algorithm-Hardware Co-Design(1)
- Approximate Computing(1)
- Approximate Nearest Neighbor(1)
- Arbitration(1)
- Ascend NPU(1)
- Associative Processor(1)
- AsyncDIMM(1)
- Atomic Operations(1)
- Atomic Regions(1)
- Auto-Tuning(1)
- Autoscaling(1)
- BM1684X(1)
- Bandwidth Utilization(1)
- Bank-Level Parallelism(1)
- Big Data(1)
- Bit-Serial-SIMD(1)
- Bit-serial SIMD PUD(1)
- Bit-slice Architecture(1)
- Bitwise Operation(1)
- Bitwise Operations(1)
- Bitwise-Operations(1)
- Block Floating Point(1)
- Brain-Computer Interface(1)
- Branch Prediction(1)
- BreakHammer(1)
- Bulk Data Copy(1)
- Bulk bitwise operations(1)
- CARS(1)
- CGRA(1)
- CKKS-TFHE(1)
- CNN(1)
- CNN accelerator(2)
- CNN training accelerator(1)
- CNN/DNN Accelerator(1)
- CNN_accelerator(1)
- CPU-Optimization(1)
- CUDA VMM(1)
- CXL(6)
- Cache Coherence(1)
- Cacheline Locking(1)
- Cambricon-C(1)
- Chiplet(2)
- Cloud(1)
- Code Generation(1)
- Coherence(1)
- Collective-Communication(1)
- Command Processor(1)
- Comparator-based_Neural_Network(1)
- Compiler(1)
- Compiler Framework(1)
- Compiler Optimization(1)
- Composition-of-Experts(1)
- Compute-in-Memory(1)
- Computing-in-Memory(2)
- Concurrency Control(1)
- Consistency(1)
- Continuous Batching(1)
- Cost-Optimization(1)
- D-RaNGe(1)
- DDR DRAM(1)
- DDR5(1)
- DIMM-Link(1)
- DIMM-NMP(1)
- DLRM(1)
- DMA Descriptor(1)
- DNN Accelerator(1)
- DNN compiler(1)
- DNN training(1)
- DNN-accelerator(1)
- DRAM(25)
- DRAM Cache(1)
- DRAM PIM(7)
- DRAM mapping(1)
- DRAM-Cache(1)
- DRAM-PIM(2)
- DRAM-Throughput(1)
- DRAM-based FPGA(1)
- DRAM↔PIM data transfer(1)
- DVFS(2)
- Data-Movement(1)
- Data-Parallel Processor(1)
- Datacenter accelerators(1)
- Dataflow(1)
- Dataflow Architecture(1)
- Die-Stacked-DRAM(1)
- Diffusion Model(1)
- Distributed Caching(1)
- Distributed Systems(1)
- Domain-Wall Memory(1)
- Domain-wall Logic(1)
- Dynamic Memory Management(1)
- Dynamic Scheduling(1)
- Dynamic Sparsity(1)
- ECC(1)
- Early Exit(1)
- Edge AI(1)
- Edge Deployment(1)
- Edge inference(2)
- Edge-Computing(1)
- Efficient Inference(1)
- Energy Efficiency(3)
- Error-Correcting-Code(1)
- Event-based HAR(1)
- Execution scheduling(1)
- FFT(1)
- FHE(1)
- FPGA(8)
- FaaS(1)
- Fibonacci-coding(1)
- Fine-grained Activation(1)
- Fine-grained-DRAM(1)
- FlashAttention(1)
- Floating-Point(1)
- Fully Homomorphic Encryption(1)
- GCN(1)
- GDMA(1)
- GEMM(1)
- GPGPU(1)
- GPGPU simulation(1)
- GPU(8)
- GPU Architecture(1)
- GPU Cluster(1)
- GPU Inference(1)
- GPU Memory(1)
- GPU Memory Management(1)
- GPU Optimization(1)
- GPU sharing(1)
- GPU synchronization(1)
- GhostMinion(1)
- Graph Analytics(1)
- Graph Computing(1)
- Graph Neural Network(1)
- Graph Neural Network Accelerator(1)
- Graph Processing(3)
- HBM(2)
- HBM2(1)
- HLS(1)
- Halide(1)
- Hardware(1)
- Hardware Accelerator(6)
- Hardware Architecture(1)
- Hardware Transactional Memory(2)
- Hardware-Software Co-Design(1)
- Hardware-Software Co-design(1)
- Hardware/Software Co-Design(1)
- Hash Table(1)
- Hashed Page Table(1)
- Heterogeneous Architecture(2)
- Heterogeneous Computing(1)
- Heterogeneous Memory(1)
- Hierarchical Search(1)
- High-Level Synthesis (HLS)(1)
- Hybrid Accelerator Design(1)
- Hybrid Memory Cube(3)
- Hybrid-Memory-System(1)
- Hypergraph Neural Network(1)
- ILP(1)
- Image Processing(1)
- In-Cache Computing(1)
- In-DRAM Computing(1)
- In-Flash Computing(1)
- In-Flash-Processing(1)
- In-Memory Computing(3)
- In-Order Processor(2)
- In-Situ Accelerator(1)
- In-band ECC(1)
- In-situ-accelerator(1)
- Inter-DIMM Communication(1)
- Interconnection-Network(1)
- Job Scheduling(1)
- KV Cache(1)
- KV Cache Compression(2)
- KV cache(4)
- KV cache compression(8)
- KV cache eviction(1)
- KV-Cache Compression(1)
- KV-cache compression(2)
- KVCache compression(1)
- Kernel Fusion(1)
- LLM(5)
- LLM Acceleration(1)
- LLM Inference(9)
- LLM Training(1)
- LLM acceleration(1)
- LLM compression(1)
- LLM inference(8)
- LLM serving(1)
- LLM-Inference(1)
- LLM-inference(1)
- LRDIMM(2)
- LUT-NN(1)
- LUT_based_multiplication(1)
- Large Language Model(2)
- Large Language Models(1)
- Locality(1)
- Logarithmic Number System(1)
- Logic-PIM(1)
- Long-context LLM(1)
- Lookup Table (LUT)(1)
- Lookup Tables(1)
- Lookup-Table(1)
- Low-Bit Quantization(1)
- Low-Rank Approximation(1)
- Low-Rank Projection(1)
- Low-bit LLM(1)
- Low-power Memory(1)
- Low-rank Approximation(1)
- MAGIC logic(1)
- MCM(1)
- MICRO 2022(1)
- MICRO 2024(1)
- MIMD(1)
- MLLM(1)
- MPU-Sim(1)
- Machine Learning Inference(1)
- Mamba(2)
- MappingOptimization(1)
- Matrix Computation(1)
- Matrix Multiplication(1)
- Matrix-Multiplication(1)
- Matryoshka training(1)
- Max/Min Search(1)
- Memory(1)
- Memory Controller(1)
- Memory Expander(1)
- Memory Hierarchy(1)
- Memory Pooling(1)
- Memory Reliability(1)
- Memory Tiering(1)
- Memory management unit(1)
- Memory-Level Parallelism(1)
- Memory-Management-Unit(1)
- Memory-Wall(1)
- Memory-based Computing(1)
- Memory_Model(1)
- Memristive CIM(1)
- Memristor(1)
- Message Passing(1)
- Microarchitecture(2)
- Microarchitecture Security(1)
- MiniCPM-V(1)
- Mixed-Precision(1)
- Mixture-of-Experts(1)
- MosaicCPU(1)
- MosaicScheduler(1)
- Multi-Instance(1)
- Multi-chip(1)
- Multilingual(1)
- NAND-Flash(1)
- NAND-Flash-Controller(1)
- NDP(1)
- NPU(4)
- NTT(1)
- NUMA(2)
- NVM crossbar(1)
- NeRF(2)
- Near-DRAM Acceleration(1)
- Near-DRAM Processing(1)
- Near-Data Processing(4)
- Near-Data-Processing(1)
- Near-Memory Computing(1)
- Near-Memory Processing(2)
- Near-Memory-Processing(1)
- Near-bank(1)
- Near-bank computing(1)
- Network-on-Chip(1)
- Neu10(1)
- NeuISA(1)
- NeurIPS 2024(1)
- Neural Network Acceleration(1)
- Neural Network Accelerator(2)
- Neural rendering(1)
- Neuromorphic Computing(1)
- Neuromorphic Processor(1)
- Non-Volatile Memory(1)
- Non-blocking Miss Handling(1)
- Nonvolatile-Memory(1)
- Normalized Effective Rank(1)
- OCR(1)
- On-Device LLMs(1)
- On-device AI(1)
- OpenSSD(1)
- Operand Collector(1)
- PCIe(1)
- PCN Accelerator(1)
- PF-DRAM(1)
- PIM(11)
- PIM accelerator(1)
- PIM-enabled Instructions(1)
- PIVOT(1)
- PRAM(1)
- PVT Variation(1)
- Page Table(1)
- Page-Table-based(1)
- PagedAttention(1)
- Parallelism(2)
- Performance debugging(1)
- Point Cloud(1)
- Polymorphic ECC(1)
- Power Modeling(1)
- Power-Management(1)
- Power-Modeling(1)
- Prefetching(1)
- Processing-In-Memory(2)
- Processing-Using-DRAM(1)
- Processing-Using-Memory(1)
- Processing-in-DRAM(1)
- Processing-in-Memory(37)
- Processing-in-Memory (Concept)(1)
- Processing-in-memory(2)
- Processing-using-DRAM(2)
- Programmable Accelerator(1)
- PuM(1)
- PyPIM(1)
- Python Tensor Library(1)
- QoS(1)
- Quantization(8)
- Quantum Computing(1)
- RISC-V(3)
- RISC-V Vector(1)
- ROC(1)
- RRAM PIM(1)
- Racetrack Memory(2)
- Ray Tracing(2)
- ReRAM(3)
- ReRAM PIM(2)
- Real-Time Systems(1)
- Recommendation System(1)
- Reconfigurable logic(1)
- Reconfigurable-Dataflow-Unit(1)
- Register File(1)
- Register-based Addressing(1)
- Reliability(1)
- Resource Management(1)
- Resource Partitioning(1)
- Resource-constrained systems(1)
- Retention Time(1)
- RoPE(1)
- RowClone(2)
- RowHammer(2)
- Runahead(1)
- SLAM(2)
- SN40L(1)
- SNN-Accelerator(1)
- SOT-MRAM(1)
- SPASM(1)
- SRAM CIM(1)
- SRAM-CIM(2)
- SSD-Architecture(1)
- SSM(1)
- Scheduling(1)
- ScopeAdvice(1)
- Security(1)
- Server CPU(1)
- Serverless(1)
- Serverless Computing(1)
- Side-Channel Attack(1)
- Simulation(1)
- Simulator(2)
- Slice-level Sparsity(1)
- SoC(1)
- Software Prefetching(1)
- Software Transactional Memory(1)
- Software-Defined-Storage(1)
- SpMV(3)
- SpTRSV(1)
- SpaceA(1)
- Sparse Accelerator(1)
- Sparse Attention(1)
- Sparse Data Structures(1)
- Sparse Embedding Similarity(1)
- Sparse Matrix(2)
- Sparse Tensor Algebra(1)
- Sparse attention(1)
- Sparse-Matrix(1)
- Sparsity(1)
- Speculative Lock Elision(1)
- Speculative Value Forwarding(1)
- Spike-Driven Processing(1)
- Spiking Neural Network(2)
- Spiking Neural Networks(1)
- Spiking-Neural-Networks(1)
- Spintronic(1)
- SplitSync(1)
- Stage-Customization(1)
- State Space Model(1)
- Static Analysis(1)
- Stiefel Manifold(1)
- Stochastic Computing(1)
- Storage Optimization(1)
- Stream-based DRAM Cache(1)
- Subarray-Level Parallelism(1)
- Synchronization(2)
- Systems(1)
- Systolic Array(2)
- TAGE(1)
- TLB(2)
- TPU(1)
- TYR(1)
- Tags(1)
- Temporal Parallelism(1)
- Temporal Similarity(1)
- Tensor compiler(1)
- Ternary weight network(1)
- Tesseract(1)
- Thermal Management(1)
- Time-Domain Interface(1)
- Token Merging(1)
- Top-K SpMV(1)
- Transformer(2)
- Transformer Accelerator(1)
- Transformer Models(1)
- Tree Traversal(1)
- Triple-row activation(1)
- UPMEM(5)
- Unified Virtual Memory(1)
- VMM latency optimization(1)
- Value Prediction(1)
- Vandermonde(1)
- Variational Quantum Algorithm(1)
- Vector Quantization(3)
- Vector-Similarity-Search(1)
- Virtual Memory(2)
- Vision Transformer(1)
- Visual Encoding(1)
- Wear-Leveling(1)
- accelerator(2)
- active message(1)
- actor model(1)
- address translation(1)
- all-SRAM accelerator(1)
- approximate computing(1)
- associative accelerator(1)
- attention accelerator(1)
- bank-level parallelism(1)
- bit-pipelining(1)
- bit-serial architecture(1)
- branch prediction(2)
- cache hierarchy(1)
- cache indexing(1)
- cache side-channel(1)
- chiplet(1)
- cloud-platform(1)
- code deformation(1)
- collective communication(2)
- computation-in-memory(1)
- computer-architecture-simulation(1)
- computing-in-memory(1)
- content addressable memory(1)
- cross-point RAM(1)
- database accelerator(1)
- datacenter networking(1)
- debugging-tool(1)
- decoding speed(1)
- deep learning hardware(1)
- deep learning systems(1)
- die-stacked DRAM(1)
- differentiable KMeans(1)
- diffusion LLM(1)
- direct-attached accelerators(1)
- distributed machine learning(1)
- distributed on-chip memory(1)
- dynamic defects(1)
- eDRAM(1)
- eDRAM/eNVM Accelerator(1)
- einsum cascade(1)
- energy efficiency(1)
- entropy(1)
- formal verification(1)
- function calls(1)
- ghost arbitration(1)
- graph pattern mining(1)
- graph-analytics(1)
- hardware accelerator(1)
- hardware compression(1)
- hardware generation(1)
- hardware-accelerator(1)
- hardware-software co-design(1)
- helper threads(1)
- heterogeneous memory architecture(1)
- hot page management(1)
- hybrid fidelity(1)
- hypercube(1)
- image projection accelerator(1)
- in-DRAM PIM(1)
- in-memory computing(1)
- incremental SVD(1)
- inference optimization(1)
- input-stationary dataflow(1)
- inter-DIMM-broadcast(1)
- issue queue(1)
- kernel scheduling(1)
- large language models(2)
- last-level cache(1)
- latency-critical data center(1)
- leakage contracts(1)
- long-context LLM(4)
- long-sequence-modeling(1)
- low-precision quantization(1)
- low-rank approximation(4)
- low-rank attention(1)
- low-rank decomposition(1)
- low-rank projection(1)
- mMPU(1)
- memory bandwidth partitioning(1)
- memory management(1)
- memory-centric architecture(1)
- memory-security(1)
- memory-system(1)
- memory_maintenance(1)
- memristor(2)
- microarchitectural side channel(1)
- microarchitecture(4)
- mixed-precision(1)
- mixed-signal accelerator(1)
- multi-camera system(1)
- multi-model scheduling(1)
- near-data computing(1)
- near-data-processing(1)
- near-memory computing(1)
- near-memory-processing(1)
- network stack(1)
- network topology(1)
- network-on-chip(1)
- neuromorphic(1)
- normalized effective rank(1)
- on-device inference(1)
- out-of-order core(1)
- performance isolation(1)
- precomputation(1)
- processing-in-memory(1)
- processing-using-memory(2)
- python(2)
- quantization(1)
- quantum error correction(1)
- real-time-monitoring(1)
- register stack(1)
- reinforcement learning(1)
- resistive memory(1)
- resource allocation(1)
- secure prefetching(1)
- self-attention(1)
- side-channel attack(1)
- sparse acceleration(1)
- sparse attention(1)
- sparse iterative solver(1)
- spatial accelerator(1)
- spatial architecture(1)
- spiking neural network(1)
- spiking neural networks(1)
- stall analysis(1)
- subarray-level parallelism(1)
- surface code(1)
- system optimization(1)
- task scheduling(1)
- throttling(1)
- throughput(1)
- token editing(1)
- uPIMulator(1)
- uv(2)
- virtual memory(2)
- virtualization(1)
- wakeup logic(1)
- weight clustering(1)
- 以查代算(1)
- 端侧大模型(1)
- 统一内存(2)
- 量化(1)