Wang
Zongwu
home
archives
categories
tags
Your browser does not support HTML5 video.
The article tags
竹杖芒鞋轻胜马,
一蓑烟雨任平生。
Home
tags
Scroll down
Welcome to Zongwu's Science Hub ✨
Residence:
Shanghai
Age:
18
Contact Me
3D Gaussian Splatting
1
3D NAND Flash
1
3D accelerator
1
3D reconstruction
1
3D spatial computing
1
3D-Stacked DRAM
1
3D-Stacked Memory
1
3D-stacked DRAM
1
3D-stacked-memory
1
4-bit matrix multiplication
1
AESPA
1
AI
1
AI Accelerator
1
AI accelerator
2
AI processor
1
AR/VR
1
ARMv8-A
1
Accelerator
5
Accelerator-in-Memory
1
Activation Compression
1
Address Translation
1
Algorithm-Hardware Co-Design
1
Approximate Computing
1
Approximate Nearest Neighbor
1
Arbitration
1
Ascend NPU
1
Associative Processor
1
AsyncDIMM
1
Atomic Operations
1
Atomic Regions
1
Auto-Tuning
1
Autoscaling
1
BM1684X
1
Bandwidth Utilization
1
Bank-Level Parallelism
1
Big Data
1
Bit-Serial-SIMD
1
Bit-serial SIMD PUD
1
Bit-slice Architecture
1
Bitwise Operation
1
Bitwise Operations
1
Bitwise-Operations
1
Block Floating Point
1
Brain-Computer Interface
1
Branch Prediction
1
BreakHammer
1
Bulk Data Copy
1
Bulk bitwise operations
1
CARS
1
CGRA
1
CKKS-TFHE
1
CNN
1
CNN accelerator
2
CNN training accelerator
1
CNN/DNN Accelerator
1
CNN_accelerator
1
CPU-Optimization
1
CUDA VMM
1
CXL
6
Cache Coherence
1
Cacheline Locking
1
Cambricon-C
1
Chiplet
2
Cloud
1
Code Generation
1
Coherence
1
Collective-Communication
1
Command Processor
1
Comparator-based_Neural_Network
1
Compiler
1
Compiler Framework
1
Compiler Optimization
1
Composition-of-Experts
1
Compute-in-Memory
1
Computing-in-Memory
2
Concurrency Control
1
Consistency
1
Continuous Batching
1
Cost-Optimization
1
D-RaNGe
1
DDR DRAM
1
DDR5
1
DIMM-Link
1
DIMM-NMP
1
DLRM
1
DMA Descriptor
1
DNN Accelerator
1
DNN compiler
1
DNN training
1
DNN-accelerator
1
DRAM
25
DRAM Cache
1
DRAM PIM
7
DRAM mapping
1
DRAM-Cache
1
DRAM-PIM
2
DRAM-Throughput
1
DRAM-based FPGA
1
DRAM↔PIM data transfer
1
DVFS
2
Data-Movement
1
Data-Parallel Processor
1
Datacenter accelerators
1
Dataflow
1
Dataflow Architecture
1
Die-Stacked-DRAM
1
Diffusion Model
1
Distributed Caching
1
Distributed Systems
1
Domain-Wall Memory
1
Domain-wall Logic
1
Dynamic Memory Management
1
Dynamic Scheduling
1
Dynamic Sparsity
1
ECC
1
Early Exit
1
Edge AI
1
Edge Deployment
1
Edge inference
2
Edge-Computing
1
Efficient Inference
1
Energy Efficiency
3
Error-Correcting-Code
1
Event-based HAR
1
Execution scheduling
1
FFT
1
FHE
1
FPGA
8
FaaS
1
Fibonacci-coding
1
Fine-grained Activation
1
Fine-grained-DRAM
1
FlashAttention
1
Floating-Point
1
Fully Homomorphic Encryption
1
GCN
1
GDMA
1
GEMM
1
GPGPU
1
GPGPU simulation
1
GPU
8
GPU Architecture
1
GPU Cluster
1
GPU Inference
1
GPU Memory
1
GPU Memory Management
1
GPU Optimization
1
GPU sharing
1
GPU synchronization
1
GhostMinion
1
Graph Analytics
1
Graph Computing
1
Graph Neural Network
1
Graph Neural Network Accelerator
1
Graph Processing
3
HBM
2
HBM2
1
HLS
1
Halide
1
Hardware
1
Hardware Accelerator
6
Hardware Architecture
1
Hardware Transactional Memory
2
Hardware-Software Co-Design
1
Hardware-Software Co-design
1
Hardware/Software Co-Design
1
Hash Table
1
Hashed Page Table
1
Heterogeneous Architecture
2
Heterogeneous Computing
1
Heterogeneous Memory
1
Hierarchical Search
1
High-Level Synthesis (HLS)
1
Hybrid Accelerator Design
1
Hybrid Memory Cube
3
Hybrid-Memory-System
1
Hypergraph Neural Network
1
ILP
1
Image Processing
1
In-Cache Computing
1
In-DRAM Computing
1
In-Flash Computing
1
In-Flash-Processing
1
In-Memory Computing
3
In-Order Processor
2
In-Situ Accelerator
1
In-band ECC
1
In-situ-accelerator
1
Inter-DIMM Communication
1
Interconnection-Network
1
Job Scheduling
1
KV Cache
1
KV Cache Compression
2
KV cache
4
KV cache compression
8
KV cache eviction
1
KV-Cache Compression
1
KV-cache compression
2
KVCache compression
1
Kernel Fusion
1
LLM
5
LLM Acceleration
1
LLM Inference
9
LLM Training
1
LLM acceleration
1
LLM compression
1
LLM inference
8
LLM serving
1
LLM-Inference
1
LLM-inference
1
LRDIMM
2
LUT-NN
1
LUT_based_multiplication
1
Large Language Model
2
Large Language Models
1
Locality
1
Logarithmic Number System
1
Logic-PIM
1
Long-context LLM
1
Lookup Table (LUT)
1
Lookup Tables
1
Lookup-Table
1
Low-Bit Quantization
1
Low-Rank Approximation
1
Low-Rank Projection
1
Low-bit LLM
1
Low-power Memory
1
Low-rank Approximation
1
MAGIC logic
1
MCM
1
MICRO 2022
1
MICRO 2024
1
MIMD
1
MLLM
1
MPU-Sim
1
Machine Learning Inference
1
Mamba
2
MappingOptimization
1
Matrix Computation
1
Matrix Multiplication
1
Matrix-Multiplication
1
Matryoshka training
1
Max/Min Search
1
Memory
1
Memory Controller
1
Memory Expander
1
Memory Hierarchy
1
Memory Pooling
1
Memory Reliability
1
Memory Tiering
1
Memory management unit
1
Memory-Level Parallelism
1
Memory-Management-Unit
1
Memory-Wall
1
Memory-based Computing
1
Memory_Model
1
Memristive CIM
1
Memristor
1
Message Passing
1
Microarchitecture
2
Microarchitecture Security
1
MiniCPM-V
1
Mixed-Precision
1
Mixture-of-Experts
1
MosaicCPU
1
MosaicScheduler
1
Multi-Instance
1
Multi-chip
1
Multilingual
1
NAND-Flash
1
NAND-Flash-Controller
1
NDP
1
NPU
4
NTT
1
NUMA
2
NVM crossbar
1
NeRF
2
Near-DRAM Acceleration
1
Near-DRAM Processing
1
Near-Data Processing
4
Near-Data-Processing
1
Near-Memory Computing
1
Near-Memory Processing
2
Near-Memory-Processing
1
Near-bank
1
Near-bank computing
1
Network-on-Chip
1
Neu10
1
NeuISA
1
NeurIPS 2024
1
Neural Network Acceleration
1
Neural Network Accelerator
2
Neural rendering
1
Neuromorphic Computing
1
Neuromorphic Processor
1
Non-Volatile Memory
1
Non-blocking Miss Handling
1
Nonvolatile-Memory
1
Normalized Effective Rank
1
OCR
1
On-Device LLMs
1
On-device AI
1
OpenSSD
1
Operand Collector
1
PCIe
1
PCN Accelerator
1
PF-DRAM
1
PIM
11
PIM accelerator
1
PIM-enabled Instructions
1
PIVOT
1
PRAM
1
PVT Variation
1
Page Table
1
Page-Table-based
1
PagedAttention
1
Parallelism
2
Performance debugging
1
Point Cloud
1
Polymorphic ECC
1
Power Modeling
1
Power-Management
1
Power-Modeling
1
Prefetching
1
Processing-In-Memory
2
Processing-Using-DRAM
1
Processing-Using-Memory
1
Processing-in-DRAM
1
Processing-in-Memory
37
Processing-in-Memory (Concept)
1
Processing-in-memory
2
Processing-using-DRAM
2
Programmable Accelerator
1
PuM
1
PyPIM
1
Python Tensor Library
1
QoS
1
Quantization
8
Quantum Computing
1
RISC-V
3
RISC-V Vector
1
ROC
1
RRAM PIM
1
Racetrack Memory
2
Ray Tracing
2
ReRAM
3
ReRAM PIM
2
Real-Time Systems
1
Recommendation System
1
Reconfigurable logic
1
Reconfigurable-Dataflow-Unit
1
Register File
1
Register-based Addressing
1
Reliability
1
Resource Management
1
Resource Partitioning
1
Resource-constrained systems
1
Retention Time
1
RoPE
1
RowClone
2
RowHammer
2
Runahead
1
SLAM
2
SN40L
1
SNN-Accelerator
1
SOT-MRAM
1
SPASM
1
SRAM CIM
1
SRAM-CIM
2
SSD-Architecture
1
SSM
1
Scheduling
1
ScopeAdvice
1
Security
1
Server CPU
1
Serverless
1
Serverless Computing
1
Side-Channel Attack
1
Simulation
1
Simulator
2
Slice-level Sparsity
1
SoC
1
Software Prefetching
1
Software Transactional Memory
1
Software-Defined-Storage
1
SpMV
3
SpTRSV
1
SpaceA
1
Sparse Accelerator
1
Sparse Attention
1
Sparse Data Structures
1
Sparse Embedding Similarity
1
Sparse Matrix
2
Sparse Tensor Algebra
1
Sparse attention
1
Sparse-Matrix
1
Sparsity
1
Speculative Lock Elision
1
Speculative Value Forwarding
1
Spike-Driven Processing
1
Spiking Neural Network
2
Spiking Neural Networks
1
Spiking-Neural-Networks
1
Spintronic
1
SplitSync
1
Stage-Customization
1
State Space Model
1
Static Analysis
1
Stiefel Manifold
1
Stochastic Computing
1
Storage Optimization
1
Stream-based DRAM Cache
1
Subarray-Level Parallelism
1
Synchronization
2
Systems
1
Systolic Array
2
TAGE
1
TLB
2
TPU
1
TYR
1
Tags
1
Temporal Parallelism
1
Temporal Similarity
1
Tensor compiler
1
Ternary weight network
1
Tesseract
1
Thermal Management
1
Time-Domain Interface
1
Token Merging
1
Top-K SpMV
1
Transformer
2
Transformer Accelerator
1
Transformer Models
1
Tree Traversal
1
Triple-row activation
1
UPMEM
5
Unified Virtual Memory
1
VMM latency optimization
1
Value Prediction
1
Vandermonde
1
Variational Quantum Algorithm
1
Vector Quantization
3
Vector-Similarity-Search
1
Virtual Memory
2
Vision Transformer
1
Visual Encoding
1
Wear-Leveling
1
accelerator
2
active message
1
actor model
1
address translation
1
all-SRAM accelerator
1
approximate computing
1
associative accelerator
1
attention accelerator
1
bank-level parallelism
1
bit-pipelining
1
bit-serial architecture
1
branch prediction
2
cache hierarchy
1
cache indexing
1
cache side-channel
1
chiplet
1
cloud-platform
1
code deformation
1
collective communication
2
computation-in-memory
1
computer-architecture-simulation
1
computing-in-memory
1
content addressable memory
1
cross-point RAM
1
database accelerator
1
datacenter networking
1
debugging-tool
1
decoding speed
1
deep learning hardware
1
deep learning systems
1
die-stacked DRAM
1
differentiable KMeans
1
diffusion LLM
1
direct-attached accelerators
1
distributed machine learning
1
distributed on-chip memory
1
dynamic defects
1
eDRAM
1
eDRAM/eNVM Accelerator
1
einsum cascade
1
energy efficiency
1
entropy
1
formal verification
1
function calls
1
ghost arbitration
1
graph pattern mining
1
graph-analytics
1
hardware accelerator
1
hardware compression
1
hardware generation
1
hardware-accelerator
1
hardware-software co-design
1
helper threads
1
heterogeneous memory architecture
1
hot page management
1
hybrid fidelity
1
hypercube
1
image projection accelerator
1
in-DRAM PIM
1
in-memory computing
1
incremental SVD
1
inference optimization
1
input-stationary dataflow
1
inter-DIMM-broadcast
1
issue queue
1
kernel scheduling
1
large language models
2
last-level cache
1
latency-critical data center
1
leakage contracts
1
long-context LLM
4
long-sequence-modeling
1
low-precision quantization
1
low-rank approximation
4
low-rank attention
1
low-rank decomposition
1
low-rank projection
1
mMPU
1
memory bandwidth partitioning
1
memory management
1
memory-centric architecture
1
memory-security
1
memory-system
1
memory_maintenance
1
memristor
2
microarchitectural side channel
1
microarchitecture
4
mixed-precision
1
mixed-signal accelerator
1
multi-camera system
1
multi-model scheduling
1
near-data computing
1
near-data-processing
1
near-memory computing
1
near-memory-processing
1
network stack
1
network topology
1
network-on-chip
1
neuromorphic
1
normalized effective rank
1
on-device inference
1
out-of-order core
1
performance isolation
1
precomputation
1
processing-in-memory
1
processing-using-memory
2
python
2
quantization
1
quantum error correction
1
real-time-monitoring
1
register stack
1
reinforcement learning
1
resistive memory
1
resource allocation
1
secure prefetching
1
self-attention
1
side-channel attack
1
sparse acceleration
1
sparse attention
1
sparse iterative solver
1
spatial accelerator
1
spatial architecture
1
spiking neural network
1
spiking neural networks
1
stall analysis
1
subarray-level parallelism
1
surface code
1
system optimization
1
task scheduling
1
throttling
1
throughput
1
token editing
1
uPIMulator
1
uv
2
virtual memory
2
virtualization
1
wakeup logic
1
weight clustering
1
以查代算
1
端侧大模型
1
统一内存
2
量化
1
Large Language Model
Brain Transformers _ SNN-LLM
02/20
11:04
🔐 本文已加密,请输入密码查看
SkipDecode_ Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference
02/20
11:04
🔐 本文已加密,请输入密码查看
1
Please enter keywords to search