Wang
Zongwu
home
archives
categories
tags
Slides
Your browser does not support HTML5 video.
Hi my new friend!
书山有路勤为径,
学海无涯苦作舟。
Home
page
Scroll down
Welcome to Zongwu's Science Hub ✨
Residence:
Shanghai
Age:
18
Contact Me
Architecture
205
Read More
System
58
Read More
Newest Publications
System
GraphPipe_ Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism
26/03/31
10:11
System
Helix_ Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow
26/03/30
14:49
Architecture
Instruction-Aware Cooperative TLB and Cache Replacement Policies
26/03/27
17:35
System
MEDUSA_ Accelerating Serverless LLM Inference with Materialization
26/03/27
13:02
System
MoE-Lightning_ High-Throughput MoE Inference on Memory-constrained GPUs
26/03/27
12:19
Algorithm
BinaryAttention_ One-Bit QK-Attention for Vision and Diffusion Transformers
26/03/27
11:49
Architecture
MVQ_ Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization
26/03/27
11:22
System
Nazar_ Monitoring and Adapting ML Models on Mobile Devices
26/03/27
10:49
System
PCcheck_ Persistent Concurrent Checkpointing for ML
26/03/27
10:27
Algorithm
TurboQuant_ Online Vector Quantization with Near-optimal Distortion Rate
26/03/26
10:32
1
…
5
6
7
8
9
…
32
Please enter keywords to search