
DeepSeek
@deepseek_ai
Followers
973K
Following
32
Media
87
Statuses
140
Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.
Joined October 2023
To prevent any potential harm, we reiterate that @deepseek_ai is our sole official account on Twitter/X. Any accounts:.- representing us.- using identical avatars.- using similar names.are impersonations. Please stay vigilant to avoid being misled!.
4K
6K
78K
π DeepSeek-V3-0324 is out now!. πΉ Major boost in reasoning performance.πΉ Stronger front-end development skills.πΉ Smarter tool-use capabilities. β
For non-complex reasoning tasks, we recommend using V3 β just turn off βDeepThinkβ.π API usage remains unchanged.π Models are
679
2K
12K
π Day 6 of #OpenSourceWeek: One More Thing β DeepSeek-V3/R1 Inference System Overview. Optimized throughput and latency via:.π§ Cross-node EP-powered batch scaling.π Computation-communication overlap.βοΈ Load balancing. Statistics of DeepSeek's Online Service:.β‘ 73.7k/14.8k.
787
1K
9K
π Day 5 of #OpenSourceWeek: 3FS, Thruster for All DeepSeek Data Access. Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks. β‘ 6.6 TiB/s aggregate read throughput in a 180-node cluster.β‘ 3.66 TiB/min.
532
1K
11K
π Day 4 of #OpenSourceWeek: Optimized Parallelism Strategies. β
DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. π β
EPLB - an expert-parallel load balancer for V3/R1. π.
451
841
6K
π Day 3 of #OpenSourceWeek: DeepGEMM. Introducing DeepGEMM - an FP8 GEMM library that supports both dense and MoE GEMMs, powering V3/R1 training and inference. β‘ Up to 1350+ FP8 TFLOPS on Hopper GPUs.β
No heavy dependency, as clean as a tutorial.β
Fully Just-In-Time compiled.
473
1K
7K
π Day 2 of #OpenSourceWeek: DeepEP. Excited to introduce DeepEP - the first open-source EP communication library for MoE model training and inference. β
Efficient and optimized all-to-all communication.β
Both intranode and internode support with NVLink and RDMA.β
.
519
1K
8K
π Day 1 of #OpenSourceWeek: FlashMLA. Honored to share FlashMLA - our efficient MLA decoding kernel for Hopper GPUs, optimized for variable-length sequences and now in production. β
BF16 support.β
Paged KV cache (block size 64).β‘ 3000 GB/s memory-bound & 580 TFLOPS.
561
1K
11K
π Day 0: Warming up for #OpenSourceWeek! . We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency. These humble building blocks in our online service have been documented,.
1K
3K
21K