deepseek_ai Profile Banner
DeepSeek Profile
DeepSeek

@deepseek_ai

Followers
973K
Following
32
Media
89
Statuses
145

Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.

Joined October 2023
Don't wanna be here? Send us removal request.
@deepseek_ai
DeepSeek
7 months
To prevent any potential harm, we reiterate that @deepseek_ai is our sole official account on Twitter/X. Any accounts:.- representing us.- using identical avatars.- using similar names.are impersonations. Please stay vigilant to avoid being misled!.
4K
6K
78K
@deepseek_ai
DeepSeek
3 days
Pricing Changes πŸ’³. πŸ”Ή New pricing starts & off-peak discounts end at Sep 5th, 2025, 16:00 (UTC Time).πŸ”Ή Until then, APIs follow current pricing.πŸ“ Pricing page: 5/5
Tweet media one
26
48
862
@grok
Grok
5 days
Join millions who have switched to Grok.
255
277
2K
@deepseek_ai
DeepSeek
3 days
Model Update πŸ€–. πŸ”Ή V3.1 Base: 840B tokens continued pretraining for long context extension on top of V3.πŸ”Ή Tokenizer & chat template updated β€” new tokenizer config: πŸ”— V3.1 Base Open-source weights: πŸ”— V3.1 Open-source weights:.
9
42
793
@deepseek_ai
DeepSeek
3 days
Tools & Agents Upgrades 🧰. πŸ“ˆ Better results on SWE / Terminal-Bench.πŸ” Stronger multi-step reasoning for complex search tasks.⚑️ Big gains in thinking efficiency. 3/5
Tweet media one
Tweet media two
Tweet media three
10
50
828
@deepseek_ai
DeepSeek
3 days
API Update βš™οΈ. πŸ”Ή deepseek-chat β†’ non-thinking mode.πŸ”Ή deepseek-reasoner β†’ thinking mode.🧡 128K context for both.πŸ”Œ Anthropic API format supported: βœ… Strict Function Calling supported in Beta API: πŸš€ More API resources, smoother.
11
40
799
@deepseek_ai
DeepSeek
3 days
Introducing DeepSeek-V3.1: our first step toward the agent era! πŸš€. 🧠 Hybrid inference: Think & Non-Think β€” one model, two modes.⚑️ Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528.πŸ› οΈ Stronger agent skills: Post-training boosts tool use and.
442
2K
16K
@deepseek_ai
DeepSeek
3 months
πŸš€ DeepSeek-R1-0528 is here!. πŸ”Ή Improved benchmark performance.πŸ”Ή Enhanced front-end capabilities.πŸ”Ή Reduced hallucinations.πŸ”Ή Supports JSON output & function calling. βœ… Try it now: πŸ”Œ No change to API usage β€” docs here: πŸ”—
Tweet media one
Tweet media two
569
2K
10K
@deepseek_ai
DeepSeek
5 months
πŸš€ DeepSeek-V3-0324 is out now!. πŸ”Ή Major boost in reasoning performance.πŸ”Ή Stronger front-end development skills.πŸ”Ή Smarter tool-use capabilities. βœ… For non-complex reasoning tasks, we recommend using V3 β€” just turn off β€œDeepThink”.πŸ”Œ API usage remains unchanged.πŸ“œ Models are
Tweet media one
Tweet media two
694
2K
12K
@deepseek_ai
DeepSeek
6 months
πŸš€ Day 6 of #OpenSourceWeek: One More Thing – DeepSeek-V3/R1 Inference System Overview. Optimized throughput and latency via:.πŸ”§ Cross-node EP-powered batch scaling.πŸ”„ Computation-communication overlap.βš–οΈ Load balancing. Statistics of DeepSeek's Online Service:.⚑ 73.7k/14.8k.
789
1K
9K
@deepseek_ai
DeepSeek
6 months
πŸš€ Day 5 of #OpenSourceWeek: 3FS, Thruster for All DeepSeek Data Access. Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks. ⚑ 6.6 TiB/s aggregate read throughput in a 180-node cluster.⚑ 3.66 TiB/min.
529
1K
11K
@deepseek_ai
DeepSeek
6 months
πŸš€ Day 4 of #OpenSourceWeek: Optimized Parallelism Strategies. βœ… DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. πŸ”— βœ… EPLB - an expert-parallel load balancer for V3/R1. πŸ”—.
Tweet card summary image
github.com
A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. - deepseek-ai/DualPipe
451
849
6K
@deepseek_ai
DeepSeek
6 months
🚨 Off-Peak Discounts Alert!. Starting today, enjoy off-peak discounts on the DeepSeek API Platform from 16:30–00:30 UTC daily:. πŸ”Ή DeepSeek-V3 at 50% off.πŸ”Ή DeepSeek-R1 at a massive 75% off. Maximize your resources smarter β€” save more during these high-value hours!
Tweet media one
543
713
7K
@deepseek_ai
DeepSeek
6 months
πŸš€ Day 3 of #OpenSourceWeek: DeepGEMM. Introducing DeepGEMM - an FP8 GEMM library that supports both dense and MoE GEMMs, powering V3/R1 training and inference. ⚑ Up to 1350+ FP8 TFLOPS on Hopper GPUs.βœ… No heavy dependency, as clean as a tutorial.βœ… Fully Just-In-Time compiled.
472
1K
7K
@deepseek_ai
DeepSeek
6 months
πŸš€ Day 2 of #OpenSourceWeek: DeepEP. Excited to introduce DeepEP - the first open-source EP communication library for MoE model training and inference. βœ… Efficient and optimized all-to-all communication.βœ… Both intranode and internode support with NVLink and RDMA.βœ….
519
1K
8K
@deepseek_ai
DeepSeek
6 months
πŸš€ Day 1 of #OpenSourceWeek: FlashMLA. Honored to share FlashMLA - our efficient MLA decoding kernel for Hopper GPUs, optimized for variable-length sequences and now in production. βœ… BF16 support.βœ… Paged KV cache (block size 64).⚑ 3000 GB/s memory-bound & 580 TFLOPS.
560
1K
10K
@deepseek_ai
DeepSeek
6 months
πŸš€ Day 0: Warming up for #OpenSourceWeek! . We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency. These humble building blocks in our online service have been documented,.
1K
3K
21K
@deepseek_ai
DeepSeek
6 months
πŸš€ Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference!. Core components of NSA:.β€’ Dynamic hierarchical sparse strategy.β€’ Coarse-grained token compression.β€’ Fine-grained token selection. πŸ’‘ With
Tweet media one
Tweet media two
Tweet media three
Tweet media four
899
2K
16K
@deepseek_ai
DeepSeek
6 months
πŸŽ‰ Excited to see everyone’s enthusiasm for deploying DeepSeek-R1! Here are our recommended settings for the best experience:. β€’ No system prompt.β€’ Temperature: 0.6.β€’ Official prompts for search & file upload: β€’ Guidelines to mitigate model bypass.
701
2K
16K
@barstoolsports
Barstool Sports
3 days
RT @PardonMyTake: Tuesday night max woke Big Cat up with a flashlight at 2am because he thought we were going to get sued. @forthepeople ht….
0
13
0
@deepseek_ai
DeepSeek
7 months
πŸ“’ Terminology Correction: DeepSeek-R1’s code and models are released under the MIT License.
339
78
919
@deepseek_ai
DeepSeek
7 months
🌐 API Access & Pricing. βš™οΈ Use DeepSeek-R1 by setting model=deepseek-reasoner.πŸ’° $0.14 / million input tokens (cache hit).πŸ’° $0.55 / million input tokens (cache miss).πŸ’° $2.19 / million output tokens. πŸ“– API guide: πŸ‹ 5/n
Tweet media one
Tweet media two
241
359
4K