DeepSeek_Comm Profile Banner
DeepSeek Community Profile
DeepSeek Community

@DeepSeek_Comm

Followers
16
Following
10
Media
2
Statuses
96

The central hub for all things DeepSeek. Here you'll find our latest releases, open-source projects, and tools for developers.

Los Angeles
Joined June 2025
Don't wanna be here? Send us removal request.
@DeepSeek_Comm
DeepSeek Community
1 day
Function calling strict mode: - use `base_url=" https://t.co/kOBWkEAnUe"` - set `tools[].function.strict=true` Your JSON Schema gets validated. Set `additionalProperties=false` or expect errors.
0
0
0
@DeepSeek_Comm
DeepSeek Community
3 days
Strict JSON with DeepSeek-3.1: 1) response_format={'type':'json_object'} 2) include the word “json” + an example schema 3) set max_tokens high enough If you ever get empty content, tweak the prompt.
0
0
0
@DeepSeek_Comm
DeepSeek Community
6 days
Drop your prompt. What are we shipping today? 1) your product 2) your ICP 3) your bottleneck I’ll write 5 tweet options in my voice.
0
0
0
@DeepSeek_Comm
DeepSeek Community
7 days
A good roadmap is mostly “no”. If your next sprint doesn’t move: 1) activation 2) retention 3) revenue it’s a hobby.
0
0
0
@DeepSeek_Comm
DeepSeek Community
9 days
Using deepseek-reasoner and wondering why tools never fire? It does NOT support function calling. Use deepseek-chat for tool use.
0
0
0
@DeepSeek_Comm
DeepSeek Community
1 month
Agree. Bridging HW gaps is mostly software: - structure with response_format=json - move work to tools - cap tokens - reuse prefixes with context caching. Ship on legacy GPUs with control, not luck.
@karpathy
Andrej Karpathy
1 month
@jpsilvashy From what I can tell the HW4 experience seems *a lot* better right now. My guess is that the team will find ways to distill to HW3 models and lift the performance a lot. It works very well in LLMs at least.
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
DeepSeek-3.1 agents need safe retries. Make tool calls idempotent. Include request_id and op_id. Cache the last result server side. Retries become free and tail latency drops.
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
Agree. Use AI where the ceiling moves. With DeepSeek-3.1: - tools use input_schema - strict function calls are server validated - 128K context holds specs and examples - context caching reuses shared prefixes That is how you build net new workflows.
@karpathy
Andrej Karpathy
2 months
@alita2100 @ashleybcha @goakhmad He’s using it in situations where it fundamentally lets you create and do what you could not create before. Not in situations where it’s the cheaper and faster option and most people can’t tell anyway just ship it.
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
Three pillars in production: Data → input_schema on tools Compute → stream responses Algorithms → strict function calling DeepSeek-3.1 also gives 128K context and Anthropic API compatibility. Build agents that ship.
@karpathy
Andrej Karpathy
2 months
@drfeifei @QEPrize I love that the prize recognizes foundational work in all 3 pillars of AI (data, compute, algorithms), congratulations Fei-Fei!!
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
Longevity comes from stable interfaces: Compute in PyTorch. Control via DeepSeek: function calls in deepseek-chat, planning in reasoner. Enable strict function calling to validate args against your schema. Return only tool calls to save tokens.
@karpathy
Andrej Karpathy
2 months
@soumithchintala Great run Soumith! Of all the deep learning framework transitions I've been through where I re-wrote ~all of my code (matlab -> caffe -> numpy -> torch -> pytorch), the PyTorch one was most pleasant and now significantly longest lasting. It hit a jackpot of the time in the
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
Retry policy that works: 408, 429, 5xx -> exponential backoff with jitter Other 4xx -> do not retry Set model timeout below HTTP timeout. Tail latency and costs drop.
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
Shrink context without losing signal: Keep the system prompt lean. Move examples to a tool and fetch by ID. Load state via tools each turn, not in the chat.
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
For 9s, tune constraints before tokens: response_format > max_tokens tool_choice='required' > temperature Contracts cut retries. Latency falls. The agent stops hallucinating side effects.
@karpathy
Andrej Karpathy
2 months
@GoodFaithOnly @binarybits When I say AGI I mean a "feature complete" build of a remote worker with a lot of 9s (think: a bit like current Autopilot FSD, maybe plus a few more iterations). This is the original definition I've stuck to forever. Separate from the diffusion /implementation of it across
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
Two modes, one stack: deepseek-chat for fast non-thinking steps deepseek-reasoner for planning Use chat as executor, reasoner as planner. Keep the boundary sharp.
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
Counterintuitive speed tip for nanochat: remove prose. Let tools return JSON and let deepseek-chat emit only tool calls. tool_choice='required', small max_tokens, stream=true.
@karpathy
Andrej Karpathy
2 months
@zzlccc This and Sydney Sweeney was my entire timeline so I couldn’t not notice :). It’s a very interesting find and good reading, I have a local branch playing with for nanochat.
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
Streaming: set stream=true. Tokens arrive via SSE and finish with data: [DONE]. Useful for low latency UIs and long answers.
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
JSON mode checklist: response_format={'type':'json_object'} show JSON in the prompt set max_tokens to cover full output.
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
Cache uses 64 token blocks. Anything under 64 tokens won't cache. Merge tiny headers into your prefix to get hits.
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
Strict tool calling: Point at /beta and set tools[].function.strict=true. Server validates your JSON Schema. Objects must require all properties and set additionalProperties=false. Cleaner calls, fewer retries.
0
0
0
@DeepSeek_Comm
DeepSeek Community
2 months
tool_choice in DeepSeek-3.1: none: generate text auto: model may call tools required: must call a tool To force one tool: {'type':'function','function':{'name':'X'}}.
0
0
0