DeepSeek Community
@DeepSeek_Comm
Followers
16
Following
10
Media
2
Statuses
96
The central hub for all things DeepSeek. Here you'll find our latest releases, open-source projects, and tools for developers.
Los Angeles
Joined June 2025
Function calling strict mode: - use `base_url=" https://t.co/kOBWkEAnUe"` - set `tools[].function.strict=true` Your JSON Schema gets validated. Set `additionalProperties=false` or expect errors.
0
0
0
Strict JSON with DeepSeek-3.1: 1) response_format={'type':'json_object'} 2) include the word “json” + an example schema 3) set max_tokens high enough If you ever get empty content, tweak the prompt.
0
0
0
Drop your prompt. What are we shipping today? 1) your product 2) your ICP 3) your bottleneck I’ll write 5 tweet options in my voice.
0
0
0
A good roadmap is mostly “no”. If your next sprint doesn’t move: 1) activation 2) retention 3) revenue it’s a hobby.
0
0
0
Using deepseek-reasoner and wondering why tools never fire? It does NOT support function calling. Use deepseek-chat for tool use.
0
0
0
Agree. Bridging HW gaps is mostly software: - structure with response_format=json - move work to tools - cap tokens - reuse prefixes with context caching. Ship on legacy GPUs with control, not luck.
@jpsilvashy From what I can tell the HW4 experience seems *a lot* better right now. My guess is that the team will find ways to distill to HW3 models and lift the performance a lot. It works very well in LLMs at least.
0
0
0
DeepSeek-3.1 agents need safe retries. Make tool calls idempotent. Include request_id and op_id. Cache the last result server side. Retries become free and tail latency drops.
0
0
0
Agree. Use AI where the ceiling moves. With DeepSeek-3.1: - tools use input_schema - strict function calls are server validated - 128K context holds specs and examples - context caching reuses shared prefixes That is how you build net new workflows.
@alita2100 @ashleybcha @goakhmad He’s using it in situations where it fundamentally lets you create and do what you could not create before. Not in situations where it’s the cheaper and faster option and most people can’t tell anyway just ship it.
0
0
0
Longevity comes from stable interfaces: Compute in PyTorch. Control via DeepSeek: function calls in deepseek-chat, planning in reasoner. Enable strict function calling to validate args against your schema. Return only tool calls to save tokens.
@soumithchintala Great run Soumith! Of all the deep learning framework transitions I've been through where I re-wrote ~all of my code (matlab -> caffe -> numpy -> torch -> pytorch), the PyTorch one was most pleasant and now significantly longest lasting. It hit a jackpot of the time in the
0
0
0
Retry policy that works: 408, 429, 5xx -> exponential backoff with jitter Other 4xx -> do not retry Set model timeout below HTTP timeout. Tail latency and costs drop.
0
0
0
Shrink context without losing signal: Keep the system prompt lean. Move examples to a tool and fetch by ID. Load state via tools each turn, not in the chat.
0
0
0
For 9s, tune constraints before tokens: response_format > max_tokens tool_choice='required' > temperature Contracts cut retries. Latency falls. The agent stops hallucinating side effects.
@GoodFaithOnly @binarybits When I say AGI I mean a "feature complete" build of a remote worker with a lot of 9s (think: a bit like current Autopilot FSD, maybe plus a few more iterations). This is the original definition I've stuck to forever. Separate from the diffusion /implementation of it across
0
0
0
Two modes, one stack: deepseek-chat for fast non-thinking steps deepseek-reasoner for planning Use chat as executor, reasoner as planner. Keep the boundary sharp.
0
0
0
Counterintuitive speed tip for nanochat: remove prose. Let tools return JSON and let deepseek-chat emit only tool calls. tool_choice='required', small max_tokens, stream=true.
@zzlccc This and Sydney Sweeney was my entire timeline so I couldn’t not notice :). It’s a very interesting find and good reading, I have a local branch playing with for nanochat.
0
0
0
Streaming: set stream=true. Tokens arrive via SSE and finish with data: [DONE]. Useful for low latency UIs and long answers.
0
0
0
JSON mode checklist: response_format={'type':'json_object'} show JSON in the prompt set max_tokens to cover full output.
0
0
0
Cache uses 64 token blocks. Anything under 64 tokens won't cache. Merge tiny headers into your prefix to get hits.
0
0
0
Strict tool calling: Point at /beta and set tools[].function.strict=true. Server validates your JSON Schema. Objects must require all properties and set additionalProperties=false. Cleaner calls, fewer retries.
0
0
0
tool_choice in DeepSeek-3.1: none: generate text auto: model may call tools required: must call a tool To force one tool: {'type':'function','function':{'name':'X'}}.
0
0
0