马东锡 NLP 🇸🇪 @dongxi_nlp X Profile

马东锡 NLP 🇸🇪

@dongxi_nlp

Followers

10K

Following

5K

Media

157

Statuses

1K

Prev. PhD @Stockholm_Uni | Alumni @KTHuniversity @uppsalauni Sharing insights on AI, autonomous agents, and large language & reasoning models.

Stockholm, Sweden

Joined January 2022

Don't wanna be here? Send us removal request.

马东锡 NLP 🇸🇪

@dongxi_nlp

13 hours

Papers:. HiRA.ReWoo.

0

4

马东锡 NLP 🇸🇪

@dongxi_nlp

13 hours

「 Deep Search 」. Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search. 解耦 “规划” 与 “执行”，HiRA显著提升 Deep Search 推理效率和可扩展性，更让我想起被低估的 “ReWOO”。

1

9

28

马东锡 NLP 🇸🇪

@dongxi_nlp

2 days

Project:.

0

1

马东锡 NLP 🇸🇪

@dongxi_nlp

2 days

「 Precise Instruction Following, RLVR, Ai2 」. Generalizing Verifiable Instruction Following. IFBENCH 构建了全新、难度更高且可verifiable的约束指令，让LLM更能精准服从指令。. 大模型当下的 “精准” 服从指令能力，依然较为挣扎。. 如GPT-4.1、Claude 3.7 在常用的IFEval > 80%，但在

2

0

13

马东锡 NLP 🇸🇪

@dongxi_nlp

3 days

Altman说，missionaries will beat mercenaries. 使命者击败雇佣兵。. 非常讨厌。. 企业主给员工100w，但他能创造1000w的价值，这是本质逻辑。这个时候别的公司花两百万挖人，你不匹配，却说拿100w的更有使命感？. 打工牛马，请记住这句：.I have always valued individual choice over collective.

10

11

72

马东锡 NLP 🇸🇪

@dongxi_nlp

4 days

Paper:.

0

2

马东锡 NLP 🇸🇪

@dongxi_nlp

4 days

「 Gibberish Prompt 」. LLM 对语义的依赖比我们想的少, 内部可能更偏好某些统计或位置模式，而非人类语法。. 把本来完整、可读的 prompt 按 token 级随机做 pruning，只留下极少量残片，结果对人类几乎不可读，却能让 LLM 的表现更好。. 作者把这种残缺的prompt，形象的描绘为 Gibberish Prompt。

7

5

34

马东锡 NLP 🇸🇪

@dongxi_nlp

5 days

Paper:.

0

1

马东锡 NLP 🇸🇪

@dongxi_nlp

5 days

「 Agent, NanoGPT Speedrun, Meta 」. The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements. 如果 LLM 可以实现自我迭代升级，那或许就是通往 AGI 奇点的真正临界点。. 要验证这种“自我升级”是否可行，至少要先看现有 LLM 能否复现人类已经做到的创新。. 作者提出Automated

1

3

37

马东锡 NLP 🇸🇪

@dongxi_nlp

5 days

贴一段我自己的phd论文对distribution hypothesis的解释

0

7

马东锡 NLP 🇸🇪

@dongxi_nlp

6 days

特意去看了NLP的章节。当看到是从Distributional Hypothesis开始说起，就果断mark了。. 现在LLM的一切都可以从追述到语言学的Distributional Hypothesis。. 希望大家有一个充实的暑假，如果你还有暑假的话 😀.

Sebastian Raschka

@rasbt

6 days

Since it's summer, and more or less internship and tech interview season, I made all 30 chapters of my Machine Learning Q and AI book freely available for the summer:. Hope it’s helpful! Happy reading, and good luck if you are interviewing!.

3

6

35

马东锡 NLP 🇸🇪

@dongxi_nlp

7 days

Paper:.

0

1

马东锡 NLP 🇸🇪

@dongxi_nlp

7 days

「 LLM Code, Code Migration, RL 」. ReCode: Updating Code API Knowledge with Reinforcement Learning. ReCode，使 LLM 能够通过少量数据快速学习并更新过时的 API 知识，学会 “跨版本代码迁移”。. 当 API 更新，LLM 掌握的api知识会即时过时，导致生成的代码在新版本环境中频繁报错。. 作者提出

1

2

16

马东锡 NLP 🇸🇪

@dongxi_nlp

8 days

在 LLM 训练的语境中，有一种现象叫 grokking。. 模型在训练早期就完全记住训练集，却在很长一段时间里，测试准确率低；但再持续优化，测试准确率突然飙升，显示出 “迟到的泛化”。. Grokking，不是瞬间gotcha，而是一个过程，如同 “顿悟”，往往意为着 “求索 -> 不得 -> 顿悟”。. LLM.

6

7

98

马东锡 NLP 🇸🇪

@dongxi_nlp

9 days

Paper:.

0

1

马东锡 NLP 🇸🇪

@dongxi_nlp

9 days

「 SWE Agent, Data Scaling Law 」. Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs. Data scaling law，只要数据继续增长，SWE Agent 的性能几乎呈 log-linear上升，且尚未见饱和。. 作者构建全自动、执行可验证的data pipeline，构建了Skywork-SWE

2

5

61

马东锡 NLP 🇸🇪

@dongxi_nlp

9 days

SWE Agent 的文章带来CodeX，ClaudeCode，Gemini CLI，充分说明了什么是高水平的有价值的文章。. 在Chatbot泛滥的 Human Computer Interaction 交互中，SWE Agent就是要提出Agent Computer Interface，让human out of the loop。. 超越方法，是思想。有思想的文章和产品才更有价值。.

5

20

145

马东锡 NLP 🇸🇪

@dongxi_nlp

10 days

Paper :.

0

1

马东锡 NLP 🇸🇪

@dongxi_nlp

10 days

「 SWE Agent, SQL 」. SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications. SWE Agent 在数据库中的形态 SWE-SQL. 针对 SQL 调试难题，作者构建了：.数据集（BIRD-CRITIC）.训练环境（SIX-GYM），.开源 SQL Agent（BIRD-FIXER），非常solid！. 对比 SWE

2

4

31