Marc Sun Profile
Marc Sun

@_marcsun

Followers
2K
Following
3K
Media
23
Statuses
572

Machine Learning Engineer @huggingface Open Source team

New york
Joined February 2023
Don't wanna be here? Send us removal request.
@_marcsun
Marc Sun
12 hours
RT @art_zucker: Holy. `transformers` reached 1B downloads 😭 thanks everyone for making this possible, what an amazing community https://t….
0
26
0
@_marcsun
Marc Sun
24 hours
RT @RisingSayak: Had the honor to present diffusion transformers at CS25, Stanford. The place is truly magical. Slides: .
0
86
0
@_marcsun
Marc Sun
3 days
RT @realHongyu_Wang: We just released the fine-tuning code and fine-tuned models of BitVLA in Huggingface🔥🔥.Enjoy these hyper-efficient 1-b….
0
11
0
@_marcsun
Marc Sun
3 days
RT @LysandreJik: BOOOM! transformers now has a baked-in http server w/ OpenAI spec compatible API. Launch it with `transformers serve` and….
0
28
0
@_marcsun
Marc Sun
10 days
RT @RisingSayak: Boy, we shipped, and we shipped hard 🧨. From new SoTA open models to improved support for torch.compile to features, inspi….
0
14
0
@_marcsun
Marc Sun
12 days
Huge kudos to @sgl_project team for making this possible ! @lm_zheng @zhyncs42 @ispobaoke Jin Pan.
0
0
1
@_marcsun
Marc Sun
12 days
🤗 Transformers is becoming the source of truth for model definitions, regardless of backend or runner.
@LysandreJik
Lysandre
2 months
The Transformers library is undergoing it's largest pivot to date 🙌. It now cements its role as the central model definition, irrespective of the backend and runner. One ground truth to bring more reliability across the ecosystem. Why is this important?
Tweet media one
1
0
1
@_marcsun
Marc Sun
12 days
In practice, this means you can now use any transformers-compatible model with SGLang. To explicitly use the Transformers backend, just set `impl="transformers"`.
Tweet media one
1
0
2
@_marcsun
Marc Sun
12 days
🚀 SGLang now supports Hugging Face Transformers as a backend!. Run any transformers-compatible model with fast, production-grade inference — no native support needed. Just plug and play 🥳. Blogpost:
Tweet media one
3
24
109
@_marcsun
Marc Sun
13 days
RT @TheZachMueller: This is starting to feel more like a conference, less like a course every day. We're now having the amazing @wanchao_ a….
0
6
0
@_marcsun
Marc Sun
15 days
RT @stevhliu: Pour yourself some wine and watch me speak for the first time ever (if I flop, at least your wine won't) about how Transforme….
0
2
0
@_marcsun
Marc Sun
15 days
We added support for regional compilation with the DeepSpeed engine. DeepSpeed’s .compile() modifies models in-place using torch.nn.Module.compile(. ), rather than the out-of-place torch.compile(. ), so we had to account for that.
0
0
1
@_marcsun
Marc Sun
15 days
We updated the CCL_WORKER_COUNT variable and added KMP parameters for Intel CPU users. This significantly improves distributed training performance, with up to a 40% speed-up on Intel 4th Gen Xeon when training transformer TP models.
1
0
0
@_marcsun
Marc Sun
15 days
We've simplified how to prepare FSDPv2 models, as there were too many ways to compose FSDP2 with other features. Although the setup is now more restrictive, it leads to fewer errors and a more performant user experience. We’ve also added support for FP8 !.
Tweet media one
1
0
0
@_marcsun
Marc Sun
15 days
🚀 Accelerate v1.8.0 is here! . Highlights of this release: .- 📷FSDPv2 + FP8 support by @m_sirovatka .- Faster distributed training on Intel CPU by jiqing-feng.- Regional compilation for deepspeed by @IlysMoutawwakil. Release notes:
Tweet media one
1
5
26
@_marcsun
Marc Sun
15 days
RT @mgoin_: Exciting first day talking about @vllm_project in Singapore! I had an great time discussing in depth with @EmbeddedLLM on how w….
0
11
0
@_marcsun
Marc Sun
16 days
RT @_derek_liu_: Now you can make Flux.1 your own within just 10GBs of VRAM. In our new blog post we walk you through the process step by s….
0
9
0
@_marcsun
Marc Sun
16 days
RT @reach_vb: Let's goooo! @kyutai_labs just dropped SoTA Speech to Text transcriptions model - CC-BY-4.0 Licensed 🔥. > kyutai/stt-1b-en_fr….
0
56
0
@_marcsun
Marc Sun
16 days
RT @RisingSayak: Overlap compute with communication while offloading to disk instead of CPU 🔥. This (group offloading) provides a very nice….
0
5
0
@_marcsun
Marc Sun
16 days
RT @PyTorch: Interested in pushing the limits of optimizing LLMs on GPUs? Join engineers from @AMD , PyTorch, @GPU_MODE, @huggingface & mo….
0
6
0