Hossam Amer Profile
Hossam Amer

@Hossam_Amer12

Followers
329
Following
116
Media
520
Statuses
17K

Hossam Amer (7@S7@S), a person that has passed and still passing thru a lot of things to be the way he is. He is ambitious, lucky, likes ppl, basketball, BLUE.

Joined October 2011
Don't wanna be here? Send us removal request.
@Hossam_Amer12
Hossam Amer
5 years
5 years in 30 minutes. This is quite ambitious! :D #PhD #PhDDefense #PhDMemories .Video credit goes to my niece, Salma Al Ghazaly. @UWaterlooGSPA @WaterlooENG @GRADventure_UW
2
0
6
@Hossam_Amer12
Hossam Amer
7 days
RT @iScienceLuvr: Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains. 'We introduce Rubrics as Rewards (RaR), a framework….
0
80
0
@Hossam_Amer12
Hossam Amer
8 days
RT @denny_zhou: Slides for my lecture “LLM Reasoning” at Stanford CS 25: Key points: .1. Reasoning in LLMs simply….
0
349
0
@Hossam_Amer12
Hossam Amer
8 days
RT @askalphaxiv: "Deep Researcher with Test-Time Diffusion". This paper treats report writing as an iterative retrieval‑augmented diffusion….
0
26
0
@Hossam_Amer12
Hossam Amer
9 days
RT @omarsar0: Deep Research Agents with Test-Time Diffusion. Google keeps pushing on diffusion. This time, they apply diffusion to deep re….
0
128
0
@Hossam_Amer12
Hossam Amer
9 days
RT @jiqizhixin: Anthropic just released a research paper. Inverse Scaling in Test-Time Compute. This study shows that longer reasoning in….
0
87
0
@Hossam_Amer12
Hossam Amer
10 days
RT @askalphaxiv: The Era of DiffusionLM might be upon us. "Diffusion Beats Autoregressive in Data-Constrained Settings". they find that Dif….
0
44
0
@Hossam_Amer12
Hossam Amer
10 days
RT @iScienceLuvr: Diffusion Beats Autoregressive in Data-Constrained Settings. Comparison of diffusion and autoregressive language models f….
0
119
0
@Hossam_Amer12
Hossam Amer
12 days
RT @goyal__pramod: A beautiful visual blog, where you can change values, interact, and see what each head does exactly inside the transform….
0
421
0
@Hossam_Amer12
Hossam Amer
14 days
RT @liliang_ren: We’re open-sourcing the pre-training code for Phi4-mini-Flash, our SoTA hybrid model that delivers 10× faster reasoning th….
Tweet card summary image
github.com
Simple & Scalable Pretraining for Neural Architecture Research - microsoft/ArchScale
0
217
0
@Hossam_Amer12
Hossam Amer
15 days
RT @askalphaxiv: how do you watermark an LLM?. "Scalable Fingerprinting of Large Language Models". This paper forces an LLM to produce uniq….
0
33
0
@Hossam_Amer12
Hossam Amer
16 days
RT @cwolferesearch: LLM-as-a-Judge (LaaJ) and reward models (RMs) are similar concepts, but understanding their nuanced differences is impo….
0
45
0
@Hossam_Amer12
Hossam Amer
16 days
RT @Hesamation: 10 GitHub repos to sleep with as an ai engineer covering ML systems, Agents, RAG, MLOps:. 1. Machine Learning for Beginners….
0
407
0
@Hossam_Amer12
Hossam Amer
16 days
RT @askalphaxiv: it is better not to train on the mixture you want to be good on!. "Scaling Laws for Optimal Data Mixtures". Researchers at….
0
28
0
@Hossam_Amer12
Hossam Amer
16 days
RT @goyal__pramod: I never knew how beautifully connected Softmax and Cross-entropy were till I read this.
Tweet media one
0
107
0
@Hossam_Amer12
Hossam Amer
16 days
RT @ManuelFaysse: Introducing ColQwen-Omni, a 3B omnimodal retriever that extends the ColPali concept of multimodal retrieval with late int….
0
102
0
@Hossam_Amer12
Hossam Amer
16 days
RT @Azaliamirh: Looking forward to attending ICML!. Here are some works on memory/long context, verification, kernel design, multi-model AI….
0
17
0
@Hossam_Amer12
Hossam Amer
17 days
RT @Mengyue_Yang_: Unfortunately, I won't be able to attend #ICML2025 in person due to visa delays. But I'm excited to share our paper in I….
0
31
0
@Hossam_Amer12
Hossam Amer
17 days
RT @Yulun_Du: Shaowei from our infra team actually wrote about the decisions we made on the Kimi K2 architecture. I….
0
100
0
@Hossam_Amer12
Hossam Amer
17 days
RT @janericlenssen: Can diffusion models solve visual Sudoku? . If you are at #ICML2025, come to our poster in the Wednesday morning poste….
0
75
0
@Hossam_Amer12
Hossam Amer
17 days
RT @y0b1byte: "The difference in performance between optimizers shrinks under small batch sizes. In fact, vanilla SGD without momentum perf….
0
21
0