Yangxiao Lu
@parker_sean_L
Followers
22
Following
30
Media
0
Statuses
23
CS PhD @IRVLUTD, @UT_Dallas
Joined November 2017
I was preparing a video to introduce our lab @IRVLUTD for a meeting. Happy to share the video here! We are looking forward to collaborating with both academia and industry. Please feel free to reach out
5
38
154
We can replay the HO-Cap data in #IsaacSim No physics enabled. It is setting the poses of hands and objects according to our annotations
Introducing HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction. We built a multi-camera system and a semi-automatic method for annotating the shape and pose of hands and objects Project page: https://t.co/JfurgXWcJf
3
52
309
Introducing HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction. We built a multi-camera system and a semi-automatic method for annotating the shape and pose of hands and objects Project page: https://t.co/JfurgXWcJf
1
34
232
Yangxiao Lu @parker_sean_L, the first author of the following method, is graduating in Spring 2025. He is looking for industrial positions in robotics and computer vision. Please reach out to him if your team needs an expert in real-world perception!
We added a ROS interface to use the model for YCB objects. Code available at https://t.co/GWYL4JizfY
0
6
29
NVIDIA robotics research team is hiring 2025 PhD interns! Apply here: https://t.co/GhnrkZBVho
5
35
224
π’π£Our team at Meta's GenAI media foundations (applied Llama) is looking for highly motivated Research Scientist Interns for summer 2025. Duration is 12-24 weeks at various US locations. Please connect with me directly at felixu@meta.com with your CV. https://t.co/bFDEAJ9Xqh
3
32
243
Our lab is hiring 2025 summer PhD Interns in Robotics: https://t.co/aii340abJJ Our lab website / NVIDIA Seattle Robotics Labs: https://t.co/ueOpmQu4qu Apply and ping me if you are interested ~~~
research.nvidia.com
NVIDIA Robotics
3
45
304
Looking for a 2025 summer research intern, in the Foundation Model Team at Apple AI/ML, with the focus of Multimodal LLM / Vision-Language. Phd preferred. Apply through https://t.co/m243cnfXay Also email me your resume to haoxuanyou@gmail.com! π
17
69
432
This is an incredible performance boost by pure prompting to solve reasoning tasks! The secret is to teach models to use backtracking if necessary, and try diverse CoTs. Exactly what we did in our Searchformer paper (except that we pre-train/fine-tune the model). Searchformer
Can @AnthropicAI Claude 3.5 sonnet outperform @OpenAI o1 in reasoning? Combining Dynamic Chain of Thoughts, reflection, and verbal reinforcement, existing LLMs like Claude 3.5 Sonnet can be prompted to increase test-time compute and match reasoning strong models like OpenAI o1.
0
26
135
π¨LOTUS: Diffusion-based Visual Foundation Model for High-quality Dense Prediction πππ«π¨π£: https://t.co/EPlbmQeEml ππππ¬: https://t.co/JUy9m0Jury present Lotus, a diffusion-based visual foundation model for dense geometry prediction
2
27
145
π Exciting internship opportunity! Join the Llama team @AIatMeta and help redefine what's possible with large language modelsβfrom pre-training to post-training. Be part of our 2025 research internship and help shape the future of LLMs. Feel free to email or DM me π© Learn
6
21
223
We're looking for research interns starting next year working on embodied agents and multimodal LLMs. If you are intereted, please drop me an email and apply at
1
31
188
Multiple research internships available in Qualcomm AI Research in Amsterdam. Topics: - Image and video generation - 3D generation - Multimodal models - LLM compression and quantization - AI Safety - Code generation - Robotics #LLM, #diffusion, #VLM
https://t.co/m37kFhejCy
5
32
279
π‘CLIP is the default choice for most multimodal LLM research. But, we know CLIP is not perfect. It is good at high-level semantics, but not for capturing fine-grained info. π€©π€© We present CLOC β°, our next-generation image encoder, with enhanced localization capabilities, and
15
154
930
π New paper from our Llama team @AIatMeta! We discuss "cross capabilities" and "Law of the Weakest Link" of large language models (LLMs): πΉ Cross capabilities: the intersection of multiple distinct capabilities across different types of expertise necessary to address complex,
7
22
149
We are hiring interns for summer 2025 at FAIR. Get involved in cutting-edge projects related to LLM alignment, reasoning, and synthetic data generation for text/multimodal LLMs. Apply now! https://t.co/e811qOMzBT
11
53
453
π₯π₯π₯ Application season is here!!! Tencent AI Lab Seattle is actively hiring full-time employees and interns in 2025. If you're passionate about cutting-edge research in self-evolving AI systems, multi-modal perception, and linear models, don't hesitate to reach out to me!
10
40
406
I love simple yet effective things. However, reviewers never agree with me on that.
15
16
220
Introducing Meta Segment Anything Model 2 (SAM 2) β the first unified model for real-time, promptable object segmentation in images & videos. SAM 2 is available today under Apache 2.0 so that anyone can use it to build their own experiences Details β‘οΈ https://t.co/eTTDpxI60h
153
1K
7K