Open-Sora 1.2 is out🔥
Open-Sora is an initiative dedicated to efficiently producing high-quality video in open-source way , released by
@HPCAITech
👏
Model:
Demo:
✨ Video compression network
✨Rectifie-flow training
✨More data
Today released their open source model Yi-34B on the Hub👏🚀
✨ The FIRST Chinese model to top the Open LLM Leaderboard
💪 Better performance than Falcon-180B and Llama2 70B on pre-training
🇨🇳 Supports both English and Chinese
Impressive work from the Chinese community 🚀
Mini-Omni 🔥 An open multimodal large language model with real-time speech and audio conversation abilities.
Model:
Demo:
Paper: ( you can directly communicate
OpenCodeInterpreter💻 A family of open-source code systems for generating, executing, & refining code🔄
✨Their 7b models hits 90% accuracy on HumanEval
✨ SC2 series based on StarCoder2 and GM-7b Model with gemma-7b
Model:
Paper:
EVA-CLIP-18B🔥A powerful and probably largest open-source CLIP model released by a Chinese research Lab
@BAAIBeijing
✨ 18B params
✨ 80.7% zero-shot accuracy on 27 benchmarks with only 6B samples.
✨ Trained on a smaller dataset, more efficient
Model:
RWKV-v5 Eagle 7B is out 🔥
✨ Trained on 1.1 Trillion Tokens across 100+ languages
📄 Apache 2.0
🚀 Outperforms all 7B class models
Model:
Demo:
💡Check the blog, their response to the question about the muiti-lingual is really cool
All while being
- Cleanly licensed Apache 2, under
@linuxfoundation
(do anything with it!)
- The world's greenest 7B model 🌲
(by per token, energy consumption)
You can find out more from our full writeup:
China Telecom becomes the first state-owned enterprises to open source their LLM - TeleChat 7B and high quality pre-train dataset🚀
✨The size of the dataset reaches 270M, which is one of the largest Chinese pre-training datasets released so far.
SeaLLMs - language models optimized for Southeast Asian languages!
It supports English 🇬🇧, Vietnamese 🇻🇳, Indonesian 🇮🇩, Thai 🇹🇭, Malay 🇲🇾, Khmer🇰🇭, Lao🇱🇦, Tagalog🇵🇭 and Burmese🇲🇲
Always excited to see an LLM that goes beyond mainstream languages 🤗
Just checked out today's top-voted paper, Chinese researchers are on fire🔥
✨ The Era of 1-bit LLMs from UCAS and Microsoft
✨EMO: Emote Portrait Alive from Alibaba
Btw, love seeing authors jump into the conversation thread like
Let's meet Yi-Coder 🔥
Chinese AI unicorn
@01AI_Yi
just released its first series of code LLMs!
Blog:
Model:
✨ 1.5B & 9B base and chat
✨ Apache 2.0
✨ Context window of 128K tokens
✨ Supporting 52 major programming languages
百模大战 🔛 The battle of 100 large models
A trendy word in Chinese large model land. It gives you an idea about how this business is growing in China. 🚀
Show case or business case? 🧵
OpenRLHF 🔥 A high-performance RLHF framework built on Ray, DeepSpeed, and HF Transformers!
✨ User-friendly, compatible with
@huggingface
models.
✨ 2x performance boost with Ray and Adam Offload.
✨ Distributed training for 70B+ models on multiple GPUs.
BGE-M3 🔥
Highlights:
✨ Multi-Lingual: supports over 100 languages
✨ Multi-Granularity: input texts up to 8192 characters
✨ Multi-Functionality: integrates dense retrieval/ sparse retrieval/ multi-vector retrieval to support different scenarios
BAAI releases BGE-M3, a new member to BGE model series. M3 stands for Multi-linguality (100+ languages), Multi-granularities (input length up to 8192), Multi-Functionality (unification of dense, lexical, multi-vec retrieval). 🔥
AutoMathText: A 200GB dataset of mathematical texts open sourced on
@huggingface
📊🚀
✨ Multi-source : arXiv/programming code/web pages
✨ Filtered and processed to adapte Math reasoning
✨ Selected by Qwen 72B
Paper:
Just few hours after Shanghai AI lab's two releases, here comes CogVideoX 🔥🚀 a SOTA open video generation model made by
@thukeg
from the Chinese community 🤯
Model:
Demo:
And it's only the first day of the week!!😎
Found this multi-lingual fine-tuned version of Zephry-7B in the Hub🔥
🔠 Supports multiple languages: Chinese, Japanese, Korean, English, French, German etc. and cross-language tasks such as translation.
🧠 Strong cognitive abilities.
DeepSeek-VL🐬 An open access VL model designed for real-world vision and language understanding applications 📺🚀
✨ 1.3B & 7B base and chat
✨ Support commercial use in limited scenarios
Paper:
Model:
🔥 Just dropped: Qwen2-VL by
@Alibaba_Qwen
🚀
✨ 2B & 7B both under Apache 2.0
✨ Smart agents for mobile & robot ops
✨ SoTA in image & 20min+ video comprehension
✨ Multilingual: English, Chinese, Japanese, Arabic etc.
This is THE MODEL you don't want to miss!!
Qwen 2 ⚡️ POWERFUL open model from
@alibaba_cloud
is now available on
@huggingface
Hub 🚀
Model:
Demo:
✨ 0.5B / 1.5B / 7B / 57B-A14B / 72B
✨ Apache 2.0
✨ Support 27 languages
✨ Great
GLM-4-9B 🔥 Chinese model with open access from ZHIPU AI
@thukeg
, now available on the
@huggingface
hub🚀
Model:
✨ 9B base ( 8k ) & chat ( 128k and 1M ) , 4V-9B MML
✨ Function call comparable to GPT-4
✨ All Tools function: enabling smart use of web
YES, WE DID IT AGAIN IN PARIS!! 🥐🤗
#WoodstockAI
Can you believe the biggest open source community event in Europe was organised by our
@huggingface
team in only 2 weeks? 🤯 🧵
DeepSeek-Coder-V2 ⚡️ an powerful MoE code language model with open access is now available on the
@huggingface
hub 🚀🔥
Model:
✨ 16B & 236B parameters
✨ 128k context length
✨ Code & math skill are between GPT-4o & GPT-4Turbo.
✨ Free commercial use
ConvLLaVA 💻 A visual encoder design replacing Vision Transformer (ViT) in language-vision models (LMM) from Alibaba and Tsinghua University.
Model:
Paper:
✨ Uses the hierarchical ConvNeXt backbone as the visual encoder for LMM.
Alibaba just released Qwen1.5 - 110b on
@huggingface
hub🎉
Model:
Demo:
✨ The largest one in the Qwen1.5 series
✨ Context length 32K tokens
✨ Multilingual: Chinese, English, French, Korean, Japanese, Vietnamese, Arabic etc.
GOT-OCR2.0 🔥 a 580M end-to-end OCR-2.0 model released by StepFun 阶跃星辰 is now available on the
@huggingface
Model:
Paper:
✨ While others are releasing powerful models, StepFun, a new player in China's OS community is opening
InstantX team from the Chinese community has been making a lot of new moves in open source👇
✨ Partnered with
@ShakkerAI_Team
, release FLUX.1-CN-Union & FLUX.1-CN-Depth.
✨ Built a new demo:
✨ Published new paper about
Just checked out today's top-voted paper, Chinese researchers are on fire🔥
✨ The Era of 1-bit LLMs from UCAS and Microsoft
✨EMO: Emote Portrait Alive from Alibaba
Btw, love seeing authors jump into the conversation thread like
🎥 New Video-LLMs update from the Chinese community!
VideoLLaMA 2-72B released by
@AlibabaDAMO
🔥
Model:
Demo:
Paper:
✨ Join the discussion thread, communicate with the authors on the paper page!
VideoLLaMA2 🦙 A set of Video LLMs from
@alibaba_cloud
is now available on
@huggingface
🚀
Model:
Paper:
✨ Spatial-Temporal Mastery: Advanced STC for pinpoint video dynamics capture.
✨ Enhanced Audio Branch: Seamless integration
Most upvoted paper in past two weeks from the Chinese community on Daily Papers📑🚀
✨ ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
✨Depth Anything V2
✨Autoregressive Model Beats Diffusion:
China has released the <Basic Security Requirements for Generative AI Services>📖setting standards for data, model safety, and security protocols, including assessment guidelines🔍🧵
New dataset from the Chinese community 🥳
OpenVid-1M 🔥a high-quality text-to-video dataset with 1 million text-video pairs, from
@ByteDanceOSS
and
@NanjingUnivers1
Dataset:
Paper: ( most upvoted paper of the day)
Most upvoted paper of May from the Chinese community on Daily Papers 📜🚀
✨ StoryDiffusion: Consistent self-attention for long-range image and video generation
@ByteDanceOSS
✨ ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal
Today released their open source model Yi-34B on the Hub👏🚀
✨ The FIRST Chinese model to top the Open LLM Leaderboard
💪 Better performance than Falcon-180B and Llama2 70B on pre-training
🇨🇳 Supports both English and Chinese
Last Monday,
@AIatMeta
released SAM2.
The community shared Medical-SAM2 just a few days later🔥
kudos to Jiayuan Zhu, Yanli Qi and
@JundeMorsenWu
Dataset:
Paper:
Tuesday's here, and we've got an exciting update for THE MiniCPM! 🔥
MiniCPM-V 2.6🚀 is the latest end-side MLLM from
@OpenBMB
💫
Communicate with authors on the paper page:
Model:
✨ Built on SigLip-400M & Qwen2-7B with 8B
Alibaba's LLMs are entering the Southeast Asian market👀🚀
SeaLLMs is now on V3 🦭 This is a family of OPEN LLMs tailored for Southeast Asia Languages, made by
@AlibabaGroup
🔥
Model:
Demo:
Paper:
Congrats LiblibAI 🎊🚀 The largest AI image creation platform in China, with millions of creators and 100 millions AI artworks contributed, is now become the FIRST and ONLY AI community officially registered under China's AI service regulations🥳
This is a huge deal for Generative 3D.
MeshAnything was just released, and is a major leap in terms of mesh topology.
This is the beginning of Generative 3D in real-world 3D applications.
ANOLE🦎 an OPEN multimodal model that natively generates images and text without needing stable diffusion from GAIR🔥
Model:
Paper:
✨ Native integration: no need for adapters, seamlessly aligning visual and language models.
✨
🔥 Big update on the SOTA text-to-video model from the Chinese community!
-
@ChatGLM
from Tsinghua just released CogVideoX 5B
- CogVideoX 2B now supports Apacha 2.0 🙌
Paper:
Model:
Demo:
✨ CogVideoX
DeepSeek-V2.5 🚀 an OPEN model combining general and coding capabilities just released by Chinese AI unicorn
@deepseek_ai
.
✨ Combine DeepSeek-V2-Chat & DeepSeek-Coder-V2
✨ Enhanced writing, instruction-following and human preference alignment
Just discovered an amazing organisation on the
@huggingface
Hub with over 1,000 researchers in Japan collaborating on open-source LLMs. 🌟
It's inspiring to see global support and contributions to open source in so many ways!🌍🚀
Open Chinese LLM Leaderboard 开放中文大语言模型榜单🏆 from
@BAAIBeijing
now available on
@huggingface
hub!
✨ Based on the Eleuther AI Language Model Evaluation Harness
✨ Evaluates on 7 key benchmarks, with all English datasets translated to Chinese
The open models released by the Chinese community this week are truly remarkable 🔥
Here are some highlights:
✨ CogVideoX 2b from
@thukeg
- Zhipu "Sora"
✨ Qwen 2 - Math from
@Alibaba_Qwen
- For advanced math problem-solving.
Bunny🐰is on the Hub!
A family of lightweight but powerful multimodal models released by Chinese research lab
@BAAIBeijing
🔥
✨Bunny-3B (SigLIP + Phi-2) outperforms even 13B models!
For those overwhelmed by 500 daily new papers on arXiv and not sure where to start, here are some tips💡
📑 Check out Daily Paper() by
@_akhaliq
for daily digest or subscribe for inbox delivery 📩
🤖 Ask librarian-bot for paper recommendations
🌟 Start
Very nice to see the authors of a paper ask librarian-bot for a paper recommendation. I hope it was helpful,
@vicgalle_
!
You can find similar papers for a
@huggingface
paper by commenting `
@librarian
-bot recommend`.
DeepSeekMoE 16B : a new MoE with two innovative strategies, just released by
@deepseek_ai
🔥
📊 16.4B parameters
🏋️ Trained on a 2T token dataset
♻️ 40% more efficient than DeekSeek 7B and LLaMA2 7B
💻 Deployed on a single GPU without quantization
Depth Anything 2 🔥 A monocular depth estimation model from HKU and TikTok 🚀
Model:
Demo:
Paper:
✨ Enhancing depth prediction with synthetic images, larger teacher models, and pseudo-labeled real images.
Near 500 paper claimed in Space ICLR 2024🎉
Here are some tips if you want to engage more with the community 🤗
💡 Join the discussion threads below each paper.
💡 Start a conversation with authors by clicking "@" next to the author's profile photo on
We've prepared everything for
@iclr_conf
at Space ICLR2024 👀
📑 All accepted papers
💬 Discussions with authors
🧠 Code/Models/Datasets/Demos related to the paper
🙋 Upvotes by the community
❓Is there anything else to add?
#ICLR2024
@_akhaliq
Yi-9B 🚀 is now on the
@huggingface
hub 🤗
✨ Strong coding and math skills
✨ Excellent bilingual ability in Chinese and English
✨ Developer friendly, can run on consumer GPUs
✨ Apache 2.0, mail required for free commercial use
SeaLLM-7B-v2.5🔥 Latest update of SeaLLMs, released by
@AlibabaGroup
on
@huggingface
Hub!
Model:
Demo:
✨ Support Vietnamese 🇻🇳, Indonesian 🇮🇩, Thai 🇹🇭, Malay 🇲🇾, Khmer🇰🇭, Lao🇱🇦, Tagalog🇵🇭 and Burmese🇲🇲
✨ Great on math, common
ChatTTS 💬 A text-to-speech model designed for dialogue scenarios like LLM assistants.
Model:
Demo:
✨ OS version on
@huggingface
is pre-trained with 40,000 hours of data.
✨Support English and Chinese.
✨Supports multiple
XVERSE 元象 has released one of the largest MoE models from the Chinese community 🇨🇳
👉 XVERSE-MoE-A36B
✨ 255B total parameters, with 36B actively used during inference
✨ Supports 40 languages, including English, Chinese, Russian, and Spanish
✨ Apache
LLM performances have been plateauing... so we decided to make the Open LLM Leaderboard steep again 🏔️ 😈
Introducing the Leaderboard 2️⃣
Expect...
- new benchmarks
- fairer reporting
- cool features (did I hear voting and chat template?)
🧵
ChronoDepth🔥A new work in video depth estimation!
Demo:
Paper:
✨Achieves frame-to-frame consistency and spatial accuracy
✨Transforms depth estimation into a conditional generation problem, enhancing learning and generalizability.
An exclusive community for Chinese LLMs is here 💫🚀
We're building this org on the
@huggingface
hub to keep you in the loop with the latest works from the Chinese language community 🔥
And we'd love for you to join us and help sharpen this community
Mindsearch (思·索)🔍 An open AI Search Engine Framework made by
@intern_lm
🔥
Communicate with authors in here:
✨ Apache 2.0
✨ Gathers and integrates info from 300+ web pages in 3 mins
✨ Accurately handles complex queries by breaking them into
Dive into the world of "Benchmark" and "Arena" with today's Daily Papers on
@huggingface
📑🧐
✨ GenAI Arena: An Open Evaluation Platform for Generative Models
✨ WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
When you're going to call it a day...
Qwen released Qwen2 audio🤯
✨ 7b base & instruct
✨ Voice chat: use your own voice to give instructions
✨ Audio analysis: including speech, sound, music, etc.
✨ Multilingual: supports more than 8 languages and
The open models released by the Chinese community this week are truly remarkable 🔥
Here are some highlights:
✨ CogVideoX 2b from
@thukeg
- Zhipu "Sora"
✨ Qwen 2 - Math from
@Alibaba_Qwen
- For advanced math problem-solving.
25% of global AI papers have come from the Chinese community🤯
Staying tuned to the latest papers on the Daily Papers page is a great way to keep up with AI developments from the Chinese community.
Here are the MOST UPVOTED ones from in the last two weeks🔥🚀
LLaMA-Omni 🦙 a speech-language model built upon Llama-3.1-8B-Instruct release by the Chinese community 🚀🔥
✨ Low-latency speech interaction with a latency as low as 226ms.
✨ Simultaneous generation of both text and speech responses.
✨ Trained in
ChatGLM 3 was released by THUDM today 🔥 Probably the best model of its size in China.
✨ It supports complex scenarios like: Function call, code interpreter, Agent tasks etc.
✨ Fully OPEN to academic research and FREE commercial use with registration.
New MoE from Chinese community 🚀
Skywork-MoE from Kun Lun tech is now on
@huggingface
Hub🔥
✨ 146 billion parameters, 16 experts and 22 billion activated parameters
✨ With two innovative techniques: Gating Logit Normalization, which enhances expert
HunyuanDiT from
@TencentGlobal
has released version 1.2🔥 along with the Hunyuan-Captioner on
@huggingface
Model:
✨ Generating high-quality image descriptions from various angles and supporting bilingual
虎头帮 TIGER-LAB🐯 The name caught my attention first, then I realized they were behind all these cool works!
⚔️ GenAI-Arena ⚔️ : Benchmarking Visual Generative Models in the Wild
✨Mantis: Optimized for multi-image reasoning with text/image format
The Chinese community is on fire with the release of open models! 🔥🙀
MeshAnything V2 🚀 A transformer that generates artist-created meshes (AM) based on given shapes made by
@NTUsg
&
@Tsinghua_Uni
Model weight:
Demo:
✨ Upvote
Tele-FLM-1T 🚀 An open LLM released by
@BAAIBeijing
and TeleAI with 1T parameters💫
✨Support Chinese & English
✨Apache 2.0
✨Cost-effective progressive pre-training.
✨Enhanced with Input/Output scalers, RoPE, RMSNorm, SwiGLU.
Model:
CodeGeeX4-ALL-9B🔥 An OPEN multilingual code generation model from the latest CodeGeeX4 series, released by
@thukeg
🚀
Github:
Model:
✨ 128k context length
✨ Supports Function Call
✨ Trained on GLM-4-9B
✨ Open for academic
CraftsMan✂️ A novel generative 3D modeling system from
@hkust
🔥
Demo:
Paper:
✨ High quality and efficient generation
✨ Interactive Refinement: Users can interactively refine and customize the generated 3D models.
✨ Multiple
MM-Vet is now on v2 🔥🔥🔥
MM-Vet v2 is a benchmark to evaluate LLMs for integrated capabilities , released by
@NUSingapore
🚀⚖️
Paper:
✨ Includes a new capability called "image-text sequence understanding."
✨ Expands the
Huggy bandanas are ready for the community!🔥
Drop a 🤗in the thread if you're in for our new bandanas and I'll send you one! Can't wait to see everyone rocking them! 🥳✨🤘
ShareGPT4Video 📺 a series that helping big video-smart systems and video-making AI understand and create videos better, using detailed captions.
Model & Dataset:
Paper:
✨ ShareGPT4Video: dataset contains 40K high quality videos
✨
🚀👏Another amazing project by the PKU Yuan Group, who also created Open-sora-Plan and MagicTime!
ChronoMagic-Bench 📊 A benchmark for metamorphic evaluation of text-to-time-lapse video generation 🔥🎬
Code:
Paper:
OmniLMM-12B & OmniLMM-3B , open access LMMs from a Chinese research lab
@OpenBMB
🔥
✨ OmniLMM-12B: Outperform in benchmarks; Trustworthy behavior; Real-time interaction.
🚀 OmniLMM-3B (MiniCPM-V): Runs on most devices, mobile included; Support both Chinese & English.
🪪 Apache
This is the speed of AI in 2024!!🚀🤯
24 hours after its release, the FIRST Llama 3.1 Chinese fine-tune is already available on the
@huggingface
hub🔥 Kudos to OpenBuddy!
Two interesting papers on today:
✨Top trending: Diffusion Models Are Real-Time Game Engines
✨Most upvotes: Writing in the Margins: Better Inference Pattern for Long Context Retrieval
@ authors in the
New benchmark for Chinese LLM Evaluation: CMMMU ⚖️🚀
A Chinese Massive Multi-discipline Multimodal Understanding Benchmark with 12k manually collected multimodal questions.
Code:
Paper:
Dataset:
HuatuoGPT-Vision 华佗 🩺 An OPEN medical MLLMs released by Shenzhen Research Institute of Big Data and
@cuhksz
🔥
Dataset:
Model:
Paper:
✨ The HuatuoGPT series is already being used in hospitals in
🌋MoE-LLaVA with just 3B activated parameters outperforms the LLaVA-1.5-7B on an average of 9 benchmarks, and the 2.2B version even surpasses the LLaVA-1.5-13B in object hallucination benchmark.
🤗We have open-sourced all data, code, and models.
Code:
Introducing Vision Arena! Inspired by the awesome Chatbot Arena, we built a web demo on
@huggingface
for testing Vision LMs (GPT-4V, Gemini, Llava, Qwen-VL, etc.). You can easily test two VLMs side by side and vote! It’s still a work-in-progress. Feedbacks are welcome!
🔗