apolinario π
@multimodalart
Followers
15K
Following
6K
Media
992
Statuses
4K
ML for Art and Creativity, working @HuggingFace ([email protected])
Joined July 2021
this is so good! mid-frames are here, multi-frame to video is an easy to use workflow! kudos to @morphic for open sourcing it
Morphic's frames-to-video, with up to 5 frames and time control, is now open-source. GitHub: https://t.co/O15v89IaSr Hugging Face: https://t.co/yRDpUni69j More details in the thread:
3
39
299
Weights are out! π₯ FIBO is a new open weights high quality 8B image model by @bria_ai_ trained on json prompts! It can generate and modify images with a precisely crafted json prompt, allowing for every detail of the image to be decided Try it here: https://t.co/16xNHUGQ8B
3
13
87
π¨In our NeurIPS paper, we bring encoder-decoders back.. for diffusion language models! β‘οΈEncoder-decoders make diffusion sampling fast: a small (fast) decoder denoises tokens progressively and a large (slower) encoder represents clean context.
8
36
236
this is so good! mid-frames are here, multi-frame to video is an easy to use workflow! kudos to @morphic for open sourcing it
Morphic's frames-to-video, with up to 5 frames and time control, is now open-source. GitHub: https://t.co/O15v89IaSr Hugging Face: https://t.co/yRDpUni69j More details in the thread:
3
39
299
Morphic's frames-to-video, with up to 5 frames and time control, is now open-source. GitHub: https://t.co/O15v89IaSr Hugging Face: https://t.co/yRDpUni69j More details in the thread:
6
45
329
many folks have been trying json prompting with image and video models, bria trained the model to take that in natively! super cool model π
Weights are out! π₯ FIBO is a new open weights high quality 8B image model by @bria_ai_ trained on json prompts! It can generate and modify images with a precisely crafted json prompt, allowing for every detail of the image to be decided Try it here: https://t.co/16xNHUGQ8B
0
2
23
Weights are out! π₯ FIBO is a new open weights high quality 8B image model by @bria_ai_ trained on json prompts! It can generate and modify images with a precisely crafted json prompt, allowing for every detail of the image to be decided Try it here: https://t.co/16xNHUGQ8B
3
13
87
A next-gen visual model trained on structured JSON for precise, controllable generation. πͺ FIBO is a text-to-image model that transforms prompts into JSON schemas, enabling predictable visuals at scale. Trained on extended, structured captionsβoften 1,000+ wordsβFIBO
2
9
50
China's open source is just on fire. Soul, China's Tinder (?), has just open sourced their podcast model on @huggingface
https://t.co/laie6kUMqd
huggingface.co
16
48
509
Today, we are open-sourcing Hunyuan World 1.1 (WorldMirror), a universal feed-forward 3D reconstruction model. πππ Β While our previously released Hunyuan World 1.0 (open-sourced, lite version deployable on consumer GPUs) focused on generating 3D worlds from text or
45
265
2K
Great work scaling Self-Forcing up to 14B models, improving extrapolation, while still keeping it running in real time.
Krea Realtime is distilled from the Wan 2.1 14B text-to-video model using Self-Forcing. It achieves a text-to-video inference speed of 11fps using 4 inference steps on a single NVIDIA B200 GPU check all our training methodology and sampling innovations in our technical report!
1
5
111
today we're open-sourcing Krea Realtime. this 14B autoregressive model is 10x larger than any open-source equivalent, and it can generate long-form videos at 11 fps on a single B200. weights and technical report below π
58
202
1K
this now can be a benchmark/eval for agentic systems: could your agent re-create nanochat from scratch? (at least until nanochat gets fed into the future models)
@zenitsu_aprntc Good question, it's basically entirely hand-written (with tab autocomplete). I tried to use claude/codex agents a few times but they just didn't work well enough at all and net unhelpful, possibly the repo is too far off the data distribution.
0
0
3
many free tiers of Nano Banana vanished over time, but if you are a @huggingface PRO, it is still around
2
0
5
Next destination π«π· Super hyped to get together with @huggingface and @bfl_ml (again!!) in Paris! I already booked my flights Paris has a great AI community, and Iβm excited to meet them. @fal has so much to offer European builders @mervenoyann & @stephenbtl many thanks for
11
5
57
I created a @huggingface Space app so that you can try the tiny 7M parameters Shakespeare character diffusion model by @ash_at_tt Looking at the denosing is so mesmerizing π΅βπ«π₯
I turned @karpathy's baby GPT into a character-level text diffusion model, using @aaron_lou et al.'s score entropy-based training objective.
3
2
17
for example, here my preference would be probably biased if I didn't test out both options or deeply analysed the quality of the code. But both answers feel good and if I hit skip is a bit frustrating to remove them and generate a new one. I wish there was a way to continue the
0
0
2