Just added something new to my 360° image generation pipeline, and I'm really excited about the details and perspective it brings!
Here's an example output of cats in a moss-covered Japanese alley.
I'd love to hear your thoughts and ideas for what kind of scene to explore next.
360 photos in 1:1 aspect ratio look pretty cool 😎
This used Proteus 0.2 at 2008x2008 resolution. While I work on being able to traverse these images, I'll be sharing a lot more scenes like this one.
Just harnessed ControlNet to enhance equirectangular consistency in image models not originally trained for 360° views.
Tested it with Proteus 0.2 on my first attempt at crafting a 360° scene: a wizard casting lightning. The results? Electrifying!
This took 30 seconds to generate on my rtx 3060 using an update to my pipeline focused on speed.
I couldn't help adding some music for this late night post, I promise I'll go to sleep now ;)
🪻🎶
Right now I'm working on controller + ultimate sd upscale to create super hi-resolution images. Try zooming in on this image, that I made with Proteus v0.2.
With distortion correction and r-esrgan, it feels like I'm an actual mage 🧙♂️
I'll share the full image this time since twitter keeps shrinking my videos to dust.
Prompt Share, "lava lakes with a soft orange glow, demon creatures flying in the sky, dark obsidian landscape"
With my current workflow, you can control the output very easily with a simple description.
All the tuning I've done has made it idiot proof.
@austinvhuang
One idea I like about LLM is that in an apocalyptic scenario, people could still run a synthetic internet locally as long as they can generate power XD
Sora with a 360 image input will be amazing.
Imagine the water flowing and the hot air balloons floating around in the sky.
Adding ambient sound 🔊already adds a lot to the scene in my opinion.
You can make 360 images with open source models very easily it turns out!
Here's a terraria inspired 360 image using Proteus v0.2 and Auto1111
None of them are perfect, but it really satisfies my sense of exploration making these.
I'll describe my workflow below 😊...
I'm thinking automatic background remover + images with a blank background+ 360 images, just add some code for good placement and you can have consistent characters between "areas" in a game :) more on this tomorrow with poc
I'm excited for the release of Proteus V0.3
It's a model trained on GPT-4V data labels for powerful prompt comprehension.
The newest version has better fine detail and aesthetic in my opinion.
Polaroid art style can be incredible with a good enough model.
Share any polaroid style images in the comments!!!
I love the borders that generate as if the image was directly from the camera.
@cliff_swan
I know from personal experience, published research from "reputable" universities is very often riddled with enormous mistakes that even a basic proper peer review process should vet. We need to decrease the publishing of low quality works or research will become too bloated.
@abacaj
If GPT-5 was just a language model, how do you think it would differ in order to progress usefulness?
I know some people have said we have already plateaued auto-regressive language models, but it seems like they could be even better to me, a lot of room for growth.
@GozukaraFurkan
@lmsysorg
@artificialguybr
Did you read what they wrote? They explain that they rephrased the training data to subvert contamination, this post is to illustrate why benchmarks can be dubious.
@RealJonahBlake
@CultureCrave
HuggingFace models by default have distinct noise distributions and meta-data to identify AI images.
However, both of these can be removed without much complication.
In regards to "we can tell the difference". There will be error in both directions if we use human intuition.
@MattWalshBlog
If you ever go to towns like these in West Virginia, they are just empty and mostly the people left are pretty nice. Almost the complete opposite of sf
ProteusV0.4: The Style Update
This update enhances stylistic capabilities, similar to Midjourney's approach, rather than advancing prompt comprehension. Methods used do not infringe on any copyrighted material.
@Teknium1
I am shocked to see how consistently this guy succeeds at grifting, I see him everywhere and he has yet to show a single useful thing of substance to the public.
I'm not hating on him or anything, I just don't get it
"(polaroid photo:1.1) of a man holding a red apple and giving a thumbs up, best quality, hd, dark contrast, strong perspective from shadow, bold vivid and natural colors"
1. Install Auto1111 and an extension called Asymmetric Tiling.
2. Check the option to activate Asymmetric Tiling for the x axis, which allows for seamless 360 images.
3. Make images with your model of choosing, at 2:1 AR.
4. Write a prompt with the "equirectangular" tag.
I’m not a part of this weak ass reality anymore. This how I’m showin up everywhere so just don’t even say anything about it when I come in the room. ITS OVER!!
You can score any image automatically using GPT-4V and Suno.
Here is an example of what that looks like for some dune images made with the new Proteus Mobius model.
In the near future, I think AI generated films will use a similar strategy for music, it's pretty satisfying :D
"Black cat crawling on an old box car, rusty door and a fallen star."
Proteus-RunDiffusion 2 trained by
@DataPlusEngine
is the best model ever tested on my local machine, and it definitely challenges MJ in my opinion.
@NickADobos
@SpencerKSchiff
That was for gpt-3 and then misinterpreted from the media to mean 3.5
If you read openAI's blog posts, they never explicitly said the size, and it was kinda leaked unofficially later on.
I suggest using Proteus v0.2, it is my favorite SDXL model right now and it listens to prompting very well.
Also I suggest using this website to view your images in 360 view:
@Yampeleg
@ShaykhSulaiman
But seriously... A GAN detector? Those things have horrific accuracy.
I'm not saying the image is real or fake, but using a GAN detector is not the way to check.