Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Midjourney uses SD under the hood (you can see in their license), but they augnment the model in various ways.


The results in midjourney are significantly better than SD. I find it much easier to get to a good result in MJ and I've been trying to understand why. Anymore insight you could share?


Good engineering. Midjourney likely has a lot going on under the hood before your prompt actually gets to Stable Diffusion. As an example you can check out this research paper [0] which seeks to add prompt chaining to GPT-3 so you can "correct" it's outputs before it reaches back to the user. There's also no rule that states you can only make one call to SD, MJ likely bounces around a picture through a pipeline they've tuned to ensure your generated image looks more reasonable.

[0]: https://arxiv.org/abs/2110.01691


Midjourney takes their base models and does further training/guidance on them to bring out intentional aesthetic qualities. One of their main goals is to ensure that that their “default” style is beautiful no matter how simple the user’s prompt is.


Opinionated background injected prompt suffixes varying based on user input + post processing pipelines.


Midjourney is doing "secret sauce" post-processing to enhance the image returned from the model. SD just gives you back what the model spits out. That's how I understand it at least




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: