I wish Stable Diffusion had focused on more fine art and professional photography in their training corpus, like DALL-E did in its own one. All outputs I've seen have a decidedly low-quality or amateurish look, which must have been picked up from somewhere.
Thing is, the primary power of large models is not in the styles they memorized, but in the ability to be fine-tuned and guided by reference content. The focus is your choice, you can shift the model bias with any available finetuning method and a certain amount of reference images. 99% of the training has already been done for you, so all you need is a relatively subtle push achievable on a single GPU (or a few GPUs).
Take a look at Analog Diffusion outputs, for example.
With patience the guidance can even be done simply, without training, with enough prompts, blending of prompts, input images to img2img, etc. I'm not an expert on art history, but those who are seem able to get just about any style on any image with just the base stable diffusion model. Adding specific photography terms can emulate some of the results from the analog diffusion model in regular SD as well.
It has nothing to do with "style". DALL-E 2 consistently astonishes with its composition, framing, variety of visual elements, and so on, which tend to also be good in oil paintings and professional photographs, but not at all in the sort of content which predominates on art sharing apps with user-generated content.
If you mean porn (which is censored in DALL-E), SD also can't make it as it's mostly filtered out from the training set. All of it is made by custom finetuned models, which is my point. Same with hands, which SD can struggle with. Styles are not the only thing that you can transfer, but subjects and concepts as well.
Well... That's what predominates as user-generated content, I think :)
Anyway, none of the existing image-gen models can be used in content production in their vanilla state. Prompt to image is fine as a hobby, but style/concept transfer is what makes them usable in professional setting. Or rather will make in nearest future, as this is all still highly experimental. SD in particular is quite small and is not a ready to use product, not intended for direct usage. It's a middleware model to build products upon. Such as Midjourney.
You're just prompting it wrong. Try adding qualifiers that have well-known associations with pro-level photography. For example these two makes your prompts incredibly much better, append either "Canon EOS 5D Mark IV, Sigma 85mm f/1.4" or "photography by Annie Liebovitz" (for a bit more artsy photos) to the end of your prompt and see the difference.
If you're not doing portrait photography, replace Sigma 85mm f/1.4 with something else for example Sigma 24mm for more wide-angle photography.
Note that in my extensive testing specific lenses and cameras tend to make broad, random changes to the output unless they tend to have a specific aesthetic associated with them.
E.g. 85mm vs 24mm make no specific changes to a photo. SD appears to just interpret these as "make a photo look realistic", and any changes to the photo as you switch between them are simply incidental.
Yeah the particular item of the focal length is not as important than hinting in the prompt that you want a photorealistic pic in general. Still the conditioning guidance has a lots of terms and maybe it can tip it in a good direction at some unclear points during a diffusion process the more details you add.
You need to look at a better source. SD 2.x is trained on alot of professional photography. There is plenty of output which is not low-quality/amateurish. There is also the Analog Diffusion model is the sd1.5 fine tuned for film photography.