Midjourney, a renowned name in AI-driven image generation, has recently introduced its V6 alpha model, marking a significant step in the evolution of AI imagery technology. While it's crucial to emphasize that this is an alpha release, the advancements it brings to the table and debate it is sparking are noteworthy.
Midjourney says development of V6 has been in the works for nine months and is the third model trained from scratch on their AI superclusters. Key improvements in the V6 Alpha include:
- Enhanced Prompt Adherence and Length: The V6 base model boasts an improved ability to accurately follow longer prompts, a critical feature for users requiring detailed image creation.
- Coherence and Model Knowledge: There's a noticeable improvement in the coherence of generated images and the underlying model's knowledge, ensuring more contextually relevant outputs.
- Introduction of Text Drawing: The model is also getting better at generating text. While it still lags considerably behind DALLE3, it is a welcomed improvement. For now, text must be written in "quotations" and Midjourney recommends using
--style raw
or lower--stylize
values. - Improved Upscalers: The V6 introduces
subtle
andcreative
options when upscaling images.
V6 already supports a range of features like aspect ratio adjustments, style variations, and blending options. However, certain familiar V5 capabilities such as image editing options like pan
, zoom
, and vary region
will be added in the coming months. The company is also teasing a revamped /describe
feature.
Prompting in V6 differs from previous versions, and Midjourney recommends essentially relearning how to properly prompt the new model. Its heightened sensitivity to prompts means that clarity and specificity are paramount. Users are encouraged to avoid unnecessary "junk" words (e.g. award winning, photorealistic, 4k, 8k) and focus on explicit descriptions to achieve the best results.
V6's improved photorealism is raising important questions and spurring controversy and debate about generative AI technology, image rights and deepfakes. The new model can now render identifiable brands, celebrities and public figures with a fidelity nearing actual photos.
Beyond realistic photos, V6 also exhibits an uncanny knack for replicating distinctive animation aesthetics from popular anime, cartoons, and major movie franchises. More concerning, V6 can also extrapolate and fabricate new scenes and sequences that blur the line between fan art and outright IP theft.
As quality improves, such AI-generated borrowing or remixing of heavily copyrighted material presents yet another frontier for legislators, legal experts, and creators themselves to grapple with in balancing creative expression and commercial protections. Nonetheless, legally sound applications of V6's capabilities certainly remain, and the technology itself should not shoulder blame for potential misuse.
As with previous alpha releases, Midjourney warns that users should expect changes to be frequent and without prior notice as they work towards a full release. Also, keep in mind that V6 will be slower and cost more than V5 (this will change as they optimize the model and get closer to full release).