Zhipu AI just brought a big gift to AI enthusiasts—they open-sourced their latest image generation model, CogView4!
This isn't just any ordinary model—it's the industry's first open-source text-to-image model that supports bilingual prompts in Chinese and English. It excels at understanding Chinese prompts and can even generate Chinese characters within images. In simple terms, you can tell it what you want in Chinese or English, and it will generate an image that matches your description. Whether you're working on ad design, short video creation, or just playing around with ideas, this model can come in handy.
What is CogView4?
CogView4 is an AI image generation model developed by Zhipu AI, falling under the "text-to-image" technology category, which means it generates images from text descriptions. With 6 billion parameters (equivalent to the model's "brain capacity"), it boasts powerful performance. Its standout feature is that it not only supports input in both Chinese and English but also accurately understands complex Chinese prompts and can generate clear Chinese characters in images. For example, if you input "a swordsman in ancient costume standing in a bamboo forest, with the characters '侠义' written beside him," CogView4 can produce such a scene. This capability is a first among open-source models and is particularly well-suited for Chinese users.
Additionally, CogView4 can generate images at any resolution (within a certain range) and supports very long prompt descriptions. This means you can write a detailed creative idea, and it will do its best to bring your vision to life. Whether it's a simple "a cat" or a complex "nighttime city skyline with skyscrapers," it can handle it.
How to Use CogView4?
The good news is that CogView4 is open-source, meaning anyone can download and use it for free! Its code and model files are available on GitHub: https://github.com/THUDM/CogView4
If you're a beginner and worried about technical details, don't fret—Zhipu AI also plans to launch the latest version, CogView4-6B-0304, on their "Zhipu Qingyan" platform on March 13. At that point, you'll just need to open the webpage or app, enter your image description, and click to see the results—as simple as taking a photo with your phone.
Official Website for Online Use
https://open.bigmodel.cn/trialcenter/modeltrial?modelCode=glm-4-voice
What Similar Services Are Available in China?
The field of AI text-to-image in China is developing rapidly. Besides CogView4, there are several similar tools, such as:
- Wenxin Yige (Baidu): A text-to-image service launched by Baidu that supports Chinese input and generates artistic-style images, ideal for design and creativity.
- Tongyi Wanxiang (Alibaba): An image generation tool from Alibaba that also supports Chinese prompts with good results, leaning towards commercial applications.
- Doubao (ByteDance): An AI tool from ByteDance that supports text-to-image and multimodal creation, with a simple interface suitable for beginners.
Most of these services offer web or app versions with user-friendly operations, though some features may require payment. CogView4's advantage lies in being open-source and free, offering greater flexibility, especially for those who like to get hands-on.