Skip to content

Zhipu AI has just presented AI enthusiasts with a great gift - they have open-sourced their latest image generation model, CogView4!

Similar to the image above, this is a picture created using the model.

This isn't just any model; it's the industry's first open-source text-to-image model that supports bilingual Chinese and English prompts. It excels at understanding Chinese prompts and can even generate Chinese characters within images. Simply put, you can tell it what you want in Chinese or English, and it will generate an image that matches the description. Whether you're looking to create advertising designs, short video content, or just have some creative fun, this model can come in handy.

What is CogView4?

CogView4 is an AI image generation model developed by Zhipu AI. It falls under the category of "text-to-image" technology, which means it generates images based on textual descriptions. With 6 billion parameters (equivalent to the model's "brain capacity"), it's incredibly powerful. What makes it special is that it not only supports Chinese and English input but also accurately understands complex Chinese prompts and can even generate clear Chinese characters within images. For example, if you input "A swordsman in ancient costume standing in a bamboo forest, with the words '侠义' (chivalry) written beside him," CogView4 can generate such a scene. This capability is a first in open-source models, making it highly suitable for Chinese users.

In addition, CogView4 can generate images of any resolution (within a certain range) and supports ultra-long prompt descriptions. This means you can write a very detailed creative idea, and it will try its best to reproduce your vision. Whether it's a simple "a cat" or a complex "night city skyline with skyscrapers," it can handle it.

How to use CogView4?

  • The good news is that CogView4 is open-source, meaning anyone can download and use it for free! Its code and model files can be found on GitHub: https://github.com/THUDM/CogView4

  • If you're a beginner user, don't worry about the complex technical details. Zhipu also plans to launch the latest version, CogView4-6B-0304, on their "Zhipu Qingyan" platform on March 13th. When that happens, you'll just need to open the website or app, enter the description of the image you want to generate, and click a button to see the result. It's as simple as taking a photo with your phone.

Official website online use: https://open.bigmodel.cn/trialcenter/modeltrial?modelCode=glm-4-voice

What are some similar services in China?

The field of AI text-to-image is developing rapidly in China. In addition to CogView4, there are several similar tools, such as:

  • Wenxin Yige (Baidu): A text-to-image service launched by Baidu, supports Chinese input, and can generate artistic-style images, suitable for design and creativity.
  • Tongyi Wanxiang (Alibaba): Alibaba's image generation tool, also supports Chinese prompts, with good results, and is geared towards commercial applications.
  • Doubao (ByteDance): ByteDance's AI tool, supports text-to-image and multimodal creation, with a simple interface, suitable for beginners.

These services mostly have web versions or apps, making them easy to use, but some features may require payment. CogView4's advantage is that it's open-source and free, offering greater flexibility, especially for those who want to get their hands dirty.