Beijing-based artificial intelligence startup Baichuan Intelligent Technology today announced the release of Baichuan-13B, an open-source Chinese and English language model containing 13 billion parameters. The new model demonstrates Baichuan's rapid progress in developing large language models that can rival prominent Western efforts like OpenAI's GPT-3.
Baichuan-13B represents a significant expansion over the company's previous 7B parameter model. Trained on a dataset of 1.4 trillion tokens, it is one of the largest open-source models of its size. For comparison, Meta's 13B parameter LLaMA model is trained on trillion tokens.
The release includes two variants of the model, a pre-training model (Baichuan-13B-Base) for developers, and the aligned model (Baichuan-13B-Chat) tailored for end users. According to the company, Baichuan-13B achieves state-of-the-art results for 13B models on standard Chinese and English benchmarks. Baichuan-13B-Chat in particular shows strong conversational ability out-of-the-box, and requires just a few lines of code to deploy.
Baichuan has also focused on optimization, providing int8 and int4 quantized versions to enable more resource-efficient deployment down to consumer GPUs like the Nvidia 3090.
Baichuan Intelligent Technology was founded by Wang Xiaochuan, who previously established the successful search engine provider Sogou. After leaving Sogou, Wang launched Baichuan Intelligence with a vision to make China a formidable player in the generative AI industry. His expertise in the field has helped position Baichuan as a promising developer of large language models.
The new model reflects China's burgeoning efforts to develop competitive large language models, an area of AI currently dominated by Western labs like OpenAI, Anthropic, Google, and Meta. Xiaochuan has explicitly cited OpenAI as a target to emulate.
Perhaps the most notable feature of the Baichuan-13B is its open-source status. The model is not only freely available for academic research, but also for commercial application, provided an official commercial license is obtained. All necessary resources for inference, including model weights, source code, and configuration, have been published on Hugging Face.
Baichuan's rapid progress also comes amidst evolving Chinese regulations around AI content generation. While supportive of AI development, regulators are expected to impose licensing rules around large language models, which could slow the release of Chinese models. Baichuan will need to navigate this emerging regulatory landscape.
For now, Baichuan-13B represents an impressive technical achievement and statement of intent by China's newest AI startup. With its commitment to open source availability and efficient performance, the model promises to further power China's AI ecosystem. Baichuan's forthcoming larger models will bear watching as China pushes to lead in this pivotal AI area.