Elon Musk's X.ai has announced the latest version of its AI model, Grok-1.5, which boasts improved reasoning capabilities and an impressive context length of 128,000 tokens. The new model, set to be available on the X platform in the coming days, promises improvements in coding, math-related tasks, and long context understanding.
Grok-1.5 is the successor to Grok-1, for which X.ai released the model weights just two weeks ago. The new model builds on the progress made by its predecessor, with significant improvements in reasoning and problem-solving capabilities.
In testing, Grok-1.5 achieved a score of 50.6% on the MATH benchmark and 90% on the GSM8K benchmark, demonstrating its proficiency in handling a wide range of grade school to high school competition math problems. Additionally, the model scored 74.1% on the HumanEval benchmark, showcasing its code generation and problem-solving abilities.
One of the standout features of Grok-1.5 is its ability to process long contexts of up to 128K tokens within its context window. This represents a significant increase in memory capacity, allowing the model to utilize information from substantially longer documents and handle more complex prompts while maintaining its instruction-following capability. In the Needle In A Haystack (NIAH) evaluation, Grok-1.5 demonstrated powerful retrieval capabilities for embedded text within contexts of up to 128K tokens in length, achieving perfect retrieval results.
Built on a custom distributed training framework based on JAX, Rust, and Kubernetes, the training stack enables X.ai's team to prototype ideas and train new architectures at scale with minimal effort. The custom training orchestrator ensures maximum reliability and uptime of the training job, with problematic nodes automatically detected and ejected from the training job. Checkpointing, data loading, and training job restarts have also been optimized to minimize downtime in the event of a failure.
Grok-1.5 will soon be available to early testers, with X.ai looking forward to receiving feedback to help improve the model. As the company gradually rolls out Grok-1.5 to a wider audience, several new features will be introduced over the coming days.