AI chip startup Cerebras unveils open-source ChatGPT-like models with support for up to 13 billion parameters
Just two weeks before the launch of ChatGPT, Silicon Valley-based AI chip startup Cerebras Systems unveiled its AI supercomputer called Andromeda. With a focus on commercial and academic research, Andromeda was built by linking 16 Cerebras CS-2 systems, which are the company’s most recent AI computers designed around the oversized chip called Wafer-Scale Engine 2.
Fast forward four months later, Cerebras on Tuesday unveiled its open-source ChatGPT-like models on Andromeda aimed at research and business community to use for free as part of the effort to foster more collaboration. The release includes seven models all trained on Andromeda plus smaller 111 million parameter language models to a larger 13 billion parameter model.
The newly released models still shy away from OpenAI’s GPT-3 model—the foundational model behind ChatGPT—which by comparison has 175 billion parameters. But unlike ChatGPT models which are run on large Microsoft cloud, Cerebras said its smaller models can be deployed on phones or smart speakers while the bigger ones can run on PCs or servers.
However, Cerebras said complex tasks like large passage summarization will require larger models. The more parameters AI models have the more they are able to perform complex generative functions.
“There’s been some interesting papers published that show that (a smaller model) can be accurate if you train it more,” said Freund. “So there’s a trade-off between bigger and better trained.”
“There is a big movement to close what has been open-sourced in AI…it’s not surprising as there’s now huge money in it,” said Andrew Feldman, founder and CEO of Cerebras. “The excitement in the community, the progress we’ve made, has been in large part because it’s been so open.”
Founded in 2016 and based in Los Altos, California, Cerebras specializes in developing and manufacturing high-performance artificial intelligence (AI) computer systems used by organizations such as Argonne National Laboratory, Lawrence Livermore National Laboratory, and the Pittsburgh Supercomputing Center to accelerate their AI research and development.
One of the company’s main products is the Wafer Scale Engine (WSE), which is a massive semiconductor chip that measures 46,225 square millimeters, making it the largest computer chip in the world. The WSE contains 1.2 trillion transistors and is designed to provide high-speed processing capabilities for AI applications, such as natural language processing, image and video recognition, and drug discovery.
With the popularity of ChatGPT, Cerebras has been able to raise massive funding from investors, including Benchmark, NEA, and Moore Capital, among others.