Google launches Gemini, its largest and most powerful AI model to challenge ChatGPT
After several months of playing catch-up, Google has officially unveiled Gemini, its latest and most powerful AI model, claiming superiority over OpenAI’s GPT-3.5.
The announcement follows a report by The Information suggesting a potential delay in the full launch until 2024. Google supposedly delayed the launch of Gemini due to its lack of readiness, evoking memories of the company’s turbulent release of its AI tools earlier this year.
However, under increasing pressure to address how it plans to monetize AI, Google has accelerated the release of revolutionary AI models to compete in the intensifying AI race. Unlike other AI models, Gemini has been trained to identify, comprehend, and integrate various forms of information, such as text, images, audio, video, and code. Its cutting-edge performance introduces impressive new capabilities, all while prioritizing safety and responsibility at its foundation.
The Gemini large language models offer three different sizes: Gemini Ultra, the most extensive and capable category; Gemini Pro, designed for a broad range of tasks; and Gemini Nano, tailored for specific tasks and mobile devices.
Initially, Google intends to license Gemini to customers through Google Cloud for integration into their applications. Starting December 13, developers and enterprise customers can access Gemini Pro through the Gemini API in Google AI Studio or Google Cloud Vertex AI. Android developers will also have the opportunity to leverage Gemini Nano. Furthermore, Google plans to deploy Gemini in its products, such as the Bard chatbot and Search Generative Experience, aiming to respond to search queries with conversational-style text (SGE availability is limited at present).
Gemini Ultra is highlighted as the first model to surpass human experts in Massive Multitask Language Understanding (MMLU), which assesses knowledge and problem-solving abilities across 57 subjects like math, physics, history, law, medicine, and ethics. Google asserts that Gemini Ultra can comprehend nuance and reasoning in complex subjects.
In a blog post, CEO Sundar Pichai emphasized that Gemini was the outcome of extensive collaboration across Google teams, incorporating multimodal capabilities to seamlessly understand and operate across various types of information, including text, code, audio, image, and video.
“Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research,” wrote CEO Sundar Pichai in a blog post on Wednesday. “It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across, and combine different types of information including text, code, audio, image, and video.”
Meanwhile, starting today, Google’s chatbot Bard will integrate Gemini Pro, enhancing advanced reasoning, planning, and understanding capabilities. Executives indicated that early next year, “Bard Advanced,” utilizing Gemini Ultra, will be launched, representing a significant update to Bard, akin to its ChatGPT-like chatbot.
Below is a video of Google CEO discussing how Gemini marks the next phase on the company’s “journey to making AI more helpful for everyone.”