Google I/O 2024: Everything That Was Announced

Published on 14 May, 2024, 8:32 PM IST

Updated on 21 Dec, 2024, 8:45 PM IST

Sahil Mohan Gupta

6 min read

Top stories and News

Google showed a strong focus on generative AI which was clear as it did not announce updates for many core products.

Google's annual developer conference, Google I/O, concluded with a massive focus on the company's Gemini AI models and their integration into various applications. The event, lasting over two hours, featured numerous announcements highlighting Google's advancing AI technologies, particularly in the context of growing competition from industry rivals such as OpenAI which had major updates for ChatGPT a day before.

OpenAI's Recent Announcements

Just a day before Google I/O, OpenAI announced the launch of GPT-4o, an updated iteration of its GPT-4 model, which powers the company's flagship product, ChatGPT. GPT-4o offers improved capabilities across text, vision, and audio, with the model's capabilities being rolled out iteratively. OpenAI also announced the opening of its GPT Store to all users, fostering an ecosystem of AI-powered applications and services, and the extension of several previously subscription-locked features to free users. So the onus was on Google to not just respond, but also impose its superiority as the fundamentals of generative AI were invented at Google called transformers.

Enhanced Search Capabilities and New Features

One of the notable updates to Google Lens allows users to search by recording a video and asking questions during the recording. Google's AI will then analyse the video content and attempt to provide relevant answers from the web. This feature, combined with the "Ask Photos" functionality set to launch this summer, demonstrates Google's efforts to make search more interactive and intuitive. "Ask Photos," powered by Gemini, enables users to query their Google Photos library, with the AI model providing answers and relevant images based on the user's questions. CEO Sundar Pichai showcased this feature by asking Gemini to identify his licence plate number, with the model correctly providing the number and displaying a picture of the licence plate.

Gemini Model Updates and Improvements

Google introduced Gemini 1.5 Flash, a new multimodal model optimised for fast responses in specific tasks. The company also announced significant improvements to Gemini 1.5, enhancing its translation, reasoning, and coding abilities. The context window of Gemini 1.5 Pro has been doubled from 1 million to 2 million tokens, allowing the model to process and understand more information. These updates reflect Google's commitment to refining its AI models to better serve users' needs and compete with rival offerings like OpenAI's GPT-4o.

Workspace Integration and Project Astra

Paid subscribers of Workspace will soon benefit from the integration of Gemini 1.5 Pro in the sidebar of Docs, Sheets, Slides, Drive, and Gmail. This integration will transform Gemini into a general-purpose assistant, capable of retrieving information from Drive content and assisting with tasks such as writing emails and setting reminders.

Project Astra, Google's ambitious multimodal AI assistant, aims to become a comprehensive virtual assistant that can understand visual input, remember object locations, and perform tasks on behalf of users. Demonstrations showcased Astra's impressive capabilities, such as identifying and explaining various elements in an office environment, locating misplaced objects like glasses, and even suggesting improvements to code displayed on a monitor. The video also hinted at the potential return of Google Glass, with Astra providing real-time information and suggestions based on the wearer's surroundings. Astra's ability to understand and respond to voice commands, observe its surroundings through a device's camera, and remember the location of objects marks a significant step forward in the development of AI assistants.

Video Generation and Custom Chatbots

Google introduced Veo, a generative AI model that can produce 1080p videos based on text, image, and video-based prompts. Google is offering Veo to select creators for use in YouTube videos and is also pitching it to Hollywood for potential use in films.

Gems, a custom chatbot creator built on Gemini, allows users to customise the chatbot's responses and specialisations. This feature will enable users to create chatbots tailored to their specific needs.

Enhanced Voice Chats and Multimodal Updates

To enhance voice chats with Gemini, Google introduced Gemini Live. This feature incorporates personality updates to Gemini's voice, allows users to interrupt the chatbot, and enables Gemini to watch through the user's smartphone camera and provide real-time information. Gemini will also receive new integrations with Google Calendar, Tasks, and Keep.

Android users can now benefit from the Circle to Search feature, which helps solve maths problems by circling them on the screen. Google's AI will break down the problem into steps, making it easier for students to understand and complete.

AI in Google Search, Android, and Chrome

Google Search is receiving a significant AI update with "AI Overviews," formerly known as "Search Generative Experience." This feature, rolling out to everyone in the US this week, utilises a specialised Gemini model to design and populate search results pages with summarised answers from the web. The AI-generated overviews aim to provide users with concise and relevant information, streamlining the search experience and reducing the need to navigate through multiple web pages.

Android phones will soon feature AI-powered scam detection, using on-device Gemini Nano AI to identify common scammer conversation patterns and display warnings to users. This feature highlights Google's efforts to leverage AI to improve user security and protect against fraudulent activities.

Google announced that Gemini will soon allow Android users to ask questions about videos on-screen, with the AI model providing answers based on automatic captions. This functionality extends the multimodal capabilities of Gemini, enabling users to interact with video content more effectively. Additionally, paid Gemini Advanced users will have the ability to process PDFs and receive information from them, further enhancing the AI assistant's utility across various file formats.

Google Chrome is getting an AI assistant in the form of Gemini Nano. This built-in assistant will use on-device AI to help users generate text for various purposes within the Chrome browser on desktop, streamlining tasks such as composing emails, writing social media posts, or creating product reviews.

AI Watermarking Expansion

Google announced the expansion of SynthID, its AI watermarking technology. SynthID will now embed watermarking into content created with the Veo video generator and has gained the ability to detect AI-generated videos.

Audio Outputs in NotebookLM

Google showcased the progress in multimodal AI with Audio Overviews in NotebookLM, which uses Gemini 1.5 Pro to take source materials and generate a personalised and interactive audio conversation.

New Trillium TPUs

Google announced its 6th generation of TPUs, called Trillium, to support the growing demand for AI computing power. Trillium delivers a 4.7x improvement in compute performance per chip over the previous generation, TPU v5e. Google will make Trillium available to its Cloud customers in late 2024.

Alongside TPUs, Google offers CPUs and GPUs to support any workload, including the new Axion processors, Google's first custom Arm-based CPU that delivers industry-leading performance and energy efficiency. Google will also be one of the first Cloud providers to offer Nvidia's cutting-edge Blackwell GPUs, available in early 2025.

Google's AI Hypercomputer will allow businesses and developers to tackle more complex challenges with more than twice the efficiency relative to buying raw hardware and chips. The AI Hypercomputer advancements are made possible in part by Google's approach to liquid cooling in its data centres, which has been in use for nearly a decade and has a total deployed fleet capacity of nearly 1 gigawatt.

Google also revealed its network infrastructure, spanning more than 2 million miles of terrestrial and subsea fibre, connects its infrastructure globally and is over 10 times the reach of the next leading cloud provider.