Google has launched Gemini, a new artificial intelligence system that can seemingly understand and speak intelligently about almost any kind of prompt—pictures, text, speech, music, computer code, and much more.
This type of AI system is known as a multimodal model. It’s a step beyond just being able to handle text or images like previous algorithms. And it provides a strong hint of where AI may be going next: being able to analyze and respond to real-time information from the outside world.
Although Gemini’s capabilities might not be quite as advanced as they seemed in a viral video, which was edited from carefully curated text and still-image prompts, it is clear that AI systems are rapidly advancing. They are heading towards the ability to handle more and more complex inputs and outputs.
Comments are closed.