OpenAI recently announced powerful new ChatGPT capabilities, including the ability to use pictures in addition to text prompts for conversations with the AI chatbot.
The company offered examples:
“Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it. When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow-up questions for a step-by-step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.”
(The company also announced that its mobile app would support voice input and output for the chatbot. You’ll be able to talk to ChatGPT, just as dozens of third-party apps already allow. And OpenAI officials also announced that ChatGPT will soon be able to access Microsoft’s Bing search engine for addition information.)
OpenAI isn’t the only AI company promising picture prompts.
Meta’s new camera glasses
Meta, the company formerly known as Facebook, recently unveiled the second version of its camera glasses, created in a partnership with EssilorLuxottica’s Ray-Ban division. The new specs, which cost $299 and ship Oct.17, boast more and better cameras, microphones and speakers than the first version — and they enable live streaming to Facebook and Instagram.
Gadget nerds and social influencers are excited about these features. But the real upgrade is artificial intelligence (AI). The glasses contain Qualcomm’s powerful new AR1 Gen 1 chip, meaning users wearing the Meta Ray-Ban smartglasses can have conversations with AI via the built-in speakers and microphones. But this is not just any old AI.
In a related announcement, Meta announced a ChatGPT alternative called Meta AI that also supports voice chat, with the responses spoken by any of 28 available synthetic voices. Meta has been baking Meta AI into all its social platforms (including the glasses) — and Meta AI will also be able to search Microsoft’s Bing search engine for more up-to-date information than what the Llama LLM (LLM stands for Large Language Model) has been trained on.
Facebook promised a software update next year that will make the Meta Ray-ban glasses “multimodal.” Instead of interacting with the Meta AI chatbot through voice, the glasses will gain the ability to accept “picture prompts,” as OpenAI now does. But instead of uploading a jpg, the Meta Ray-Ban glasses will just grab the image using the cameras built into the glasses.
While wearing the glasses, you’ll be able to look at a building and say: “What building is this?” and AI will tell you the answer. Meta also promised real-time language translation of signs and menus, instructions for how to repair whatever household appliance you’re looking at, and other uses. I expect that it’s only a matter of time before the glasses tell you who you’re talking to through…
2023-10-10 13:00:03
Source from www.computerworld.com rnrn