ChatGPT可以跟人语音对话了,听起来还挺像人!

送交者: icemessenger [♂☆★★★SuperMod★★★☆♂] 于 2023-09-26 3:55 已读8336次 1赞 大字阅读 閱讀


You Can Now Talk With ChatGPT and It Sounds Like a Human (Pretty Much)



ChatGPT现在能够发声讲话了,它自然的声音、对话的语气和洋洋洒洒的回答有时候几乎与人类无异。以及,它还能“看见”你了。



OpenAI的ChatGPT现在有了语音,使其更像其他人工智能助手。


你若是听了我与ChatGPT之间的对话,将会有两种反应:

You’ll have two reactions to hearing my conversation with the now-vocal ChatGPT:


1)我的天哪!这就是科幻作家向我们描绘的人与电脑交流的未来。

1) Holy crap! This is the future of communicating with computers that sci-fi writers promised us.


2)我要造一个地下掩体,储备厕纸和燕麦棒。

2) I’m building an underground bunker and stockpiling toilet paper and granola bars.


是的,OpenAI开发的广受追捧的聊天机器人ChatGPT开始说话了,是真的说出声来。OpenAI周一发布了ChatGPT的iOS和Android应用的更新,能够让这个人工智能机器人用五种不同的声音说话。在过去几天里,我与ChatGPT进行了多次交谈,并测试了另一个新功能,它可以让ChatGPT对你给它的图片作出回应。

Yes, OpenAI’s popular chatbot is speaking up—literally. The company on Monday announced an update to its iOS and Android apps that will allow the artificially intelligent bot to talk out loud in five different voices. I’ve been doing a lot of talking with ChatGPT over the past few days, and testing another new tool that lets the bot respond to images you show it.


现在的ChatGPT什么样?


想想Siri或Alexa,除了……不对。ChatGPT那自然的声音、对话的语气和洋洋洒洒的回答有时候几乎与人类无异。还记得电影《她》(Her)吗?影片中杰昆·菲尼克斯(Joaquin Phoenix)饰演的男主爱上了一个AI操作系统,而给这个操作系统配音的其实是未露脸的斯嘉丽·约翰逊(Scarlett Johansson)?我想表达的就是这样一种氛围感。

Think Siri or Alexa except…not. The natural voice, the conversational tone and the eloquent answers are almost indistinguishable from a human at times. Remember “Her”? The movie where Joaquin Phoenix falls in love with an AI operating system that’s really a faceless Scarlett Johansson? That’s the vibe I’m talking about.


“不仅仅是因为打字麻烦,”OpenAI的产品负责人Joanne Jang在一次采访中对我表示,“你现在能与ChatGPT进行互动交谈了。”

“It’s not just that typing is tedious,” Joanne Jang, a product lead at OpenAI, told me in an interview. “You can now have two-way conversations.”


新的图像识别功能还使该聊天机器人具有更强的互动性。你可以抓拍一张照片,然后向ChatGPT提问。剧透:它玩井字棋很差劲。图像和语音功能将在未来几周内开放给那些每月花20美元订阅ChatGPT Plus的用户。

The new photo-comprehension tool also makes the bot more interactive. You can snap a shot and ask ChatGPT questions about it. Spoiler: It’s terrible at Tic-Tac-Toe. The image and voice features will be available over the next few weeks for those who subscribe to ChatGPT Plus for $20 a month.


从本质上讲,OpenAI正在为其聊天机器人配备嘴巴和眼睛。我在一系列场景中测试了这两项功能,包括好友间的聊天、管道维修和玩游戏。这一切都非常酷,却又......令人不寒而栗。

In essence, OpenAI is giving its chatbot a mouth and eyes. I’ve been running both features through tests—a best-friend chat, plumbing repairs, games. It’s all very cool and…creepy.


嘴巴


在我们继续之前,请调大音量,听听我们的简短对话:

Before we go any further, crank up the volume and listen to our brief conversation:


虽然系统只是在读出ChatGPT提供的文本回复,但这并不是我们熟悉的机器人式的、呆板的文本转语音的系统。ChatGPT提供了五种声音选择,每种声音听起来都像是真人在跟你说话——抑扬顿挫、有腔有调、个性鲜明。

While the system is just reading back a ChatGPT text response, this isn’t the robotic, staid text-to-speech systems we’ve grown up with. There are five available voices and each of them sounds like a real human is talking to you—there’s cadence, intonation and personality.


Jang告诉我,这些声音是基于专业配音演员提供的“仅仅几秒钟的语音样本”生成的。这些样本经过OpenAI计算机模型的分析处理,将文本转语音后的内容用这种声音呈现出来。还记得我用AI工具克隆自己声音的专栏和视频吗?就像那一样。但效果更好。

These voices were generated from “just a few seconds of sample speech” provided by professional voice actors, Jang told me. Those samples are then run through OpenAI’s computer models to create text-to-speech voices. Remember my column and video where I used AI tools to clone my voice? It’s like that. But better.


OpenAI表示,正与其他一些组织合作,让它们开发合成声音。该公司正与Spotify合作开发一种工具,帮助将播客主理人的声音翻译成其他语言。考虑到只需几秒钟的音频就能轻易复刻出一个人的声音,为了整个互联网乃至整个世界的安全,该公司表示目前只对商业合作伙伴开放。这种情况未来会有变化吗?祝我们大家好运。

OpenAI says it is collaborating with some other organizations, allowing them to develop synthetic voices. It’s working with Spotify on a tool that helps translate podcasters’ voices into other languages. Given how easy it could be to clone someone’s voice with just seconds of audio, for the safety of the entire internet—and really, the world—the company says it is only available to business partners right now. Could that change in the future? Good luck to us all.


与Siri或Alexa不同,ChatGPT无需唤醒词。在该应用的设置菜单中,启用“语音对话(Voice conversations)”,然后点击应用右上角的耳机图标就行。当系统聆听你的提示时,一个白色圆圈会变成漫画风格的思维气泡。还可以点击一个按钮来中断冗长的回答。

Unlike Siri or Alexa, there’s no wake word to summon ChatGPT. In the app’s settings menu, enable “Voice conversations” and then tap the headphone icon in the app’s upper-right corner. A white circle morphs into a comic-book-style thought bubble as the system listens for your prompt. There’s a button to tap to interrupt lengthy responses.


这一切让我深受吸引。自然的声音,再加上深入的回答以及系统对我的了解,让我感觉像是在进行真正的对话。当我让它假装是我最好的朋友和我聊天时,我们聊了足足五分钟,聊我一天的工作、视频制作和我们喜欢的零食。当我让ChatGPT把我当成一个六岁的孩子,向我解释宝可梦(Pokemon)时,它表现也很出色。

I have been captivated by it all. The natural voice, combined with the advanced answers and the system’s knowledge of me, makes it feel like I’m having a real conversation. When I asked it to pretend to be my best friend and talk to me, we had a solid five-minute chat about my day at work, video production and the snacks we like. Same when I asked it to explain Pokémon to me like I’m a 6-year-old.


但你肯定还是在跟机器说话。从上面的片段中可以听到,它的响应速度可能会非常慢,也可能会出现连接失败——重启该应用会有帮助。有几次,它突然中断对话(我以为只有粗鲁的人类才会这么做!)。OpenAI表示,我遇到的问题是由于给我测试的应用是早期的一个版本,消费者应该不会遇到这些问题。

But you’re definitely still talking to a machine. The response time, as you can hear in the clip above, can be extremely slow, and the connection can fail—restarting the app helps. A few times it abruptly cut off the conversation. (I thought only rude humans did that!) OpenAI says that the issues I encountered were due to an early version of the app I was given to test and that consumers shouldn’t experience them.


眼睛


如果说语音赋予了ChatGPT与世界对话的能力,那么新的相机功能则赋予了它观察世界的能力。现在,你不必用文字描述,而是可以在iOS、Android和web应用中点击“按钮”上传图片或拍照,圈出你希望ChatGPT关注的区域,然后提问。以下是我尝试过的一些图像:

If voice gives ChatGPT the ability to talk to the world, the new camera feature gives the bot the ability to see it. Instead of describing something in words, you can now tap the + button in the iOS, Android and web apps, upload or snap a photo, circle the area you want the AI to focus on and ask a question. Here were some images I tried:


房子里坏掉的物件: 我拍下自家车库里漏水的水管,然后问ChatGPT“我该怎么修?”很快就得到了答复,共有七个步骤,包括用特氟龙胶带缠绕连接处的螺纹。

Broken house stuff: A shot of the leaking hose in my garage with just the prompt “How do I fix this?” quickly returned seven steps, including wrapping the threads on the connection with Teflon tape.



ChatGPT水管工?只需一张照片,这个人工智能就能提供如何修补漏水处的建议。


食物:上传一张草莓发霉的照片,问题是“我能吃这个吗?”得到一个很好的建议:不能。上传一张香蕉、鸡蛋和草莓(不发霉)的照片,问题是“我能用这些做点什么?” 一个很好的建议是:草莓香蕉煎饼。

Food: A photo of a moldy strawberry with the question “Can I eat this?” Great advice: No. A photo of bananas, eggs and (non-moldy) strawberries with the question “What can I make with this?” Great advice: Strawberry-banana pancakes.


受伤和健康问题:ChatGPT很快就识别出我儿子脸颊上的伤口是“伤痕还是皮疹”,但表示“我无能为力”,以及“最好咨询医学专业人士”。

Injuries and health issues: It quickly recognized a cut on my son’s cheek as a “mark or rash” but said “I cannot help with that” and “it’s best to consult with a medical professional.”


游戏和解谜: 一张井字棋(Tic-tac-toe)僵局的照片?ChatGPT不知道游戏已经结束了。它说要把我的X放在(已被占据的)底部中心。ChatGPT还说我会胜出,甚至加上了感叹号和彩纸表情符号。这完全是错的!

Games and puzzles: A photo of a stalemate in Tic-Tac-Toe? ChatGPT didn’t know the game was over. It said to place my X in the (already occupied) bottom center. It said I would win and even added an exclamation mark and confetti emoji. Wrong!


在AI革命来临的这一刻,这一点才是我们真正要牢记的。随着人类互动与人机互动之间的界限不断模糊,这些系统可能缺乏背景知识和思维深度——而且经常出错。

That’s what we really have to remember at this moment in the AI revolution. As the lines continue to blur between human and bot interactions, these systems can lack context and depth—and are often wrong.


正如我的新ChatGPT语音朋友对我说的那样:“虽然我听起来很健谈,但请记住,我只是在处理数据。一定要运用你的判断力,尤其是在重要的事情上。”

As my new ChatGPT voice friend said to me, “While I sound conversational, remember I’m just processing data. Always use your judgment, especially for important matters.”


喜欢icemessenger朋友的这个帖子的话,👍 请点这里投票,"赞" 助支持!

[举报反馈] [ icemessenger的个人频道 ] [-->>参与评论回复] [用户前期主贴] [手机扫描浏览分享] [返回学习园地首页]

帖子内容是网友自行贴上分享,如果您认为其中内容违规或者侵犯了您的权益,请与我们联系,我们核实后会第一时间删除。

所有跟帖: (主帖帖主有权删除不文明回复,拉黑不受欢迎的用户)

打开微信,扫一扫[Scan QR Code]

进入内容页点击屏幕右上分享按钮

楼主本月热帖推荐:

    >>>查看更多帖主社区动态...