当前位置:网站首页>I met me | virtual digital human cultivation, facegood virtual digital human open source technology seminar
I met me | virtual digital human cultivation, facegood virtual digital human open source technology seminar
2022-07-20 19:30:00 【Magic Data】
2022 Open a year ,Magic Data And virtual digital human production company FaceGood, And Tsinghua University , Spichi and other institutions , Held a virtual digital human Open Source Technology Seminar .Magic Data Founder and CEO Dr. zhangqingqing was invited to attend the Forum . On the BBS , Virtual digital human Driven Technology 、 Soft tissue motion capture and tracking technology 、 Interaction technology 、 There was a warm exchange of data processing technology .
Magic Data Multimodal data processing system Annotator5.0, It provides basic and important multi-dimensional data support for the underlying architecture of virtual digital human .
Emerging virtual digital people
In the huge market of metauniverse , High precision and intelligent virtual digital people seem to penetrate all walks of life overnight , It is attracting manufacturers at home and abroad to accelerate the layout of the meta universe race track . Can meet different sounds in another universe 、 Look even different gender of oneself , This makes people living in reality have real expectations .
Virtual digital human needs multimodal technology drive and rich data support , To make digital people “ Three-dimensional ” get up . Digital people in the past , More is just a sensory existence “ goods ”. They will smile , Will talk , Can sing , But they have no feelings , Don't communicate with people .
Emotional human-computer interaction ——“ Conversational AI” Technology and data
The ultimate form of artificial intelligence is emotional needs , Virtual digital human can stimulate human emotional needs , The premise is that they must interact with people ,“ Conversational AI” Technology and data , It can realize the communication between virtual digital people and human beings .
“ Conversational AI” Need the accuracy and efficiency of the whole process , Convert user voice to text , Understand the meaning of the text , Search for the best response that matches the context , Finally, the text to speech tool is used to provide the response .
On a technical level , Conversational AI Speech recognition is involved 、 Core technologies such as natural language understanding and speech synthesis . We want to realize a more natural dialogue between human and machine through these technologies , Facing two technical challenges :
One is the differentiation of individual language system . Due to the location 、 Culture 、 Different educational backgrounds , Everyone's language expression is unique , This kind of personalization inevitably leads to misunderstanding in the communication between people , Not to mention a machine . about AI for , Chinese is not a language , It is 13 Billion languages .
Second, conversational oral English often has word order reversal 、 hesitate 、 Pause caused by hesitation . In complex multi person interaction scenarios , Statements are also inevitable interrupt 、 It's a hot call 、 Overlapping sound Other questions , These phonetic features are AI Modeling brings great difficulties .
Real dialogue data and multilingual corpus construction are the key to solve the above problems , Inject knowledge map into the machine 、 chinese 、 dialect 、 Foreign languages and other language materials , In order to make machines understand natural language like people .
1、 Speech recognition technology and data
speech recognition , It mainly converts the vocabulary and other contents in human voice into computer-readable input , This is the first step for computers to learn human language , The personalized expression of thousands of people and thousands of faces mentioned above and the reversal of oral dialogue 、 hesitate 、 Hesitation, etc , These are entered “ Content ” It is a very important learning element for machines .
2、 Speech synthesis technology and data
speech synthesis , It is mainly the text generated by the computer ( Own or external input ) Turn into something human beings can understand 、 Fluent spoken Chinese output . Human beings always attach mood and emotion to their language expression , The audio of speech synthesis is to imitate the real human voice , So we need to predict the prosody of the text , Where to stop , How long is the pause , Which word or word needs to be stressed , Which words need to be read lightly, etc , Realize the high and low twists and turns of sound , speak in measured tones .
3、 Natural language understanding technology and data sets
natural language understanding , Mainly through the understanding and analysis of input data , Let human beings and machines communicate effectively with natural language , Not only can the machine “ Understand people's words ”, It can make the machine “ Talking about people ”.
More examples of datasets can be found in MagicData Check it on the official website :
https://www.magicdatatech.cn/datasets
I hope that in the future, virtual human can not only communicate with people , You can also have thousands of people and thousands of faces , Can really and “ I ” equally , Have the same movement habits , There are common expressions , There is a familiar tone and so on . In the process of human-computer interaction , If a machine wants to perceive human emotion , Sound alone is not enough , More information transmission is contained in facial expressions or language content , Emotion perception is the result of a multimodal comprehensive evaluation . And these , We need to customize the multimodal data collection and characterization of each individual .
Magic Data Multimodal data processing system Annotator5.0, It provides the most basic and important multi-dimensional data support for the construction of virtual digital human . stay Annotator5.0 On the ground floor , With a lot of AI technology , Through data preprocessing , Capture basic characterization capabilities , And then through manual fine processing in the later stage , Further optimize features , So as to ensure the depiction ability of each individual to the greatest extent .
pc End trial link :
https://www.magicdatatech.cn/
Data is the foundation of artificial intelligence , Whether meta universe or virtual digital human , All the construction of artificial intelligence is inseparable from data , Use data reasonably and effectively , In order to make machines better understand human , Let mankind better explore the unknown .
边栏推荐
- C LINQ queries the set and returns VaR to convert it into entity class set
- Keil uVision5代码自动补全或代码联想
- LDR指令和LDR伪指令区别
- 如何排查 Inodes 使用太多的问题
- STM32 IWDG设置
- DOM 事件类型
- Memory distribution in C language and program running (BSS segment, data segment, code segment, stack)
- TST, CMP, bne, BEQ instructions
- 使用 Abp.Zero 搭建第三方登录模块(三):网页端开发
- 域名解析中“TTL”是什么意思?
猜你喜欢
随机推荐
u-boot-1.1.6移植笔记(初级篇)
Memory内存操作函数
2021-07-05
The difference between BL and LDR jump program
koa2学习
PIC16F877XA指令系统(汇编语言)
好用的 代码统计工具
ICASSP 2022 Women Session Only in Shenzhen
Win10 completely uninstall mysql8.0
tst、cmp、bne、beq指令
s3c2440上的nor flash启动与nand flash启动的区别
s3c2440上的nor flash啟動與nand flash啟動的區別
nmos和pmos区别、工作原理及基本结构详解
KeilC51使用详解 (三)
>/dev/null 2>&1 &
P5024 [NOIP2018 提高组] 保卫王国 题解
简单工厂 工厂方法 抽象工厂 了解一下
WPF MVVM mouse double click event
Jz2440 development board TFT LCD experiment
Array sort usage (sorting) functions can be used