当前位置:网站首页>[natural language processing and text analysis] two sub models of word2vec (supervised neural network model), skip gram and cbow model.
[natural language processing and text analysis] two sub models of word2vec (supervised neural network model), skip gram and cbow model.
2022-07-22 00:39:00 【Sunny qt01】
- Word2vec Premise
First, explain the operation rules of neural network .
On the far left is the input field (3 Neurons ), middle weights Is the weight hidden layer ,bias Is the partial weight , In the middle is accumulation
Here is the product .Z=4 Is the neuron multiplied by the weight , add bias obtain , And then through the activation function (activation function) machining , The left part is function processing
Take these parts as basic units , Practice and you will get the following neural network
A neuron binds to another neuron , The connection relationship is the activation function .
- XOR ask
Linearly indivisible , Multiple linear regression cannot be used to calculate , Because the final result must have some predictions failed .
The input layer has 4 A numerical , The exception layer has two nodes , Output results 1 or 0, Training 500 Time
this 500 Adjust the weight value next time , Then adjust the value in the hidden layer
We find that the error value will continue to follow 500 Number of times error falling .
The final adjustment result is :
Let's take a look at the result table
We began to doubt , Originally linearly indivisible values , Now there is a hidden layer , It can be classified successfully
So we doubt the function of hidden layer .
We can type in ,0,0. See what the hidden layer outputs , Get the results , Make logic LR The regression model ( Here is the neural network framework process mentioned above .), We found that the input field passes through the hidden layer , It becomes a linearly separable value .
Observe the output of the hidden layer .
We can find that the input layer (input Layer) To the hidden layer (Hidden Layer) The process of is to carry out goal transformation , Reduce analysis dimension .
Because our input layer only 2 individual , Therefore, this case does not carry out dimensionality reduction . But if you use New_X Conduct neural network , It can also be found , The accuracy is 100% .
Neural networks can help us generate new features
There is a foundation here , We will word2Vec
- Word2Vec( There's a surveillance model )
Word model 1(Skip -gram)
CBOW
The input layer is the keyword , The output layer is the result
The hidden layer has two matrices , matrix 1 It represents the word embedding matrix , matrix 2 It stands for
We put the previous 5 An article , Train to get embedded ,word,embedding
The distance is far because the words used before and after are completely different .
Method CBOW
Because there are many words before and after words , So we can , Every word of one hot encoding As an input result , Then we average the results of each hidden layer , If we get the result, we can get what kind of model the news is
MATRIW The matrix is as follows
Again , Let's enter Article 6 , You can also predict the appropriate results .
Similarly, we can get related words and synonyms by doing distance .
边栏推荐
- Form form label
- # QForkMasterInit: system error caught. error code=0x000005af, message=VirtualAllocEx failed.: unk
- Go learning notes - channel channel
- Go学习笔记—Channel通道
- Host information collection script
- BGP基本配置和路由聚合
- 2694:逆波兰表达式
- 【自然语言处理与文本分析】PCA文本降维。奇异值分解SVD,PU分解法。无监督词嵌入模型Glove。有案例的将文本非结构化数据转化为结构化数据的方法。
- How important is the instant debit system to B2B e-commerce business?
- Ipset basic usage and save configuration
猜你喜欢
随机推荐
重发布中的路由策略
Redis急速入门!
OSPF的路由控制和防环
网络类型划分
Jasperreports configures Chinese Fonts
uniapp访问的路径去掉 # ,访问出现404
Summary of tcp/ip five layer and seven layer models (it is recommended to recite the full text)
【自然语言处理和文本分析】基础信息检索:签名文件技术,进阶信息检索:向量空间技术(目前主流的搜索引擎在用的技术)
FTP service
Detailed explanation of iptables
Form form label
【自然语言处理和文本分析】基本信息检索技术中的全面扫描法和逐项翻转法。
2811:熄灯问题
DHCP服务与配置
【特征选择】特征选择的几种方法
UE5 官方案例LyraStarter 全特性详解 4.创建队伍
Several usages of return statement in go language
Solr deployment and IK Chinese word segmentation cases
一顿饭的时间,教你怎样快速使用 动态代理ip 做一个获取Steam 热销商品 的方法
Text file import database for getting started with kettle