Fusion论文笔记(一)语言模型融合声学模型
语言模型融合声学模型Fusion解释:fusing E2E models with LMs trained with text data (usually referred to this as fu
...
LM shallow fusion
LM shallow fusion
github 项目 End-to-end-ASR-Pytorch
github 项目 espnet
[重点参考] github 项目 neural_sp
Two-P
...
NLP学习笔记
NLP学习笔记
http://jalammar.github.io/a-visual-guide-to-using-bert-for-the-first-time/
[CLS] token
http
...
code-switch语言模型
code-switch语言模型
github:https://github.com/sagorbrur/codeswitch
https://huggingface.co/sagorsarker/co
...
huggingface使用
huggingface使用
https://github.com/huggingface/transformers
https://github.com/tal-tech/edu-bert
李理 Hu
...
huggingface训练代码
huggingface训练代码
https://huggingface.co/docs/transformers/main/en/perf_train_gpu_one
https://huggingf
...
jieba分词
jieba分词注意,依然会有oov,即使用词典分的,用词典分的不好,不如不加词典的
注意,有个坑,let’s 分完会变成 let ‘ s,分完的数据要进行: sed 's/ '\
...
kenlm
kenlm1234567891011import kenlm ## 将文件导入到 kenlm 语言模型中model = kenlm.LanguageModel("/data/NLP/Lang
...
Ngram LM实验(一)
Ngram LM
https://zhuanlan.zhihu.com/p/273606445
训练一个ngram LM:数据:train set(2715万条,带英文的212万条)
1loca
...