ngram经验

Posted on 2022-07-22 | In 语言模型 |

BIGO的端到端语音识别技术 BIGO的端到端语音识别技术在缺乏成对语音文本数据的任务中，浅层融合对识别率提升有很大的帮助。但一般的融合技术仅会选用一种外部LM，RNNLM (Recurrent ...

Ngram LM实验（二）

Posted on 2022-07-22 | In 语言模型 |

4000万条做语言模型G1，中英700万条做语言模型G2，模型融合，融合比例通过测试集整体的ppl来确定G1、G2分别对测试集计算ppl ==用词做== 1 ...

Posted on 2022-07-22 | In 语言模型 |

困惑度 PPL perplexity https://huggingface.co/docs/transformers/perplexity https://huggingface.co/spaces ...

Posted on 2022-07-22 | In 语言模型 |

筛选文本来训练领域LM ==Moore, Robert C., and William Lewis. “Intelligent selection of language mod ...

Posted on 2022-07-22 | In 语言模型 |

语言模型经验 https://cloud.tencent.com/developer/article/1116533 我们首先使用 n-gram LM 生成了词网格（word lattices），而 ...

Posted on 2022-07-22 | In 语言模型 |

语言模型自回归 https://github.com/DengBoCong/nlp-paper https://github.com/infinitylogesh/mutate =&# ...

Posted on 2022-07-22 | In 语言模型 |

语言模型自编码 ==Devlin, Jacob, et al. “Bert: Pre-training of deep bidirectional transformers fo ...

Posted on 2022-07-22 | In 语言模型 |

预训练数据集官网地址：https://www.cluebenchmarks.com/ 数据地址：https://github.com/CLUEbenchmark/CLUE 100GB原始语料库的大 ...

Posted on 2022-07-22 | In 语言模型 |

模型汇总huggingface transformers 抱抱脸 https://github.com/huggingface/transformers https://huggingface.co/ ...

Posted on 2022-07-22 | In 语言模型 |

预训练语言模型开源预训练语言模型合集 github：https://github.com/ZhuiyiTechnology/pretrained-models [好] github：https://g ...