voice recognition - How large must a corpus be to create a language model for Sphinx? -
voice recognition - How large must a corpus be to create a language model for Sphinx? -
i know how many documents or sentences or words need process in order language model of domain , utilize in voice recognition tools such cmu sphinx.
to create decent language model little domain it's plenty have 100 mb of texts. can mix them generic language model improve generalization of language model.
to create generic language model developers utilize big corpora. illustration there google 1tb corpus contains millions of words , terabyte of data. trigram part of 40gb of bigram counts must hundred terabytes of texts.
voice-recognition sphinx4
Comments
Post a Comment