カンファレンス (国際) Web-based Topic Language Modeling for Audio Indexing
International Conference on Multimedia and Expo (ICME 2009)
We describe the implementation of a scalable architecture for audio indexing, in which topic-dependent language models (LMs) were trained on web pages categorized in a portal web directory and stored on distributed servers. Input speech was decoded in parallel on servers that each had an LM for an individual topic. From the decoders' outputs, an optimal hypothesis was chosen for each utterance, according to a topic-selection criterion that minimizes an energy function consisting of three terms : likelihood scores for the utterances; keyword co-occurrence statistics, which measure the long-distance correlation; and web-based hypothesis verification scores, which penalize misrecognized trigrams through web search results. Experiments showed that the proposed approach outperformed the baseline topic-independent system by 6.0% absolutely in character accuracy. Experimental results showed that the proposed approach significantly outperformed the baseline topic-independent system by 6.0% absolutely (20.0% relatively) in character accuracy.