德国世界杯_2012年世界杯 - fyycdq.com

德国世界杯_2012年世界杯 - fyycdq.com

轻量化汉语唇读模型及数据集构建

Abstract: In order to promote the rapid development and practical application of Chinese lipreading, a lightweight lipreading model is proposed based on the combination of interleaved group convolution and dilated convolution. In the proposed model, the interleaved group convolution is taken to learn the correlation between different features and the dilated convolution is taken to expand the model receptive field, which greatly reduces the amount and complexity of model parameter and improves the accuracy of model recognition. In addition, the largest sentence-level Chinese lipreading dataset is recorded in a controlled environment to enrich the Chinese lipreading dataset. The applicability of the lightweight lipreading model is verified on the recorded datasets and public datasets. The learning ability of the model to the video frame and text mapping relationship is analyzed visually through the heatmap.