CSpace  > 智能安全技术研究中心
Keyword spotting in handwritten chinese documents using semi-markov conditional random fields
Zhang, Heng1; Zhou, Xiang-Dong2; Liu, Cheng-Lin3
2017-02-01
摘要This paper proposes a document indexing method for keyword spotting based on semi-Markov conditional random fields (semi-CRFs), which provide a theoretical framework for fusing the information of different contexts. The candidate segmentation-recognition lattice is first augmented based on the linguistic context to improve recognition results. For fast retrieval and to save storage space, the lattice is then purged by a forward backward pruning procedure. In the reduced lattice, we estimate character similarity scores based on the semi-CRF model. The parameters of semi-CRF model are estimated using a binary classification objective, i.e., the cross-entropy (CE) to discriminate candidate characters in the lattice. To locate mis-recognized character instances in the lattice, we use confusing similar characters as proxies and search for proxy-characters in the index file. The proxy-character driven search can significantly improve the performance compared with our previous character-synchronous dynamic search (CSDS) method. Experimental results on the online handwriting database CASIA-OLHWDB justify the effectiveness of the proposed method.
关键词Online handwritten Chinese documents Semi-Markov conditional random fields Keyword spotting Proxy-character driven search
DOI10.1016/j.engappai.2016.11.006
发表期刊ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
ISSN0952-1976
卷号58页码:49-61
收录类别SCI
WOS记录号WOS:000392684200004
语种英语