Name-It: Naming and Detecting Faces in Video by the Integration of Image and Natural Language Processing
We have been developing Name-It, a system that associates faces and names in news videos. First, as the only knowledge source, the system is given news videos which include image sequences and transcripts obtained from audio tracks or closed caption texts. The system can then either infer the name of a given face and output the name candidates, or can locate the faces in news videos by a name. To accomplish this task, the system extracts faces from image sequences and names from transcripts, both of which might correspond to key persons in news topics. The proposed system takes full advantage of advanced image and natural language processing. The image processing contributes to the extraction of face sequences which provide rich information for face-name association. The processing also helps to select the best frontal view of a face in a face sequence to enhance the face identification which is required for the processing. On the other hand, the natural language processing effectively extracts names by using lexical/grammatical analysis and knowledge of the news video topics structure. The success of our experiments demonstrates the benefits of the advanced image and natural language processing methods and their incorporation.
@conference{Satoh-1997-16394,author = {Shin'ichi Satoh and Yuichi Nakamura and Takeo Kanade},
title = {Name-It: Naming and Detecting Faces in Video by the Integration of Image and Natural Language Processing},
booktitle = {Proceedings of 15th International Joint Conference on Artificial Intelligence (IJCAI '97)},
year = {1997},
month = {August},
pages = {1488 - 1493},