TY - GEN
T1 - Haber videolarinda nesne tanima ve otomatik etiketleme
AU - Baştan, Muhammet
AU - Duygulu, Pinar
PY - 2006
Y1 - 2006
N2 - We propose a new approach to object recognition problem motivated by the availability of large annotated image and video collections. Similar to translation from one language to another, this approach considers the object recognition problem as the translation of visual elements to words. The visual elements represented in feature space are first categorized into a finite set of blobs. Then, the correspondences between the blobs and the words are learned using a method adapted from Statistical Machine Translation. Finally, the correspondences, in the form of a probability table, are used to predict words for particular image regions (region naming), for entire images (auto-annotation), or to associate the automatically generated speech transcript text with the correct video frames (video alignment). Experimental results are presented on TRECVID 2004 data set, which consists of about 150 hours of news videos associated with manual annotations and speech transcript text.
AB - We propose a new approach to object recognition problem motivated by the availability of large annotated image and video collections. Similar to translation from one language to another, this approach considers the object recognition problem as the translation of visual elements to words. The visual elements represented in feature space are first categorized into a finite set of blobs. Then, the correspondences between the blobs and the words are learned using a method adapted from Statistical Machine Translation. Finally, the correspondences, in the form of a probability table, are used to predict words for particular image regions (region naming), for entire images (auto-annotation), or to associate the automatically generated speech transcript text with the correct video frames (video alignment). Experimental results are presented on TRECVID 2004 data set, which consists of about 150 hours of news videos associated with manual annotations and speech transcript text.
UR - https://www.scopus.com/pages/publications/34247129082
U2 - 10.1109/SIU.2006.1659821
DO - 10.1109/SIU.2006.1659821
M3 - Konferans katkısı
AN - SCOPUS:34247129082
SN - 1424402395
SN - 9781424402397
T3 - 2006 IEEE 14th Signal Processing and Communications Applications Conference
BT - 2006 IEEE 14th Signal Processing and Communications Applications Conference
T2 - 2006 IEEE 14th Signal Processing and Communications Applications
Y2 - 17 April 2006 through 19 April 2006
ER -