Major casts, for example, the anchor persons or reporters in news broadcast programs and principle characters in movies play an important role in video, and their occurrences provide good indices for organizing and presenting video content. This paper describes a new approach for automatically generating the list of major casts in a video sequence based on multiple modalities, specifically, both speaker and face information. A list of major casts is created and ordered by the accumulative temporal and spatial presence of corresponding casts. Preliminary simulation results show that the detected major casts are meaningful and the proposed approach is promising.
|Original language||English (US)|
|Number of pages||4|
|Journal||ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|
|State||Published - 2001|
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering