A speech/music discriminator of radio recordings based on dynamic programming and Bayesian networks

From Brede Wiki
Jump to: navigation, search
Paper (help)
A speech/music discriminator of radio recordings based on dynamic programming and Bayesian networks
Authors: A. Pikrakis, T. Giannakopoulos, Sergios Theodoridis
Citation: IEEE Transactions on Multimedia 10 (5): 846-857. 2008
Database(s):
DOI: 10.1109/TMM.2008.922870.
Link(s):
Search
Web: Bing Google Yahoo!Google PDF
Article: BASE Google Scholar PubMed
Restricted: DTU Digital Library
Other: NIF
Services
Format: BibTeX
Extract:

A speech/music discriminator of radio recordings based on dynamic programming and Bayesian networks describes a signal analysis system for audio segmentation to differentiate between speech and music.

The system uses chromatic entropy as a feature: "The standard deviation of chromatic entropy is significantly lower for the case of music" than speech (from talk).

The system reaches a performance on 99.5% precision on music recognition (according to talk).

[edit] Data

3 datasets:

  1. 170 minutes
  2. 60 minutes
  3. 9 hours of uninterrupted audio recordings "divided on a radio genre basis"

Manually segmentation

[edit] Related papers

  1. A computationally efficient speech/music discriminator for radio recordings
Personal tools