A Speech/Music Discriminator based on Frequency energy, Spectrogram and Autocorrelation
Sumit Kumar Banchhor1, Om Prakash Sahu2, Prabhakar3
1Sumit Kumar Banchhor, Electronics and Telecommunication, Chhattisgarh Swami Vivekananda Technical University, GD Rungta College of Engineering and Technology, Bhilai, India.
2Om Prakash Sahu, Electronics and Telecommunication, Chhattisgarh Swami Vivekananda Technical University, GD Rungta College of Engineering and Technology, Bhilai, India.
3Prabhakar, Electronics and Telecommunication, Chhattisgarh Swami Vivekananda Technical University, GD Rungta College of Engineering and Technology, Bhilai, India.
Manuscript received on February 15, 2012. | Revised Manuscript received on February 20, 2012. | Manuscript published on March 05, 2012. | PP: 480-483 | Volume-2 Issue-1, March 2012. | Retrieval Number: F0318121611/2012©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Over the last few years major efforts have been made to develop methods for extracting information from audio-visual media, in order that they may be stored and retrieved in databases automatically, based on their content. In this work we deal with the characterization of an audio signal, which may be part of a larger audio-visual system or may be autonomous, as for example in case of an audio recording stored digitally on disk. Our goal was first to develop a system for segmentation of the audio signal, and then classify into one of two main categories: speech or music. Segmentation is based on mean signal amplitude distribution, whereas classification utilizes an additional characteristic related to frequency. The basic characteristics are computed in 2sec intervals, resulting in the segments’ limits being specified within an accuracy of 2sec. The result shows the difference in human voice and musical instrument.
Keywords: Speech/music classification, audio segmentation, zero crossing rate, short time energy, spectrum flux.