Abstract

Recent developments in the field of separation of mixed signals into music/voice components have attracted the attention of many researchers. Recently, iterative kernel back-fitting, also known as kernel additive modeling, was proposed to achieve good results for music/voice separation. To obtain mi...


AI 본문요약
AI-Helper 아이콘 AI-Helper

* AI 자동 식별 결과로 적합하지 않은 문장이 있을 수 있으니, 이용에 유의하시기 바랍니다.

제안 방법

  • From the complex spectrogram X of the input music signal, each complex spectrogram, SV, SH, and SP, for the vocal, harmonic, and percussive components is estimated by each generalized WbE, GV, GH, and GP, of decomposed spectral amplitude by singular value decomposition (SVD) for the vocal, harmonic, and percussive components, respectively. The WbE estimation gain, Gj, for each source j (= 0, 1, 2, … , J) is explained in detail in Algorithm 2.
  • In this paper, a generalized weighted β-order MMSE estimation (WbE) method based on kernel back-fitting (KBF) was proposed and evaluated for the separation of mixed signals into music/voice components.
  • In this paper, an advanced music/voice separation method is proposed, in which WbE and KBF are combined for improvement of the separation performance.
  • The proposed estimation method takes full advantage of both a generalized weighted β-order spectral amplitude estimator and an SVD-based subspace decomposition.

대상 데이터

  • For the first experiment, 150 full-length song tracks [23] were used (50 songs from the ccMixter database containing many different musical genres, 50 songs from a self-recording studio music database, and 50 songs from the MIR-1 K database), where all singing voices and music accompaniments were recorded separately. All of the song data were stored in PCM format with mono, 16-bit depth, and 44.
  • For the second performance comparison, the proposed algorithm, SVD-WbE-KAM, was compared with REPETSIM [26], RPCA [27], and SVD-GW-KAM. To evaluate the separation of background music and singing voice, 40 fulllength song tracks [24] were used (20 songs from the ccMixter database containing many different musical genres, and 20 songs from the MIR-1 K database). Figures 1 and 2 show boxplots of the SDR for the vocals and the music accompaniment, respectively.
