RUJA: Repositorio Institucional de Producción Científica

 

Multichannel Blind Sound Source Separation Using Spatial Covariance Model With Level and Time Differences and Nonnegative Matrix Factorization

dc.contributor.authorCarabias-Orti, Julio José
dc.contributor.authorNikunen, Joonas
dc.contributor.authorVirtanen, Tuomas
dc.contributor.authorVera-Candeas, Pedro
dc.date.accessioned2024-02-07T00:37:07Z
dc.date.available2024-02-07T00:37:07Z
dc.date.issued2018-04-27
dc.description.abstractThis paper presents an algorithm for multichannel sound source separation using explicit modeling of level and time differences in source spatial covariance matrices (SCM). We propose a novel SCM model in which the spatial properties are modeled by the weighted sum of direction of arrival (DOA) kernels. DOA kernels are obtained as the combination of phase and level difference covariance matrices representing both time and level differences between microphones for a grid of predefined source directions. The proposed SCM model is combined with the NMF model for the magnitude spectrograms. Opposite to other SCM models in the literature, in this work, source localization is implicitly defined in the model and estimated during the signal factorization. Therefore, no localization preprocessing is required. Parameters are estimated using complex-valued nonnegative matrix factorization with both Euclidean distance and Itakura-Saito divergence. Separation performance of the proposed system is evaluated using the two-channel SiSEC development dataset and four channels signals recorded in a regular room with moderate reverberation. Finally, a comparison to other state-of-the-art methods is performed, showing better achieved separation performance in terms of SIR and perceptual measures.es_ES
dc.description.sponsorshipAcademy of Finland project (Grant Number: 290190)es_ES
dc.identifier.citationJ. J. Carabias-Orti, J. Nikunen, T. Virtanen and P. Vera-Candeas, "Multichannel Blind Sound Source Separation Using Spatial Covariance Model With Level and Time Differences and Nonnegative Matrix Factorization," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 9, pp. 1512-1527, Sept. 2018, doi: 10.1109/TASLP.2018.2830105. keywords: {Time-frequency analysis;Covariance matrices;Microphones;Direction-of-arrival estimation;Kernel;Source separation;Spectrogram;Multichannel source separation;spatial covariance model;interaural time difference;interaural level difference;non-negative matrix factorization;direction of arrival estimation},es_ES
dc.identifier.issn2329-9290es_ES
dc.identifier.other10.1109/TASLP.2018.2830105es_ES
dc.identifier.uri-es_ES
dc.identifier.urihttps://hdl.handle.net/10953/2184
dc.language.isoenges_ES
dc.publisherIEEEes_ES
dc.relation.ispartofIEEE/ACM Transactions on Audio, Speech, and Language Processing 2018; vol. 26, no. 9, pp. 1512-1527es_ES
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses_ES
dc.subjectTime-frequency analysises_ES
dc.subjectCovariance matriceses_ES
dc.subjectMicrophoneses_ES
dc.subjectDirection-of-arrival estimationes_ES
dc.subject.udc621.39es_ES
dc.titleMultichannel Blind Sound Source Separation Using Spatial Covariance Model With Level and Time Differences and Nonnegative Matrix Factorizationes_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.type.versioninfo:eu-repo/semantics/publishedVersiones_ES

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
level_time_SCM2018.pdf
Tamaño:
1.04 MB
Formato:
Adobe Portable Document Format
Descripción:
PDF file of the article

Bloque de licencias

Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
1.98 KB
Formato:
Item-specific license agreed upon to submission
Descripción:

Colecciones