Examinando por Autor "Vera-Candeas, Pedro"

Mostrando 1 - 8 de 8

A constrained tonal semi-supervised non-negative matrix factorization to classify presence/absence of wheezing in respiratory sounds
(Elsevier, 2020-04-01) Torre-Cruz, Juan; Cañadas-Quesada, Francisco Jesús; García-Galán, Sebastián; Ruiz-Reyes, Nicolás; Vera-Candeas, Pedro; Carabias-Orti, Julio José
From a clinical point of view, the detection of wheezing presence in respiratory sounds is a challenging task for early identification of pulmonary diseases since wheezing is the main manifestation associated to airway obstruction. In this article, we propose a novel method to detect the presence or absence of wheeze sounds in breath recordings in order to increase the reliability of the subjective diagnosis provided by the physician in the auscultation process. Specifically, it is assumed an unhealthy subject when wheeze sounds can be detected during breathing. The proposed method consists of three stages. The first stage attempts to estimate the spectral interval, band of interest (BOI), that shows the highest probability to find wheeze sounds. In the second stage, a constrained tonal semi-supervised non-negative matrix factorization (NMF) approach is applied to obtain spectral patterns that models the periodic or tonal nature typically shown by wheeze sounds. The third stage analyzes the estimated wheezing spectrogram based on the smoothness of the spectral trajectories from the most significant energy previously factorized in the BOI. Our system has been evaluated and compared to other state-of-the-art methods, yielding competitive results in the wheezing presence detection in respiratory sounds.
A novel wheezing detection approach based on constrained non-negative matrix factorization
(Elsevier, 2019-05-01) Torre-Cruz, Juan; Cañadas-Quesada, Francisco Jesús; Carabias-Orti, Julio José; Vera-Candeas, Pedro; Ruiz-Reyes, Nicolás
The early wheezing detection is still a challenging task in biomedical signal processing because the presence of wheeze sounds often indicate respiratory diseases from airway obstructions. Currently, most of the first clinical examinations to detect any airway obstructions are carried out using auscultation. However, a high percentage of diagnoses are misdiagnosed since they are highly dependent on the physician’s training in the wheezing detection, especially in noisy environments in which weak wheeze sounds can be masked by louder respiratory sounds. In this work, we propose a novel wheezing detection approach, based on Constrained Non-negative Matrix Factorization, that uses two-stage cascade: separation and detection. The novelty of the separation stage is to model wheeze and respiratory sounds as reliably as possible that they can be observed in the nature incorporating constraints (sparseness and smoothness) into the NMF factorization. Once the estimated wheezing and respiratory signal are obtained from the separation stage, the detection contribution is based on the use of the Kullback-Leibler divergence to discriminate between wheezing and respiratory areas. The experiments have been conducted using three different datasets composed of healthy or unhealthy patients. First, an optimization process is applied to obtain the optimal parameters of the separation stage. Finally, the performance of the wheezing detection of the proposed method is evaluated taking into account other state-of-the-art methods. Experimental results report that i) the proposed method outperforms recent state-of-the-art wheezing detection approaches showing a robust wheezing detection performance even evaluating noisy environments and ii) the ability of the proposal to reliably detect healthy patients.
Classification and Separation Techniques based on Fundamental Frequency for Speech Enhancement
(Jaén : Universidad de Jaén, 2016-01-11) Cabañas-Molero, Pablo-Antonio; Vera-Candeas, Pedro; Ruiz-Reyes, Nicolás; Universidad de Jaén. Departamento de Ingeniería de Telecomunicación
[ES] En esta tesis se desarrollan nuevos algoritmos de clasificación y mejora de voz basados en las propiedades de la frecuencia fundamental (F0) de la señal vocal. Estas propiedades permiten su discriminación respecto al resto de señales de la escena acústica, ya sea mediante la definición de características (para clasificación) o la definición de modelos de señal (para separación). Tres contribuciones se aportan en esta tesis: 1) un algoritmo de clasificación de entorno acústico basado en F0 para audífonos digitales, capaz de clasificar la señal en las clases voz y no-voz; 2) un algoritmo de detección de voz sonora basado en la aperiodicidad, capaz de funcionar en ruido no estacionario y con aplicación a mejora de voz; 3) un algoritmo de separación de voz y ruido basado en descomposición NMF, donde el ruido se modela de una forma genérica mediante restricciones matemáticas.
Monitoring the internal quality of ornamental stone using impact-echo testing
(Elsevier, 2019-12-01) Montiel-Zafra, María Violeta; Cañadas-Quesada, Francisco Jesús; Campos-Suñol, María José; Vera-Candeas, Pedro; Ruiz-Reyes, Nicolás
The decay and durability of stone materials is a natural response to the progressive adjustment to different environmental and harsh conditions. Usually stone building elements with no apparent sign of decay are affected by the loss of cohesion. Non-invasive, early and low-cost identification of the internal damage of stone materials would be a great step forward. This paper presents an impact-echo (IE) method to analyse the internal quality of ornamental stone. The proposed method attempts to estimate the P-wave velocity in the material applying a frequency estimator that best explains the energy distribution of the possible modes of vibration from the captured IE signals. The velocity estimation will be analysed along a set of freeze-thawing cycles in order to establish a correlation with the internal damage caused in the material confirmed by its porosity. This value has been measured after several freezing-thawing cycles at each stone specimen. Experimental results show that the proposed method can be considered as a valid and effective tool for determining the internal damage of ornamental stone materials. Besides, the proposed method could be easily adapted to analyse specimens of different sizes, shapes and types of rocks.
Multichannel Blind Sound Source Separation Using Spatial Covariance Model With Level and Time Differences and Nonnegative Matrix Factorization
(IEEE, 2018-04-27) Carabias-Orti, Julio José; Nikunen, Joonas; Virtanen, Tuomas; Vera-Candeas, Pedro
This paper presents an algorithm for multichannel sound source separation using explicit modeling of level and time differences in source spatial covariance matrices (SCM). We propose a novel SCM model in which the spatial properties are modeled by the weighted sum of direction of arrival (DOA) kernels. DOA kernels are obtained as the combination of phase and level difference covariance matrices representing both time and level differences between microphones for a grid of predefined source directions. The proposed SCM model is combined with the NMF model for the magnitude spectrograms. Opposite to other SCM models in the literature, in this work, source localization is implicitly defined in the model and estimated during the signal factorization. Therefore, no localization preprocessing is required. Parameters are estimated using complex-valued nonnegative matrix factorization with both Euclidean distance and Itakura-Saito divergence. Separation performance of the proposed system is evaluated using the two-channel SiSEC development dataset and four channels signals recorded in a regular room with moderate reverberation. Finally, a comparison to other state-of-the-art methods is performed, showing better achieved separation performance in terms of SIR and perceptual measures.
Multimodal speaker diarization for meetings using volume-evaluated SRP-PHAT and video analysis
(Springer, 2018-04-11) Cabañas-Molero, Pablo Antonio; Lucena, Manuel; Fuertes, José Manuel; Vera-Candeas, Pedro; Ruiz-Reyes, Nicolás
Speaker diarization is traditionally defined as the problem of determining “who speaks when” given an audio or video stream. This is an important task in many applications for meeting rooms, including automatic transcription of conversations, camera steering or content summarization. When the room is equipped with microphone arrays and cameras, speakers can be distinguished according to their location and the problem can be addressed through localization techniques. This article proposes a multimodal speaker diarization system for meeting environments based on a modified SRP-PHAT function evaluated on space volumes rather than discrete points. In our system, this function is used in combination with a circular array, enabling audio-based localization based on the selection of local maxima. Voicing detection is used to detect speech frames, whereas video analysis is introduced to aid in the decision when users move or simultaneously speak. The approach is evaluated on the well-known AMI dataset with approximately 100 hours of realistic meeting recordings and shows an average diarization error rate of 21% – 25%.
Separación de fuentes sonoras en señales acústicas
(Jaén : Universidad de Jaén, 2014) Rodríguez-Serrano, Francisco-Jose; Vera-Candeas, Pedro; Ruiz-Reyes, Nicolás; Universidad de Jaén. Departamento de Ingeniería de Telecomunicación
[ES] Se puede considerar como fuente sonora toda aquella que genere sonido susceptible de ser captado por un micrófono. En este trabajo de investigación el objetivo principal es desarrollar técnicas de separación y evaluar qué tipo de información de entrada al sistema puede ser interesante emplear para que la separación obtenga los mejores resultados posibles. Esta tesis se basa en la hipótesis de que el uso de modelos espectral es de instrumento pueden ser una buena herramienta para discriminar la pertenencia de parte de la energía de la señal a un instrumento u otro, obteniendo resultados más competitivos que los actuales. Además otra hipótesis fundamenta esta tesis de manera que si se consigue solventar el problema de los parciales armónicos solapados de notas concurrentes, el nivel de aislamiento de las fuentes será mayor obteniendo así mejores resultados en la calidad de la separación de las fuentes.
The music demixing machine: toward real-time remixing of classical music
(Springer, 2023-04-06) Cabañas-Molero, Pablo Antonio; Muñoz-Montoro, Antonio Jesús; Vera-Candeas, Pedro; Ranilla, José
Classical music, unlike popular music, is usually recorded live with close microphone techniques. For this reason, isolated tracks are not available to create the final mixture/stream, and so the mixing process requires greater effort. Source separation methods are a potential solution to this problem. However, current algorithms are not fast enough to yield real-time separation in professional setups with dozens of microphones and sources. In this paper, we propose a fast approach consisting of a panning-based multichannel non-negative matrix factorization model to separate classical music. We tested the system on real professional recordings, where we were able to reach real-time with very low latency and promising quality.

RUJA: Repositorio Institucional de Producción Científica

Examinando por Autor "Vera-Candeas, Pedro"