Computational Methods of Information Geometry with Real-Time Applications in Audio Signal Processing

This thesis proposes novel computational methods of information geometry with real-time applications in audio signal processing. In this context, we address in parallel the applicative problems of real-time audio segmentation, and of real-time polyphonic music transcription. This is achieved by developing theoretical frameworks respectively for sequential change detection with exponential families, and for non-negative matrix factorization with convex-concave divergences. On the one hand, sequential change detection is studied in the light of the dually flat information geometry of exponential families. We notably develop a generic and unifying statistical framework relying on multiple hypothesis testing with decision rules based on exact generalized likelihood ratios. This is applied to devise a modular system for real-time audio segmentation with arbitrary types of signals and of homogeneity criteria. The proposed system controls the information rate of the audio stream as it unfolds in time to detect changes. On the other hand, non-negative matrix factorization is investigated by the way of convex-concave divergences on the space of discrete positive measures. In particular, we formulate a generic and unifying optimization framework for non-negative matrix factorization based on variational bounding with auxiliary functions. This is employed to design a real-time system for polyphonic music transcription with an explicit control on the frequency compromise during the analysis. The developed system decomposes the music signal as it arrives in time onto a dictionary of note spectral templates. These contributions provide interesting insights and directions for future research in the realm of audio signal processing, and more generally of machine learning and signal processing, in the relatively young but nonetheless prolific field of computational information geometry.

Data and Resources

Additional Info

Field Value
Source https://theses.hal.science/tel-00768524
Author Dessein, Arnaud
Maintainer CCSD
Last Updated May 10, 2026, 12:02 (UTC)
Created May 10, 2026, 12:02 (UTC)
Identifier tel-00768524
Language en
Rights https://about.hal.science/hal-authorisation-v1/
contributor Synchronous Realtime Processing and Programming of Music Signals (MuTant) ; Institut de Recherche et Coordination Acoustique/Musique (IRCAM)-Inria Paris-Rocquencourt ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université Pierre et Marie Curie - Paris 6 (UPMC)-Centre National de la Recherche Scientifique (CNRS)
creator Dessein, Arnaud
date 2012-12-13T00:00:00
harvest_object_id 823ff666-eb66-4e8f-9d3f-b078f4eb826b
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2025-03-01T00:00:00
set_spec type:THESE