Statistical Approaches for Segmentation : Application to Genome Annotation

We propose to model the output of transcriptome sequencing technologies (RNA-Seq) using the negative binomial distribution, as well as build segmentation models suited to their study at different biological scales, in the context of these technologies becoming a valuable tool for genome annotation, gene expression analysis, and new-transcript discovery. We develop a fast segmentation algorithm to analyze whole chromosomes series, and we propose two methods for estimating the number of segments, a key feature related to the number of genes expressed in the cell, should they be identified from previous experiments or discovered at this occasion. Research on precise gene annotation, and in particular comparison of transcription boundaries for individuals, naturally leads us to the statistical comparison of change-points in independent series. To address our questions, we build tools, in a Bayesian segmentation framework, for which we are able to provide uncertainty measures. We illustrate our models, all implemented in R packages, on an RNA-Seq dataset from a study on yeast, and show for instance that the intron boundaries are conserved across conditions while the beginning and end of transcripts are subject to differential splicing.

Data and Resources

Statistical Approaches for Segmentation :...HTML
Explore
- More information
- Go to resource

Additional Info

Field	Value
Source	https://theses.hal.science/tel-00913851
Author	Cleynen, Alice
Maintainer	CCSD
Last Updated	May 7, 2026, 23:13 (UTC)
Created	May 7, 2026, 23:13 (UTC)
Identifier	NNT: 2013PA112258
Language	en
Rights	https://about.hal.science/hal-authorisation-v1/
contributor	Mathématiques et Informatique Appliquées (MIA-Paris) ; Institut National de la Recherche Agronomique (INRA)-AgroParisTech
creator	Cleynen, Alice
date	2013-11-15T00:00:00
harvest_object_id	abcb7d35-fbae-4470-a035-74004e3c8dcc
harvest_source_id	3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title	test moissonnage SELUNE
metadata_modified	2026-04-01T00:00:00
set_spec	type:THESE