Automatic sentence compression : towards abstract summarization

This dissertation presents a novel approach to automatic text summarization, one of the most challenging tasks in Natural Language Processing (NLP). Until now, no one had ever created a summarization method capable of producing summaries comparable in quality with those produced by humans. Even many of state-of-the-art approaches form the summary by selecting a subset of sentences from the original text. Since some of the selected sentences might still contain superfluous information, a finer analysis is needed. We propose an Automatic Sentence Compression method based on the elimination of intra-phrase discourse segments. Using a manually annotated big corpus, we have obtained a linear model that predicts the elimination probability of a segment on the basis of three simple three criteria: informativity, grammaticality and compression rate. We discuss the difficulties for automatic assessment of these criteria in documents and phrases and we propose a solution based on existing techniques in NLP literature, one applying two different algorithms that produce summaries with compressed sentences. After applying both algorithms in documents in Spanish, our method is able to produce high quality results. Finally, we evaluate the produced summaries using the Turing test to determine if human judges can distinguish between human-produced summaries and machine-produced summaries. This dissertation addresses many previously ignored aspects of NLP, namely the subjectivity of informativity, the sentence compression in Spanish documents, and the evaluation of NLP using the Turing test.

Data and Resources

Automatic sentence compression : towards...HTML
Explore
- More information
- Go to resource

Additional Info

Field	Value
Source	https://theses.hal.science/tel-00998924
Author	Molina Villegas, Alejandro
Maintainer	CCSD
Last Updated	May 5, 2026, 09:51 (UTC)
Created	May 5, 2026, 09:51 (UTC)
Identifier	NNT: 2013AVIG0195
Language	fr
Rights	https://about.hal.science/hal-authorisation-v1/
contributor	Laboratoire Informatique d'Avignon (LIA) ; Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI
creator	Molina Villegas, Alejandro
date	2013-09-30T00:00:00
harvest_object_id	0c609c7f-7e1c-4c92-b647-738087c8d0c0
harvest_source_id	3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title	test moissonnage SELUNE
metadata_modified	2026-03-31T00:00:00
set_spec	type:THESE