Text line segmentation is one of the pre-stages of modern optical character
recognition systems. The algorithmic approach proposed by this paper has been
designed for this exact purpose. Its main characteristic is the combination of
two different techniques, morphological image operations and horizontal
histogram projections. The method was developed to be applied on a historic
data collection that commonly features quality issues, such as degraded paper,
blurred text, or presence of noise. For that reason, the segmenter in question
could be of particular interest for cultural institutions, that want access to
robust line bounding boxes for a given historic document. Because of the
promising segmentation results that are joined by low computational cost, the
algorithm was incorporated into the OCR pipeline of the National Library of
Luxembourg, in the context of the initiative of reprocessing their historic
newspaper collection. The general contribution of this paper is to outline the
approach and to evaluate the gains in terms of accuracy and speed, comparing it
to the segmentation algorithm bundled with the used open source OCR software.