Novel Approach to Background-Text-Non-Text Separationin Ancient Degraded Document Images

Asatryan David

Novel Approach to Background-Text-Non-Text Separationin Ancient Degraded Document Images

Download

Description
Information

Title: Novel Approach to Background-Text-Non-Text Separationin Ancient Degraded Document Images

Abstract:

Nowadays lots of handwritten and printed ancient documents need to be digitized for automated processing and analysis. In this paper, an approach to background-text-non-text separation procedure based on differences of presented in a document image objects sizes which can be obtained by binarization and segmentation algorithms, is proposed. After binarization by proper method it is segmented and the distribution of segments sizes is obtained. It is assumed that the three types of objects presented in an image have significantly different sizes; therefore the problem of separation comes to discrimination of the set of segments into three groups. The thresholds for separation of these groups can be found by minimizing the intrasample variation which used in discriminant analysis. Some examples of images from Matenadaran collection are considered and the separated parts of the image are illustrated and interpreted.