Open Access Open Access  Restricted Access Subscription or Fee Access

Intelligent Document Scanner

Devi Devapal, Aravind M.P, Athul P, Pranav S. Chandran, Vishak K.K

Abstract


There is an irresistible trend in the present world for scanning the paper documents and convert it into a digitalized format. Most of these scanning treat the whole document as an entire image. In this scenario, we propose a novel “Intelligent Document Scanner” which automatically segment and classify the contents of the image document including texts, tables and images and store it as a PDF document with three sections that include all the above said contents that were extracted from the image. Segmentation consists of three steps which consist of object extraction, object clustering and object filtering. Before object extraction, Mean-Shift filtering is performed for smoothening the document. To perform object clustering K-means algorithm is employed and to find the relationship between kernels, we employed Kernel Propagation algorithm. Classification includes seed point calculation and categorization of contents which is based on weighted priority. The categorized results are then stored as PDF document.

Cite this Article Devi Devapal, Aravind MP, Athul P et al. Intelligent Document Scanner. Journal of Computer Technology & Applications. 2016; 7(2): 47–57p.


Keywords


Mean-Shift filtering, Object extraction, Object Clustering, K-means, Object Filter, Kernel Propagation, Seed Point, Categorization, Weight priority

Full Text:

PDF

Refbacks

  • There are currently no refbacks.