Office Documents Classification under Limited Sample. A Case of Table Detection Inside Court Files
Journal: Journal of Information and Organizational Sciences (JIOS) (Vol.46, No. 2)Publication Date: 2022-12-22
Authors : Paweł Baranowski; Adrian Stepniak;
Page : 293-304
Keywords : Convolutions; Deep learning; Document processing; Image classification; Office documents;
Abstract
Deep convolutional neural networks (CNNs) became an industry standard in image processing. However, in order to keep their high efficiency, a large annotated sample is required in the case of supervised learning. In this paper we apply the techniques specific for relatively small sample to a court files dataset. Specifically, we propose transfer learning and semisupervised learning to classify scanned page as having a table or not. We use four CNNs architectures established in the literature and find that transfer learning improves the classification performance, compared to the fully supervised learning. This result is especially evident in the scenarios where only a part of convolutioanl layers are transferred. The gains from semisupervised learning are ambiguous, as the results vary over CNNs architectures. Overall, our results show that office documents classification can achieve high accuracy when transferring initial convolutional layers is applied.
Other Latest Articles
- Development of Activity Recognition Model using LSTM-RNN Deep Learning Algorithm
- TRAUMATIC DENTAL INJURIES, MANAGEMENT AND COMPLICATIONS IN SCHOOL GOING CHILDREN AND ADOLESCENTS. AN UPDATE
- A STUDY OF SERUM FERRITIN LEVELS IN TYPE 2 DIABETES MELLITUS AND ITS CORRELATION WITH HBA1C LEVELS
- SOCIAL WORK SANCTIONS: THE IDEA OF REFORMING THE CRIMINAL LAW FOR ERADICATION OF CORRUPTION IN INDONESIA
- A COMPARATIVE STUDY OF LICHTENSTEIN MESH REPAIR VS NON MESH TISSUE REPAIR DESARDAS TECHNIQUE FOR INGUINAL HERNIA REPAIR
Last modified: 2023-02-09 22:57:11