document processing
Recently Published Documents


TOTAL DOCUMENTS

251
(FIVE YEARS 35)

H-INDEX

17
(FIVE YEARS 1)

Author(s):  
Andrey M. Kitenko ◽  

The paper explores the possibility of using neural networks to single out target artifacts on different types of documents. Numerous types of neural networks are often used for document processing, from text analysis to the allocation of certain areas where the desired information may be contained. However, to date, there are no perfect document processing systems that can work autonomously, compensating for human errors that may appear in the process of work due to stress, fatigue and many other reasons. In this work, the emphasis is on the search and selection of target artifacts in drawings, in conditions of a small amount of initial data. The proposed method of searching and highlighting artifacts in the image consists of two main parts, detection and semantic segmentation of the detected area. The method is based on training with a teacher on marked-up data for two convolutional neural networks. The first convolutional network is used to detect an area with an artifact, in this example YoloV4 was taken as the basis. For semantic segmentation, the U-Net architecture is used, where the basis is the pre-trained Efficientnetb0. By combining these neural networks, good results were achieved, even for the selection of certain handwritten texts, without using the specifics of building neural network models for text recognition. This method can be used to search for and highlight artifacts in large datasets, while the artifacts themselves may be different in shape, color and type, and they may be located in different places of the image, have or not have intersection with other objects.


2021 ◽  
Vol 2 (2) ◽  
pp. 169-174
Author(s):  
Diva Permata Tri Putri ◽  
Eva Wina Aprielya Damayanti ◽  
Intan Sianturi

The Covid-19 pandemic has fatal consequences for the world economy, one of which occurred in Indonesia. Government regulations require the public to apply health protocols that must be obeyed, namely social distancing which causes traders to be hampered in the process of buying and selling activities. The purpose of this study is to analyze the impact of Covid-19 on export-import activities in Indonesia. This research method uses the desk study method, namely the collection of data sourced from secondary data obtained from BPS in 2020. The results show that: Covid-19 has had an impact including: (1) The largest decline in the value of imports in Indonesia was experienced in February 2020 and May 2020; (2) Document processing must take longer due to this pandemic and all import-export activities must be guided by health protocols that must be carried out; (3) Delay in handling the ship at the port (ship delay), which will have an impact on the delay of the goods arriving at the hands of the owner of the goods (importer) which causes the importer to also have to prepare more costs for importing the goods.


Data ◽  
2021 ◽  
Vol 6 (7) ◽  
pp. 78
Author(s):  
Dipali Baviskar ◽  
Swati Ahirrao ◽  
Ketan Kotecha

The day-to-day working of an organization produces a massive volume of unstructured data in the form of invoices, legal contracts, mortgage processing forms, and many more. Organizations can utilize the insights concealed in such unstructured documents for their operational benefit. However, analyzing and extracting insights from such numerous and complex unstructured documents is a tedious task. Hence, the research in this area is encouraging the development of novel frameworks and tools that can automate the key information extraction from unstructured documents. However, the availability of standard, best-quality, and annotated unstructured document datasets is a serious challenge for accomplishing the goal of extracting key information from unstructured documents. This work expedites the researcher’s task by providing a high-quality, highly diverse, multi-layout, and annotated invoice documents dataset for extracting key information from unstructured documents. Researchers can use the proposed dataset for layout-independent unstructured invoice document processing and to develop an artificial intelligence (AI)-based tool to identify and extract named entities in the invoice documents. Our dataset includes 630 invoice document PDFs with four different layouts collected from diverse suppliers. As far as we know, our invoice dataset is the only openly available dataset comprising high-quality, highly diverse, multi-layout, and annotated invoice documents.


Author(s):  
Rajneesh Rani ◽  
Renu Dhir ◽  
Deepti Kakkar ◽  
Nonita Sharma

The identification of script in a document page image is the first step for an OCR system processing multi-script documents. In this multilingual/multiscript world, document processing systems relying on the OCR that need human involvement to select the appropriate OCR package is definitely undesirable and inefficient. The development of robust and efficient methods for automatic script identification of a document is a subject of major importance for automatic document processing in a multilingual/multiscript environment. Thus, the basic objective is to come up with some intuitive methods having straightforward implementation without compromising with efficiency. The aim of this work is to evaluate state-of-the-art feature extraction and classification techniques in the field of automatic script identification of printed and handwritten documents and to propose the best combination for the same.


2021 ◽  
Author(s):  
Yeti Komalasari

The purpose of this study is to evaluate document processing measures at a One Stop service center managed by PT. Lintas Samudera Borneo Lines. The study uses a qualitative method that involved data collection as well as direct interviews with research subjects and a literature study. From the results of research and discussion of document processing issues at the One Stop Service Center, the effort made was to divide time outside working hours by recruiting new employees in the operational field. The obstacles faced are the lack of two-way communication between ships and shipping companies. Keywords: Process, Ship Document, One Stop Service Center


2021 ◽  
Vol 69 (3) ◽  
pp. 3399-3411
Author(s):  
Suliman Aladhadh ◽  
Hidayat Ur Rehman ◽  
Ali Mustafa Qamar ◽  
Rehan Ullah Khan

2021 ◽  
Author(s):  
Dániel Görög ◽  
Mátyás Rényi

The presentation is an overview of the Mikes Kelemen Program - running since 2013 under the auspices of various public entities including the National Széchényi Library - in terms of its processes, results and future potential. Since its launch the Program has processed and offered into public use 250,000 documents collected from eight countries on four continents. The 90,000 documents so far placed offer us insight into the document needs of the domestic library system and those of Hungarian minorities abroad. After two years of development the initial HTML-based service interface listing the documents on offer was replaced by a new SQL database-based one in 2020. The implementation was driven by knowledge gained in the first years of the Program, including document processing experience, utilization statistics for the documents on offer, and feedback from partner institutions that joined the Program. The operation of the new database-based interface implements the management of duplicates exchange differently from the Hungarian practice. The Mikes Kelemen Program website is characterized by serviceoriented operation and integrated processing and recommendation process management. The operation enables accurate, reliable, automated and trackable document management, which may provide a blueprint for the overhaul of the national duplicates exchange system, replacing “digital paper-based” records such as Excel or Word.


2021 ◽  
Author(s):  
Khalid Al Khatib ◽  
Tirthankar Ghosal ◽  
Yufang Hou ◽  
Anita de Waard ◽  
Dayne Freitag
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document