Retrieving and Saving Meaningful Keywords in Unstructured PDF Documents using Binary Decision Diagrams
With the growing intricacy in data engendered and processed across sundry platforms today, the desideratum for consistency has grown. Structured data is utilized for a number of purposes which is not feasible with unstructured data. The purpose of this study was to convert data from unstructured format to structured in portable document format with the help of new framework using the concept of Binary Decision Diagrams and Boolean operations. Binary decision diagrams are data structures for representing Boolean functions taking Boolean as input and generating Boolean as output and hence creating a binary diagram. This research is mainly carried out to show how we can store large number of data easily in the form of bits. The entire focus is on retrieving the meaningful information from unstructured textual data in PDF documents using Boolean operations and bag model, thus, saving the meaningful keywords in the form of binary decision trees. Later on clustering the documents based on commonalities between the documents. This research presents a way for increasing the efficiency of converting unstructured data to structured in PDF and saving huge number of data in the form of bits using this novel framework