Text Analysis of Assembly Work Instructions
The objective of this research is to investigate the requirements and performance of parts-of-speech tagging of assembly work instructions. Natural Language Processing of assembly work instructions is required to perform data mining with the objective of knowledge reuse. Assembly work instructions are key process engineering elements that allow for predictable assembly quality of products and predictable assembly lead times. Authoring of assembly work instructions is a subjective process. It has been observed that most assembly work instructions are not grammatically complete sentences. It is hypothesized that this can lead to false parts-of-speech tagging (by Natural Language Processing tools). To test this hypothesis, two parts-of-speech taggers are used to tag 500 assembly work instructions (obtained from the automotive industry). The first parts-of-speech tagger is obtained from Natural Language Processing Toolkit (nltk.org) and the second parts-of-speech tagger is obtained from Stanford Natural Language Processing Group (nlp.stanford.edu). For each of these taggers, two experiments are conducted. In the first experiment, the assembly work instructions are input to the each tagger in raw form. In the second experiment, the assembly work instructions are preprocessed to make them grammatically complete, and then input to the tagger. It is found that the Stanford Natural Language Processing tagger with the preprocessed assembly work instructions produced the least number of false parts-of-speech tags.