scholarly journals Optimal stop word selection for text mining in critical infrastructure domain

Author(s):  
Kasun Amarasinghe ◽  
Milos Manic ◽  
Ryan Hruska

In Present situation, a huge quantity of data is recorded in variety of forms like text, image, video, and audio and is estimated to enhance in future. The major tasks related to text are entity extraction, information extraction, entity relation modeling, document summarization are performed by using text mining. This paper main focus is on document clustering, a sub task of text mining and to measure the performance of different clustering techniques. In this paper we are using an enhanced features selection for clustering of text documents to prove that it produces better results compared to traditional feature selection.


2021 ◽  
Vol 9 (2) ◽  
pp. 244-252
Author(s):  
Rizka Safitri Lutfiyani ◽  
Niken Retnowati

Email cukup populer sebagai salah satu media komunikasi digital. Hal tersebut dikarenakan proses pengiriman pesan dengan email yang mudah. Sayangnya, kebanyakan pesan dalam email adalah email spam. Spam adalah pesan yang tidak diinginkan penerima pesan karena spam biasanya berisi pesan iklan maupun pesan penipuan. Ham adalah pesan yang diinginkan penerima pesan. Salah satu cara untuk menyortir pesan-pesan tersebut adalah dengan melakukan pengklasifikasian pesan email menjadi spam maupun ham. Naïve Bayes dan decision tree J48 ialah algoritma yang dapat digunakan untuk mengklasifikasikan pesan email. Oleh karena itu, penelitian ini bertujuan membandingkan efektifitas algoritma Naïve Bayes dan decision tree J48 dalam penyortiran email spam. Metode yang digunakan adalah text mining. Data yang berisi teks pesan email berbahasa Inggris akan diproses terlebih dahulu sebelum diklasifikasikan dengan Naïve Bayes dan decision tree J48. Tahap pra proses tersebut meliputi tokenisasi, pembuangan stop word list, stemming, dan seleksi atribut. Selanjutnya, data teks pesan email akan diproses dengan algoritma Naïve Bayes dan decision tree J48. Algoritma Naïve Bayes adalah algoritma pengklasifikasi yang berdasarkan pada teori keputusan Bayesian sedangkan algoritma decision tree J48 ialah pengembangan dari algoritma decision tree ID3. Hasil penelitian ini adalah algoritma decision tree J48 mendapat akurasi yang lebih tingggi dari algoritma Naïve Bayes. Algoritma decision tree J48 mendapat 93,117% sedangkan Naïve Beyes memiliki akurasi 88,5284%. Kesimpulan dari penelitian ini adalah algoritma decision tree J48 lebih unggul dibanding Naive Bayes untuk menyortir email spam jika dilihat dari tingkat akurasi masing-masing algoritma.


A research paper is a rich source of academic and innovative writing on a particular topic, and they are unstructured in nature. Categorization of documents refers to classification of documents in classes that are predefined. It is arduous for a user to categories research paper in different domains: because extracting meaningful and relevant words from the research paper is a challenging task. For extracting important information we have used certain methods and classifiers. Methods like bag of words and tfidf is used for processing data. Prepossessing the data includes string tokenizing and stop-word removal. Then the processed data is classified using SVM classifier. For multiclass classification; since predefined classes are 4, therefore 1-v-r classifier is used. The system performance is 88% with 800 training and 200 testing documents. It is analyzed that the model performs better when the training data is more. The aim of this work is to categorize the documents and allocate set of predefined tag to them. It also evaluates the performance of the model by considering different percentages for training and testing sets of documents.


2017 ◽  
Vol 3 ◽  
pp. 25-30
Author(s):  
Szymanowski Karol

In the presented article the author desires to study requirements for the air defence subunits in the forest-lake environment of the Suwalki isthmus. For that purpose the author underlined the importance of a presented region , Identified specifics of the isthmus, made a terrain analysis and defined the requirements of the air defence subunits operating in that area. The submitted Proposals learnt from the research indicato the need of use of the so - called light air Demence in order to provide efficient defence for ces and critical infrastructure of the region. The author also takes into account the origin and presence of the Territorial Defence Forces air defence elements, drawing also attention on their place in the entire air defence system of the area and emphasizing the importance of the suitable armament selection for the newly formed air defence subunits of the Territorial Defence Forces (TDFs).


2021 ◽  
Vol 75 (2) ◽  
pp. 45-51
Author(s):  
Oleksiy Tsurkan ◽  

The level of trust of police service depends on objectively transparent and unprejudiced requirements. That’s why we agree with necessity of using of the European approach in conducting «selection» for the duty in the National Police of Ukraine. It is because the main aim of creating a fundamentally new structure of the modern police of Ukraine starts directly with the selection for the service and be in accordance with European terms. The one of the most approaches of translation of legal terminology is the using of a system of law terms. There is mention in the jurisprudence that very necessary to achieve unambiguity of each term in the law texts and legislation. It is striving for the minimum required number of terms, but with the losing of those nuances that are necessary for public administration practice. The article focuses on the differences of the translation definitions of word «selection» that used in Ukrainian law texts and researches. Some researchers revealing the process of «selection» of personnel through the principles of systemic character as a procedure of differentiation staff according to their compliance with a certain type of activity and making the decision on the suitability or unsuitability of candidates. The author determinate etymological origin of the concepts «selection» in Ukrainian and differentiate the using of their translation. The research papers make a suggestion to putting forward changes in legal acts with the propose of remove the inaccuracy in the interpretation concepts. The system of «selection» of the staff of the civil service of Ukraine includes: defining the requirements for applicants for specific vacancy positions of the civil service; review and evaluation of internal and external sources of attraction of candidates, placement of ads on a set; competitive selection; acceptance for positions outside the competition (according to another procedure provided by the current legislation: the transfer system; appointment to the post; selection system; by contract); formation of personnel reserve; internship; assessment of frames. The notion of «selection» has a more meaningful and widespread value, indicating the need for its use in the legislation.


Sign in / Sign up

Export Citation Format

Share Document