Ranking Biomedical Annotations with Annotator’s Semantic Relevancy

Computational and Mathematical Methods in Medicine ◽

10.1155/2014/258929 ◽

2014 ◽

Vol 2014 ◽

pp. 1-11

Author(s):

Aihua Wu

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Semantic Relationship ◽

Critical Problem ◽

Data Set ◽

Relational Information ◽

Common Features ◽

Data User

Biomedical annotation is a common and affective artifact for researchers to discuss, show opinion, and share discoveries. It becomes increasing popular in many online research communities, and implies much useful information. Ranking biomedical annotations is a critical problem for data user to efficiently get information. As the annotator’s knowledge about the annotated entity normally determines quality of the annotations, we evaluate the knowledge, that is, semantic relationship between them, in two ways. The first is extracting relational information from credible websites by mining association rules between an annotator and a biomedical entity. The second way is frequent pattern mining from historical annotations, which reveals common features of biomedical entities that an annotator can annotate with high quality. We propose a weighted and concept-extended RDF model to represent an annotator, a biomedical entity, and their background attributes and merge information from the two ways as the context of an annotator. Based on that, we present a method to rank the annotations by evaluating their correctness according to user’s vote and the semantic relevancy between the annotator and the annotated entity. The experimental results show that the approach is applicable and efficient even when data set is large.

Download Full-text

Frequent Pattern Mining Based on Pattern Space Division in Map/Reduce Cluster

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.588-589.2038 ◽

2012 ◽

Vol 588-589 ◽

pp. 2038-2041

Author(s):

Qian Liu ◽

Ming Chen

Keyword(s):

Pattern Mining ◽

Recursive Algorithm ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Map Reduce ◽

Data Set ◽

Combinatorial Explosion ◽

Space Division ◽

Pattern Space ◽

The Many

By means of pattern space division and based on Map/Reduce, the problem of processing the many-to-many corresponding relationship between the data set and the patterns set is converted to the problem of processing the many-to-many corresponding relationship between the data subsets and the pattern subspaces associated with the frequent 1-itemsets. Thus, the scale of the intermediate key/value pairs set is reduced so dramatically that the problem of single Map node bottleneck which results from combinatorial explosion of candidate patterns space is avoided. Over three rounds of Map/Reduce tasks, the pattern space is constructed and divided, the filtering rules is established and employed, father more, the mining of frequent patterns is realized in each pattern subspace independently. By making the best of both the universal trait of the entire pattern space and the individuality of each pattern subspace, the optimized non-recursive algorithm is designed and implemented to improve the efficiency of mining phase.

Download Full-text

An improved and efficient frequent pattern mining approach to discover frequent patterns among important attributes in large data set using IA-TJ-FGTT

2016 IEEE International Conference on Advances in Computer Applications (ICACA) ◽

10.1109/icaca.2016.7887920 ◽

2016 ◽

Author(s):

Saravanan Suba ◽

T. Christopher

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Large Data ◽

Frequent Pattern ◽

Frequent Patterns ◽

Data Set ◽

Large Data Set

Download Full-text

Proposing Pattern Growth Methods for Frequent Pattern Mining on Account of Its Comparison Made with the Candidate Generation and Test Approach for a Given Data Set

Advances in Intelligent Systems and Computing - Software Engineering ◽

10.1007/978-981-10-8848-3_20 ◽

2018 ◽

pp. 203-209

Author(s):

Vaibhav Kant Singh

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Data Set ◽

Pattern Growth ◽

Growth Methods

Download Full-text

PENERAPAN DATA MINING MENGGUNAKAN ALGORITMA APRIORI UNTUK MENENTUKAN POLA GOLONGAN PENYANDANG MASALAH KESEJAHTERAAN SOSIAL

Sebatik ◽

10.46984/sebatik.v26i1.1622 ◽

2022 ◽

Vol 26 (1) ◽

Author(s):

Irwan Adji Darmawan ◽

Muhammad Fakhri Randy ◽

Imam Yunianto ◽

Muhamad Malik Mutoffar ◽

M Tio Putra Salis

Keyword(s):

Data Mining ◽

Association Rule ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Itemset ◽

Frequent Pattern ◽

Data Set ◽

Minimum Support

Penyandang Masalah Kesejahteraan Sosial (PMKS) menjadi satu dari sekian masalah yang terdapat di daerah perkotaan, sebab dapat mengganggu pembangunan kota, ketertiban umum, keamanan dan stabilitas. Sejauh ini langkah yang dilakukan sementara masih terfokus dengan cara penanganan PMKS, masih belum mengarah untuk mencegah. Menentukan pola golongan PMKS merupakan salah satu cara yang dapat dilakukan. Algoritma Apriori memiliki fungsi untuk membantu menemukan pola yang terdapat pada data (frequent pattern mining) untuk menentukan frequent itemset yang menggunakan metode Association Rule dalam data mining. Dalam penghitungan secara manual yang dilakukan maka didapat pola kombinasi antara lain 3 rules yang memiliki nilai minimum support 15% dengan confidence tertinggi 100% menggunakan Algoritma Apriori. Dalam menguji Algoritma Apriori digunakan aplikasi RapidMiner. RapidMiner merupakan satu dari beberapa software pengolah data mining, misalnya menganalisis teks, mengekstrak pola data set kemudian dikombinasikan menggunakan metode statistik, database, dan kecerdasan buatan agar didapat informasi yang tinggi berasal dari olahan data. Hasil yang didapat dari pengujian perbandingan pola antar golongan PMKS. Dari pengujian menggunakan aplikasi RapidMiner dan penghitungan secara manual Algoritma Apriori, maka disimpulkan dengan kriteria pengujian, bahwa pola (rules) golongan dengan nilai confidence (c) penghitungan manual Algoritma Apriori dapat dibilang tidak mendekati hasil pengujian aplikasi RapidMiner, maka dapat dikatakan tingkat keakuratan pengujian rencah, hanya 37,5%.

Download Full-text

Penerapan Data Mining Menggunakan Algoritma Apriori untuk Menentukan Pola Penyebab Gelandangan dan Pengemis

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2020721376 ◽

2020 ◽

Vol 7 (2) ◽

pp. 229

Author(s):

Wirta Agustin ◽

Yulya Muharmi

Keyword(s):

Data Mining ◽

Association Rule ◽

Urban Areas ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Itemset ◽

Frequent Pattern ◽

Data Sets ◽

Apriori Algorithm ◽

Data Set

Gelandangan dan pengemis salah satu masalah yang ada di daerah perkotaan, karena dapat mengganggu ketertiban umum, keamanan, stabilitas dan pembangunan kota. Upaya yang dilakukan saat ini masih fokus pada cara penanganan gelandangan dan pengemis, belum untuk pencegahan. Salah satu cara yang bisa dilakukan adalah dengan menentukan pola usia gelandangan dan pengemis. Algoritma Apriori sebuah metode Association Rule dalam data mining untuk menentukan frequent itemset yang berfungsi membantu menemukan pola dalam sebuah data (frequent pattern mining). Perhitungan manual menggunakan algoritma apriori, menghasilkan pola kombinasi sebanyak 3 rules dengan nilai minimum support sebesar 30% dan nilai confidence tertinggi sebesar 100%. Pengujian penerapan Algoritma Apriori menggunakan aplikasi RapidMiner. RapidMiner salah satu software pengolahan data mining, diantaranya analisis teks, mengekstrak pola-pola dari data set dan mengkombinasikannya dengan metode statistika, kecerdasan buatan, dan database untuk mendapatkan informasi bermutu tinggi dari data yang diolah. Hasil pengujian menunjukkan perbandingan pola usia gelandangan dan pengemis yang berpotensi menjadi gelandangan dan pengemis. Berdasarkan hasil pengujian aplikasi RapidMiner dan hasil perhitungan manual Algoritma Apriori, dapat disimpulkan sesuai kriteria pengujian, bahiwa pola (rules) usia dan nilai confidence (c) hasil perhitungan manual Algoritma Apriori tidak mendekati nilai hasil pengujian menggunakan aplikasi RapidMiner, maka tingkat keakuratan pengujian rendah, yaitu 37.5 %. Abstract Homeless and beggars are one of the problems in urban areas as they possibly disrupt public order, security, stability and urban development. The efforts conducted are still focusing on managing the existing homeless and beggars instead of preventing the potential ones. One of the methods used for solving this problem is Algoritma Apriori which determines the age pattern of homeless and beggars. Apriori Algorithm is an Association Rule method in data mining to determine frequent item set that serves to help in finding patterns in a data (frequent pattern mining). The manual calculation through Apriori Algorithm obtains combination pattern of 3 rules with a minimum support value of 30% and the highest confidence value of 100%. These patterns were refences for the incharged department in precaution action of homeless and beggars arising numbers. Apriori Algorithm testing uses the RapidMiner application which is one of data mining processing software, including text analysis, extracting patterns from data sets and combining them with statistical methods, artificial intelligence, and databases to obtain high quality information from processed data. Based on the results of the said testing, it can be concluded that the level of accuracy test is low, i.e. 37.5%.

Download Full-text

A Data-Driven Based Dynamic Rebalancing Methodology for Bike Sharing Systems

Applied Sciences ◽

10.3390/app11156967 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6967

Author(s):

Marco Cipriano ◽

Luca Colomba ◽

Paolo Garza

Keyword(s):

Decision Making ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Real Data ◽

Frequent Pattern ◽

Data Driven ◽

System Usability ◽

Bike Sharing ◽

Data Driven Approach

Mobility in cities is a fundamental asset and opens several problems in decision making and the creation of new services for citizens. In the last years, transportation sharing systems have been continuously growing. Among these, bike sharing systems became commonly adopted. There exist two different categories of bike sharing systems: station-based systems and free-floating services. In this paper, we concentrate our analyses on station-based systems. Such systems require periodic rebalancing operations to guarantee good quality of service and system usability by moving bicycles from full stations to empty stations. In particular, in this paper, we propose a dynamic bicycle rebalancing methodology based on frequent pattern mining and its implementation. The extracted patterns represent frequent unbalanced situations among nearby stations. They are used to predict upcoming critical statuses and plan the most effective rebalancing operations using an entirely data-driven approach. Experiments performed on real data of the Barcelona bike sharing system show the effectiveness of the proposed approach.

Download Full-text

An Adaptive Data Distribution Through Tree Rules in Frequent Pattern Mining

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit183894 ◽

2018 ◽

pp. 300-305

Keyword(s):

Information Sharing ◽

Pattern Mining ◽

Data Distribution ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

General Development ◽

Secure Information ◽

Evaluation Parameters ◽

Secure Information Sharing

Information sharing among the associations is a general development in a couple of zones like business headway and exhibiting. As bit of the touchy principles that ought to be kept private may be uncovered and such disclosure of delicate examples may impacts the advantages of the association that have the data. Subsequently the standards which are delicate must be secured before sharing the data. In this paper to give secure information sharing delicate guidelines are bothered first which was found by incessant example tree. Here touchy arrangement of principles are bothered by substitution. This kind of substitution diminishes the hazard and increment the utility of the dataset when contrasted with different techniques. Examination is done on certifiable dataset. Results shows that proposed work is better as appear differently in relation to various past strategies on the introduce of evaluation parameters.

Download Full-text