Methodology for Exploring Patterns of Epigenetic Information in Cancer Cells Using Data Mining Technique

Epigenetic changes are a necessary characteristic of all cancer types. Tumor cells usually target genetic changes and epigenetic alterations as well. It is most beneficial to identify epigenetic similar features among cancer various types to be able to discover the appropriate treatments. The existence of epigenetic alteration profiles can aid in targeting this goal. In this paper, we propose a new technique applying data mining and clustering methodologies for cancer epigenetic changes analysis. The proposed technique aims to detect common patterns of epigenetic changes in various cancer types. We demonstrated the validation of the new technique by detecting epigenetic patterns across seven cancer types and by determining epigenetic similarities among various cancer types. The experimental results demonstrate that common epigenetic patterns do exist across these cancer types. Additionally, epigenetic gene analysis performed on the associated genes found a strong relationship with the development of various types of cancer and proved high risk across the studied cancer types. We utilized the frequent pattern data mining approach to represent cancer types compactly in the promoters for some epigenetic marks. Utilizing the built frequent pattern item set, the most frequent items are identified and yield the group of the bi-clusters of these patterns. Experimental results of the proposed method are shown to have a success rate of 88% in detecting cancer types according to specific epigenetic pattern.

Download Full-text

MINING SIMILAR PATTERN WITH ATTRIBUTE ORIENTED INDUCTION HIGH LEVEL EMERGING PATTERN (AOI-HEP) DATA MINING TECHNIQUE

Jurnal Teknologi ◽

10.11113/jt.v79.11876 ◽

2017 ◽

Vol 79 (7-2) ◽

Author(s):

Harco Leslie Hendric Spits Warnars ◽

Nizirwan Anwar ◽

Richard Randriatoamanana ◽

Horacio Emilio Perez Sanchez

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Data Mining ◽

Similar Pattern ◽

Frequent Pattern ◽

Data Mining Technique ◽

Mining Technique ◽

High Level ◽

Concept Attribute ◽

Emerging Pattern

AOI-HEP (Attribute Oriented Induction High Emerging Pattern) as new data mining technique has been success to mine frequent pattern and is extended to mine similar patterns. AOI-HEP is success to mine 3 and 1 similar patterns from IPUMS and breast cancer UCI machine learning datasets respectively. Meanwhile, the experiments showed that there was no finding similar patterns on adult and census UCI machine learning datasets. The experiments showed that finding AOI-HEP similar pattern in dataset is influenced by learning on chosen high level concept attribute in concept hierarchy and it is applied to AOI-HEP frequent pattern in previous research as well. The experiments chosed high level concept attributes such as workclass, clump thickness, means and marts for adult, breast cancer, census and IPUMS datasets respectively. In order to proof that the chosen high level concept attribute will influences the AOI-HEP similar pattern in dataset, then extended experiments were carried on and the finding were census dataset which had been none AOI-HEP similar pattern, had AOI-HEP similar pattern when learned on high level concept in marital attribute. Meanwhile, Breast cancer which had been had 1 AOI-HEP similar pattern, had none AOI-HEP similar pattern when learned on high level concept in attributes such as cell size, cell shape and bare nuclei. The 2 of 3 finding Similar patterns in IPUMS dataset have strong discriminant rule since having large growth rates such as 1.53% and 3.47%, and having large supports in target dataset such as 4.54% and 5.45 respectively. Moreover, there have small supports in contrasting dataset such as 2.96% and 1.57% respectively.

Download Full-text

Research of Improved FP-Growth Algorithm in Association Rules Mining

Scientific Programming ◽

10.1155/2015/910281 ◽

2015 ◽

Vol 2015 ◽

pp. 1-6 ◽

Cited By ~ 10

Author(s):

Yi Zeng ◽

Shiqun Yin ◽

Jiangyue Liu ◽

Miao Zhang

Keyword(s):

Data Mining ◽

Association Rules ◽

Experimental Results ◽

Frequent Pattern ◽

Association Rules Mining ◽

Classical Algorithm ◽

Pattern Growth ◽

Data Volume ◽

Better Than

Association rules mining is an important technology in data mining. FP-Growth (frequent-pattern growth) algorithm is a classical algorithm in association rules mining. But the FP-Growth algorithm in mining needs two times to scan database, which reduces the efficiency of algorithm. Through the study of association rules mining and FP-Growth algorithm, we worked out improved algorithms of FP-Growth algorithm—Painting-Growth algorithm and N (not) Painting-Growth algorithm (removes the painting steps, and uses another way to achieve). We compared two kinds of improved algorithms with FP-Growth algorithm. Experimental results show that Painting-Growth algorithm is more than 1050 and N Painting-Growth algorithm is less than 10000 in data volume; the performance of the two kinds of improved algorithms is better than that of FP-Growth algorithm.

Download Full-text

An application of data mining techniques in designing catalogue for a laundry service

MATEC Web of Conferences ◽

10.1051/matecconf/201815401099 ◽

2018 ◽

Vol 154 ◽

pp. 01099

Author(s):

Annisa Uswatun Khasanah ◽

Deliana Ardhitama Erlangga ◽

Ahmad Mustopa Jamil

Keyword(s):

Data Mining ◽

Association Rule ◽

Strong Relationship ◽

Target Market ◽

Rule Mining ◽

Data Mining Technique ◽

Mining Technique ◽

Interesting Pattern ◽

Customer Segment ◽

The Media

Catalogues are the media that companies use to promote their products or services. Since catalogue is one of marketing media, the first essential step before designing product catalogue is determining the market target. Besides, it is also important to put some information that appeal to the target market, such as discount or promos by analysing customer pattern preferences in using services or buying product. This study conduct two data mining technique. The first is clustering analysis to segment customer and the second one is association rule mining to discover an interesting pattern about the services that commonly used by the customer at the same service time. Thus, the results will be used as a recommendation to make an attractive marketing strategy to be put in the service catalogue promo for a laundry in Sleman Yogyakarta. The clustering result showed that the biggest customer segment is university student who come 3 until 5 times in a month on weekends, while the association rule result showed that clothes, shoes, and bed sheet have strong relationship. The catalogue design is presented in the end of the paper.

Download Full-text

FP-outlier: Frequent pattern based outlier detection

Computer Science and Information Systems ◽

10.2298/csis0501103h ◽

2005 ◽

Vol 2 (1) ◽

pp. 103-118 ◽

Cited By ~ 86

Author(s):

Zengyou He ◽

Xiaofei Xu ◽

Zhexue Huang ◽

Shengchun Deng

Keyword(s):

Data Mining ◽

Outlier Detection ◽

Frequent Itemsets ◽

Research Community ◽

Experimental Results ◽

New Method ◽

Frequent Pattern ◽

Data Detection ◽

Frequent Patterns ◽

Data Set

An outlier in a dataset is an observation or a point that is considerably dissimilar to or inconsistent with the remainder of the data. Detection of such outliers is important for many applications and has recently attracted much attention in the data mining research community. In this paper, we present a new method to detect outliers by discovering frequent patterns (or frequent itemsets) from the data set. The outliers are defined as the data transactions that contain less frequent patterns in their itemsets. We define a measure called FPOF (Frequent Pattern Outlier Factor) to detect the outlier transactions and propose the FindFPOF algorithm to discover outliers. The experimental results have shown that our approach outperformed the existing methods on identifying interesting outliers.

Download Full-text

A Study on the Analysis of Employment Decision Factor of the Visually Impaired using Data Mining Technique

Disability & Employment ◽

10.15707/disem.2013.23.1.011 ◽

2013 ◽

Vol 23 (1) ◽

pp. 273-302 ◽

Cited By ~ 6

Author(s):

임은정 ◽

신현욱 ◽

김성진

Keyword(s):

Data Mining ◽

Visually Impaired ◽

Data Mining Technique ◽

Mining Technique ◽

Employment Decision ◽

Using Data

Download Full-text

Retrieving Information and Discovering Knowledge from Unstructured Data Using Big Data Mining Technique: Heavy Oil Fields Example

10.2523/17805-ms ◽

2014 ◽

Cited By ~ 1

Author(s):

Wenkuang Wu ◽

Xiaoguang Lu ◽

Ben Cox ◽

Guoqiang Li ◽

Lihua Lin ◽

...

Keyword(s):

Data Mining ◽

Big Data ◽

Heavy Oil ◽

Oil Fields ◽

Unstructured Data ◽

Data Mining Technique ◽

Big Data Mining ◽

Mining Technique

Download Full-text

THE ROLE OF THE GENETIC ABNORMALITIES, EPIGENETIC AND microRNA IN THE PROGNOSIS OF CHRONIC LYMPHOCYTIC LEUKEMIA

Experimental Oncology ◽

10.31768/2312-8852.2018.40(4):261-267 ◽

2018 ◽

Vol 40 (4) ◽

pp. 261-267 ◽

Cited By ~ 5

Author(s):

K Tari ◽

Z Shamsi ◽

H Reza Ghafari ◽

A Atashi ◽

M Shahjahani ◽

...

Keyword(s):

B Cells ◽

Chronic Lymphocytic Leukemia ◽

Kinase Inhibitors ◽

Bone Marrow Involvement ◽

P53 Mutation ◽

Lymphocytic Leukemia ◽

Gene Promoters ◽

Epigenetic Changes ◽

Genetic Changes ◽

Trisomy 12

Chronic lymphocytic leukemia (CLL) is increased proliferation of B-cells with peripheral blood and bone marrow involvement, which is usually observed in older people. Genetic mutations, epigenetic changes and miRs play a role in CLL pathogenesis. Del 11q, del l17q, del 6q, trisomy 12, p53 and IgVH mutations are the most important genetic changes in CLL. Deletion of miR-15a and miR-16a can increase bcl2 gene expression, miR-29 and miR-181 deletions decrease the expression of TCL1, and miR-146a deletion prevents tumor metastasis. Epigenetic changes such as hypo- and hypermethylation, ubiquitination, hypo- and hyperacetylation of gene promoters involved in CLL pathogenesis can also play a role in CLL. Expression of CD38 and ZAP70, presence or absence of mutation in IgVH and P53 mutation are among the factors involved in CLL prognosis. Use of monoclonal antibodies against surface markers of B-cells like anti-CD20 as well as tyrosine kinase inhibitors are the most important therapeutic approaches for CLL.

Download Full-text

OFCOD: On the Fly Clustering Based Outlier Detection Framework

Data ◽

10.3390/data6010001 ◽

2020 ◽

Vol 6 (1) ◽

pp. 1

Author(s):

Ahmed Elmogy ◽

Hamada Rizk ◽

Amany M. Sarhan

Keyword(s):

Data Mining ◽

Image Processing ◽

Intrusion Detection ◽

Real Time ◽

Outlier Detection ◽

Real World ◽

Medical Data ◽

Experimental Results ◽

Real Time Applications ◽

Real World Datasets

In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics.

Download Full-text

Know Your Stars Before They Fall Apart: A Social Network Analysis of Telecom Industry to Foster Employee Retention using Data Mining Technique

IEEE Access ◽

10.1109/access.2021.3050327 ◽

2021 ◽

pp. 1-1

Author(s):

Sundus Younis ◽

Ali Ahsan

Keyword(s):

Data Mining ◽

Social Network ◽

Social Network Analysis ◽

Network Analysis ◽

Employee Retention ◽

Data Mining Technique ◽

Mining Technique ◽

Telecom Industry ◽

Using Data

Download Full-text

A Technique for the Laboratory Determination of Recirculation in Single Needle Dialysis

The International Journal of Artificial Organs ◽

10.1177/039139889301600202 ◽

1993 ◽

Vol 16 (2) ◽

pp. 63-70 ◽

Cited By ~ 3

Author(s):

N.A. Hoenich ◽

P.T. Smirthwaite ◽

C. Woffindin ◽

P. Lancaster ◽

T.H. Frost ◽

...

Keyword(s):

Experimental Data ◽

Flow Rate ◽

Experimental Results ◽

New Technique ◽

Treatment Efficiency ◽

Single Lumen ◽

Theoretical Predictions ◽

Lumen Catheter ◽

A New Technique

Recirculation is an important factor in single needle dialysis and, if high, can compromise treatment efficiency. To provide information regarding recirculation characteristics of access devices used in single needle dialysis, we have developed a new technique to characterise recirculation and have used this to measure the recirculation of a Terumo 15G fistula needle and a VasCath SC2300 single lumen catheter. The experimentally obtained results agreed well with those established clinically (8.5 ± 2.4% and 18.4 ± 3.4%). The experimental results have also demonstrated a dependence on access type, pump speeds and fistula flow rate. A comparison of experimental data with theoretical predictions showed that the latter exceeded those measured with the largest contribution being due to the experimental fistula.

Download Full-text