sequence rules
Recently Published Documents


TOTAL DOCUMENTS

38
(FIVE YEARS 3)

H-INDEX

8
(FIVE YEARS 0)

2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Jiawei Li

Aiming at the difficulty in setting the support threshold for sequential pattern mining algorithms and improving the effectiveness of the support threshold setting without the guidance of domain experts’ experience, an improved SPADE (sequential pattern discovery using equivalence classes) algorithm is proposed. By analyzing the relationship between the number of frequent sequences and the support threshold, the support threshold is dynamically selected. Using the electronic medical record data from a medical centre, the time-series relationship of the drugs taken by hypertension patients was extracted as the drug sequence dataset. By determining the optimal support threshold of the dataset, the frequent sequence set is mined, and the sequence rules are generated from the obtained regular sequences to visualize the sequence rules. The sequence rules of medication for hypertensive patients were combined with the patients’ physical indicators for the recommendation. For patients with obstetric hypertension, a combination of nifedipine and captopril is recommended. Through the comparison of the observation group and control group, we study the curative effect of various drugs. The results showed that the total effective rate of the observation group was about 96.6%; compared with the control group, the result indicated that the difference was significant ( P  < 0.05). The comparison of blood pressure levels between the two groups after treatment also showed that the results of the observation group were ideal ( P  < 0.05). In addition, the incidence of postpartum haemorrhage and perinatal complications in the observation group was also significantly reduced ( P  < 0.05). Therefore, the combination of medication for pregnancy hypertension syndrome can effectively improve the treatment effect of the disease and reduce the rate of postpartum haemorrhage and the incidence of perinatal complications.


2020 ◽  
Author(s):  
Qing Yang ◽  
Ting Luo ◽  
Wei Zhang ◽  
Xiaorong Zhong ◽  
Ping He ◽  
...  

Abstract Background: Due to the multidimensional, multilayered, and chronological order of the cancer data in this study, it was challenging for us to extract treatment paths. Therefore, it was necessary to design a new data mining scheme to effectively extract the treatment path of breast cancer. To determine whether the cSPADE algorithm and system clustering proposed in this study can effectively identify the treatment pathways for early breast cancer. Methods: We applied data mining technology to the electronic medical records of 6891 early breast cancer patients to mine treatment pathways. We provided a method of extracting data from EMR and performed three-stage mining: determining the treatment stage through the cSPADE algorithm → system clustering for treatment plan extraction → cSPADE mining sequence pattern for treatment. The Kolmogorov-Smirnov test and correlation analysis were used to cross-validate the sequence rules of early breast cancer treatment pathways.Results: We unearthed 55 sequence rules for early breast cancer treatment, 3 preoperative neoadjuvant chemotherapy regimens, 3 postoperative chemotherapy regimens, and 2 chemotherapy regimens for patients without surgery. Through 5-fold cross-validation, Pearson and Spearman correlation tests were performed. At the significance level of P <0.05, all correlation coefficients of support, confidence and lift were greater than 0.89. Using the Kolmogorov-Smirnov test, we found no significant differences between the sequence distributions.Conclusions: The cSPADE algorithm combined with system clustering can achieve hierarchical and vertical mining of breast cancer treatment models. By uncovering the treatment pathways of early breast cancer patients by this method, the real-world breast cancer treatment behavior model can be evaluated, and it can provide a reference for the redesign and optimization of the treatment pathways.


2019 ◽  
Vol 19 (2) ◽  
pp. 95
Author(s):  
Lestari Fidi Astuti ◽  
Kiswara Agung Santoso ◽  
Ahmad Kamsyakawuni

Affine cipher is a classic cryptographic algorithm substitution technique. Substitution technique is the encryption process for every character in the plaintext will be subtituted by another character. Affine cipher uses two types of keys. Each character of plaintext to be encrypted substituted by the same key. This research discusses about modify one of the key affine cipher, to produce a different key that will be substituted with each plaintext character. Key modifications are made by the Fibonacci sequence rules. This study also compares affine cipher and key modification affine cipher by finding corelation coeffiecient values. The results obtained from the comparison of the two algorithms, encryption that uses affine cipher key modification is better than affine cipher. Keywords: Cryptography, Affine Cipher, Fibonacci, Correlation Value


2018 ◽  
Author(s):  
Xiyao Long ◽  
Jeliazko R Jeliazkov ◽  
Jeffrey J Gray

Antibodies are proteins generated by the adaptive immune system to recognize and counteract a plethora of pathogens through specific binding. This adaptive binding is mediated by structural diversity in the six complementary determining region (CDR) loops (H1, H2, H3, L1, L2 and L3), which also makes accurate structural modeling of CDRs challenging. Both homology and de novo modeling approaches have been used; to date, the former has achieved greater accuracy for the non-H3 loops. The better performance of homology modeling in non-H3 CDRs is due to the fact that most of the non-H3 CDR loops of the same length and type can be grouped into a few structural clusters. Most antibody-modeling suites utilize homology modeling for the non-H3 CDRs, differing only in the alignment algorithm and how/if they utilize structural clusters. While RosettaAntibody and SAbPred do not explicitly assign query CDR sequences to clusters, two other approaches, PIGS and Kotai Antibody Builder, utilize sequence-based rules to assign CDR sequences to clusters. While the manually curated sequence rules can identify better structural templates, because their curation requires extensive literature search and human effort, they lag behind the deposition of new antibody structures and are infrequently updated. In this study, we propose a machine learning approach (Gradient Boosting Machine [GBM]) to learn the structural clusters of non-H3 CDRs from sequence alone. We argue the GBM method gives simplicity in feature selection and immediate integration of new data compared to manual sequence rules curation. We compare the classification results using the GBM method to that of RosettaAntibody in a 3-repeat 10-fold cross-validation scheme on the cluster-annotated antibody database PyIgClassify and we observe an improvement in the classification accuracy from 78.8±0.2% to 85.1±0.2%. We find the GBM models can reduce the errors in specific cluster membership misclassifications if the involved clusters have relatively abundant data. Based on the factors identified, we suggest methods that can enrich structural classes with sparse data can possibly further improve prediction accuracy in future studies.


Author(s):  
Xiyao Long ◽  
Jeliazko R Jeliazkov ◽  
Jeffrey J Gray

Antibodies are proteins generated by the adaptive immune system to recognize and counteract a plethora of pathogens through specific binding. This adaptive binding is mediated by structural diversity in the six complementary determining region (CDR) loops (H1, H2, H3, L1, L2 and L3), which also makes accurate structural modeling of CDRs challenging. Both homology and de novo modeling approaches have been used; to date, the former has achieved greater accuracy for the non-H3 loops. The better performance of homology modeling in non-H3 CDRs is due to the fact that most of the non-H3 CDR loops of the same length and type can be grouped into a few structural clusters. Most antibody-modeling suites utilize homology modeling for the non-H3 CDRs, differing only in the alignment algorithm and how/if they utilize structural clusters. While RosettaAntibody and SAbPred do not explicitly assign query CDR sequences to clusters, two other approaches, PIGS and Kotai Antibody Builder, utilize sequence-based rules to assign CDR sequences to clusters. While the manually curated sequence rules can identify better structural templates, because their curation requires extensive literature search and human effort, they lag behind the deposition of new antibody structures and are infrequently updated. In this study, we propose a machine learning approach (Gradient Boosting Machine [GBM]) to learn the structural clusters of non-H3 CDRs from sequence alone. We argue the GBM method gives simplicity in feature selection and immediate integration of new data compared to manual sequence rules curation. We compare the classification results using the GBM method to that of RosettaAntibody in a 3-repeat 10-fold cross-validation scheme on the cluster-annotated antibody database PyIgClassify and we observe an improvement in the classification accuracy from 78.8±0.2% to 85.1±0.2%. We find the GBM models can reduce the errors in specific cluster membership misclassifications if the involved clusters have relatively abundant data. Based on the factors identified, we suggest methods that can enrich structural classes with sparse data can possibly further improve prediction accuracy in future studies.


2018 ◽  
Author(s):  
Robert M. Hanson ◽  
John Mayfield ◽  
Mikko J. Vainio ◽  
Andrey Yerin ◽  
Dmitry Redkin ◽  
...  

<div> <div> <div> <p>The most recent version of the Cahn-Ingold-Prelog rules for the determination of stereodescriptors as described in Nomenclature of Organic Chemistry: IUPAC Recommendations and Preferred Names 2013 (the “Blue Book”) were analyzed by an international team of cheminformatics software developers. Algorithms for machine implementation were designed, tested, and cross-validated. Deficiencies in Sequence Rules 1b and 2 were found, and proposed language for their modification is presented. A concise definition of an additional rule (“Rule 6,” below) is proposed, which succinctly covers several cases only tangentially mentioned in the 2013 recommendations. Each rule is discussed from the perspective of machine implementation. The four resultant implementations are supported by validation suites in 2D and 3D SDF format as well as SMILES. The validation suites include all significant examples in Chapter 9 of the Blue Book, as well as several additional structures that highlight more complex aspects of the rules not addressed or not clearly analyzed in that work. These additional structures support a case for the need for modifications of the Sequence Rules. </p> </div> </div> </div>


2018 ◽  
Author(s):  
Robert M. Hanson ◽  
John Mayfield ◽  
Mikko J. Vainio ◽  
Andrey Yerin ◽  
Dmitry Redkin ◽  
...  

<div> <div> <div> <p>The most recent version of the Cahn-Ingold-Prelog rules for the determination of stereodescriptors as described in Nomenclature of Organic Chemistry: IUPAC Recommendations and Preferred Names 2013 (the “Blue Book”) were analyzed by an international team of cheminformatics software developers. Algorithms for machine implementation were designed, tested, and cross-validated. Deficiencies in Sequence Rules 1b and 2 were found, and proposed language for their modification is presented. A concise definition of an additional rule (“Rule 6,” below) is proposed, which succinctly covers several cases only tangentially mentioned in the 2013 recommendations. Each rule is discussed from the perspective of machine implementation. The four resultant implementations are supported by validation suites in 2D and 3D SDF format as well as SMILES. The validation suites include all significant examples in Chapter 9 of the Blue Book, as well as several additional structures that highlight more complex aspects of the rules not addressed or not clearly analyzed in that work. These additional structures support a case for the need for modifications of the Sequence Rules. </p> </div> </div> </div>


2017 ◽  
Vol 13 (1) ◽  
pp. 36-50 ◽  
Author(s):  
Haitao Zhang ◽  
Zewei Chen ◽  
Zhao Liu ◽  
Yunhong Zhu ◽  
Chenxue Wu

Analyzing large-scale spatial-temporal anonymity sets can benefit many LBS applications. However, traditional spatial-temporal data mining algorithms cannot be used for anonymity datasets because the uncertainty of anonymity datasets renders those algorithms ineffective. In this paper, the authors adopt the uncertainty of anonymity datasets and propose a probabilistic method for mining sequence rules (PMSR) from sequences of LBS cloaking regions generated from a series of LBS continuous queries. The main concept of the method is that it designs a probabilistic measurement of a support value of a sequence rule, and the implementation principle of the method is to iteratively achieve sequence rules. Finally, the authors conduct extensive experiments, and the results show that, compared to the non-probabilistic method, their proposed method has a significant matching ratio when the mined sequence rules are used as predictors, while the average accuracy of the sequence rules is comparable and computing performance is only slightly decreased.


2016 ◽  
Vol 33 (2) ◽  
pp. 325-325
Author(s):  
Roberta Siciliano ◽  
Antonio D’Ambrosio ◽  
Massimo Aria ◽  
Sonia Amodio
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document