Shallow learning model for diagnosing neuro muscular disorder from splicing variants

Purpose Diagnosing genetic neuromuscular disorder such as muscular dystrophy is complicated when the imperfection occurs while splicing. This paper aims in predicting the type of muscular dystrophy from the gene sequences by extracting the well-defined descriptors related to splicing mutations. An automatic model is built to classify the disease through pattern recognition techniques coded in python using scikit-learn framework. Design/methodology/approach In this paper, the cloned gene sequences are synthesized based on the mutation position and its location on the chromosome by using the positional cloning approach. For instance, in the human gene mutational database (HGMD), the mutational information for splicing mutation is specified as IVS1-5 T > G indicates (IVS - intervening sequence or introns), first intron and five nucleotides before the consensus intron site AG, where the variant occurs in nucleotide G altered to T. IVS (+ve) denotes forward strand 3′– positive numbers from G of donor site invariant and IVS (−ve) denotes backward strand 5′ – negative numbers starting from G of acceptor site. The key idea in this paper is to spot out discriminative descriptors from diseased gene sequences based on splicing variants and to provide an effective machine learning solution for predicting the type of muscular dystrophy disease with the splicing mutations. Multi-class classification is worked out through data modeling of gene sequences. The synthetic mutational gene sequences are created, as the diseased gene sequences are not readily obtainable for this intricate disease. Positional cloning approach supports in generating disease gene sequences based on mutational information acquired from HGMD. SNP-, gene- and exon-based discriminative features are identified and used to train the model. An eminent muscular dystrophy disease prediction model is built using supervised learning techniques in scikit-learn environment. The data frame is built with the extracted features as numpy array. The data are normalized by transforming the feature values into the range between 0 and 1 aid in scaling the input attributes for a model. Naïve Bayes, decision tree, K-nearest neighbor and SVM learned models are developed using python library framework in scikit-learn. Findings To the best knowledge of authors, this is the foremost pattern recognition model, to classify muscular dystrophy disease pertaining to splicing mutations. Certain essential SNP-, gene- and exon-based descriptors related to splicing mutations are proposed and extracted from the cloned gene sequences. An eminent model is built using statistical learning technique through scikit-learn in the anaconda framework. This paper also deliberates the results of statistical learning carried out with the same set of gene sequences with synonymous and non-synonymous mutational descriptors. Research limitations/implications The data frame is built with the Numpy array. Normalizing the data by transforming the feature values into the range between 0 and 1 aid in scaling the input attributes for a model. Naïve Bayes, decision tree, K-nearest neighbor and SVM learned models are developed using python library framework in scikit-learn. While learning the SVM model, the cost, gamma and kernel parameters are tuned to attain good results. Scoring parameters of the classifiers are evaluated using tenfold cross-validation using metric functions of scikit-learn library. Results of the disease identification model based on non-synonymous, synonymous and splicing mutations were analyzed. Practical implications Certain essential SNP-, gene- and exon-based descriptors related to splicing mutations are proposed and extracted from the cloned gene sequences. An eminent model is built using statistical learning technique through scikit-learn in the anaconda framework. The performance of the classifiers are increased by using different estimators from the scikit-learn library. Several types of mutations such as missense, non-sense and silent mutations are also considered to build models through statistical learning technique and their results are analyzed. Originality/value To the best knowledge of authors, this is the foremost pattern recognition model, to classify muscular dystrophy disease pertaining to splicing mutations.

Download Full-text

Identification of Rare Genetic Disorder from Single Nucleotide Variants Using Supervised Learning Technique

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v6.i4.pp174-184 ◽

2017 ◽

Vol 6 (4) ◽

pp. 174

Author(s):

Sathyavikasini K ◽

Vijaya M S

Keyword(s):

Muscular Dystrophy ◽

Positional Cloning ◽

Genetic Disorders ◽

Genetic Disorder ◽

Gene Sequences ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Rare Genetic Disorder ◽

Learning Techniques ◽

Fold Cross Validation

Muscular dystrophy is a rare genetic disorder that affects the muscular system which deteriorates the skeletal muscles and hinders locomotion. In the finding of genetic disorders such as Muscular dystrophy, the disease is identified based on mutations in the gene sequence. A new model is proposed for classifying the disease accurately using gene sequences, mutated by adopting positional cloning on the reference cDNA sequence. The features of mutated gene sequences for missense, nonsense and silent mutations aims in distinguishing the type of disease and the classifiers are trained with commonly used supervised pattern learning techniques.10-fold cross validation results show that the decision tree algorithm was found to attain the best accuracy of 100%. In summary, this study provides an automatic model to classify the muscular dystrophy disease and shed a new light on predicting the genetic disorder from gene based features through pattern recognition model.

Download Full-text

Ankle Angle Prediction Using a Footwear Pressure Sensor and a Machine Learning Technique

Sensors ◽

10.3390/s21113790 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3790

Author(s):

Zachary Choffin ◽

Nathan Jeong ◽

Michael Callihan ◽

Savannah Olmstead ◽

Edward Sazonov ◽

...

Keyword(s):

Machine Learning ◽

Pressure Sensor ◽

Learning Algorithm ◽

Flexible Substrate ◽

Ankle Injuries ◽

Sensor System ◽

Measurement Unit ◽

K Nearest Neighbor ◽

Machine Learning Technique ◽

Learning Technique

Ankle injuries may adversely increase the risk of injury to the joints of the lower extremity and can lead to various impairments in workplaces. The purpose of this study was to predict the ankle angles by developing a footwear pressure sensor and utilizing a machine learning technique. The footwear sensor was composed of six FSRs (force sensing resistors), a microcontroller and a Bluetooth LE chipset in a flexible substrate. Twenty-six subjects were tested in squat and stoop motions, which are common positions utilized when lifting objects from the floor and pose distinct risks to the lifter. The kNN (k-nearest neighbor) machine learning algorithm was used to create a representative model to predict the ankle angles. For the validation, a commercial IMU (inertial measurement unit) sensor system was used. The results showed that the proposed footwear pressure sensor could predict the ankle angles at more than 93% accuracy for squat and 87% accuracy for stoop motions. This study confirmed that the proposed plantar sensor system is a promising tool for the prediction of ankle angles and thus may be used to prevent potential injuries while lifting objects in workplaces.

Download Full-text

Sediminihaliea albiluteola gen. nov., sp. nov., a new member of the family Halieaceae, isolated from marine sediment

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijsem.0.004959 ◽

2021 ◽

Vol 71 (8) ◽

Author(s):

Shan Jiang ◽

Feng-Bai Lian ◽

You-Yang Sun ◽

Xiao-Kui Zhang ◽

Zong-Jun Du

Keyword(s):

Type Species ◽

Phylogenetic Analyses ◽

Rrna Gene ◽

Gene Sequences ◽

Unidentified Phospholipid ◽

Respiratory Quinone ◽

Content Type ◽

Link Type ◽

The Family ◽

Sequence Similarities

A Gram-stain-negative, rod-shaped and facultatively aerobic bacterial strain, designated F7430T, was isolated from coastal sediment collected at Jingzi Wharf in Weihai, PR China. Cells of strain F7430T were 0.3–0.4 µm wide, 2.0–2.6 µm long, non-flagellated, non-motile and formed pale-beige colonies. Growth was observed at 4–40 °C (optimum, 30 °C), pH 6.0–9.0 (optimum, pH 7.5–8.0) and at NaCl concentrations of 1.0–10.0 % (w/v; optimum, 1.0 %). The sole respiratory quinone of strain F7430T was ubiquinone 8 and the predominant cellular fatty acids were summed feature 8 (C18 : 1 ω7c / C18 : 1 ω6c; 60.7 %), summed feature 3 (C16 : 1 ω7c/C16 : 1 ω6c; 30.2 %) and C15 : 0 iso (13.9 %). The polar lipids of strain F7430T consisted of diphosphatidylglycerol, phosphatidylethanolamine, phosphatidylglycerol, phosphatidylcholine, one unidentified phospholipid and three unidentified lipids. Results of 16S rRNA gene sequences analyses indicated that this strain belonged to the family Halieaceae and had high sequence similarities to Parahaliea aestuarii JCM 51547T (95.3 %) and Halioglobus pacificus DSM 27932T (95.2 %) followed by 92.9–95.0 % sequence similarities to other type species within the aforementioned family. The rpoB gene sequences analyses indicated that the novel strain had the highest sequence similarities to Parahaliea aestuarii JCM 51547T (82.2 %) and Parahaliea mediterranea DSM 21924T (82.2 %) followed by 75.2–80.5 % sequence similarities to other type species within this family. Phylogenetic analyses showed that strain F7430T constituted a monophyletic branch clearly separated from the other genera of family Halieaceae . Whole-genome sequencing of strain F7430T revealed a 3.3 Mbp genome size with a DNA G+C content of 52.6 mol%. The genome encoded diverse metabolic pathways including the Entner–Doudoroff pathway, assimilatory sulphate reduction and biosynthesis of dTDP-l-rhamnose. Based on results from the current polyphasic study, strain F7430T is proposed to represent a novel species of a new genus within the family Halieaceae , for which the name Sediminihaliea albiluteola gen. nov., sp. nov. is proposed. The type strain of the type species is F7430T (=KCTC 72873T=MCCC 1H00420T).

Download Full-text

Pseudomonas bijieensis sp. nov., isolated from cornfield soil

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijsem.0.004676 ◽

2019 ◽

Vol 71 (3) ◽

Cited By ~ 4

Author(s):

Jingling Liang ◽

Sai Wang ◽

Ayizekeranmu Yiming ◽

Luoyi Fu ◽

Iftikhar Ahmad ◽

...

Keyword(s):

16S Rrna ◽

Type Species ◽

Novel Species ◽

Guizhou Province ◽

Rrna Gene ◽

Gene Sequences ◽

Content Type ◽

Link Type ◽

Pr China ◽

Genome Comparisons

Strain L22-9T, a Gram-stain-negative and rod-shaped bacterium, motile by one polar flagellum, was isolated from cornfield soil in Bijie, Guizhou Province, PR China. Based on 16S rRNA gene sequences, it was identified as a Pseudomonas species. Multilocus sequence analysis of concatenated 16S rRNA, gyrB, rpoB and rpoD gene sequences showed that strain L22-9T formed a clearly separated branch, located in a cluster together with Pseudomonas brassicacearum LMG 21623T, Pseudomonas kilonensis DSM 13647T and Pseudomonas thivervalensis DSM 13194T. Whole-genome comparisons based on average nucleotide identity (ANI) and digital DNA–DNA hybridization (dDDH) confirmed that strain L22-9T should be classified as a novel species. It was most closely related to P. kilonensis DSM 13647T with ANI and dDDH values of 91.87 and 46.3 %, respectively. Phenotypic features that can distinguish strain L22-9T from P. kilonensis DSM 13647T are the assimilation ability of N-acetyl-d-glucosamine, poor activity of arginine dihydrolase and failure to ferment ribose and d-fucose. The predominant cellular fatty acids of strain L22-9T are C16 : 0, summed feature 3 (C16 : 1 ω6c and/or C16 : 1 ω7c) and summed feature 8 (C18 : 1 ω7c and/or C18 : 1 ω6c). The respiratory quinones consist of Q-9 and Q-8. The polar lipids are diphosphatidylglycerol, phosphatidylethanolamine, two unidentified phosphoglycolipids, two unidentified aminophospholipids and an unidentified glycolipid. Based on the evidence, we conclude that strain L22-9T represents a novel species, for which the name Pseudomonas bijieensis sp. nov. is proposed. The type strain is L22-9T (=CGMCC 1.18528T=LMG 31948T), with a DNA G+C content of 60.85 mol%.

Download Full-text

Reclassification of Parvularcula flava as Aquisalinus luteolus nom. nov. and emended description of the genus Aquisalinus

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijsem.0.005072 ◽

2021 ◽

Vol 71 (10) ◽

Author(s):

Jun-Jie Ying ◽

Zhi-Cheng Wu ◽

Yuan-Chun Fang ◽

Lin Xu ◽

Cong Sun

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Type Species ◽

Comparative Genomic ◽

Rrna Gene ◽

Gene Sequences ◽

16S Rrna Gene Sequences ◽

Content Type ◽

Link Type ◽

Type Strains

Parvularcula flava was proposed as a novel member of genus Parvularcula in 2016. Some time earlier, Aquisalinus flavus has been proposed as a novel species of a novel genus named Aquisalinus . When comparing the 16S rRNA gene sequences of type strains P. flava NH6-79T and A. flavus D11M-2T, they showed 97.9 % sequence identity, much higher than the sequence identities 92.7–94.3 % between P. flava NH6-79T and type strains in the genus Parvularcula , indicating that the later proposed novel taxon Parvularcula flava need reclassification. The phylogenetic trees based on 16S rRNA gene sequences and genome sequences both showed that P. flava NH6-79T and A. flavus D11M-2T formed a separated branch away from strains in the genera Parvularcula , Marinicaulis and Amphiplicatus . The average amino acid identity and average nucleotide identity values of P. flava NH6-79T and A. flavus D11M-2T were 87.9 and 85.0 %, respectively, much higher than the values between P. flava NH6-79T and other closely related type strains (54.3 %–58.1 % and 68.6–70.4 %, respectively). P. flava NH6-79T and A. flavus D11M-2T also contained summed feature 8 (C18 : 1 ω6c and/or C18 : 1 ω7c) and C16 : 0 as major fatty acids, distinguishing them from other closely related taxa. Based on the results of the phylogenetic, comparative genomic and phenotypic analyses, Parvularcula flava should be reclassified as Aquisalinus luteolus nom. nov. and the description of genus Aquisalinus is emended.

Download Full-text

Limosilactobacillus urinaemulieris sp. nov. and Limosilactobacillus portuensis sp. nov. isolated from urine of healthy women

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijsem.0.004726 ◽

2019 ◽

Vol 71 (3) ◽

Cited By ~ 7

Author(s):

Magdalena Ksiezarek ◽

Teresa Gonçalves Ribeiro ◽

Joana Rocha ◽

Filipa Grosso ◽

Svetlana Ugarcina Perovic ◽

...

Keyword(s):

Type Species ◽

Novel Species ◽

Rrna Gene ◽

Gram Stain ◽

Gene Sequences ◽

16S Rrna Gene Sequences ◽

Content Type ◽

Link Type ◽

Healthy Women ◽

Genome Distance

Two Gram-stain-positive strains, c9Ua_26_MT and c11Ua_112_MT, were isolated from voided urine samples from two healthy women. Comparative 16S rRNA gene sequences demonstrated that these novel strains were members of the genus Limosilactobacillus . Phylogenetic analysis based on pheS gene sequences and core genomes showed that each strain formed a separated branch and are closest to Limosilactobacillus vaginalis DSM 5837T. The average nucleotide identity (ANI) and Genome-to-Genome Distance Calculator (GGDC) values between c9Ua_26_MT and the closest relative DSM 5837T were 90.7 and 42.9 %, respectively. The ANI and GGDC values between c11Ua_112_MT and the closest relative DSM 5837T were 91.2 and 45.0 %, and those among the strains were 92.9% and 51,0 %, respectively. The major fatty acids were C12 : 0 (40.2 %), C16 : 0 (26.7 %) and C18 : 1 ω9c (17.7 %) for strain c9Ua_26_MT, and C18 : 1 ω9c (38.0 %), C16 : 0 (33.3 %) and C12 : 0 (17.6 %) for strain c11Ua_112_MT. The genomic DNA G+C content of strains c9Ua_26_MT and c11Ua_112_MT was 39.9 and 39.7 mol%, respectively. On the basis of the data presented here, strains c9Ua_26_MT and c11Ua_112_MT represent two novel species of the genus Limosilactobacillus , for which the names Limosilactobacillus urinaemulieris sp. nov. (c9Ua_26_MT=CECT 30144T=LMG 31899T) and Limosilactobacillus portuensis sp. nov. (c11Ua_112_MT=CECT 30145T=LMG 31898T) are proposed.

Download Full-text

Adaptive clothing features to support daily exercising needs of muscular dystrophy victimized women in Sri Lanka

Research Journal of Textile and Apparel ◽

10.1108/rjta-08-2020-0087 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Niromi Seram ◽

Rivini Mataraarachchi ◽

Thanuri Jayaneththi

Keyword(s):

Muscular Dystrophy ◽

Sri Lanka ◽

Research Effort ◽

Health And Safety ◽

Psychological Needs ◽

Daily Basis ◽

Structured Interviews ◽

Content Type ◽

Daily Exercise ◽

Custom Made

Purpose Exercising is a key approach adopted by muscular dystrophy patients to halt the weakening of muscles as it can eventually lead to serious immobility issues. Though it is essential to exercise on a daily basis for healthy living, there is no mention of any research effort in the current literature regarding the development of an apparel product for these mobility-affected patients that might assist them both in meeting their exercising needs and providing them some comfort in their daily living. Thus, this paper aims to focus on identifying the specific needs of muscular dystrophy victims and proposing special adaptive clothing solutions to support their daily exercise and mobility needs. Design/methodology/approach To achieve the objectives of this study, attention was focused on the muscular dystrophy afflicted women in Sri Lanka. Semi-structured interviews were conducted with the female victims of muscular dystrophy and their lifestyles were observed carefully; additional data were gathered by holding semi-structured interviews with their physiotherapists. Further, interviews were conducted with both garment technologists and fabric technologists too. Data gathered through these methods were analyzed qualitatively using the principles of thematic analysis and then aggregate conclusions were drawn. Findings It was observed that the patients were engaged in special activities such as exercising three times a day besides following their normal day-to-day activities to maintain and develop muscle strength. It soon became evident that these women found it difficult to perform their daily exercise routines with their regular clothing and were looking for custom made clothing they could wear all day long in comfort and avoid the problems that arose while exercising. The study specifies the requirements that must be met to satisfy both generic and specific needs. Considering all these aspects some adaptive clothing solutions were proposed to support daily exercising activity with respect to comfort, convenience, health and safety, as well as socio-cultural and psychological needs. Originality/value The area of fusing generic and specific features to support the daily exercising needs of muscular dystrophy victims is an untouched field of experimentation and being a need of the disabled, the present study marks a milestone on the way to a novel area of apparel design, besides exploring a new field of research.

Download Full-text

Sponge-Associated Bacteria Are Strictly Maintained in Two Closely Related but Geographically Distant Sponge Hosts

Applied and Environmental Microbiology ◽

10.1128/aem.05285-11 ◽

2011 ◽

Vol 77 (20) ◽

pp. 7207-7216 ◽

Cited By ~ 76

Author(s):

Naomi F. Montalvo ◽

Russell T. Hill

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Bacterial Communities ◽

Sponge Species ◽

Rrna Gene ◽

Gene Sequences ◽

Associated Bacteria ◽

Content Type ◽

Key Species ◽

Sponge Associated Bacteria

ABSTRACTThe giant barrel spongesXestospongiamutaandXestospongiatestudinariaare ubiquitous in tropical reefs of the Atlantic and Pacific Oceans, respectively. They are key species in their respective environments and are hosts to diverse assemblages of bacteria. These two closely related sponges from different oceans provide a unique opportunity to examine the evolution of sponge-associated bacterial communities. Mitochondrial cytochrome oxidase subunit I gene sequences fromX.mutaandX.testudinariashowed little divergence between the two species. A detailed analysis of the bacterial communities associated with these sponges, comprising over 900 full-length 16S rRNA gene sequences, revealed remarkable similarity in the bacterial communities of the two species. Both sponge-associated communities include sequences found only in the twoXestospongiaspecies, as well as sequences found also in other sponge species and are dominated by three bacterial groups,Chloroflexi,Acidobacteria, andActinobacteria. While these groups consistently dominate the bacterial communities revealed by 16S rRNA gene-based analysis of sponge-associated bacteria, the depth of sequencing undertaken in this study revealed clades of bacteria specifically associated with each of the twoXestospongiaspecies, and also with the genusXestospongia, that have not been found associated with other sponge species or other ecosystems. This study, comparing the bacterial communities associated with closely related but geographically distant sponge hosts, gives new insight into the intimate relationships between marine sponges and some of their bacterial symbionts.

Download Full-text

Pengenalan Gerak Manusia Menggunakan Algoritma Relevance Vector Machine pada MSRC-12 Dataset

JSAI (Journal Scientific and Applied Informatics) ◽

10.36085/jsai.v3i1.850 ◽

2020 ◽

Vol 3 (1) ◽

Author(s):

Vina Ayumi ◽

Erwin Dwika Putra

Keyword(s):

Machine Learning ◽

Data Processing ◽

Statistical Learning ◽

Gesture Recognition ◽

Learning Theory ◽

Relevance Vector Machine ◽

Motion Model ◽

Communication Tools ◽

Machine Learning Technique ◽

Learning Technique

Relevance vector machine is a popular machine learning technique that is motivated by statistical learning theory. RVM can be used for gesture recognition which is one of the communication tools used by humans. This study proposes an experiment using the Relevance Vector Machine (RVM) algorithm on gesture data from Microsoft Research Cambridge-12 (MSRC-12) as a proposed solution to overcome unbalanced problems in data processing. The results of the study are the accuracy for 1-person motion model reaches 100% and the lowest accuracy with 5 people the motion model reaches 96%. Graphically, the more people or models, the lower the algorithm's accuracy.

Download Full-text

Molecular Characterization of anEndozoicomonas-Like Organism Causing Infection in the King Scallop (Pecten maximusL.)

Applied and Environmental Microbiology ◽

10.1128/aem.00952-17 ◽

2017 ◽

Vol 84 (3) ◽

Cited By ~ 1

Author(s):

Irene Cano ◽

Ronny van Aerle ◽

Stuart Ross ◽

David W. Verner-Jeffreys ◽

Richard K. Paley ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Molecular Characterization ◽

Mass Mortality ◽

Rrna Gene ◽

Gene Sequences ◽

16S Rrna Gene Sequences ◽

Content Type ◽

The 16S Rrna Gene

ABSTRACTOne of the fastest growing fisheries in the UK is the king scallop (Pecten maximusL.), also currently rated as the second most valuable fishery. Mass mortality events in scallops have been reported worldwide, often with the causative agent(s) remaining uncharacterized. In May 2013 and 2014, two mass mortality events affecting king scallops were recorded in the Lyme Bay marine protected area (MPA) in Southwest England. Histopathological examination showed gill epithelial tissues infected with intracellular microcolonies (IMCs) of bacteria resemblingRickettsia-like organisms (RLOs), often with bacteria released in vascular spaces. Large colonies were associated with cellular and tissue disruption of the gills. Ultrastructural examination confirmed the intracellular location of these organisms in affected epithelial cells. The 16S rRNA gene sequences of the putative IMCs obtained from infected king scallop gill samples, collected from both mortality events, were identical and had a 99.4% identity to 16S rRNA gene sequences obtained from “CandidatusEndonucleobacter bathymodioli” and 95% withEndozoicomonasspecies.In situhybridization assays using 16S rRNA gene probes confirmed the presence of the sequenced IMC gene in the gill tissues. Additional DNA sequences of the bacterium were obtained using high-throughput (Illumina) sequencing, and bioinformatic analysis identified over 1,000 genes with high similarity to protein sequences fromEndozoicomonasspp. (ranging from 77 to 87% identity). Specific PCR assays were developed and applied to screen for the presence of IMC 16S rRNA gene sequences in king scallop gill tissues collected at the Lyme Bay MPA during 2015 and 2016. There was 100% prevalence of the IMCs in these gill tissues, and the 16S rRNA gene sequences identified were identical to the sequence found during the previous mortality event.IMPORTANCEMolluscan mass mortalities associated with IMCs have been reported worldwide for many years; however, apart from histological and ultrastructural characterization, characterization of the etiological agents is limited. In the present work, we provide detailed molecular characterization of anEndozoicomonas-like organism (ELO) associated with an important commercial scallop species.

Download Full-text