scholarly journals Using Data Science Software to Address Health Disparities

Author(s):  
Jose O. Huerta ◽  
Gayle L. Prybutok ◽  
Victor R. Prybutok

The article assesses data science software to evaluate the usefulness of data science technology in addressing concerns such as health disparities. Data science software was analyzed using KDnuggets data related to analytics, data science, and machine learning software. Data science functionalities include computational processes and frameworks that are relevant for healthcare. This study demonstrates the importance of leading applications for conducting data science operations that can improve care in healthcare networks by addressing such factors as health disparities.

Author(s):  
Ihor Ponomarenko ◽  
Oleksandra Lubkovska

The subject of the research is the approach to the possibility of using data science methods in the field of health care for integrated data processing and analysis in order to optimize economic and specialized processes The purpose of writing this article is to address issues related to the specifics of the use of Data Science methods in the field of health care on the basis of comprehensive information obtained from various sources. Methodology. The research methodology is system-structural and comparative analyzes (to study the application of BI-systems in the process of working with large data sets); monograph (the study of various software solutions in the market of business intelligence); economic analysis (when assessing the possibility of using business intelligence systems to strengthen the competitive position of companies). The scientific novelty the main sources of data on key processes in the medical field. Examples of innovative methods of collecting information in the field of health care, which are becoming widespread in the context of digitalization, are presented. The main sources of data in the field of health care used in Data Science are revealed. The specifics of the application of machine learning methods in the field of health care in the conditions of increasing competition between market participants and increasing demand for relevant products from the population are presented. Conclusions. The intensification of the integration of Data Science in the medical field is due to the increase of digitized data (statistics, textual informa- tion, visualizations, etc.). Through the use of machine learning methods, doctors and other health professionals have new opportunities to improve the efficiency of the health care system as a whole. Key words: Data science, efficiency, information, machine learning, medicine, Python, healthcare.


Author(s):  
R. Suganya ◽  
Rajaram S. ◽  
Kameswari M.

Currently, thyroid disorders are more common and widespread among women worldwide. In India, seven out of ten women are suffering from thyroid problems. Various research literature studies predict that about 35% of Indian women are examined with prevalent goiter. It is very necessary to take preventive measures at its early stages, otherwise it causes infertility problem among women. The recent review discusses various analytics models that are used to handle different types of thyroid problems in women. This chapter is planned to analyze and compare different classification models, both machine learning algorithms and deep leaning algorithms, to classify different thyroid problems. Literature from both machine learning and deep learning algorithms is considered. This literature review on thyroid problems will help to analyze the reason and characteristics of thyroid disorder. The dataset used to build and to validate the algorithms was provided by UCI machine learning repository.


2014 ◽  
Vol 3 (4) ◽  
pp. 231
Author(s):  
Kevin Daimi ◽  
Shadi Banitaan

Depression is a disorder characterized by misery and gloominess felt over a period of time. Some symptoms of depression overlap with somatic illnesses implying considerable difficulty in diagnosing it. This paper contributes to its diagnosis through the application of data mining, namely classification, to predict patients who will most likely develop depression or are currently suffering from depression. Synthetic data is used for this study. To acquire the results, the popular suite of machine learning software, WEKA, is used.


Author(s):  
Akshata Kulkarni

Abstract: Officials around the world are using several COVID-19 outbreak prediction models to make educated decisions and enact necessary control measures. In this study, we developed a Machine Learning model which predicts and forecasts the COVID-19 outbreak in India, with the goal of determining the best regression model for an in-depth examination of the novel coronavirus. Based on data available from January 31 to October 31, 2020, collected from Kaggle, this model predicts the number of confirmed cases in Maharashtra. We're using a Machine Learning model to foresee the future trend of these situations. The project has the potential to demonstrate the importance of information dissemination in improving response time and planning ahead of time to help reduce risk.


2020 ◽  
Vol 862 ◽  
pp. 72-77
Author(s):  
José Alberto Guzmán Torres ◽  
Francisco Javier Domínguez Mota ◽  
Elia Mercedes Alonso-Guzmán ◽  
Wilfrido Martínez-Molina ◽  
José Gerardo Tinoco Ruiz ◽  
...  

The inclusion of additions to concrete blends helps to improve performance in certain conditions. The analysis of two concrete blends was performed, a blend with the addition of a natural organic polymer and a control blend to make predictive models and find a correlation. Tree tests were performed: Electrical resistivity (Er) test, Tensile strength (Ft) and Carbonation resistance. One of the most popular non-destructive tests on concrete is , due to the simplicity of measuring readings on concrete elements. It is a non-destructive test that determines the interconnectivity that exists in the concrete cementitious matrix by determining the quality of the concrete. The blend with the addition showed improved performance in all the tests. Data science techniques were used to generate artificial data, the Machine Learning technique (ML) is based on Tree regression (Tr) with satisfactory accuracy to assess the reliability.


2021 ◽  
pp. 167-175
Author(s):  
Hema Mahajan ◽  
Santosh Madeva Naik ◽  
Ch. Kannaiah ◽  
Shaik Meer Subhaniali

2019 ◽  
Author(s):  
Marc Bocquet ◽  
Julien Brajard ◽  
Alberto Carrassi ◽  
Laurent Bertino

Abstract. Recent progress in machine learning has shown how to forecast and, to some extent, learn the dynamics of a model from its output, resorting in particular to neural networks and deep learning techniques. We will show how the same goal can be directly achieved using data assimilation techniques without leveraging on machine learning software libraries, with a view to high-dimensional models. The dynamics of a model are learned from its observation and an ordinary differential equation (ODE) representation of this model is inferred using a recursive nonlinear regression. Because the method is embedded in a Bayesian data assimilation framework, it can learn from partial and noisy observations of a state trajectory of the physical model. Moreover, a space-wise local representation of the ODE system is introduced and is key to cope with high-dimensional models. It has recently been suggested that neural network architectures could be interpreted as dynamical systems. Reciprocally, we show that our ODE representations are reminiscent of deep learning architectures. Furthermore, numerical analysis considerations on stability shed light on the assets and limitations of the method. The method is illustrated on several chaotic discrete and continuous models of various dimensions, with or without noisy observations, with the goal to identify or improve the model dynamics, build a surrogate or reduced model, or produce forecasts from mere observations of the physical model.


2020 ◽  
Vol 24 (106) ◽  
pp. 79-87
Author(s):  
Fredy Humberto Troncoso Espinosa ◽  
Javiera Valentina Ruiz Tapia

La fuga de clientes es un problema relevante al que enfrentan las empresas de servicios y que les puede generar pérdidas económicas significativas. Identificar los elementos que llevan a un cliente a dejar de consumir un servicio es una tarea compleja, sin embargo, mediante su comportamiento es posible estimar una probabilidad de fuga asociada a cada uno de ellos. Esta investigación aplica minería de datos para la predicción de la fuga de clientes en una empresa de distribución de gas natural, mediante dos técnicas de machine learning: redes neuronales y support vector machine. Los resultados muestran que mediante la aplicación de estas técnicas es posible identificar los clientes con mayor probabilidad de fuga para tomar sobre estas acciones de retenciónoportunas y focalizadas, minimizando los costos asociados al error en la identificación de estos clientes. Palabras Clave: fuga de clientes, minería de datos, machine learning, distribución de gas natural. Referencias [1]J. Miranda, P. Rey y R. Weber, «Predicción de Fugas de Clientes para una Institución Financiera Mediante Support Vector Machines,» Revista Ingeniería de Sistemas Volumen XIX, pp. 49-68, 2005. [2]P. A. Pérez V., «Modelo de predicción de fuga de clientes de telefonía movil post pago,» Universidad de Chile, Santiago, Chile, 2014. [3]Gas Sur S.A., «https://www.gassur.cl/Quienes-Somos/,» [En línea]. [4]J. Xiao, X. Jiang, C. He y G. Teng, «Churn prediction in customer relationship management via GMDH-based multiple classifiers ensemble,» IEEE IntelligentSystems, vol. 31, nº 2, pp. 37-44, 2016. [5]A. M. Almana, M. S. Aksoy y R. Alzahrani, «A survey on data mining techniques in customer churn analysis for telecom industry,» International Journal of Engineering Research and Applications, vol. 4, nº 5, pp. 165-171, 2014. [6]A. Jelvez, M. Moreno, V. Ovalle, C. Torres y F. Troncoso, «Modelo predictivo de fuga de clientes utilizando mineríaa de datos para una empresa de telecomunicaciones en chile,» Universidad, Ciencia y Tecnología, vol. 18, nº 72, pp. 100-109, 2014. [7]D. Anil Kumar y V. Ravi, «Predicting credit card customer churn in banks using data mining,» International Journal of Data Analysis Techniques and Strategies, vol. 1, nº 1, pp. 4-28, 2008. [8]E. Aydoğan, C. Gencer y S. Akbulut, «Churn analysis and customer segmentation of a cosmetics brand using data mining techniques,» Journal of Engineeringand Natural Sciences, vol. 26, nº 1, 2008. [9]G. Dror, D. Pelleg, O. Rokhlenko y I. Szpektor, «Churn prediction in new users of Yahoo! answers,» de Proceedings of the 21st International Conference onWorld Wide Web, 2012. [10]T. Vafeiadis, K. Diamantaras, G. Sarigiannidis y K. Chatzisavvas, «A comparison of machine learning techniques for customer churn prediction,» SimulationModelling Practice and Theory, vol. 55, pp. 1-9, 2015. [11]Y. Xie, X. Li, E. Ngai y W. Ying, «Customer churn prediction using improved balanced random forests,» Expert Systems with Applications, vol. 36, nº 3, pp.5445-5449, 2009. [12]U. Fayyad, G. Piatetsky-Shapiro y P. Smyth, «Knowledge Discovery and Data Mining: Towards a Unifying Framework,» de KDD-96 Proceedings, 1996. [13]R. Brachman y T. Anand, «The process of knowledge discovery in databases,» de Advances in knowledge discovery and data mining, 1996. [14]K. Lakshminarayan, S. Harp, R. Goldman y T. Samad, «Imputation of Missing Data Using Machine Learning Techniques,» de KDD, 1996. [15]B. Nguyen , J. L. Rivero y C. Morell, «Aprendizaje supervisado de funciones de distancia: estado del arte,» Revista Cubana de Ciencias Informáticas, vol. 9, nº 2, pp. 14-28, 2015. [16]I. Monedero, F. Biscarri, J. Guerrero, M. Peña, M. Roldán y C. León, «Detection of water meter under-registration using statistical algorithms,» Journal of Water Resources Planning and Management, vol. 142, nº 1, p. 04015036, 2016. [17]I. Guyon y A. Elisseeff, «An introduction to variable and feature selection,» Journal of machine learning research, vol. 3, nº Mar, pp. 1157-1182, 2003. [18]K. Polat y S. Güneş, «A new feature selection method on classification of medical datasets: Kernel F-score feature selection,» Expert Systems with Applications, vol. 36, nº 7, pp. 10367-10373, 2009. [19]D. J. Matich, «Redes Neuronales. Conceptos Básicos y Aplicaciones,» de Cátedra: Informática Aplicada ala Ingeniería de Procesos- Orientación I, 2001. [20]E. Acevedo M., A. Serna A. y E. Serna M., «Principios y Características de las Redes Neuronales Artificiales, » de Desarrollo e Innovación en Ingeniería, Medellín, Editorial Instituto Antioqueño de Investigación, 2017, pp. Capítulo 10, 173-182. [21]M. Hofmann y R. Klinkenberg, RapidMiner: Data mining use cases and business analytics applications, CRC Press, 2016. [22]R. Pupale, «Towards Data Science,» 2018. [En línea]. Disponible: https://towardsdatascience.com/https-medium-com-pupalerushikesh-svm-f4b42800e989. [23]F. H. Troncoso Espinosa, «Prediction of recidivismin thefts and burglaries using machine learning,» Indian Journal of Science and Technology, vol. 13, nº 6, pp. 696-711, 2020. [24]L. Tashman, «Out-of-sample tests of forecasting accuracy: an analysis and review,» International journal of forecasting, vol. 16, nº 4, pp. 437-450, 2000. [25]S. Varma y R. Simon, «Bias in error estimation when using cross-validation for model selection,» BMC bioinformatics, vol. 7, nº 1, p. 91, 2006. [26]N. V. Chawla, K. W. Bowyer, L. O. Hall y W. Kegelmeyer, «SMOTE: Synthetic Minority Over-sampling Technique,» Journal of Artificial Inteligence Research16, pp. 321-357, 2002. [27]M. Sokolova y G. Lapalme, «A systematic analysis of performance measures for classification tasks,» Information processing & management, vol. 45, nº 4, pp. 427-437, 2009. [28]S. Narkhede, «Understanding AUC-ROC Curve,» Towards Data Science, vol. 26, 2018. [29]R. Westermann y W. Hager, «Error Probabilities in Educational and Psychological Research,» Journal of Educational Statistics, Vol 11, No 2, pp. 117-146, 1986.  


2022 ◽  
Author(s):  
Hannah Paris Cowley ◽  
Michael S. Robinette ◽  
Jordan K. Matelsky ◽  
Daniel Xenes ◽  
Aparajita Kashyap ◽  
...  

Abstract As clinicians are faced with a deluge of new information, data science can play a key role in highlighting key features towards developing new clinical hypotheses. Indeed, insights derived from machine learning can serve as a clinical support tool by connecting care providers with results from big data analysis to identify latent patterns that may not be easily detected by even skilled human observers. In this work, we show an example of collaboration between clinicians and data scientists during the COVID-19 pandemic, identifying subgroups of COVID-19 patients with unanticipated outcomes or who are high-risk for severe disease or death. We apply a random forest classifier model to predict adverse patient outcomes early in the disease course, and we connect our classification results to unsupervised clustering of patient features that may underpin patient risk. The paradigm for using data science for hypothesis generation and clinical decision support, as well as our triage classification approach and unsupervised clustering methods to determine patient cohorts, are applicable to driving rapid hypothesis generation and iteration in a variety of clinical challenges, including future public health crises.


Sign in / Sign up

Export Citation Format

Share Document