An HSV-Based Visual Analytic System for Data Science on Music and Beyond

Author(s):  
Carson K.S. Leung ◽  
Yibin Zhang

In the current era of big data, high volumes of a wide variety of valuable data—which may be of different veracities—can be easily generated or collected at a high speed in various real-life applications related to art, culture, design, engineering, mathematics, science, and technology. A data science solution helps manage, analyze, and mine these big data—such as musical data—for the discovery of interesting information and useful knowledge. As “a picture is worth a thousand words,” a visual representation provided by the data science solution helps visualize the big data and comprehend the mined information and discovered knowledge. This journal article presents a visual analytic system—which uses a hue-saturation-value (HSV) color model to represent big data—for data science on musical data and beyond (e.g., other types of big data).

2013 ◽  
Vol 3 (4) ◽  
pp. 120-140 ◽  
Author(s):  
Carson K.S. Leung ◽  
Christopher L. Carmichael ◽  
Patrick Johnstone ◽  
David Sonny Hung-Cheung Yuen

In numerous real-life applications, large databases can be easily generated. Implicitly embedded in these databases is previously unknown and potentially useful knowledge such as frequently occurring sets of items, merchandise, or events. Different algorithms have been proposed for managing and retrieving useful information from these databases. Various algorithms have also been proposed for mining these databases to find frequent sets, which are usually presented in a lengthy textual list. As “a picture is worth a thousand words”, the use of visual representations can enhance user understanding of the inherent relationships among the mined frequent sets. Many of the existing visualizers were not designed to visualize these mined frequent sets. In this journal article, an interactive visual analytic system is proposed for providing visual analytic solutions to the frequent set mining problem. The system enables the management, visualization, and advanced analysis of the original transaction databases as well as the frequent sets mined from these databases.


2017 ◽  
Vol 7 (2) ◽  
Author(s):  
Dicky R. M. Nainggolan

<p><em><strong>Abstract</strong> – Data are the prominent elements in scientific researches and approaches. Data Science methodology is used to select and to prepare enormous numbers of data for further processing and analysing. Big Data technology collects vast amount of data from many sources in order to exploit the information and to visualise trend or to discover a certain phenomenon in the past, present, or in the future at high speed processing capability. Predictive analytics provides in-depth analytical insights and the emerging of machine learning brings the data analytics to a higher level by processing raw data with artificial intelligence technology. Predictive analytics and machine learning produce visual reports for decision makers and stake-holders. Regarding cyberspace security, big data promises the opportunities in order to prevent and to detect any advanced cyber-attacks by using internal and external security data.</em></p><p><br /><em><strong>Keywords</strong>: Big Data, Cyber Security, Data Science, Intelligence, Predictive Analytics</em></p><p><br /><em><strong>Abstrak</strong> – Data merupakan unsur terpenting dalam setiap penelitian dan pendekatan ilmiah. Metodologi sains data digunakan untuk memilah, memilih dan mempersiapkan sejumlah data untuk diproses dan dianalisis. Teknologi big data mampu mengumpulkan data dengan sangat banyak dari berbagai sumber dengan tujuan untuk mendapatkan informasi dengan visualisasi tren atau menyingkapkan pengetahuan dari suatu peristiwa yang terjadi baik dimasa lalu, sekarang, maupun akan datang dengan kecepatan pemrosesan data sangat tinggi. Analisis prediktif memberikan wawasan analisis lebih dalam dan kemunculan machine learning membawa analisis data ke tingkat yang lebih tinggi dengan bantuan teknologi kecerdasan buatan dalam tahap pemrosesan data mentah. Analisis prediktif dan machine learning menghasilkan laporan berbentuk visual untuk pengambil keputusan dan pemangku kepentingan. Berkenaan dengan keamanan siber, big data menjanjikan kesempatan dalam rangka untuk mencegah dan mendeteksi setiap serangan canggih siber dengan memanfaatkan data keamanan internal dan eksternal.</em></p><p><br /><strong>Kata Kunci</strong>: Analisis Prediktif, Big Data, Intelijen, Keamanan Siber, Sains Data</p>


2017 ◽  
Vol 7 (2) ◽  
Author(s):  
Dicky R. M. Nainggolan

<p><strong>Abstrak</strong> – Data merupakan unsur terpenting dalam setiap penelitian dan pendekatan ilmiah. Metodologi sains data digunakan untuk memilah, memilih dan mempersiapkan sejumlah data untuk diproses dan dianalisis. Teknologi big data mampu mengumpulkan data dengan sangat banyak dari berbagai sumber dengan tujuan untuk mendapatkan informasi dengan visualisasi tren atau menyingkapkan pengetahuan dari suatu peristiwa yang terjadi baik dimasa lalu, sekarang, maupun akan datang dengan kecepatan pemrosesan data sangat tinggi. Analisis prediktif memberikan wawasan analisis lebih dalam dan kemunculan machine learning membawa analisis data ke tingkat yang lebih tinggi dengan bantuan teknologi kecerdasan buatan dalam tahap pemrosesan data mentah. Analisis prediktif dan machine learning menghasilkan laporan berbentuk visual untuk pengambil keputusan dan pemangku kepentingan. Berkenaan dengan keamanan siber, big data menjanjikan kesempatan dalam rangka untuk mencegah dan mendeteksi setiap serangan canggih siber dengan memanfaatkan data keamanan internal dan eksternal.</p><p><br /><strong>Kata Kunci</strong>: analisis prediktif, big data, intelijen, keamanan siber, sains data</p><p><strong><em>Abstract</em> </strong>– Data are the prominent elements in scientific researches and approaches. Data Science methodology is used to select and to prepare enormous numbers of data for further processing and analysing. Big Data technology collects vast amount of data from many sources in order to exploit the information and to visualise trend or to discover a certain phenomenon in the past, present, or in the future at high speed processing capability. Predictive analytics provides in-depth analytical insights and the emerging of machine learning brings the data analytics to a higher level by processing raw data with artificial intelligence technology. Predictive analytics and machine learning produce visual reports for decision makers and stake-holders. Regarding cyberspace security, big data promises the opportunities in order to prevent and to detect any advanced cyber-attacks by using internal and external security data.</p><p><br /><strong><em>Keywords</em></strong>: big data, cyber security, data science, intelligence, predictive analytics</p>


Author(s):  
Michele Ianni ◽  
Elio Masciari ◽  
Giancarlo Sperlí

Abstract The pervasive diffusion of Social Networks (SN) produced an unprecedented amount of heterogeneous data. Thus, traditional approaches quickly became unpractical for real life applications due their intrinsic properties: large amount of user-generated data (text, video, image and audio), data heterogeneity and high speed generation rate. More in detail, the analysis of user generated data by popular social networks (i.e Facebook (https://www.facebook.com/), Twitter (https://www.twitter.com/), Instagram (https://www.instagram.com/), LinkedIn (https://www.linkedin.com/)) poses quite intriguing challenges for both research and industry communities in the task of analyzing user behavior, user interactions, link evolution, opinion spreading and several other important aspects. This survey will focus on the analyses performed in last two decades on these kind of data w.r.t. the dimensions defined for Big Data paradigm (the so called Big Data 6 V’s).


2019 ◽  
pp. 1-9
Author(s):  
Jerome Jourquin ◽  
Stephanie Birkey Reffey ◽  
Cheryl Jernigan ◽  
Mia Levy ◽  
Glendon Zinser ◽  
...  

Integrating different types of data, including electronic health records, imaging data, administrative and claims databases, large data repositories, the Internet of Things, genomics, and other omics data, is both a challenge and an opportunity that must be tackled head on. We explore some of the challenges and opportunities in optimizing data integration to accelerate breast cancer discovery and improve patient outcomes. Susan G. Komen convened three meetings (2015, 2017, and 2018) with various stakeholders to discuss challenges, opportunities, and next steps to enhance the use of big data in the field of breast cancer. Meeting participants agreed that big data approaches can enhance the identification of better therapies, improve outcomes, reduce disparities, and optimize precision medicine. One challenge is that databases must be shared, linked with each other, standardized, and interoperable. Patients want to be active participants in research and their own care, and to control how their data are used. Many patients have privacy concerns and do not understand how sharing their data can help to effectively drive discovery. Public education is essential, and breast cancer researchers who are skilled in using and analyzing big data are needed. Patient advocacy groups can play multiple roles to help maximize and leverage big data to better serve patients. Komen is committed to educating patients on big data issues, encouraging data sharing by all stakeholders, assisting in training the next generation of data science breast cancer researchers, and funding research projects that will use real-life data in real time to revolutionize the way breast cancer is understood and treated.


Author(s):  
Carson K.-S. Leung ◽  
Christopher L. Carmichael ◽  
Patrick Johnstone ◽  
Roy Ruokun Xing ◽  
David Sonny Hung-Cheung Yuen

High volumes of a wide variety of data can be easily generated at a high velocity in many real-life applications. Implicitly embedded in these big data is previously unknown and potentially useful knowledge such as frequently occurring sets of items, merchandise, or events. Different algorithms have been proposed for either retrieving information about the data or mining the data to find frequent sets, which are usually presented in a lengthy textual list. As “a picture is worth a thousand words”, the use of visual representations can enhance user understanding of the inherent relationships among the mined frequent sets. However, many of the existing visualizers were not designed to visualize these mined frequent sets. This book chapter presents an interactive next-generation visual analytic system. The system enables the management, visualization, and advanced analysis of the original big data and the frequent sets mined from the data.


Author(s):  
Shaveta Bhatia

 The epoch of the big data presents many opportunities for the development in the range of data science, biomedical research cyber security, and cloud computing. Nowadays the big data gained popularity.  It also invites many provocations and upshot in the security and privacy of the big data. There are various type of threats, attacks such as leakage of data, the third party tries to access, viruses and vulnerability that stand against the security of the big data. This paper will discuss about the security threats and their approximate method in the field of biomedical research, cyber security and cloud computing.


2020 ◽  
Author(s):  
Bankole Olatosi ◽  
Jiajia Zhang ◽  
Sharon Weissman ◽  
Zhenlong Li ◽  
Jianjun Hu ◽  
...  

BACKGROUND The Coronavirus Disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus (SARS-CoV-2) remains a serious global pandemic. Currently, all age groups are at risk for infection but the elderly and persons with underlying health conditions are at higher risk of severe complications. In the United States (US), the pandemic curve is rapidly changing with over 6,786,352 cases and 199,024 deaths reported. South Carolina (SC) as of 9/21/2020 reported 138,624 cases and 3,212 deaths across the state. OBJECTIVE The growing availability of COVID-19 data provides a basis for deploying Big Data science to leverage multitudinal and multimodal data sources for incremental learning. Doing this requires the acquisition and collation of multiple data sources at the individual and county level. METHODS The population for the comprehensive database comes from statewide COVID-19 testing surveillance data (March 2020- till present) for all SC COVID-19 patients (N≈140,000). This project will 1) connect multiple partner data sources for prediction and intelligence gathering, 2) build a REDCap database that links de-identified multitudinal and multimodal data sources useful for machine learning and deep learning algorithms to enable further studies. Additional data will include hospital based COVID-19 patient registries, Health Sciences South Carolina (HSSC) data, data from the office of Revenue and Fiscal Affairs (RFA), and Area Health Resource Files (AHRF). RESULTS The project was funded as of June 2020 by the National Institutes for Health. CONCLUSIONS The development of such a linked and integrated database will allow for the identification of important predictors of short- and long-term clinical outcomes for SC COVID-19 patients using data science.


Author(s):  
Leilah Santiago Bufrem ◽  
Fábio Mascarenhas Silva ◽  
Natanael Vitor Sobral ◽  
Anna Elizabeth Galvão Coutinho Correia

Introdução: A atual configuração da dinâmica relativa à produção e àcomunicação científicas revela o protagonismo da Ciência Orientada a Dados,em concepção abrangente, representada principalmente por termos como “e-Science” e “Data Science”. Objetivos: Apresentar a produção científica mundial relativa à Ciência Orientada a Dados a partir dos termos “e-Science” e “Data Science” na Scopus e na Web of Science, entre 2006 e 2016. Metodologia: A pesquisa está estruturada em cinco etapas: a) busca de informações nas bases Scopus e Web of Science; b) obtenção dos registros; bibliométricos; c) complementação das palavras-chave; d) correção e cruzamento dos dados; e) representação analítica dos dados. Resultados: Os termos de maior destaque na produção científica analisada foram Distributed computer systems (2006), Grid computing (2007 a 2013) e Big data (2014 a 2016). Na área de Biblioteconomia e Ciência de Informação, a ênfase é dada aos temas: Digital library e Open access, evidenciando a centralidade do campo nas discussões sobre dispositivos para dar acesso à informação científica em meio digital. Conclusões: Sob um olhar diacrônico, constata-se uma visível mudança de foco das temáticas voltadas às operações de compartilhamento de dados para a perspectiva analítica de busca de padrões em grandes volumes de dados.Palavras-chave: Data Science. E-Science. Ciência orientada a dados. Produção científica.Link:http://www.uel.br/revistas/uel/index.php/informacao/article/view/26543/20114


Sign in / Sign up

Export Citation Format

Share Document