Integrating Big Data Services Into an Undergraduate MIS Curriculum

Author(s):  
Scott Jensen

There is an insatiable demand in industry for data scientists, and graduate programs and certificates are gearing up to meet this demand. However, there is agreement in the industry that 80% of a data scientist's work consists of the transformation and profiling aspects of wrangling Big Data; work that may not require an advanced degree. In this paper, the authors present hands-on exercises to introduce Big Data to undergraduate MIS students using the CoNVO Framework and Big Data tools to scope a data problem and then wrangle the data to answer questions using a real-world dataset. This can provide undergraduates with a single course introduction to an important aspect of data science.

Author(s):  
Gurdeep S Hura

This chapter presents this new emerging technology of social media and networking with a detailed discussion on: basic definitions and applications, how this technology evolved in the last few years, the need for dynamicity under data mining environment. It also provides a comprehensive design and analysis of popular social networking media and sites available for the users. A brief discussion on the data mining methodologies for implementing the variety of new applications dealing with huge/big data in data science is presented. Further, an attempt is being made in this chapter to present a new emerging perspective of data mining methodologies with its dynamicity for social networking media and sites as a new trend and needed framework for dealing with huge amount of data for its collection, analysis and interpretation for a number of real world applications. A discussion will also be provided for the current and future status of data mining of social media and networking applications.


Author(s):  
Gurdeep S Hura

This chapter presents this new emerging technology of social media and networking with a detailed discussion on: basic definitions and applications, how this technology evolved in the last few years, the need for dynamicity under data mining environment. It also provides a comprehensive design and analysis of popular social networking media and sites available for the users. A brief discussion on the data mining methodologies for implementing the variety of new applications dealing with huge/big data in data science is presented. Further, an attempt is being made in this chapter to present a new emerging perspective of data mining methodologies with its dynamicity for social networking media and sites as a new trend and needed framework for dealing with huge amount of data for its collection, analysis and interpretation for a number of real world applications. A discussion will also be provided for the current and future status of data mining of social media and networking applications.


Author(s):  
Gary Smith ◽  
Jay Cordes

Scientific rigor and critical thinking skills are indispensable in this age of big data because machine learning and artificial intelligence are often led astray by meaningless patterns. The 9 Pitfalls of Data Science is loaded with entertaining real-world examples of both successful and misguided approaches to interpreting data, both grand successes and epic failures. Anyone can learn to distinguish between good data science and nonsense. We are confident that readers will learn how to avoid being duped by data, and make better, more informed decisions. Whether they want to be effective creators, interpreters, or users of data, they need to know the nine pitfalls of data science.


2019 ◽  
Vol 15 (S367) ◽  
pp. 458-460
Author(s):  
A. Bayo ◽  
M. J. Graham ◽  
D. Norman ◽  
M. Cerda ◽  
G. Damke ◽  
...  

AbstractLa Serena School for Data Science is a multidisciplinary program with six editions so far and a constant format: during 10-14 days, a group of ∼30 students (15 from the US, 15 from Chile and 1-3 from Caribbean countries) and ∼9 faculty gather in La Serena (Chile) to complete an intensive program in Data Science with emphasis in applications to astronomy and bio-sciences.The students attend theoretical and hands-on sessions, and, since early on, they work in multidisciplinary groups with their “mentors” (from the faculty) on real data science problems. The SOC and LOC of the school have developed student selection guidelines to maximize diversity.The program is very successful as proven by the high over-subscription rate (factor 5-8) and the plethora of positive testimony, not only from alumni, but also from current and former faculty that keep in contact with them.


Author(s):  
Shaveta Bhatia

 The epoch of the big data presents many opportunities for the development in the range of data science, biomedical research cyber security, and cloud computing. Nowadays the big data gained popularity.  It also invites many provocations and upshot in the security and privacy of the big data. There are various type of threats, attacks such as leakage of data, the third party tries to access, viruses and vulnerability that stand against the security of the big data. This paper will discuss about the security threats and their approximate method in the field of biomedical research, cyber security and cloud computing.


2020 ◽  
Author(s):  
Bankole Olatosi ◽  
Jiajia Zhang ◽  
Sharon Weissman ◽  
Zhenlong Li ◽  
Jianjun Hu ◽  
...  

BACKGROUND The Coronavirus Disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus (SARS-CoV-2) remains a serious global pandemic. Currently, all age groups are at risk for infection but the elderly and persons with underlying health conditions are at higher risk of severe complications. In the United States (US), the pandemic curve is rapidly changing with over 6,786,352 cases and 199,024 deaths reported. South Carolina (SC) as of 9/21/2020 reported 138,624 cases and 3,212 deaths across the state. OBJECTIVE The growing availability of COVID-19 data provides a basis for deploying Big Data science to leverage multitudinal and multimodal data sources for incremental learning. Doing this requires the acquisition and collation of multiple data sources at the individual and county level. METHODS The population for the comprehensive database comes from statewide COVID-19 testing surveillance data (March 2020- till present) for all SC COVID-19 patients (N≈140,000). This project will 1) connect multiple partner data sources for prediction and intelligence gathering, 2) build a REDCap database that links de-identified multitudinal and multimodal data sources useful for machine learning and deep learning algorithms to enable further studies. Additional data will include hospital based COVID-19 patient registries, Health Sciences South Carolina (HSSC) data, data from the office of Revenue and Fiscal Affairs (RFA), and Area Health Resource Files (AHRF). RESULTS The project was funded as of June 2020 by the National Institutes for Health. CONCLUSIONS The development of such a linked and integrated database will allow for the identification of important predictors of short- and long-term clinical outcomes for SC COVID-19 patients using data science.


Author(s):  
Leilah Santiago Bufrem ◽  
Fábio Mascarenhas Silva ◽  
Natanael Vitor Sobral ◽  
Anna Elizabeth Galvão Coutinho Correia

Introdução: A atual configuração da dinâmica relativa à produção e àcomunicação científicas revela o protagonismo da Ciência Orientada a Dados,em concepção abrangente, representada principalmente por termos como “e-Science” e “Data Science”. Objetivos: Apresentar a produção científica mundial relativa à Ciência Orientada a Dados a partir dos termos “e-Science” e “Data Science” na Scopus e na Web of Science, entre 2006 e 2016. Metodologia: A pesquisa está estruturada em cinco etapas: a) busca de informações nas bases Scopus e Web of Science; b) obtenção dos registros; bibliométricos; c) complementação das palavras-chave; d) correção e cruzamento dos dados; e) representação analítica dos dados. Resultados: Os termos de maior destaque na produção científica analisada foram Distributed computer systems (2006), Grid computing (2007 a 2013) e Big data (2014 a 2016). Na área de Biblioteconomia e Ciência de Informação, a ênfase é dada aos temas: Digital library e Open access, evidenciando a centralidade do campo nas discussões sobre dispositivos para dar acesso à informação científica em meio digital. Conclusões: Sob um olhar diacrônico, constata-se uma visível mudança de foco das temáticas voltadas às operações de compartilhamento de dados para a perspectiva analítica de busca de padrões em grandes volumes de dados.Palavras-chave: Data Science. E-Science. Ciência orientada a dados. Produção científica.Link:http://www.uel.br/revistas/uel/index.php/informacao/article/view/26543/20114


Author(s):  
Muhammad Waqar Khan ◽  
Muhammad Asghar Khan ◽  
Muhammad Alam ◽  
Wajahat Ali

<p>During past few years, data is growing exponentially attracting researchers to work a popular term, the Big Data. Big Data is observed in various fields, such as information technology, telecommunication, theoretical computing, mathematics, data mining and data warehousing. Data science is frequently referred with Big Data as it uses methods to scale down the Big Data. Currently<br />more than 3.2 billion of the world population is connected to internet out of which 46% are connected via smart phones. Over 5.5 billion people are using cell phones. As technology is rapidly shifting from ordinary cell phones towards smart phones, therefore proportion of using internet is also growing. There<br />is a forecast that by 2020 around 7 billion people at the globe will be using internet out of which 52% will be using their smart phones to connect. In year 2050 that figure will be touching 95% of world population. Every device connect to internet generates data. As majority of the devices are using smart phones to<br />generate this data by using applications such as Instagram, WhatsApp, Apple, Google, Google+, Twitter, Flickr etc., therefore this huge amount of data is becoming a big threat for telecom sector. This paper is giving a comparison of amount of Big Data generated by telecom industry. Based on the collected data<br />we use forecasting tools to predict the amount of Big Data will be generated in future and also identify threats that telecom industry will be facing from that huge amount of Big Data.</p>


2019 ◽  
Vol 36 (2) ◽  
pp. 75-82
Author(s):  
Tomohide Iwao ◽  
Genta Kato ◽  
Isao Ito ◽  
Toyohiro Hirai ◽  
Tomohiro Kuroda

Sign in / Sign up

Export Citation Format

Share Document