Big Data: A Very Short Introduction
Latest Publications


TOTAL DOCUMENTS

8
(FIVE YEARS 0)

H-INDEX

1
(FIVE YEARS 0)

Published By Oxford University Press

9780198779575, 9780191824616

Author(s):  
Dawn E. Holmes

‘Big data and society’ considers how big data is changing the society we live in, through the development of sophisticated robots and their role in the workplace. It discusses smart vehicles, smart homes, and smart cities of the future. The data generated by the world is only going to get bigger. The big data revolution marks a sea-change in the way the world works, and as with all technological developments, individuals, scientists, and governments together have a moral responsibility to ensure its proper use. Big data is power. Its potential for good is enormous. How we prevent its abuse is up to us.


Author(s):  
Dawn E. Holmes

The rapid growth in computing power and storage has led to progressively more data being collected. Big datasets are certainly large and complex, but in order to fully define ‘big data’ we need first to understand ‘small data’ and its role in statistical analysis. ‘Why is big data special?’ considers the four main characteristics of big data: volume, variety, velocity, and veracity, which present a considerable challenge in data management. The advantages we expect to gain from meeting this challenge and the questions we hope to answer with big data can be understood through data mining. The use of big data mining in credit card fraud detection is discussed.


Author(s):  
Dawn E. Holmes

Since the use of computers became feasible in commercial enterprise, there has been interest in using computers to improve efficiency, cut costs, and generate profits. When IBM launched the IBM-PC in 1981, with the use of floppy disks for data storage, the idea really took off for business, but it was the widespread adoption of the Internet that made e-commerce a practical proposition. ‘Big data, big business’ considers pay-per-click advertising, cookies, targeted advertising, recommender systems, and collaborative filtering used by a wide range of businesses. Alongside the analysis of business practices it provides case studies on Amazon and Netflix, each highlighting different features of marketing using big data.


Author(s):  
Dawn E. Holmes

Big data analysis is changing the world of healthcare. Its potential has yet to be fully realized, but includes medical diagnosis, epidemic prediction, gauging public response to government health warnings, and the reduction in costs. ‘Big data and medicine’ considers healthcare informatics, which uses big data to provide improved patient care and reduce costs. Mining social media data for public health related research is now a recognized practice within the academic community. Google Flu Trends is discussed along with the Watson medical system, which retrieves and analyses structured and unstructured data. Big data use in healthcare evidently has great potential, but what about the privacy of the individual’s medical data?


Author(s):  
Dawn E. Holmes

‘Big data analytics’ argues that big data is only useful if we can extract useful information from it. It looks at some of the techniques used to discover useful information from big data, such as customer preferences or how fast an epidemic is spreading. Big data analytics is changing rapidly as the size of the datasets increases and classical statistics makes room for this new paradigm. An example of big data analytics is the algorithmic method called MapReduce, a distributed data processing system that forms part of the core functionality of the Hadoop Ecosystem. Amazon, Google, Facebook, and many others use Hadoop to store and process their data.


Author(s):  
Dawn E. Holmes

The amount of data generated is approximately doubling every two years. How do we store and manage these colossal amounts of data? ‘Storing big data’ considers database storage and the idea of distributing tasks across clusters of computers. Relational database management systems are used to create, maintain, access, and manipulate structured data, whereas distributed file systems provide effective and reliable storage for unstructured data across many servers. NoSQL databases and their architecture are discussed along with the CAP Theorem and Cloud storage. The difference between lossless compression for text files and lossy data compression for sound and image files is also explained.


Author(s):  
Dawn E. Holmes

‘Big data security and the Snowden case’ looks at some of the security issues surrounding big data and the importance of encryption. Some of the problems facing big data systems include ensuring they actually work as intended, can be fixed when they break down, and are tamper-proof and accessible only to those with the correct authorization. Hacking and data theft have become the main problem facing big data. The hacks of Home Depot in 2014 and of Yahoo! in 2016 are described, as well as the Edward Snowden case, Wikileaks, TOR, and the Dark Web. If we want to make big data secure, encryption is vital.


Author(s):  
Dawn E. Holmes

‘The data explosion’ introduces the reader to the diversity of data in general before explaining how the digital age has led to changes in the way we define data. Big data is introduced informally through the idea of the data explosion, involving computer science, statistics, and the interface between them. It discusses search engine data, healthcare data, astronomical data, and real-time data such as the information provided by Global Positioning Systems. But what use is all this data? The ultimate aim of working with big data is to extract useful information. Combining traditional statistics and computer science allows large datasets to be analysed to search for key patterns.


Sign in / Sign up

Export Citation Format

Share Document