From data points to people: feminist situated ethics in online big data research

2019 ◽  
Vol 23 (2) ◽  
pp. 155-168 ◽  
Author(s):  
Danielle J. Corple ◽  
Jasmine R. Linabary
Keyword(s):  
Big Data ◽  
Author(s):  
Shweta Kumari

n a business enterprise there is an enormous amount of data generated or processed daily through different data points. It is increasing day by day. It is tough to handle it through traditional applications like excel or any other tools. So, big data analytics and environment may be helpful in the current scenario and the situation discussed above. This paper discussed the big data management ways with the impact of computational methodologies. It also covers the applicability domains and areas. It explores the computational methods applicability scenario and their conceptual design based on the previous literature. Machine learning, artificial intelligence and data mining techniques have been discussed for the same environment based on the related study.


2020 ◽  
Vol 11 (2) ◽  
pp. 396
Author(s):  
Ahmad Nurul FAJAR ◽  
Aldian NURCAHYO ◽  
Nunung Nurul QOMARIYAH

Nowadays, more and more people can enjoy fast internet access that can be used for various activities such as browsing, shopping online, video calls, playing games and so on. Businesses are also utilizing this very rapid increase in internet technology. They sell products and services through the internet with various attractive offers and competing with each other to increase their sales. One strategy that can be done to get more sales is through the method of personalizing services for customers. The personalization aspect in e-tourism has been predicted to increase. Customers who are making valuable data at every stage of their journey are making a challenge for travel companies to collect and link these data points to improve their customer experience. Learning the customer behaviour can be very significant for Online Travel Agent. Because collecting millions of search results through their services and provide a smart travel experience, Online Travel Agent in Indonesia must use Big Data and Cloud technology alignment to win the competition in the market. The entire data lifecycle must be simple because of the needs of users to keep batch ingesting a lot of data likes once in an hour. Streaming analytics has grown over the past few years, it has become one of the most critical components of most of the businesses. We proposed Online Travel Agent (OTA) for Tourism System Using Big Data and Cloud.


2017 ◽  
Author(s):  
Mohith Manjunath ◽  
Yi Zhang ◽  
Steve H. Yeo ◽  
Omar Sobh ◽  
Nathan Russell ◽  
...  

AbstractSummaryClustering is one of the most common techniques used in data analysis to discover hidden structures by grouping together data points that are similar in some measure into clusters. Although there are many programs available for performing clustering, a single web resource that provides both state-of-the-art clustering methods and interactive visualizations is lacking. ClusterEnG (acronym for Clustering Engine for Genomics) provides an interface for clustering big data and interactive visualizations including 3D views, cluster selection and zoom features. ClusterEnG also aims at educating the user about the similarities and differences between various clustering algorithms and provides clustering tutorials that demonstrate potential pitfalls of each algorithm. The web resource will be particularly useful to scientists who are not conversant with computing but want to understand the structure of their data in an intuitive manner.AvailabilityClusterEnG is part of a bigger project called KnowEnG (Knowledge Engine for Genomics) and is available at http://education.knoweng.org/[email protected]


2019 ◽  
Vol 17 (2) ◽  
pp. 272-280
Author(s):  
Adeel Hashmi ◽  
Tanvir Ahmad

Anomaly/Outlier detection is the process of finding abnormal data points in a dataset or data stream. Most of the anomaly detection algorithms require setting of some parameters which significantly affect the performance of the algorithm. These parameters are generally set by hit-and-trial; hence performance is compromised with default or random values. In this paper, the authors propose a self-optimizing algorithm for anomaly detection based on firefly meta-heuristic, and named as Firefly Algorithm for Anomaly Detection (FAAD). The proposed solution is a non-clustering unsupervised learning approach for anomaly detection. The algorithm is implemented on Apache Spark for scalability and hence the solution can handle big data as well. Experiments were conducted on various datasets, and the results show that the proposed solution is much accurate than the standard algorithms of anomaly detection.


2019 ◽  
Vol 9 (1) ◽  
pp. 53-72 ◽  
Author(s):  
Brian J. Galli

Innovation technologies are used for consistent and continuous improvement, as well as for examining past executions in business. Furthermore, obtaining numerous bits of knowledge about a business can help to influence planning and future choices. A way to create connections between various data points is through big data. Currently, business processes face many challenges because of technological headway and information age. Since big data has grown and become so popular, it is possible to apply it to unique and conventional business associations. Additionally, if big data is used to meet the business's needs, then it can yield organizational changes in infrastructure and real-world improvement. Through big data, analysts can reveal continuous improvement methods and a performance measurement system in data administration, as well as management, transactions, and convey central leadership.


Author(s):  
Rajendra Akerkar

Emergencies are typically complex problems with serious consequences that must be solved in a limited amount of time to reduce any possible damage. Big data analysis leads to more assured decision making and better decisions can mean greater operational efficiencies, cost reductions and reduced risk. In this chapter, we discuss some issues on tackling emergency situation from the perspective of big data processing and management, including our approach for processing social media content. Communications during emergencies are so plentiful that it is necessary to sift through enormous data points to find information that is most useful during a given event. The chapter also presents our ongoing IT-system that processes and analyses social media data to transform the excessive volume of low information content into small volume but rich content that is useful to emergency personnel.


2017 ◽  
Vol 55 (9) ◽  
pp. 2038-2052
Author(s):  
Huifeng Pan ◽  
Man-Su Kang ◽  
Hong-Youl Ha

Purpose Although the study of credit ratings has focused on traditional credit bureau resources, scholars have recently emphasized the importance of big data. The purpose of this paper is to examine both how these data affect the credit evaluations of small businesses and how financial managers use them to stabilize their risks. Design/methodology/approach Using data from 97,889 data points for normal guarantees and 1,678 data points for accidents in public funds, the authors explore the effects of trade area grades as well as the superiority of the use of big data when evaluating credit ratings for small businesses. Findings The results indicate that the grade information of trade areas is useful in predicting accident rates, particularly for small businesses with high credit scores (AAA-A). On the other hand, the accident rates of small businesses with low credit scores increased from 3.15-16.67 to 3.20-33.3 percent. These findings demonstrate that accident rates for the businesses with high credit scores decrease, but accident rates for businesses with low credit scores increase when using the grades of trade areas. Originality/value The authors contribute to the literature in two ways. First, this study provides one of the first investigations on information on trade areas through public financial perspectives, thereby extending the financial risk and retail literature. Second, the current study extends the research on the credit evaluation of small businesses through the big data application of real transaction-based trade areas, answering the call of Park et al. (2012), who recommended an exploration of the relationship between business start-ups and financial risk.


Author(s):  
David Pfander ◽  
Gregor Daiß ◽  
Dirk Pflüger

Clustering is an important task in data mining that has become more challenging due to the ever-increasing size of available datasets. To cope with these big data scenarios, a high-performance clustering approach is required. Sparse grid clustering is a density-based clustering method that uses a sparse grid density estimation as its central building block. The underlying density estimation approach enables the detection of clusters with non-convex shapes and without a predetermined number of clusters. In this work, we introduce a new distributed and performance-portable variant of the sparse grid clustering algorithm that is suited for big data settings. Our compute kernels were implemented in OpenCL to enable portability across a wide range of architectures. For distributed environments, we added a manager-worker scheme that was implemented using MPI. In experiments on two supercomputers, Piz Daint and Hazel Hen, with up to 100 million data points in a 10-dimensional dataset, we show the performance and scalability of our approach. The dataset with 100 million data points was clustered in 1198s using 128 nodes of Piz Daint. This translates to an overall performance of 352TFLOPS. On the node-level, we provide results for two GPUs, Nvidia's Tesla P100 and the AMD FirePro W8100, and one processor-based platform that uses Intel Xeon E5-2680v3 processors. In these experiments, we achieved between 43% and 66% of the peak performance across all compute kernels and devices, demonstrating the performance portability of our approach.


2020 ◽  
Author(s):  
Alexander Jung

We propose networked exponential families for non-parametric<br>machine learning from massive network-structured datasets<br>(“big data over networks”). High-dimensional data points are<br>interpreted as the realizations of a random process distributed<br>according to some exponential family. Networked exponential<br>families allow to jointly leverage the information contained<br>in high-dimensional data points and their network structure.<br>For data points representing individuals, we obtain perfectly<br>personalized models which enable high-precision medicine or<br>more general recommendation systems.We learn the parameters<br>of networked exponential families, using the network Lasso<br>which implicitly pools (or clusters) the data points according to<br>the intrinsic network structure and a local likelihood function.<br>Our main theoretical result characterizes how the accuracy<br>of network Lasso depends on the network structure and the<br>information geometry of the node-wise exponential families.<br>The network Lasso can be implemented as highly scalable<br>message-passing over the data network. Such message passing<br>is appealing for federated machine learning relying on edge<br>computing. The proposed method is also privacy preserving in<br>the sense that no raw data but only parameter (estimates) are<br>shared among different nodes.


Sign in / Sign up

Export Citation Format

Share Document