massive data
Recently Published Documents


TOTAL DOCUMENTS

966
(FIVE YEARS 302)

H-INDEX

27
(FIVE YEARS 7)

2022 ◽  
Vol 16 (4) ◽  
pp. 1-19
Author(s):  
Fei Gao ◽  
Jiada Li ◽  
Yisu Ge ◽  
Jianwen Shao ◽  
Shufang Lu ◽  
...  

With the popularization of visual object tracking (VOT), more and more trajectory data are obtained and have begun to gain widespread attention in the fields of mobile robots, intelligent video surveillance, and the like. How to clean the anomalous trajectories hidden in the massive data has become one of the research hotspots. Anomalous trajectories should be detected and cleaned before the trajectory data can be effectively used. In this article, a Trajectory Evaluator by Sub-tracks (TES) for detecting VOT-based anomalous trajectory is proposed. Feature of Anomalousness is defined and described as the Eigenvector of classifier to filter Track Lets anomalous trajectory and IDentity Switch anomalous trajectory, which includes Feature of Anomalous Pose and Feature of Anomalous Sub-tracks (FAS). In the comparative experiments, TES achieves better results on different scenes than state-of-the-art methods. Moreover, FAS makes better performance than point flow, least square method fitting and Chebyshev Polynomial Fitting. It is verified that TES is more accurate and effective and is conducive to the sub-tracks trajectory data analysis.


2022 ◽  
Vol 14 (2) ◽  
pp. 390
Author(s):  
Dinh Ho Tong Minh ◽  
Yen-Nhi Ngo

Modern Synthetic Aperture Radar (SAR) missions provide an unprecedented massive interferometric SAR (InSAR) time series. The processing of the Big InSAR Data is challenging for long-term monitoring. Indeed, as most deformation phenomena develop slowly, a strategy of a processing scheme can be worked on reduced volume data sets. This paper introduces a novel ComSAR algorithm based on a compression technique for reducing computational efforts while maintaining the performance robustly. The algorithm divides the massive data into many mini-stacks and then compresses them. The compressed estimator is close to the theoretical Cramer–Rao lower bound under a realistic C-band Sentinel-1 decorrelation scenario. Both persistent and distributed scatterers (PSDS) are exploited in the ComSAR algorithm. The ComSAR performance is validated via simulation and application to Sentinel-1 data to map land subsidence of the salt mine Vauvert area, France. The proposed ComSAR yields consistently better performance when compared with the state-of-the-art PSDS technique. We make our PSDS and ComSAR algorithms as an open-source TomoSAR package. To make it more practical, we exploit other open-source projects so that people can apply our PSDS and ComSAR methods for an end-to-end processing chain. To our knowledge, TomoSAR is the first public domain tool available to jointly handle PS and DS targets.


2022 ◽  
Author(s):  
Rabeeha Fazal ◽  
Munam Ali Shah ◽  
Hasan Ali Khattak ◽  
Hafiz Tayyab Rauf ◽  
Fadi Al-Turjman

2022 ◽  
pp. 41-67
Author(s):  
Vo Ngoc Phu ◽  
Vo Thi Ngoc Tran

Machine learning (ML), neural network (NN), evolutionary algorithm (EA), fuzzy systems (FSs), as well as computer science have been very famous and very significant for many years. They have been applied to many different areas. They have contributed much to developments of many large-scale corporations, massive organizations, etc. Lots of information and massive data sets (MDSs) have been generated from these big corporations, organizations, etc. These big data sets (BDSs) have been the challenges of many commercial applications, researches, etc. Therefore, there have been many algorithms of the ML, the NN, the EA, the FSs, as well as computer science which have been developed to handle these massive data sets successfully. To support for this process, the authors have displayed all the possible algorithms of the NN for the large-scale data sets (LSDSs) successfully in this chapter. Finally, they have presented a novel model of the NN for the BDS in a sequential environment (SE) and a distributed network environment (DNE).


2022 ◽  
pp. 979-992
Author(s):  
Pavani Konagala

A large volume of data is stored electronically. It is very difficult to measure the total volume of that data. This large amount of data is coming from various sources such as stock exchange, which may generate terabytes of data every day, Facebook, which may take about one petabyte of storage, and internet archives, which may store up to two petabytes of data, etc. So, it is very difficult to manage that data using relational database management systems. With the massive data, reading and writing from and into the drive takes more time. So, the storage and analysis of this massive data has become a big problem. Big data gives the solution for these problems. It specifies the methods to store and analyze the large data sets. This chapter specifies a brief study of big data techniques to analyze these types of data. It includes a wide study of Hadoop characteristics, Hadoop architecture, advantages of big data and big data eco system. Further, this chapter includes a comprehensive study of Apache Hive for executing health-related data and deaths data of U.S. government.


2021 ◽  
Vol 2021 ◽  
pp. 1-25
Author(s):  
Shuoben Bi ◽  
Ruizhuang Xu ◽  
Aili Liu ◽  
Luye Wang ◽  
Lei Wan

In view of the fact that the density-based clustering algorithm is sensitive to the input data, which results in the limitation of computing space and poor timeliness, a new method is proposed based on grid information entropy clustering algorithm for mining hotspots of taxi passengers. This paper selects representative geographical areas of Nanjing and Beijing as the research areas and uses information entropy and aggregation degree to analyze the distribution of passenger-carrying points. This algorithm uses a grid instead of original trajectory data to calculate and excavate taxi passenger hotspots. Through the comparison and analysis of the data of taxi loading points in Nanjing and Beijing, it is found that the experimental results are consistent with the actual urban passenger hotspots, which verifies the effectiveness of the algorithm. It overcomes the shortcomings of a density-based clustering algorithm that is limited by computing space and poor timeliness, reduces the size of data needed to be processed, and has greater flexibility to process and analyze massive data. The research results can provide an important scientific basis for urban traffic guidance and urban management.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Mitra Sadat Lavasani ◽  
Nahid Raeisi Ardali ◽  
Rahmat Sotudeh-Gharebagh ◽  
Reza Zarghami ◽  
János Abonyi ◽  
...  

Abstract Big data is an expression for massive data sets consisting of both structured and unstructured data that are particularly difficult to store, analyze and visualize. Big data analytics has the potential to help companies or organizations improve operations as well as disclose hidden patterns and secret correlations to make faster and intelligent decisions. This article provides useful information on this emerging and promising field for companies, industries, and researchers to gain a richer and deeper insight into advancements. Initially, an overview of big data content, key characteristics, and related topics are presented. The paper also highlights a systematic review of available big data techniques and analytics. The available big data analytics tools and platforms are categorized. Besides, this article discusses recent applications of big data in chemical industries to increase understanding and encourage its implementation in their engineering processes as much as possible. Finally, by emphasizing the adoption of big data analytics in various areas of process engineering, the aim is to provide a practical vision of big data.


2021 ◽  
Vol 258 (1) ◽  
pp. 1 ◽  
Author(s):  
Federica B. Bianco ◽  
Željko Ivezić ◽  
R. Lynne Jones ◽  
Melissa L. Graham ◽  
Phil Marshall ◽  
...  

Abstract Vera C. Rubin Observatory is a ground-based astronomical facility under construction, a joint project of the National Science Foundation and the U.S. Department of Energy, designed to conduct a multipurpose 10 yr optical survey of the Southern Hemisphere sky: the Legacy Survey of Space and Time. Significant flexibility in survey strategy remains within the constraints imposed by the core science goals of probing dark energy and dark matter, cataloging the solar system, exploring the transient optical sky, and mapping the Milky Way. The survey’s massive data throughput will be transformational for many other astrophysics domains and Rubin’s data access policy sets the stage for a huge community of potential users. To ensure that the survey science potential is maximized while serving as broad a community as possible, Rubin Observatory has involved the scientific community at large in the process of setting and refining the details of the observing strategy. The motivation, history, and decision-making process of this strategy optimization are detailed in this paper, giving context to the science-driven proposals and recommendations for the survey strategy included in this Focus Issue.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Fangpeng Ming ◽  
Liang Tan ◽  
Xiaofan Cheng

Big data has been developed for nearly a decade, and the information data on the network is exploding. Facing the complex and massive data, it is difficult for people to get the demanded information quickly, and the recommendation algorithm with its characteristics becomes one of the important methods to solve the massive data overload problem at this stage. In particular, the rise of the e-commerce industry has promoted the development of recommendation algorithms. Traditional, single recommendation algorithms often have problems such as cold start, data sparsity, and long-tail items. The hybrid recommendation algorithms at this stage can effectively avoid some of the drawbacks caused by a single algorithm. To address the current problems, this paper makes up for the shortcomings of a single collaborative model by proposing a hybrid recommendation algorithm based on deep learning IA-CN. The algorithm first uses an integrated strategy to fuse user-based and item-based collaborative filtering algorithms to generalize and classify the output results. Then deeper and more abstract nonlinear interactions between users and items are captured by improved deep learning techniques. Finally, we designed experiments to validate the algorithm. The experiments are compared with the benchmark algorithm on (Amazon item rating dataset), and the results show that the IA-CN algorithm proposed in this paper has better performance in rating prediction on the test dataset.


Sign in / Sign up

Export Citation Format

Share Document