large datasets
Recently Published Documents


TOTAL DOCUMENTS

1404
(FIVE YEARS 554)

H-INDEX

44
(FIVE YEARS 9)

2022 ◽  
pp. 153575972110686
Author(s):  
Fernando Cendes ◽  
Carrie R. McDonald

Artificial intelligence (AI) is increasingly used in medical image analysis and has accelerated scientific discoveries across fields of medicine. In this review, we highlight how AI has been applied to neuroimaging in patients with epilepsy to enhance classification of clinical diagnosis, prediction of treatment outcomes, and the understanding of cognitive comorbidities. We outline the strengths and shortcomings of current AI research and the need for future studies using large datasets that test the reproducibility and generalizability of current findings, as well as studies that test the clinical utility of AI approaches.


2022 ◽  
Vol 21 (4) ◽  
pp. 346-363
Author(s):  
Hubert Anysz

The use of data mining and machine learning tools is becoming increasingly common. Their usefulness is mainly noticeable in the case of large datasets, when information to be found or new relationships are extracted from information noise. The development of these tools means that datasets with much fewer records are being explored, usually associated with specific phenomena. This specificity most often causes the impossibility of increasing the number of cases, and that can facilitate the search for dependences in the phenomena under study. The paper discusses the features of applying the selected tools to a small set of data. Attempts have been made to present methods of data preparation, methods for calculating the performance of tools, taking into account the specifics of databases with a small number of records. The techniques selected by the author are proposed, which helped to break the deadlock in calculations, i.e., to get results much worse than expected. The need to apply methods to improve the accuracy of forecasts and the accuracy of classification was caused by a small amount of analysed data. This paper is not a review of popular methods of machine learning and data mining; nevertheless, the collected and presented material will help the reader to shorten the path to obtaining satisfactory results when using the described computational methods


Information ◽  
2022 ◽  
Vol 13 (1) ◽  
pp. 28
Author(s):  
Saïd Mahmoudi ◽  
Mohammed Amin Belarbi

Multimedia applications deal, in most cases, with an extremely high volume of multimedia data (2D and 3D images, sounds, videos). That is why efficient algorithms should be developed to analyze and process these large datasets. On the other hand, multimedia management is based on efficient representation of knowledge which allows efficient data processing and retrieval. The main challenge in this era is to achieve clever and quick access to these huge datasets to allow easy access to the data and in a reasonable time. In this context, large-scale image retrieval is a fundamental task. Many methods have been developed in the literature to achieve fast and efficient navigating in large databases by using the famous content-based image retrieval (CBIR) methods associated with these methods allowing a decrease in the computing time, such as dimensional reduction and hashing methods. More recently, these methods based on convolutional neural networks (CNNs) for feature extraction and image classification are widely used. In this paper, we present a comprehensive review of recent multimedia retrieval methods and algorithms applied to large datasets of 2D/3D images and videos. This editorial paper discusses the mains challenges of multimedia retrieval in a context of large databases.


Author(s):  
Gioele Ciaparrone ◽  
Leonardo Chiariglione ◽  
Roberto Tagliaferri

AbstractFace-based video retrieval (FBVR) is the task of retrieving videos that containing the same face shown in the query image. In this article, we present the first end-to-end FBVR pipeline that is able to operate on large datasets of unconstrained, multi-shot, multi-person videos. We adapt an existing audiovisual recognition dataset to the task of FBVR and use it to evaluate our proposed pipeline. We compare a number of deep learning models for shot detection, face detection, and face feature extraction as part of our pipeline on a validation dataset made of more than 4000 videos. We obtain 97.25% mean average precision on an independent test set, composed of more than 1000 videos. The pipeline is able to extract features from videos at $$\sim $$ ∼ 7 times the real-time speed, and it is able to perform a query on thousands of videos in less than 0.5 s.


2022 ◽  
pp. 104-124
Author(s):  
Hugo Garcia Tonioli Defendi ◽  
Vanessa de Arruda Jorge ◽  
Ana Paula da Silva Carvalho ◽  
Luciana da Silva Madeira ◽  
Suzana Borschiver

The process of knowledge construction, widely discussed in the literature, follows a common structure that encompasses transformation of data into information and then into knowledge, which converges social, technological, organizational, and strategic aspects. The advancement of information technologies and growing global research efforts in the health field has dynamically generated large datasets, thus providing potential innovative solutions to health problems, posing important challenges in selection and interpretation of useful information and possibilities. COVID-19 pandemic has intensified this data generation as results of global efforts, and cooperation has promoted a level of scientific production never experienced before concerning the overcoming of the pandemic. In this context, the search for an effective and safe vaccine that can prevent the spread of this virus has become a common goal of societies, governments, institutions, and companies. These collaborative efforts have contributed to speed up the development of these vaccines at an unprecedented pace in history.


2022 ◽  
pp. 1330-1345
Author(s):  
John G. McNutt ◽  
Lauri Goldkind

Governments have long dealt with the issue of engaging their constituents in the process of governance, and e-participation efforts have been a part of this effort. Almost all of these efforts have been controlled by government. Civic technology and data4good, fueled by the movement toward open government and open civic data, represent a sea change in this relationship. A similar movement is data for good, which uses volunteer data scientists to address social problems using advanced analytics and large datasets. Working through a variety of organizations, they apply the power of data to problems. This chapter will explore these possibilities and outline a set of scenarios that might be possible. The chapter has four parts. The first part looks at citizen participation in broad brush, with special attention to e-participation. The next two sections look at civic technology and data4good. The final section looks at the possible changes that these two embryonic movements can have on the structure of participation in government and to the nature of public management.


2021 ◽  
Vol 46 (4) ◽  
pp. 1-45
Author(s):  
Chenhao Ma ◽  
Yixiang Fang ◽  
Reynold Cheng ◽  
Laks V. S. Lakshmanan ◽  
Wenjie Zhang ◽  
...  

Given a directed graph G , the directed densest subgraph (DDS) problem refers to the finding of a subgraph from G , whose density is the highest among all the subgraphs of G . The DDS problem is fundamental to a wide range of applications, such as fraud detection, community mining, and graph compression. However, existing DDS solutions suffer from efficiency and scalability problems: on a 3,000-edge graph, it takes three days for one of the best exact algorithms to complete. In this article, we develop an efficient and scalable DDS solution. We introduce the notion of [ x , y ]-core, which is a dense subgraph for G , and show that the densest subgraph can be accurately located through the [ x , y ]-core with theoretical guarantees. Based on the [ x , y ]-core, we develop exact and approximation algorithms. We further study the problems of maintaining the DDS over dynamic directed graphs and finding the weighted DDS on weighted directed graphs, and we develop efficient non-trivial algorithms to solve these two problems by extending our DDS algorithms. We have performed an extensive evaluation of our approaches on 15 real large datasets. The results show that our proposed solutions are up to six orders of magnitude faster than the state-of-the-art.


Sign in / Sign up

Export Citation Format

Share Document