Big data and portfolio optimization: A novel approach integrating DEA with multiple data sources

Omega ◽  
2021 ◽  
pp. 102479
Author(s):  
Zhongbao Zhou ◽  
Meng Gao ◽  
Helu Xiao ◽  
Rui Wang ◽  
Wenbin Liu
2019 ◽  
Vol 253 ◽  
pp. 403-411 ◽  
Author(s):  
YuJie Ben ◽  
FuJun Ma ◽  
Hao Wang ◽  
Muhammad Azher Hassan ◽  
Romanenko Yevheniia ◽  
...  

IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 31269-31280 ◽  
Author(s):  
Busik Jang ◽  
Sangdon Park ◽  
Joohyung Lee ◽  
Sang-Geun Hahn

2019 ◽  
Vol 36 (4) ◽  
Author(s):  
José F. Torres ◽  
Alicia Troncoso ◽  
Irena Koprinska ◽  
Zheng Wang ◽  
Francisco Martínez‐Álvarez

2021 ◽  
pp. 1-15
Author(s):  
Ali Reza Honarvar ◽  
Ashkan Sami

At present, the issue of air quality in populated urban areas is recognized as an environmental crisis. Air pollution affects the sustainability of the city. In controlling air pollution and protecting its hazards from humans, air quality data are very important. However, the costs of constructing and maintaining air quality registration infrastructure are very expensive and high, and air quality data recording at one point will not be generalizable to even a few kilometers. Some of the gains come from the integration of multiple data sources, which can never be achieved through independent single-source processing. Urban organizations in each city independently produce and record data relevant to the organization’s goals and objectives. These issues create separate data silos associated with an urban system. These data are varied in model and structure, and the integration of such data provides an appropriate opportunity to discover knowledge that can be useful in urban planning and decision making. This paper aims to show the generality of our previous research, which proposed a novel model to predict Particulate Matter (PM) as the main factor of air quality in the regions of the cities where air quality sensors are not available through urban big data resources integration, by extending the model and experiments with various configuration for different settings in smart cities. This work extends the evaluation scenarios of the model with the extended dataset of city of Aarhus, in Denmark, and compare the model performance against various specified baselines. Details of removing the heterogeneity of multiple data sources in the Multiple Data Set Aggregator & Heterogeneity Remover (MDA&HR) and improving the operation of Train Data Splitter (TDS) part of the model by focusing on the finding more similar pattern of air quality also are presented in this paper. The acceptable accuracy of the results shows the generality of the model.


Author(s):  
Nayyer Masood ◽  
Gul Jabeen

Schema merging is a process of integrating multiple data sources into a GCS (Global Conceptual Schema). It is pivotal to various application domains, like data ware housing and multi-databases. Schema merging requires the identification of corresponding elements, which is done through schema matching process. In this process, corresponding elements across multiple data sources are identified after the comparison of these data sources with each other. In this way, for a given set of data sources and the correspondence between them, different possibilities for creating GCS can be achieved. In applications like multi-databases and data warehousing, new data sources keep joining in and GCS relations are usually expanded horizontally or vertically. Schema merging approaches usually expand GCS relations horizontally or vertically as new data sources join in. As a result of such expansions, an unbalanced GCS is created which either produces too much NULL values in response to global queries or a result of too many Joins causes poor query processing. In this paper, a novel approach, TuSMe (Tuned Schema Merging) technique is introduced to overcome the above mentioned issue via developing a balanced GCS, which will be able to control both vertical and horizontal expansion of GCS relations. The approach employs a weighting mechanism in which the weights are assigned to individual attributes of GCS. These weights reflect the connectedness of GCS attributes in accordance with the attributes of the principle data sources. Moreover, the overall strength of the GCS could be scrutinized by combining these weights. A prototype implementation of TuSMe shows significant improvement against other contemporary state-of-the-art approaches.


Author(s):  
Lijing Wang ◽  
Aniruddha Adiga ◽  
Srinivasan Venkatramanan ◽  
Jiangzhuo Chen ◽  
Bryan Lewis ◽  
...  

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Jin Chen ◽  
Tianyuan Chen ◽  
Yifei Song ◽  
Bin Hao ◽  
Ling Ma

AbstractPrior literature emphasizes the distinct roles of differently affiliated venture capitalists (VCs) in nurturing innovation and entrepreneurship. Although China has become the second largest VC market in the world, the unavailability of high-quality datasets on VC affiliation in China’s market hinders such research efforts. To fill up this important gap, we compiled a new panel dataset of VC affiliation in China’s market from multiple data sources. Specifically, we drew on a list of 6,553 VCs that have invested in China between 2000 and 2016 from CVSource database, collected VC’s shareholder information from public sources, and developed a multi-stage procedure to label each VC as the following types: GVC (public agency-affiliated, state-owned enterprise-affiliated), CVC (corporate VC), IVC (independent VC), BVC (bank-affiliated VC), FVC (financial/non-bank-affiliated VC), UVC (university endowment/spin-out unit), and PenVC (pension-affiliated VC). We also denoted whether a VC has foreign background. This dataset helps researchers conduct more nuanced investigations into the investment behaviors of different VCs and their distinct impacts on innovation and entrepreneurship in China’s context.


Sign in / Sign up

Export Citation Format

Share Document