SoftRec: Multi-Relationship Fused Software Developer Recommendation

Xinqiang Xie; Bin Wang; Xiaochun Yang

doi:10.3390/app10124333

SoftRec: Multi-Relationship Fused Software Developer Recommendation

Applied Sciences ◽

10.3390/app10124333 ◽

2020 ◽

Vol 10 (12) ◽

pp. 4333 ◽

Cited By ~ 2

Author(s):

Xinqiang Xie ◽

Bin Wang ◽

Xiaochun Yang

Keyword(s):

Software Development ◽

User Study ◽

State Of The Art ◽

Software Developer ◽

Implicit Information ◽

Software Company ◽

Joint Matrix Factorization ◽

Art Works ◽

Developer Recommendation ◽

Real World Datasets

Collaboration efficiency is of primary importance in software development. It is widely recognized that choosing suitable developers is an efficient and effective practice for improving the efficiency of software development and collaboration. Recommending suitable developers is complex and time-consuming due to the difficulty of learning developers’ expertise and willingness. Existing works focus on learning developers’ expertise and interactions from their explicit historical information and matching them to specific task. However, such procedures may suffer low accuracy because they ignore implicit information, such as (1) developer–developer collaboration relationships, (2) developer–task implicit interaction relationships, and (3) task–task association relationships, etc. To that end, this paper proposes a multi-relationship fused approach for software developer recommendation (termed SoftRec). First, in addition to explicit developer–task interactions, it considers multivariate implicit relationships, including the three types mentioned above. Second, it integrates these relationships based on joint matrix factorization and generates forecast results upon the architecture of deep neural network. Furthermore, we propose a fast update method to address the cold start issue by making online recommendations for new developers and new tasks. Extensive experiments are conducted on two real-world datasets, and a user study is conducted in a well-known software company. The results demonstrate that SoftRec outperforms four state-of-the-art works.

Get full-text (via PubEx)

Recent trends in Component Based software development and Efficiency analysis of Semantic search based component retrieval Technique

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit183836 ◽

2018 ◽

pp. 105-113 ◽

Cited By ~ 1

Author(s):

Vishnu Sharma ◽

Vijay Singh Rathore

Keyword(s):

Software Development ◽

Efficiency Analysis ◽

Software Components ◽

High Demand ◽

Component Retrieval ◽

Software Developer ◽

Software Libraries ◽

Retrieval Technique ◽

Executable File ◽

Recent Trends

In these days most of the software development uses preexisting software components. This approach provides plenty of benefits over the traditional development. Most of the software industries uses their own domain based software libraries where components resides in the form of modules, codes, executable file, documentations, test plans which may be used as it is or with minor changes. Due to shrinking time and high demand of software development it is necessary to use pre tested software components to ensure high functionality in software developed. Software components can be used very easily and without having the worries of errors and bugs because these are developed under expert supervision and well tested. What we have to do is just embed these components in our project. In this paper a survey got conducted over 112 software developer,testers and freelancers. In survey several issues in CBSD were identified. An efficient repository along with a component search engine is developed. All the component retrieval techniques were evaluated and compared with precise and recall method.

Get full-text (via PubEx)

G-Tric: generating three-way synthetic datasets with triclustering solutions

BMC Bioinformatics ◽

10.1186/s12859-020-03925-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

João Lobo ◽

Rui Henriques ◽

Sara C. Madeira

Keyword(s):

State Of The Art ◽

Synthetic Data ◽

Ground Truth ◽

Real Data ◽

Three Dimensions ◽

Additional Advantage ◽

Urban Dynamics ◽

Data Generator ◽

Real World Datasets ◽

Synthetic Datasets

Abstract Background Three-way data started to gain popularity due to their increasing capacity to describe inherently multivariate and temporal events, such as biological responses, social interactions along time, urban dynamics, or complex geophysical phenomena. Triclustering, subspace clustering of three-way data, enables the discovery of patterns corresponding to data subspaces (triclusters) with values correlated across the three dimensions (observations $$\times$$ × features $$\times$$ × contexts). With increasing number of algorithms being proposed, effectively comparing them with state-of-the-art algorithms is paramount. These comparisons are usually performed using real data, without a known ground-truth, thus limiting the assessments. In this context, we propose a synthetic data generator, G-Tric, allowing the creation of synthetic datasets with configurable properties and the possibility to plant triclusters. The generator is prepared to create datasets resembling real 3-way data from biomedical and social data domains, with the additional advantage of further providing the ground truth (triclustering solution) as output. Results G-Tric can replicate real-world datasets and create new ones that match researchers needs across several properties, including data type (numeric or symbolic), dimensions, and background distribution. Users can tune the patterns and structure that characterize the planted triclusters (subspaces) and how they interact (overlapping). Data quality can also be controlled, by defining the amount of missing, noise or errors. Furthermore, a benchmark of datasets resembling real data is made available, together with the corresponding triclustering solutions (planted triclusters) and generating parameters. Conclusions Triclustering evaluation using G-Tric provides the possibility to combine both intrinsic and extrinsic metrics to compare solutions that produce more reliable analyses. A set of predefined datasets, mimicking widely used three-way data and exploring crucial properties was generated and made available, highlighting G-Tric’s potential to advance triclustering state-of-the-art by easing the process of evaluating the quality of new triclustering approaches.

Get full-text (via PubEx)

Bringing Things Closer: Enhancing Low-Vision Interaction Experience with Office Productivity Applications

Proceedings of the ACM on Human-Computer Interaction ◽

10.1145/3457144 ◽

2021 ◽

Vol 5 (EICS) ◽

pp. 1-18

Author(s):

Hae-Na Lee ◽

Vikas Ashok ◽

IV Ramakrishnan

Keyword(s):

Assistive Technology ◽

User Study ◽

State Of The Art ◽

Low Vision ◽

Spatial Separation ◽

Usability Study ◽

Presentation Software ◽

Screen Magnifier ◽

Word Processors ◽

Grid Layout

Many people with low vision rely on screen-magnifier assistive technology to interact with productivity applications such as word processors, spreadsheets, and presentation software. Despite the importance of these applications, little is known about their usability with respect to low-vision screen-magnifier users. To fill this knowledge gap, we conducted a usability study with 10 low-vision participants having different eye conditions. In this study, we observed that most usability issues were predominantly due to high spatial separation between main edit area and command ribbons on the screen, as well as the wide span grid-layout of command ribbons; these two GUI aspects did not gel with the screen-magnifier interface due to lack of instantaneous WYSIWYG (What You See Is What You Get) feedback after applying commands, given that the participants could only view a portion of the screen at any time. Informed by the study findings, we developed MagPro, an augmentation to productivity applications, which significantly improves usability by not only bringing application commands as close as possible to the user's current viewport focus, but also enabling easy and straightforward exploration of these commands using simple mouse actions. A user study with nine participants revealed that MagPro significantly reduced the time and workload to do routine command-access tasks, compared to using the state-of-the-art screen magnifier.

Get full-text (via PubEx)

TransET: Knowledge Graph Embedding with Entity Types

Electronics ◽

10.3390/electronics10121407 ◽

2021 ◽

Vol 10 (12) ◽

pp. 1407

Author(s):

Peng Wang ◽

Jing Zhou ◽

Yuzhang Liu ◽

Xingchen Zhou

Keyword(s):

Link Prediction ◽

State Of The Art ◽

Score Function ◽

Graph Embedding ◽

Vector Spaces ◽

Knowledge Graph ◽

Semantic Features ◽

Knowledge Graphs ◽

Real World Datasets ◽

Low Dimensional

Knowledge graph embedding aims to embed entities and relations into low-dimensional vector spaces. Most existing methods only focus on triple facts in knowledge graphs. In addition, models based on translation or distance measurement cannot fully represent complex relations. As well-constructed prior knowledge, entity types can be employed to learn the representations of entities and relations. In this paper, we propose a novel knowledge graph embedding model named TransET, which takes advantage of entity types to learn more semantic features. More specifically, circle convolution based on the embeddings of entity and entity types is utilized to map head entity and tail entity to type-specific representations, then translation-based score function is used to learn the presentation triples. We evaluated our model on real-world datasets with two benchmark tasks of link prediction and triple classification. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.

Get full-text (via PubEx)

Sampo-UI: A full stack JavaScript framework for developing semantic portal user interfaces

Semantic Web ◽

10.3233/sw-210428 ◽

2021 ◽

pp. 1-16

Author(s):

Esko Ikkala ◽

Eero Hyvönen ◽

Heikki Rantala ◽

Mikko Koho

Keyword(s):

User Interfaces ◽

Linked Data ◽

State Of The Art ◽

Software Framework ◽

End User ◽

Faceted Search ◽

Software Developer ◽

Current State ◽

Knowledge Graphs ◽

User Friendly

This paper presents a new software framework, Sampo-UI, for developing user interfaces for semantic portals. The goal is to provide the end-user with multiple application perspectives to Linked Data knowledge graphs, and a two-step usage cycle based on faceted search combined with ready-to-use tooling for data analysis. For the software developer, the Sampo-UI framework makes it possible to create highly customizable, user-friendly, and responsive user interfaces using current state-of-the-art JavaScript libraries and data from SPARQL endpoints, while saving substantial coding effort. Sampo-UI is published on GitHub under the open MIT License and has been utilized in several internal and external projects. The framework has been used thus far in creating six published and five forth-coming portals, mostly related to the Cultural Heritage domain, that have had tens of thousands of end-users on the Web.

Get full-text (via PubEx)

Density Guarantee on Finding Multiple Subgraphs and Subtensors

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3446668 ◽

2021 ◽

Vol 15 (5) ◽

pp. 1-32

Author(s):

Quang-huy Duong ◽

Heri Ramampiaro ◽

Kjetil Nørvåg ◽

Thu-lan Dam

Keyword(s):

Lower Bound ◽

State Of The Art ◽

The State ◽

The Other ◽

Exact Methods ◽

Practical Solution ◽

Novel Approach ◽

Wide Range ◽

Real World Datasets ◽

Tensor Data

Dense subregion (subgraph & subtensor) detection is a well-studied area, with a wide range of applications, and numerous efficient approaches and algorithms have been proposed. Approximation approaches are commonly used for detecting dense subregions due to the complexity of the exact methods. Existing algorithms are generally efficient for dense subtensor and subgraph detection, and can perform well in many applications. However, most of the existing works utilize the state-or-the-art greedy 2-approximation algorithm to capably provide solutions with a loose theoretical density guarantee. The main drawback of most of these algorithms is that they can estimate only one subtensor, or subgraph, at a time, with a low guarantee on its density. While some methods can, on the other hand, estimate multiple subtensors, they can give a guarantee on the density with respect to the input tensor for the first estimated subsensor only. We address these drawbacks by providing both theoretical and practical solution for estimating multiple dense subtensors in tensor data and giving a higher lower bound of the density. In particular, we guarantee and prove a higher bound of the lower-bound density of the estimated subgraph and subtensors. We also propose a novel approach to show that there are multiple dense subtensors with a guarantee on its density that is greater than the lower bound used in the state-of-the-art algorithms. We evaluate our approach with extensive experiments on several real-world datasets, which demonstrates its efficiency and feasibility.

Get full-text (via PubEx)

Chi-Squared Distance Metric Learning for Histogram Data

Mathematical Problems in Engineering ◽

10.1155/2015/352849 ◽

2015 ◽

Vol 2015 ◽

pp. 1-12 ◽

Cited By ~ 2

Author(s):

Wei Yang ◽

Luhui Xu ◽

Xiaopan Chen ◽

Fengbin Zheng ◽

Yang Liu

Keyword(s):

Nearest Neighbor ◽

State Of The Art ◽

Metric Learning ◽

Nearest Neighbors ◽

Distance Metric Learning ◽

Distance Metric ◽

Projected Gradient Method ◽

Proper Distance ◽

Chi Squared ◽

Real World Datasets

Learning a proper distance metric for histogram data plays a crucial role in many computer vision tasks. The chi-squared distance is a nonlinear metric and is widely used to compare histograms. In this paper, we show how to learn a general form of chi-squared distance based on the nearest neighbor model. In our method, the margin of sample is first defined with respect to the nearest hits (nearest neighbors from the same class) and the nearest misses (nearest neighbors from the different classes), and then the simplex-preserving linear transformation is trained by maximizing the margin while minimizing the distance between each sample and its nearest hits. With the iterative projected gradient method for optimization, we naturally introduce thel2,1norm regularization into the proposed method for sparse metric learning. Comparative studies with the state-of-the-art approaches on five real-world datasets verify the effectiveness of the proposed method.

Get full-text (via PubEx)

Integrating Approaches in Software Development: A Case Analysis in a Small Software Company

Communications in Computer and Information Science - Systems, Software and Services Process Improvement ◽

10.1007/978-3-030-56441-4_7 ◽

2020 ◽

pp. 95-106

Author(s):

Mary Sánchez-Gordón ◽

Ricardo Colomo-Palacios ◽

Alex Sánchez ◽

Sandra Sanchez-Gordon

Keyword(s):

Software Development ◽

Case Analysis ◽

Software Company

Get full-text (via PubEx)

SetRank: A Setwise Bayesian Approach for Collaborative Ranking from Implicit Feedback

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6077 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6127-6136

Author(s):

Chao Wang ◽

Hengshu Zhu ◽

Chen Zhu ◽

Chuan Qin ◽

Hui Xiong

Keyword(s):

Bayesian Approach ◽

Posterior Probability ◽

State Of The Art ◽

User Preferences ◽

Implicit Feedback ◽

Pairwise Preference ◽

Entire List ◽

Collaborative Ranking ◽

Real World Datasets ◽

Made In

The recent development of online recommender systems has a focus on collaborative ranking from implicit feedback, such as user clicks and purchases. Different from explicit ratings, which reflect graded user preferences, the implicit feedback only generates positive and unobserved labels. While considerable efforts have been made in this direction, the well-known pairwise and listwise approaches have still been limited by various challenges. Specifically, for the pairwise approaches, the assumption of independent pairwise preference is not always held in practice. Also, the listwise approaches cannot efficiently accommodate “ties” due to the precondition of the entire list permutation. To this end, in this paper, we propose a novel setwise Bayesian approach for collaborative ranking, namely SetRank, to inherently accommodate the characteristics of implicit feedback in recommender system. Specifically, SetRank aims at maximizing the posterior probability of novel setwise preference comparisons and can be implemented with matrix factorization and neural networks. Meanwhile, we also present the theoretical analysis of SetRank to show that the bound of excess risk can be proportional to √M/N, where M and N are the numbers of items and users, respectively. Finally, extensive experiments on four real-world datasets clearly validate the superiority of SetRank compared with various state-of-the-art baselines.

Get full-text (via PubEx)

Sentence Generation for Entity Description with Content-Plan Attention

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6439 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9057-9064

Author(s):

Bayu Trisedya ◽

Jianzhong Qi ◽

Rui Zhang

Keyword(s):

State Of The Art ◽

Neural Models ◽

Time Step ◽

Two Stage ◽

Sentence Generation ◽

Neural Data ◽

Attention Model ◽

Linear Sequence ◽

Proper Order ◽

Real World Datasets

We study neural data-to-text generation. Specifically, we consider a target entity that is associated with a set of attributes. We aim to generate a sentence to describe the target entity. Previous studies use encoder-decoder frameworks where the encoder treats the input as a linear sequence and uses LSTM to encode the sequence. However, linearizing a set of attributes may not yield the proper order of the attributes, and hence leads the encoder to produce an improper context to generate a description. To handle disordered input, recent studies propose two-stage neural models that use pointer networks to generate a content-plan (i.e., content-planner) and use the content-plan as input for an encoder-decoder model (i.e., text generator). However, in two-stage models, the content-planner may yield an incomplete content-plan, due to missing one or more salient attributes in the generated content-plan. This will in turn cause the text generator to generate an incomplete description. To address these problems, we propose a novel attention model that exploits content-plan to highlight salient attributes in a proper order. The challenge of integrating a content-plan in the attention model of an encoder-decoder framework is to align the content-plan and the generated description. We handle this problem by devising a coverage mechanism to track the extent to which the content-plan is exposed in the previous decoding time-step, and hence it helps our proposed attention model select the attributes to be mentioned in the description in a proper order. Experimental results show that our model outperforms state-of-the-art baselines by up to 3% and 5% in terms of BLEU score on two real-world datasets, respectively.

Get full-text (via PubEx)