Large Scale Graph Mining with MapReduce

Social Network Mining, Analysis, and Research Trends ◽

10.4018/978-1-61350-513-7.ch005 ◽

2011 ◽

pp. 66-78 ◽

Cited By ~ 1

Author(s):

Charalampos E. Tsourakakis

Keyword(s):

Survey Research ◽

Present State ◽

Graph Mining ◽

Large Scale ◽

State Of The Art ◽

Source Code ◽

Research Work

In this chapter, the authors present state of the art work on large scale graph mining using MapReduce. They survey research work on an important graph mining problem, estimating the diameter of a graph and the eccentricities/radii of its vertices. Thanks to the algorithm they present in the following, the authors are able to mine graphs with billions of edges, and thus extract surprising patterns. The source code is publicly available at the URL http://www.cs.cmu.edu/~pegasus/.

Get full-text (via PubEx)

Extrinsic Camera Calibration with Line-Laser Projection

Sensors ◽

10.3390/s21041091 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1091

Author(s):

Izaak Van Crombrugge ◽

Rudi Penne ◽

Steve Vanlanduit

Keyword(s):

Camera Calibration ◽

Real World ◽

Large Scale ◽

State Of The Art ◽

Bundle Adjustment ◽

Field Of View ◽

Extrinsic Calibration ◽

Practical Procedure ◽

Partial Overlap

Knowledge of precise camera poses is vital for multi-camera setups. Camera intrinsics can be obtained for each camera separately in lab conditions. For fixed multi-camera setups, the extrinsic calibration can only be done in situ. Usually, some markers are used, like checkerboards, requiring some level of overlap between cameras. In this work, we propose a method for cases with little or no overlap. Laser lines are projected on a plane (e.g., floor or wall) using a laser line projector. The pose of the plane and cameras is then optimized using bundle adjustment to match the lines seen by the cameras. To find the extrinsic calibration, only a partial overlap between the laser lines and the field of view of the cameras is needed. Real-world experiments were conducted both with and without overlapping fields of view, resulting in rotation errors below 0.5°. We show that the accuracy is comparable to other state-of-the-art methods while offering a more practical procedure. The method can also be used in large-scale applications and can be fully automated.

Get full-text (via PubEx)

Legal Judgment Prediction Based on Multiclass Information Fusion

Complexity ◽

10.1155/2020/3089189 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Kongfan Zhu ◽

Rundong Guo ◽

Weifeng Hu ◽

Zeqiang Li ◽

Yujun Li

Keyword(s):

Information Fusion ◽

Real World ◽

Large Scale ◽

State Of The Art ◽

External Information ◽

Criminal Cases ◽

Law System ◽

Large Scale Dataset ◽

Assistant Systems ◽

Civil Law System

Legal judgment prediction (LJP), as an effective and critical application in legal assistant systems, aims to determine the judgment results according to the information based on the fact determination. In real-world scenarios, to deal with the criminal cases, judges not only take advantage of the fact description, but also consider the external information, such as the basic information of defendant and the court view. However, most existing works take the fact description as the sole input for LJP and ignore the external information. We propose a Transformer-Hierarchical-Attention-Multi-Extra (THME) Network to make full use of the information based on the fact determination. We conduct experiments on a real-world large-scale dataset of criminal cases in the civil law system. Experimental results show that our method outperforms state-of-the-art LJP methods on all judgment prediction tasks.

Get full-text (via PubEx)

Efficient Heterogeneous Collaborative Filtering without Negative Sampling for Recommendation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5329 ◽

2020 ◽

Vol 34 (01) ◽

pp. 19-26 ◽

Cited By ~ 5

Author(s):

Chong Chen ◽

Min Zhang ◽

Yongfeng Zhang ◽

Weizhi Ma ◽

Yiqun Liu ◽

...

Keyword(s):

Collaborative Filtering ◽

Real World ◽

Large Scale ◽

State Of The Art ◽

Heterogeneous Data ◽

Model Parameters ◽

Online Systems ◽

Practical Applications ◽

Real World Datasets ◽

Primary Type

Recent studies on recommendation have largely focused on exploring state-of-the-art neural networks to improve the expressiveness of models, while typically apply the Negative Sampling (NS) strategy for efficient learning. Despite effectiveness, two important issues have not been well-considered in existing methods: 1) NS suffers from dramatic fluctuation, making sampling-based methods difficult to achieve the optimal ranking performance in practical applications; 2) although heterogeneous feedback (e.g., view, click, and purchase) is widespread in many online systems, most existing methods leverage only one primary type of user feedback such as purchase. In this work, we propose a novel non-sampling transfer learning solution, named Efficient Heterogeneous Collaborative Filtering (EHCF) for Top-N recommendation. It can not only model fine-grained user-item relations, but also efficiently learn model parameters from the whole heterogeneous data (including all unlabeled data) with a rather low time complexity. Extensive experiments on three real-world datasets show that EHCF significantly outperforms state-of-the-art recommendation methods in both traditional (single-behavior) and heterogeneous scenarios. Moreover, EHCF shows significant improvements in training efficiency, making it more applicable to real-world large-scale systems. Our implementation has been released 1 to facilitate further developments on efficient whole-data based neural methods.

Get full-text (via PubEx)

Material Extrusion Additive Manufacturing of Wood and Lignocellulosic Filled Composites

Polymers ◽

10.3390/polym12092115 ◽

2020 ◽

Vol 12 (9) ◽

pp. 2115

Author(s):

Meghan E. Lamm ◽

Lu Wang ◽

Vidya Kishore ◽

Halil Tekinalp ◽

Vlastimil Kunc ◽

...

Keyword(s):

Additive Manufacturing ◽

3D Printing ◽

Present State ◽

Material Properties ◽

Large Scale ◽

State Of The Art ◽

Material Extrusion ◽

Functional Additives ◽

Natural Filler ◽

Natural Fillers

Wood and lignocellulosic-based material components are explored in this review as functional additives and reinforcements in composites for extrusion-based additive manufacturing (AM) or 3D printing. The motivation for using these sustainable alternatives in 3D printing includes enhancing material properties of the resulting printed parts, while providing a green alternative to carbon or glass filled polymer matrices, all at reduced material costs. Previous review articles on this topic have focused only on introducing the use of natural fillers with material extrusion AM and discussion of their subsequent material properties. This review not only discusses the present state of materials extrusion AM using natural filler-based composites but will also fill in the knowledge gap regarding state-of-the-art applications of these materials. Emphasis will also be placed on addressing the challenges associated with 3D printing using these materials, including use with large-scale manufacturing, while providing insight to overcome these issues in the future.

Get full-text (via PubEx)

Present State of the Art and Prospects for On-line Security Control in Large Scale Electric Power Systems

IFAC Proceedings Volumes ◽

10.1016/s1474-6670(17)63961-2 ◽

1981 ◽

Vol 14 (2) ◽

pp. 3307-3308

Author(s):

T.E. Dy Liacco

Keyword(s):

Power Systems ◽

Electric Power ◽

Present State ◽

Large Scale ◽

State Of The Art ◽

Electric Power Systems ◽

Security Control ◽

On Line

Get full-text (via PubEx)

Scene text removal via cascaded text stroke detection and erasing

Computational Visual Media ◽

10.1007/s41095-021-0242-8 ◽

2021 ◽

Vol 8 (2) ◽

pp. 273-287

Author(s):

Xuewei Bian ◽

Chaoqun Wang ◽

Weize Quan ◽

Juntao Ye ◽

Xiaopeng Zhang ◽

...

Keyword(s):

Performance Improvement ◽

Real World ◽

Large Scale ◽

State Of The Art ◽

The State ◽

Experimental Results ◽

Processing Unit ◽

Final Model ◽

Scene Text ◽

End To End

AbstractRecent learning-based approaches show promising performance improvement for the scene text removal task but usually leave several remnants of text and provide visually unpleasant results. In this work, a novel end-to-end framework is proposed based on accurate text stroke detection. Specifically, the text removal problem is decoupled into text stroke detection and stroke removal; we design separate networks to solve these two subproblems, the latter being a generative network. These two networks are combined as a processing unit, which is cascaded to obtain our final model for text removal. Experimental results demonstrate that the proposed method substantially outperforms the state-of-the-art for locating and erasing scene text. A new large-scale real-world dataset with 12,120 images has been constructed and is being made available to facilitate research, as current publicly available datasets are mainly synthetic so cannot properly measure the performance of different methods.

Get full-text (via PubEx)

Local feature extraction based facial emotion recognition: a survey

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v10i4.pp4080-4092 ◽

2020 ◽

Vol 10 (4) ◽

pp. 4080

Author(s):

Khadija Slimani ◽

Mohamed Kas ◽

Youssef El Merabet ◽

Yassine Ruichek ◽

Rochdi Messoussi

Keyword(s):

Facial Expression ◽

Large Scale ◽

State Of The Art ◽

Research Work ◽

Emotional Expressions ◽

Expression Recognition ◽

Technological Advancement ◽

Recognition Systems ◽

Local Feature Extraction ◽

Micro Patterns

Notwithstanding the recent technological advancement, the identification of facial and emotional expressions is still one of the greatest challenges scientists have ever faced. Generally, the human face is identified as a composition made up of textures arranged in micro-patterns. Currently, there has been a tremendous increase in the use of local binary pattern based texture algorithms which have invariably been identified to being essential in the completion of a variety of tasks and in the extraction of essential attributes from an image. Over the years, lots of LBP variants have been literally reviewed. However, what is left is a thorough and comprehensive analysis of their independent performance. This research work aims at filling this gap by performing a large-scale performance evaluation of 46 recent state-of-the-art LBP variants for facial expression recognition. Extensive experimental results on the well-known challenging and benchmark KDEF, JAFFE, CK and MUG databases taken under different facial expression conditions, indicate that a number of evaluated state-of-the-art LBP-like methods achieve promising results, which are better or competitive than several recent state-of-the-art facial recognition systems. Recognition rates of 100%, 98.57%, 95.92% and 100% have been reached for CK, JAFFE, KDEF and MUG databases, respectively.

Get full-text (via PubEx)

MG-DVD: A Real-time Framework for Malware Variant Detection Based on Dynamic Heterogeneous Graph Learning

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/209 ◽

2021 ◽

Author(s):

Chen Liu ◽

Bo Li ◽

Jun Zhao ◽

Ming Su ◽

Xu-Dong Liu

Keyword(s):

Real Time ◽

Real World ◽

Large Scale ◽

State Of The Art ◽

Evolutionary Patterns ◽

Graph Learning ◽

Fine Grained ◽

Variant Detection ◽

Effectiveness And Efficiency ◽

The Cost

Detecting the newly emerging malware variants in real time is crucial for mitigating cyber risks and proactively blocking intrusions. In this paper, we propose MG-DVD, a novel detection framework based on dynamic heterogeneous graph learning, to detect malware variants in real time. Particularly, MG-DVD first models the fine-grained execution event streams of malware variants into dynamic heterogeneous graphs and investigates real-world meta-graphs between malware objects, which can effectively characterize more discriminative malicious evolutionary patterns between malware and their variants. Then, MG-DVD presents two dynamic walk-based heterogeneous graph learning methods to learn more comprehensive representations of malware variants, which significantly reduces the cost of the entire graph retraining. As a result, MG-DVD is equipped with the ability to detect malware variants in real time, and it presents better interpretability by introducing meaningful meta-graphs. Comprehensive experiments on large-scale samples prove that our proposed MG-DVD outperforms state-of-the-art methods in detecting malware variants in terms of effectiveness and efficiency.

Get full-text (via PubEx)

Summarizing Source Code with Transferred API Knowledge

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/314 ◽

2018 ◽

Cited By ~ 14

Author(s):

Xing Hu ◽

Ge Li ◽

Xin Xia ◽

David Lo ◽

Shuai Lu ◽

...

Keyword(s):

Real World ◽

Software Maintenance ◽

Large Scale ◽

State Of The Art ◽

Source Code ◽

Code Search ◽

Novel Approach ◽

Software Maintenance And Evolution ◽

World Industry ◽

Similar Code

Code summarization, aiming to generate succinct natural language description of source code, is extremely useful for code search and code comprehension. It has played an important role in software maintenance and evolution. Previous approaches generate summaries by retrieving summaries from similar code snippets. However, these approaches heavily rely on whether similar code snippets can be retrieved, how similar the snippets are, and fail to capture the API knowledge in the source code, which carries vital information about the functionality of the source code. In this paper, we propose a novel approach, named TL-CodeSum, which successfully uses API knowledge learned in a different but related task to code summarization. Experiments on large-scale real-world industry Java projects indicate that our approach is effective and outperforms the state-of-the-art in code summarization.

Get full-text (via PubEx)