Bootstrapping Distributional Feature Vector Quality

This article presents a novel bootstrapping approach for improving the quality of feature vector weighting in distributional word similarity. The method was motivated by attempts to utilize distributional similarity for identifying the concrete semantic relationship of lexical entailment. Our analysis revealed that a major reason for the rather loose semantic similarity obtained by distributional similarity methods is insufficient quality of the word feature vectors, caused by deficient feature weighting. This observation led to the definition of a bootstrapping scheme which yields improved feature weights, and hence higher quality feature vectors. The underlying idea of our approach is that features which are common to similar words are also most characteristic for their meanings, and thus should be promoted. This idea is realized via a bootstrapping step applied to an initial standard approximation of the similarity space. The superior performance of the bootstrapping method was assessed in two different experiments, one based on direct human gold-standard annotation and the other based on an automatically created disambiguation dataset. These results are further supported by applying a novel quantitative measurement of the quality of feature weighting functions. Improved feature weighting also allows massive feature reduction, which indicates that the most characteristic features for a word are indeed concentrated at the top ranks of its vector. Finally, experiments with three prominent similarity measures and two feature weighting functions showed that the bootstrapping scheme is robust and is independent of the original functions over which it is applied.

Download Full-text

Video Quality of Experience Metric for Dynamic Adaptive Streaming Services Using DASH Standard and Deep Spatial-Temporal Representation of Video

Applied Sciences ◽

10.3390/app10051793 ◽

2020 ◽

Vol 10 (5) ◽

pp. 1793

Author(s):

Lina Du ◽

Li Zhuo ◽

Jiafeng Li ◽

Jing Zhang ◽

Xiaoguang Li ◽

...

Keyword(s):

Video Compression ◽

Quality Of Experience ◽

Feature Vector ◽

Video Quality ◽

Superior Performance ◽

Adaptive Streaming ◽

Input Feature ◽

Streaming Services ◽

Content Characteristics

DASH (Dynamic Adaptive Streaming over HTTP (HyperText Transfer Protocol)) as a universal unified multimedia streaming standard selects the appropriate video bitrate to improve the user’s Quality of Experience (QoE) according to network conditions, client status, etc. Considering that the quantitative expression of the user’s QoE is also a difficult point in itself, this paper researched the distortion caused due to video compression, network transmission and other aspects, and then proposes a video QoE metric for dynamic adaptive streaming services. Three-Dimensional Convolutional Neural Networks (3D CNN) and Long Short-Term Memory (LSTM) are used together to extract the deep spatial-temporal features to represent the content characteristics of the video. While accounting for the fluctuation in the quality of a video caused by bitrate switching on the QoE, other factors such as video content characteristics, video quality and video fluency, are combined to form the input feature vector. The ridge regression method is adopted to establish a QoE metric that enables to dynamically describe the relationship between the input feature vector and the value of the Mean Opinion Score (MOS). The experimental results on different datasets demonstrate that the prediction accuracy of the proposed method can achieve superior performance over the state-of-the-art methods, which proves the proposed QoE model can effectively guide the client’s bitrate selection in dynamic adaptive streaming media services.

Download Full-text

Walsh Transform based Feature vector generation for Image Database Classification

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i3.pp1176-1182 ◽

2016 ◽

Vol 6 (3) ◽

pp. 1176

Author(s):

Jagruti Ketan Save

Keyword(s):

Nearest Neighbor ◽

Feature Vector ◽

Similarity Measures ◽

Image Database ◽

Training Data ◽

Walsh Transform ◽

Test Image ◽

Feature Vectors ◽

Vector Generation ◽

Automated Classifier

Thousands of images are generated everyday, which implies the need to build an easy, faster, automated classifier to classify and organize these images. Classification means selecting an appropriate class for a given image from a set of pre-defined classes. The main objective of this work is to explore feature vector generation using Walsh transform for classification. In the first method, we applied Walsh transform on the columns of an image to generate feature vectors. In second method, Walsh wavelet matrix is used for feature vector generation. In third method we proposed to apply vector quantization (VQ) on feature vectors generated by earlier methods. It gives better accuracy, fast computation and less storage space as compared with the earlier methods. Nearest neighbor and nearest mean classification algorithms are used to classify input test image. Image database used for the experimentation contains 2000 images. All these methods generate large number of outputs for single test image by considering four similarity measures, six sizes of feature vector, two ways of classification, four VQ techniques, three sizes of codebook, and five combinations of wavelet transform matrix generation. We observed improvement in accuracy from 63.22% to 74% (55% training data) through the series of techniques.

Download Full-text

Verb sense disambiguation based on dual distributional similarity

Natural Language Engineering ◽

10.1017/s1351324999002193 ◽

1999 ◽

Vol 5 (2) ◽

pp. 157-170

Author(s):

JEONG-MI CHO ◽

JUNGYUN SEO ◽

GIL CHANG KIM

Keyword(s):

Similarity Measures ◽

Training Corpus ◽

Word Similarity ◽

Data Sparseness ◽

Sense Disambiguation ◽

Distributional Similarity ◽

The Senses ◽

Machine Readable

This paper presents a system for automatic verb sense disambiguation using a small corpus and a Machine-Readable Dictionary (MRD) in Korean. The system learns a set of typical uses listed in the MRD usage examples for each of the senses of a polysemous verb in the MRD definitions using verb-object co-occurrences acquired from the corpus. This paper concentrates on the problem of data sparseness in two ways. First, by extending word similarity measures from direct co-occurrences to co-occurrences of co-occurring words, we compute the word similarities using non co-occurring words but co-occurring clusters. Secondly, we acquire IS-A relations of nouns from the MRD definitions. It is possible to roughly cluster the nouns by the identification of the IS-A relationship. Using these methods, two words may be considered similar even if they do not share any word elements. Experiments show that this method can learn from a very small training corpus, achieving over an 86% correct disambiguation performance without any restriction on a word's senses.

Download Full-text

Walsh Transform based Feature vector generation for Image Database Classification

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i3.8922 ◽

2016 ◽

Vol 6 (3) ◽

pp. 1176

Author(s):

Jagruti Ketan Save

Keyword(s):

Nearest Neighbor ◽

Feature Vector ◽

Similarity Measures ◽

Image Database ◽

Training Data ◽

Walsh Transform ◽

Test Image ◽

Feature Vectors ◽

Vector Generation ◽

Automated Classifier

Download Full-text

A COMPARISON OF STRUCTURED LIGHT SCANNING AND PHOTOGRAMMETRY FOR THE DIGITISATION OF PHYSICAL PROTOTYPES

Proceedings of the Design Society ◽

10.1017/pds.2021.2 ◽

2021 ◽

Vol 1 ◽

pp. 11-20

Author(s):

Owen Freeman Gebler ◽

Mark Goudswaard ◽

Ben Hicks ◽

David Jones ◽

Aydin Nassehi ◽

...

Keyword(s):

Early Stage ◽

Structured Light ◽

Superior Performance ◽

Specific Design ◽

Model Quality ◽

Stage Design ◽

Design Changes ◽

Structured Light Scanning ◽

Early Stage Design

AbstractPhysical prototyping during early stage design typically represents an iterative process. Commonly, a single prototype will be used throughout the process, with its form being modified as the design evolves. If the form of the prototype is not captured as each iteration occurs understanding how specific design changes impact upon the satisfaction of requirements is challenging, particularly retrospectively.In this paper two different systems for digitising physical artefacts, structured light scanning (SLS) and photogrammetry (PG), are investigated as means for capturing iterations of physical prototypes. First, a series of test artefacts are presented and procedures for operating each system are developed. Next, artefacts are digitised using both SLS and PG and resulting models are compared against a master model of each artefact. Results indicate that both systems are able to reconstruct the majority of each artefact's geometry within 0.1mm of the master, however, overall SLS demonstrated superior performance, both in terms of completion time and model quality. Additionally, the quality of PG models was far more influenced by the effort and expertise of the user compared to SLS.

Download Full-text

Improved Bidirectional CABOSFV Based on Multi-Adjustment Clustering and Simulated Annealing

Cybernetics and Information Technologies ◽

10.1515/cait-2016-0075 ◽

2016 ◽

Vol 16 (6) ◽

pp. 27-42 ◽

Cited By ~ 1

Author(s):

Minghan Yang ◽

Xuedong Gao ◽

Ling Li

Keyword(s):

Simulated Annealing ◽

Data Clustering ◽

Time Complexity ◽

Clustering Algorithm ◽

Feature Vector ◽

Parameter Determination ◽

Data Sets ◽

Parameter Vector ◽

Clustering Validity

Abstract Although Clustering Algorithm Based on Sparse Feature Vector (CABOSFV) and its related algorithms are efficient for high dimensional sparse data clustering, there exist several imperfections. Such imperfections as subjective parameter designation and order sensibility of clustering process would eventually aggravate the time complexity and quality of the algorithm. This paper proposes a parameter adjustment method of Bidirectional CABOSFV for optimization purpose. By optimizing Parameter Vector (PV) and Parameter Selection Vector (PSV) with the objective function of clustering validity, an improved Bidirectional CABOSFV algorithm using simulated annealing is proposed, which circumvents the requirement of initial parameter determination. The experiments on UCI data sets show that the proposed algorithm, which can perform multi-adjustment clustering, has a higher accurateness than single adjustment clustering, along with a decreased time complexity through iterations.

Download Full-text

Influence of off-season land management on maximizing yield and quality of Curcuma longa L. cultivars

Journal of Phytology ◽

10.25081/jp.2018.v10.3459 ◽

1970 ◽

pp. 33-36

Author(s):

A. ANBURANI

Keyword(s):

Land Management ◽

Management Practices ◽

Curcuma Longa ◽

Block Design ◽

Quality Parameters ◽

Polyethylene Film ◽

Superior Performance ◽

Yield And Quality ◽

Randomized Block Design

The present investigation was carried out to study the effect of off season soil management practices on yield and quality of turmeric (Curcuma longa L.) cultivars. The experiment was laid out in a Factorial Randomized Block Design with ten treatments in three replications consisted of five off-season land management treatments viz., fallow (S1), summer ploughing 2 times (S2), summer ploughing 1 time (S3), solarization with transparent polyethylene film of 0.05 mm thick for 40 days (S4) and black polyethylene film for 40 days (S5). It was tested with two popular cultivars viz., Curcuma longa -1 CL-1 (V1) and Curcuma longa-2 CL-2 (V2), collected from Erode and Chidambaram. Various yield components were recorded at the time of harvest and were analysed. The yield attributing characters viz., number, length, girth and weight of mother, primary and secondary rhizomes were recorded. The treatment where solarization with transparent polyethylene film of 0.05 mm thick was tested recorded the highest yield and yield attributing characters when compared to other treatments. The same treatment also exhibited the highest fresh rhizome yield per plant, curing percentage and cured rhizome yield. The quality parameters like curcumin, oleoresin and essential oil content were also showed superior performance in the treatment where solarization with transparent polyethylene film of 0.05 mm thick was applied.

Download Full-text

Frequency estimates for statistical word similarity measures

10.3115/1073445.1073477 ◽

2003 ◽

Cited By ~ 55

Author(s):

Egidio Terra ◽

C. L. A. Clarke

Keyword(s):

Similarity Measures ◽

Word Similarity

Download Full-text

Community detection in complex networks by using membrane algorithm

International Journal of Modern Physics C ◽

10.1142/s0129183118500031 ◽

2018 ◽

Vol 29 (01) ◽

pp. 1850003 ◽

Cited By ~ 3

Author(s):

Chuang Liu ◽

Linan Fan ◽

Zhou Liu ◽

Xiang Dai ◽

Jiamei Xu ◽

...

Keyword(s):

Network Analysis ◽

Complex Networks ◽

Community Detection ◽

Membrane Structure ◽

Superior Performance ◽

Membrane Systems ◽

Membrane Algorithm ◽

Differential Evolutionary Algorithm ◽

Network Modularity

Community detection in complex networks is a key problem of network analysis. In this paper, a new membrane algorithm is proposed to solve the community detection in complex networks. The proposed algorithm is based on membrane systems, which consists of objects, reaction rules, and a membrane structure. Each object represents a candidate partition of a complex network, and the quality of objects is evaluated according to network modularity. The reaction rules include evolutionary rules and communication rules. Evolutionary rules are responsible for improving the quality of objects, which employ the differential evolutionary algorithm to evolve objects. Communication rules implement the information exchanged among membranes. Finally, the proposed algorithm is evaluated on synthetic, real-world networks with real partitions known and the large-scaled networks with real partitions unknown. The experimental results indicate the superior performance of the proposed algorithm in comparison with other experimental algorithms.

Download Full-text

Novel Similarity Measures, Entropy of Intuitionistic Fuzzy Sets and their Application in Software Quality Evaluation

10.21203/rs.3.rs-563745/v1 ◽

2021 ◽

Author(s):

Xuan Thao Nguyen ◽

Shuo Yan Chou

Keyword(s):

Fuzzy Sets ◽

Similarity Measures ◽

Intuitionistic Fuzzy Sets ◽

Entropy Measure ◽

Uncertain Information ◽

Membership Functions ◽

Software Projects ◽

Intuitionistic Fuzzy ◽

Software Quality Evaluation

Abstract Intuitionistic fuzzy sets (IFSs), including member and nonmember functions, have many applications in managing uncertain information. The similarity measures of IFSs proposed to represent the similarity between different types of sensitive fuzzy information. However, some existing similarity measures do not meet the axioms of similarity. Moreover, in some cases, they could not be applied appropriately. In this study, we proposed some novel similarity measures of IFSs constructed by combining the exponential function of membership functions and the negative function of non-membership functions. In this paper, we also proposed a new entropy measure as a stepping stone to calculate the weights of the criteria in the proposed multi-criteria decision making (MCDM) model. The similarity measures used to rank alternatives in the model. Finally, we used this MCDM model to evaluate the quality of software projects.

Download Full-text