Generating Alternative Item Types Using Auxiliary Information

Abstract. Given a consistent interest in comparing achievement across sub-populations in international assessments such as TIMSS, PIRLS, and PISA, it is critical that sub-population achievement is estimated reliably and with sufficient precision. As such, we systematically examine the limitations to current estimation methods used by these programs. Using a simulation study along with empirical results from the 2007 cycle of TIMSS, we show that a combination of missing and misclassified data in the conditioning model induces biases in sub-population achievement estimates, the magnitude and degree to which can be readily explained by data quality. Importantly, estimated biases in sub-population achievement are limited to the conditioning variable with poor-quality data while other sub-population achievement estimates are unaffected. Findings are generally in line with theory on missing and error-prone covariates. The current research adds to a small body of literature that has noted some of the limitations to sub-population estimation.

Download Full-text

Item Types, Cognitive Strategies, and Gender Differences in Mental Rotation

PsycEXTRA Dataset ◽

10.1037/e520602012-461 ◽

2011 ◽

Author(s):

Randi A. Doyle ◽

Daniel Voyer

Keyword(s):

Gender Differences ◽

Mental Rotation ◽

Cognitive Strategies ◽

And Gender ◽

Item Types

Download Full-text

Item Types and Upper Basic Education Students’ Performance in Mathematics in the Southern Senatorial District of Cross River State, Nigeria

Journal of Modern Education Review ◽

10.15341/jmer(2155-7993)/01.04.2014/008 ◽

2014 ◽

Vol 4 (1) ◽

pp. 57-73

Author(s):

Joy Dianabasi Eduwem ◽

Imo Edet Umoinyang

Keyword(s):

Basic Education ◽

Education Students ◽

Cross River ◽

Cross River State ◽

Item Types

Download Full-text

GENERALISED SYNTHETIC ESTIMATOR USING DOUBLE SAMPLING SCHEME AND AUXILIARY INFORMATION

Mathematical Journal of Interdisciplinary Sciences ◽

10.15415/mjis.2015.41002 ◽

2015 ◽

Vol 4 (1) ◽

pp. 15-21

Author(s):

Shashi Bahl ◽

Sangeeta .

Keyword(s):

Auxiliary Information ◽

Sampling Scheme ◽

Double Sampling

Download Full-text

Privacy-preserving Collaborative Training for Medical Image Analysis Based on Multi-Blockchain

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207323666201022110616 ◽

2020 ◽

Vol 23 ◽

Author(s):

Wanlu Zhang ◽

Qigang Wang ◽

Mei Li

Keyword(s):

Medical Image ◽

Data Privacy ◽

Medical Image Analysis ◽

Auxiliary Information ◽

Training Process ◽

Private Data ◽

Medical Institutions ◽

Model Training ◽

Collaborative Training ◽

Similar Task

Background: As artificial intelligence and big data analysis develop rapidly, data privacy, especially patient medical data privacy, is getting more and more attention. Objective: To strengthen the protection of private data while ensuring the model training process, this article introduces a multi-Blockchain-based decentralized collaborative machine learning training method for medical image analysis. In this way, researchers from different medical institutions are able to collaborate to train models without exchanging sensitive patient data. Method: Partial parameter update method is applied to prevent indirect privacy leakage during model propagation. With the peer-to-peer communication in the multi-Blockchain system, a machine learning task can leverage auxiliary information from another similar task in another Blockchain. In addition, after the collaborative training process, personalized models of different medical institutions will be trained. Results: The experimental results show that our method achieves similar performance with the centralized model-training method by collecting data sets of all participants and prevents private data leakage at the same time. Transferring auxiliary information from similar task on another Blockchain has also been proven to effectively accelerate model convergence and improve model accuracy, especially in the scenario of absence of data. Personalization training process further improves model performance. Conclusion: Our approach can effectively help researchers from different organizations to achieve collaborative training without disclosing their private data.

Download Full-text

A Survey of Network Embedding for Drug Analysis and Prediction

Current Protein and Peptide Science ◽

10.2174/1389203721666200702145701 ◽

2020 ◽

Vol 21 ◽

Author(s):

Zhixian Liu ◽

Qingfeng Chen ◽

Wei Lan ◽

Jiahai Liang ◽

Yiping Pheobe Chen ◽

...

Keyword(s):

Deep Learning ◽

Protein Function ◽

Dimensional Space ◽

Auxiliary Information ◽

Matrix Decomposition ◽

Drug Analysis ◽

Machine Learning Algorithms ◽

Superior Performance ◽

Network Embedding ◽

Similarity Estimation

: Traditional network-based computational methods have shown good results in drug analysis and prediction. However, these methods are time consuming and lack universality, and it is difficult to exploit the auxiliary information of nodes and edges. Network embedding provides a promising way for alleviating the above problems by transforming network into a low-dimensional space while preserving network structure and auxiliary information. This thus facilitates the application of machine learning algorithms for subsequent processing. Network embedding has been introduced into drug analysis and prediction in the last few years, and has shown superior performance over traditional methods. However, there is no systematic review of this issue. This article offers a comprehensive survey of the primary network embedding methods and their applications in drug analysis and prediction. The network embedding technologies applied in homogeneous network and heterogeneous network are investigated and compared, including matrix decomposition, random walk, and deep learning. Especially, the Graph neural network (GNN) methods in deep learning are highlighted. Further, the applications of network embedding in drug similarity estimation, drug-target interaction prediction, adverse drug reactions prediction, protein function and therapeutic peptides prediction are discussed. Several future potential research directions are also discussed.

Download Full-text

A simulated annealing-based algorithm for selecting balanced samples

Computational Statistics ◽

10.1007/s00180-021-01113-3 ◽

2021 ◽

Author(s):

Roberto Benedetti ◽

Maria Michela Dickson ◽

Giuseppe Espa ◽

Francesco Pantalone ◽

Federica Piersimoni

Keyword(s):

Simulated Annealing ◽

Optimization Problem ◽

Sample Selection ◽

Auxiliary Information ◽

Real Data ◽

Simulation Experiments ◽

Balanced Sampling ◽

Inclusion Probabilities ◽

Random Method ◽

Annealing Algorithms

AbstractBalanced sampling is a random method for sample selection, the use of which is preferable when auxiliary information is available for all units of a population. However, implementing balanced sampling can be a challenging task, and this is due in part to the computational efforts required and the necessity to respect balancing constraints and inclusion probabilities. In the present paper, a new algorithm for selecting balanced samples is proposed. This method is inspired by simulated annealing algorithms, as a balanced sample selection can be interpreted as an optimization problem. A set of simulation experiments and an example using real data shows the efficiency and the accuracy of the proposed algorithm.

Download Full-text

Short-term Traffic Flow Prediction Based on Multi-Auxiliary Information

2020 the 4th International Conference on Big Data Research (ICBDR'20) ◽

10.1145/3445945.3445951 ◽

2020 ◽

Author(s):

Kai Zhang ◽

Buliao Jia ◽

Yuhan Dong

Keyword(s):

Traffic Flow ◽

Auxiliary Information ◽

Short Term ◽

Traffic Flow Prediction ◽

Flow Prediction

Download Full-text

Practical Wavelet Tree Construction

Journal of Experimental Algorithmics ◽

10.1145/3457197 ◽

2021 ◽

Vol 26 ◽

pp. 1-67

Author(s):

Patrick Dinklage ◽

Jonas Ellert ◽

Johannes Fischer ◽

Florian Kurpicz ◽

Marvin Löbel

Keyword(s):

Parallel Algorithms ◽

Shared Memory ◽

Distributed Memory ◽

Auxiliary Information ◽

Parallel Computers ◽

External Memory ◽

Sequential Algorithms ◽

Bottom Up ◽

Memory Efficiency ◽

Tree Construction

We present new sequential and parallel algorithms for wavelet tree construction based on a new bottom-up technique. This technique makes use of the structure of the wavelet trees—refining the characters represented in a node of the tree with increasing depth—in an opposite way, by first computing the leaves (most refined), and then propagating this information upwards to the root of the tree. We first describe new sequential algorithms, both in RAM and external memory. Based on these results, we adapt these algorithms to parallel computers, where we address both shared memory and distributed memory settings. In practice, all our algorithms outperform previous ones in both time and memory efficiency, because we can compute all auxiliary information solely based on the information we obtained from computing the leaves. Most of our algorithms are also adapted to the wavelet matrix , a variant that is particularly suited for large alphabets.

Download Full-text

Improving the quality of disaggregated SDG indicators with cluster information for small area estimates

Statistical Journal of the IAOS ◽

10.3233/sji-200741 ◽

2020 ◽

Vol 36 (4) ◽

pp. 955-961

Author(s):

Rizky Zulkarnain ◽

Dwi Jayanti ◽

Tri Listianingrum

Keyword(s):

Small Area ◽

Research University ◽

Direct Method ◽

Random Effect ◽

Auxiliary Information ◽

District Level ◽

Small Areas ◽

Associate Dean ◽

Development Goals

The increasing needs for more disaggregated data motivates National Statistical Offices (NSOs) to develop efficient methods for producing official statistics without compromising on quality. In Indonesia, regional autonomy requires that Sustainable Development Goals (SDGs) indicators are available up to the district level. However, several surveys such as the Indonesian Demographic and Health Survey produce estimates up to the provincial level only. This generates gaps in support for district level policies. Small area estimation (SAE) techniques are often considered as alternatives for overcoming this issue. SAE enables more reliable estimation of the small areas by utilizing auxiliary information from other sources. However, the standard SAE approach has limitations in estimating non-sampled areas. This paper introduces an approach to estimating the non-sampled area random effect by utilizing cluster information. This model is demonstrated via the estimation of contraception prevalence rates at district levels in North Sumatera province. The results showed that small area estimates considering cluster information (SAE-cluster) produce more precise estimates than the direct method. The SAE-cluster approach revises the direct estimates upward or downward. This approach has important implications for improving the quality of disaggregated SDGs indicators without increasing cost. The paper was prepared under the kind mentorship of Professor James J. Cochran, Associate Dean for Research, Prof. of Statistics and Operations Research, University of Alabama.

Download Full-text