In Pursuit of Knowledge: Preschoolers Expect Agents to Weigh Information Gain and Information Cost When Deciding Whether to Explore

10.31234/osf.io/2sjb5 ◽

2021 ◽

Author(s):

Rosie Aboody ◽

Caiqin Zhou ◽

Julian Jara-Ettinger

Keyword(s):

Theory Of Mind ◽

Race And Ethnicity ◽

Information Gain ◽

Low Cost ◽

Information Cost ◽

Cost Information ◽

Trade Off ◽

Epistemic States ◽

Expected Information ◽

Expected Information Gain

When deciding whether to explore, agents must consider both their need for information and its cost. Do children recognize that exploration reflects a trade-off between action costs and expected information gain, inferring epistemic states accordingly? In two experiments, 4- and 5-year-olds (N=144; of diverse race and ethnicity) judge that an agent who refuses to obtain low-cost information must have already known it, and an agent who incurs a greater cost to gain information must have a greater epistemic desire. Two control studies suggest that these findings cannot be explained by low-level associations between competence and knowledge. Our results suggest that preschoolers’ Theory of Mind includes expectations about how costs interact with epistemic desires and states to produce exploratory action.

Download Full-text

Supplemental Material for Finding Useful Questions: On Bayesian Diagnosticity, Probability, Impact, and Information Gain

Psychological Review ◽

10.1037/0033-295x.112.4.979.supp ◽

2005 ◽

Keyword(s):

Information Gain

Download Full-text

PENENTUAN DAERAH PRIORITAS PELAYANAN AKTA KELAHIRAN DENGAN METODE K-NN DAN K-MEANS

Komputasi: Jurnal Ilmiah Ilmu Komputer dan Matematika ◽

10.33751/komputasi.v17i1.1735 ◽

2020 ◽

Vol 17 (1) ◽

pp. 319-328

Author(s):

Ade Muchlis Maulana Anwar ◽

Prihastuti Harsani ◽

Aries Maesya

Keyword(s):

Nearest Neighbor ◽

Information Gain ◽

Birth Certificate ◽

Population Data ◽

Community Services ◽

Birth Certificates ◽

Similar Data ◽

K Nearest Neighbor ◽

Civil Registration ◽

The Family

Population Data is individual data or aggregate data that is structured as a result of Population Registration and Civil Registration activities. Birth Certificate is a Civil Registration Deed as a result of recording the birth event of a baby whose birth is reported to be registered on the Family Card and given a Population Identification Number (NIK) as a basis for obtaining other community services. From the total number of integrated birth certificate reporting for the 2018 Population Administration Information System (SIAK) totaling 570,637 there were 503,946 reported late and only 66,691 were reported publicly. Clustering is a method used to classify data that is similar to others in one group or similar data to other groups. K-Nearest Neighbor is a method for classifying objects based on learning data that is the closest distance to the test data. k-means is a method used to divide a number of objects into groups based on existing categories by looking at the midpoint. In data mining preprocesses, data is cleaned by filling in the blank data with the most dominating data, and selecting attributes using the information gain method. Based on the k-nearest neighbor method to predict delays in reporting and the k-means method to classify priority areas of service with 10,000 birth certificate data on birth certificates in 2019 that have good enough performance to produce predictions with an accuracy of 74.00% and with K = 2 on k-means produces a index davies bouldin of 1,179.

Download Full-text

Vertical Boilerplate

10.31228/osf.io/svk6p ◽

2017 ◽

Author(s):

James Gibson

Keyword(s):

Market Power ◽

Real World ◽

Empirical Work ◽

Full Scale ◽

Market Mechanism ◽

Law School ◽

Information Cost ◽

Information Costs ◽

Policy Proposals ◽

Radical Solution

Despite what we learn in law school about the “meeting of the minds,” most contracts are merely boilerplate—take-it-or-leave-it propositions. Negotiation is nonexistent; we rely on our collective market power as consumers to regulate contracts’ content. But boilerplate imposes certain information costs because it often arrives late in the transaction and is hard to understand. If those costs get too high, then the market mechanism fails. So how high are boilerplate’s information costs? A few studies have attempted to measure them, but they all use a “horizontal” approach—i.e., they sample a single stratum of boilerplate and assume that it represents the whole transaction. Yet real-world transactions often involve multiple layers of contracts, each with its own information costs. What is needed, then, is a “vertical” analysis, a study that examines fewer contracts of any one kind but tracks all the contracts the consumer encounters, soup to nuts. This Article presents the first vertical study of boilerplate. It casts serious doubt on the market mechanism and shows that existing scholarship fails to appreciate the full scale of the information cost problem. It then offers two regulatory solutions. The first works within contract law’s unconscionability doctrine, tweaking what the parties need to prove and who bears the burden of proving it. The second, more radical solution involves forcing both sellers and consumers to confront and minimize boilerplate’s information costs—an approach I call “forced salience.” In the end, the boilerplate experience is as deep as it is wide. Our empirical work should reflect that fact, and our policy proposals should too.

Download Full-text

Feature Selection Algorithm for High-dimensional Biomedical Data Using Information Gain and Improved Chemical Reaction Optimization

Current Bioinformatics ◽

10.2174/1574893615666200204154358 ◽

2021 ◽

Vol 15 (8) ◽

pp. 912-926

Author(s):

Ge Zhang ◽

Pan Yu ◽

Jianlin Wang ◽

Chaokun Yan

Keyword(s):

Feature Selection ◽

Chemical Reaction ◽

Information Gain ◽

Feature Selection Method ◽

Search Space ◽

Neighborhood Search ◽

Biomedical Data ◽

Chemical Reaction Optimization ◽

Search Mechanism ◽

Reaction Optimization

Background: There have been rapid developments in various bioinformatics technologies, which have led to the accumulation of a large amount of biomedical data. However, these datasets usually involve thousands of features and include much irrelevant or redundant information, which leads to confusion during diagnosis. Feature selection is a solution that consists of finding the optimal subset, which is known to be an NP problem because of the large search space. Objective: For the issue, this paper proposes a hybrid feature selection method based on an improved chemical reaction optimization algorithm (ICRO) and an information gain (IG) approach, which called IGICRO. Methods: IG is adopted to obtain some important features. The neighborhood search mechanism is combined with ICRO to increase the diversity of the population and improve the capacity of local search. Results: Experimental results of eight public available data sets demonstrate that our proposed approach outperforms original CRO and other state-of-the-art approaches.

Download Full-text

Chemical analysis: Information contribution of results

Collection of Czechoslovak Chemical Communications ◽

10.1135/cccc19910505 ◽

1991 ◽

Vol 56 (3) ◽

pp. 505-559 ◽

Cited By ~ 7

Author(s):

Karel Eckschlager

Keyword(s):

Analytical Chemistry ◽

Information Theory ◽

Set Theory ◽

Fuzzy Set Theory ◽

Information Gain ◽

Analytical Signal ◽

Point Of View ◽

System A ◽

Review Analysis ◽

Theory Point

In this review, analysis is treated as a process of gaining information on chemical composition, taking place in a stochastic system. A model of this system is outlined, and a survey of measures and methods of information theory is presented to an extent as useful for qualitative or identification, quantitative and trace analysis and multicomponent analysis. It is differentiated between information content of an analytical signal and information gain, or amount of information, obtained by the analysis, and their interrelation is demonstrated. Some notions of analytical chemistry are quantified from the information theory and system theory point of view; it is also demonstrated that the use of fuzzy set theory can be suitable. The review sums up the principal results of the series of 25 papers which have been published in this journal since 1971.

Download Full-text

Analysis of Feature Selection and Ensemble Classifier Methods for Intrusion Detection

International Journal of Natural Computing Research ◽

10.4018/ijncr.2018010104 ◽

2018 ◽

Vol 7 (1) ◽

pp. 57-72

Author(s):

H.P. Vinutha ◽

Poornima Basavaraju

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Detection Rate ◽

Information Gain ◽

False Positive Rate ◽

Ensemble Classifier ◽

Ensemble Classification ◽

Chi Square ◽

Traffic Pattern ◽

Data Mining Algorithms

Day by day network security is becoming more challenging task. Intrusion detection systems (IDSs) are one of the methods used to monitor the network activities. Data mining algorithms play a major role in the field of IDS. NSL-KDD'99 dataset is used to study the network traffic pattern which helps us to identify possible attacks takes place on the network. The dataset contains 41 attributes and one class attribute categorized as normal, DoS, Probe, R2L and U2R. In proposed methodology, it is necessary to reduce the false positive rate and improve the detection rate by reducing the dimensionality of the dataset, use of all 41 attributes in detection technology is not good practices. Four different feature selection methods like Chi-Square, SU, Gain Ratio and Information Gain feature are used to evaluate the attributes and unimportant features are removed to reduce the dimension of the data. Ensemble classification techniques like Boosting, Bagging, Stacking and Voting are used to observe the detection rate separately with three base algorithms called Decision stump, J48 and Random forest.

Download Full-text