correct object Latest Research Papers

Learning how to execute complex tasks involving multiple objects in a 3D world is challenging when there is no ground-truth information about the objects or any demonstration to learn from. When an agent only receives a signal from task-completion, this makes it challenging to learn the object-representations which support learning the correct object-interactions needed to complete the task. In this work, we formulate learning an attentive object dynamics model as a classification problem, using random object-images to define incorrect labels for our object-dynamics model. We show empirically that this enables object-representation learning that captures an object's category (is it a toaster?), its properties (is it on?), and object-relations (is something inside of it?). With this, our core learner (a relational RL agent) receives the dense training signal it needs to rapidly learn object-interaction tasks. We demonstrate results in the 3D AI2Thor simulated kitchen environment with a range of challenging food preparation tasks. We compare our method's performance to several related approaches and against the performance of an oracle: an agent that is supplied with ground-truth information about objects in the scene. We find that our agent achieves performance closest to the oracle in terms of both learning speed and maximum success rate.

Download Full-text

Look Wide and Interpret Twice: Improving Performance on Interactive Instruction-following Tasks

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/128 ◽

2021 ◽

Author(s):

Van-Quang Nguyen ◽

Masanori Suganuma ◽

Takayuki Okatani

Keyword(s):

Visual Information ◽

Input Image ◽

Action Sequence ◽

Essential Information ◽

Interactive Instruction ◽

Single View ◽

Correct Object ◽

Preliminary Version ◽

New Ideas ◽

Current Instruction

There is a growing interest in the community in making an embodied AI agent perform a complicated task while interacting with an environment following natural language directives. Recent studies have tackled the problem using ALFRED, a well-designed dataset for the task, but achieved only very low accuracy. This paper proposes a new method, which outperforms the previous methods by a large margin. It is based on a combination of several new ideas. One is a two-stage interpretation of the provided instructions. The method first selects and interprets an instruction without using visual information, yielding a tentative action sequence prediction. It then integrates the prediction with the visual information etc., yielding the final prediction of an action and an object. As the object's class to interact is identified in the first stage, it can accurately select the correct object from the input image. Moreover, our method considers multiple egocentric views of the environment and extracts essential information by applying hierarchical attention conditioned on the current instruction. This contributes to the accurate prediction of actions for navigation. A preliminary version of the method won the ALFRED Challenge 2020. The current version achieves the unseen environment's success rate of 4.45% with a single view, which is further improved to 8.37% with multiple views.

Download Full-text

Training self-other distinction facilitates perspective taking in young children

10.31234/osf.io/dyb2x ◽

2021 ◽

Author(s):

Dora Kampis ◽

Helle Lukowski Duplessy ◽

Victoria Southgate

Keyword(s):

Perspective Taking ◽

Communication Training ◽

Social Inhibition ◽

Correct Object ◽

Imitation Inhibition ◽

Adults And Children ◽

Task Training ◽

Inhibition Group ◽

Inhibition Training ◽

Director Task

Adults and children sometimes commit ‘egocentric errors’, failing to ignore their own perspective, when interpreting others’ communication. Training imitation-inhibition reduces these errors in adults, facilitating perspective-taking. This study tested whether imitation-inhibition training may also facilitate perspective-taking in 3-6-year-olds, an age where the egocentric perspective may be particularly influential. Children participated in a 10-minute imitation-inhibition, imitation, or non-social-inhibition training (white, n=25 per condition, 33 female), and subsequently the communicative-perspective-taking Director task. Training had a significant effect (F(2, 71)= 3.268, p= .044, η2= .084): on critical trials the imitation-inhibition group selected the correct object more often than the imitation and non-social-inhibition training groups. The imitation-inhibition training thus specifically enhanced the perspective-taking process, indicating that perspective-taking from childhood onwards involves managing self-other representations.

Download Full-text

A comparison of monolingual and bilingual toddlers’ word recognition in noise

International Journal of Bilingualism ◽

10.1177/13670069211028664 ◽

2021 ◽

pp. 136700692110286

Author(s):

Giovanna Morini ◽

Rochelle S. Newman

Keyword(s):

Background Noise ◽

Word Identification ◽

Bilingual Children ◽

Noise Levels ◽

High Noise ◽

Speech In Noise ◽

Mixed Analysis ◽

Correct Object ◽

Language Exposure ◽

And Performance

Aims and objectives: The purpose of this study was to examine whether differences in language exposure (i.e., being raised in a bilingual versus a monolingual environment) influence young children’s ability to comprehend words when speech is heard in the presence of background noise. Methodology: Forty-four children (22 monolinguals and 22 bilinguals) between the ages of 29 and 31 months completed a preferential looking task where they saw picture-pairs of familiar objects (e.g., balloon and apple) on a screen and simultaneously heard sentences instructing them to locate one of the objects (e.g., look at the apple!). Speech was heard in quiet and in the presence of competing white noise. Data and analyses: Children’s eye-movements were coded off-line to identify the proportion of time they fixated on the correct object on the screen and performance across groups was compared using a 2 × 3 mixed analysis of variance. Findings: Bilingual toddlers performed worse than monolinguals during the task. This group difference in performance was particularly clear when the listening condition contained background noise. Originality: There are clear differences in how infants and adults process speech in noise. To date, developmental work on this topic has mainly been carried out with monolingual infants. This study is one of the first to examine how background noise might influence word identification in young bilingual children who are just starting to acquire their languages. Significance: High noise levels are often reported in daycares and classrooms where bilingual children are present. Therefore, this work has important implications for learning and education practices with young bilinguals.

Download Full-text

Prediction advantage as retrieval interference: an ACT-R model of processing possessive pronouns

10.31219/osf.io/9vwa3 ◽

2021 ◽

Author(s):

Umesh Patil ◽

Sol Lago

Keyword(s):

Eye Movements ◽

Dual Task ◽

Memory Retrieval ◽

Target Object ◽

Predictive Processing ◽

Activation Level ◽

Computational Implementation ◽

Correct Object ◽

Retrieval Interference ◽

Gender Features

We propose a retrieval interference-based explanation of a prediction advantage effect observed in Stone et al. (2021). They reported two dual-task eye-tracking experiments in which participants listened to instructions involving German possessive pronouns, e.g. ‘Click on his blue button’, and were asked to select the correct object from a set of objects displayed on screen. Participants’ eye movements showed predictive processing, such that the target object was fixated before its name was heard. Moreover, when the target and the antecedent of the pronoun matched in gender, predictions arose earlier than when the two genders mismatched — a prediction advantage. We propose that the prediction advantage arises due to similarity-based interference during antecedent retrieval, such that the overlap of gender features between the antecedent and possessum boosts the activation level of the latter and helps predict it faster. We report an ACT-R model supporting this hypothesis. Our model also provides a computational implementation of the idea that prediction can be thought of as memory retrieval. In addition, we provide a preliminary ACT-R model of how linguistic processes could drive changes in visual attention.

Download Full-text

Emotion and liking: how director emotional expression and knowledge of (dis)liking may impact adults’ ability to follow the instructions of an ignorant speaker

Psychological Research ◽

10.1007/s00426-020-01441-x ◽

2020 ◽

Author(s):

Rebecca L. Monk ◽

Lauren Colbert ◽

Gemma Darker ◽

Jade Cowling ◽

Bethany Jones ◽

...

Keyword(s):

Perspective Taking ◽

Real Life ◽

Target Object ◽

Individual Level ◽

Object Selection ◽

Knowledge And Beliefs ◽

Correct Object ◽

The Individual ◽

Key Aspects ◽

Accuracy Rates

Abstract Background Theory of mind (ToM), the ability to understand that others have different knowledge and beliefs to ourselves, has been the subject of extensive research which suggests that we are not always efficient at taking another’s perspective, known as visual perspective taking (VPT). This has been studied extensively and a growing literature has explored the individual-level factors that may affect perspective taking (e.g. empathy and group membership). However, while emotion and (dis)liking are key aspects within everyday social interaction, research has not hitherto explored how these factors may impact ToM. Method A total of 164 participants took part in a modified director task (31 males (19%), M age = 20.65, SD age = 5.34), exploring how correct object selection may be impacted by another’s emotion (director facial emotion; neutral × happy × sad) and knowledge of their (dis)likes (i.e. director likes specific objects). Result When the director liked the target object or disliked the competitor object, accuracy rates were increased relative to when he disliked the target object or liked the competitor object. When the emotion shown by the director was incongruent with their stated (dis)liking of an object (e.g. happy when he disliked an object), accuracy rates were also increased. None of these effects were significant in the analysis of response time. These findings suggest that knowledge of liking may impact ToM use, as can emotional incongruency, perhaps by increasing the saliency of perspective differences between participant and director. Conclusion As well as contributing further to our understanding of real-life social interactions, these findings may have implications for ToM research, where it appears that more consideration of the target/director’s characteristics may be prudent.

Download Full-text

Children understand communication intuitively but indirect communication makes them think twice – Evidence from pupillometry and looking patterns

10.31234/osf.io/4a3tp ◽

2020 ◽

Author(s):

Cornelia Schulze ◽

David Buttelmann

Keyword(s):

Reaction Times ◽

Developmental Trajectory ◽

Dual Process ◽

Indirect Communication ◽

Communicative Acts ◽

Object Selection ◽

Object Choice ◽

Correct Object ◽

Communication Condition ◽

Object Choice Task

Interpreting a speaker’s communicative acts is a challenge children are facing permanently in everyday live. In doing so, they seem to understand direct communicative acts more easily than indirect communicative acts. The present study investigated which step in the processing of communicative acts might cause difficulties in understanding indirect communication. To assess the developmental trajectory of this phenomenon, we tested 3- and 5-year-old children (N=105) using eyetracking and an object-choice task. The children watched videos that showed puppets during their every-day activities (e.g., pet care). For every activity, the puppets were asked which of two objects (e.g., rabbit or dog) they would rather have. The puppets responded either directly (“I want the rabbit”) or indirectly (“I have a carrot”). Results showed that children chose the object intended by the puppets more often in the direct- than in the indirect-communication condition, and 5-year-olds chose correctly more than 3-year-olds. However, even though we found that children’s pupil size increased while hearing the utterances, we found no effect for communication type before children had already decided on the correct object during object selection by looking at it. Only after this point, that is, only in children’s further fixation patterns and reaction times did differences for communication type occur. Thus, although children’s object-choice performance suggests that indirect communication is harder to understand than direct communication, the cognitive demands during processing both communication types seem similar. We discuss theoretical implications of these findings for developmental pragmatics in terms of a dual-process account of communication comprehension.

Download Full-text

Communicative cues in the absence of a human interaction partner enhance 12-month-old infants’ word learning

10.31234/osf.io/et4sa ◽

2019 ◽

Author(s):

Sho Tsuji ◽

Nobuyuki Jincho ◽

Reiko Mazuka ◽

Alejandrina Cristia

Keyword(s):

Language Acquisition ◽

Social Environment ◽

Word Learning ◽

Human Interaction ◽

Virtual Agent ◽

Social Agents ◽

Screen Media ◽

The Social ◽

Old Japanese ◽

Correct Object

Is infants’ word learning boosted by nonhuman social agents? An on-screen virtual agent taught infants word–object associations in a setup where the presence of contingent and referential cues could be manipulated using gaze contingency. In the study, 12-month-old Japanese-learning children (N = 36) looked significantly more to the correct object when it was labeled after exposure to a contingent and referential display versus a noncontingent and nonreferential display. These results show that communicative cues can augment learning even for a nonhuman agent, a finding highly relevant for our understanding of the mechanisms through which the social environment supports language acquisition and for research on the use of interactive screen media.

Download Full-text

New Caledonian crows infer the weight of objects from observing their movements in a breeze

Proceedings of The Royal Society B Biological Sciences ◽

10.1098/rspb.2018.2332 ◽

2019 ◽

Vol 286 (1894) ◽

pp. 20182332 ◽

Cited By ~ 10

Author(s):

Sarah A. Jelbert ◽

Rachael Miller ◽

Martina Schiestl ◽

Markus Boeckle ◽

Lucy G. Cheke ◽

...

Keyword(s):

The Novel ◽

Heavy Object ◽

Subsequent Test ◽

Correct Object ◽

Novel Objects ◽

New Caledonian Crows ◽

Experimental Trials

Humans use a variety of cues to infer an object's weight, including how easily objects can be moved. For example, if we observe an object being blown down the street by the wind, we can infer that it is light. Here, we tested whether New Caledonian crows make this type of inference. After training that only one type of object (either light or heavy) was rewarded when dropped into a food dispenser, birds observed pairs of novel objects (one light and one heavy) suspended from strings in front of an electric fan. The fan was either on—creating a breeze which buffeted the light, but not the heavy, object—or off, leaving both objects stationary. In subsequent test trials, birds could drop one, or both, of the novel objects into the food dispenser. Despite having no opportunity to handle these objects prior to testing, birds touched the correct object (light or heavy) first in 73% of experimental trials, and were at chance in control trials. Our results suggest that birds used pre-existing knowledge about the behaviour exhibited by differently weighted objects in the wind to infer their weight, using this information to guide their choices.

Download Full-text

Cross-task perceptual learning of object recognition in simulated retinal implant perception

10.1101/360669 ◽

2018 ◽

Author(s):

Lihui Wang ◽

Fariba Sharifian ◽

Jonathan Napp ◽

Carola Nath ◽

Stefan Pollmann

Keyword(s):

Object Recognition ◽

Visual Perception ◽

Discrimination Task ◽

Recognition Performance ◽

Naming Task ◽

Task Context ◽

Retinal Implant ◽

Task Order ◽

Correct Object ◽

Task Training

AbstractThe perception gained by retina implants (RI) is limited, which asks for a learning regime to improve patients’ visual perception. Here we simulated RI vision and investigated if object recognition in RI patients can be improved and maintained through training. Importantly, we asked if the trained object recognition can be generalized to a new task context, and to new viewpoints of the trained objects. For this purpose, we adopted two training tasks, a naming task where participants had to choose the correct label out of other distracting labels for the presented object, and a discrimination task where participants had to choose the correct object out of other distracting objects to match the presented label. Our results showed that, despite of the task order, recognition performance was improved in both tasks and lasted at least for a week. The improved object recognition, however, can be transferred only from the naming task to the discrimination task but not vice versa. Additionally, the trained object recognition can be transferred to new viewpoints of the trained objects only in the naming task but not in the discrimination task. Training with the naming task is therefore recommended for RI patients to achieve persistent and flexible visual perception.

Download Full-text

correct object
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environment

Look Wide and Interpret Twice: Improving Performance on Interactive Instruction-following Tasks

Training self-other distinction facilitates perspective taking in young children

A comparison of monolingual and bilingual toddlers’ word recognition in noise

Prediction advantage as retrieval interference: an ACT-R model of processing possessive pronouns

Emotion and liking: how director emotional expression and knowledge of (dis)liking may impact adults’ ability to follow the instructions of an ignorant speaker

Children understand communication intuitively but indirect communication makes them think twice – Evidence from pupillometry and looking patterns

Communicative cues in the absence of a human interaction partner enhance 12-month-old infants’ word learning

New Caledonian crows infer the weight of objects from observing their movements in a breeze

Cross-task perceptual learning of object recognition in simulated retinal implant perception

Export Citation Format

correct objectRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environment

Look Wide and Interpret Twice: Improving Performance on Interactive Instruction-following Tasks

Training self-other distinction facilitates perspective taking in young children

A comparison of monolingual and bilingual toddlers’ word recognition in noise

Prediction advantage as retrieval interference: an ACT-R model of processing possessive pronouns

Emotion and liking: how director emotional expression and knowledge of (dis)liking may impact adults’ ability to follow the instructions of an ignorant speaker

Children understand communication intuitively but indirect communication makes them think twice – Evidence from pupillometry and looking patterns

Communicative cues in the absence of a human interaction partner enhance 12-month-old infants’ word learning

New Caledonian crows infer the weight of objects from observing their movements in a breeze

Cross-task perceptual learning of object recognition in simulated retinal implant perception

correct object
Recently Published Documents