scholarly journals Bio-Inspired Modality Fusion for Active Speaker Detection

2021 ◽  
Vol 11 (8) ◽  
pp. 3397
Author(s):  
Gustavo Assunção ◽  
Nuno Gonçalves ◽  
Paulo Menezes

Human beings have developed fantastic abilities to integrate information from various sensory sources exploring their inherent complementarity. Perceptual capabilities are therefore heightened, enabling, for instance, the well-known "cocktail party" and McGurk effects, i.e., speech disambiguation from a panoply of sound signals. This fusion ability is also key in refining the perception of sound source location, as in distinguishing whose voice is being heard in a group conversation. Furthermore, neuroscience has successfully identified the superior colliculus region in the brain as the one responsible for this modality fusion, with a handful of biological models having been proposed to approach its underlying neurophysiological process. Deriving inspiration from one of these models, this paper presents a methodology for effectively fusing correlated auditory and visual information for active speaker detection. Such an ability can have a wide range of applications, from teleconferencing systems to social robotics. The detection approach initially routes auditory and visual information through two specialized neural network structures. The resulting embeddings are fused via a novel layer based on the superior colliculus, whose topological structure emulates spatial neuron cross-mapping of unimodal perceptual fields. The validation process employed two publicly available datasets, with achieved results confirming and greatly surpassing initial expectations.

1982 ◽  
Vol 57 (3) ◽  
pp. 309-315
Author(s):  
Mortimer J. Adler

✓ In his 1982 Cushing oration, a distinguished philosopher, author, and discerning critic presents a distillate of his phenomenally wide range of personal experience and his familiarity with the great books and teachers of the present and the past. He explores the differences and relationships between human beings, brute animals, and machines. Knowledge of the brain and nervous system contribute to the explanation of all aspects of animal behavior, intelligence, and mentality, but cannot completely explain human conceptual thought.


2005 ◽  
Vol 15 (2) ◽  
pp. 219-240 ◽  
Author(s):  
A. MARK SMITH

From the late thirteenth to the early seventeenth century, the process of visual imaging was understood in the Latin West as an essentially subjective act initiated by the eye and completed by the brain. The crystalline lens took center stage in this act, its role determined by its peculiar physical and sensitive capacities. As a physical body, on the one hand, it was disposed to accept the physical impressions of light and color radiating to it from external objects. As a sensitive body, on the other hand, it was enabled by the visual spirit flowing to it from the brain to feel those impressions visually. Acting as a sentient selector of visual information, the lens transformed the brute physical impressions of light and color into visual impressions. These, in turn, gave rise to perceptual “depictions” that were passed back along the stream of visual spirits to the brain. Known in Scholastic parlance as “intentional species,” these depictions served as virtual representations of their generating objects. As such, they provided the wherewithal not only for perception, but also for conception and cognition.


1977 ◽  
Vol 86 (3) ◽  
pp. 362-370 ◽  
Author(s):  
Douglas G. Mann ◽  
Clarence T. Sasaki ◽  
Hiroyuki Fukuda ◽  
Masafumi Suzuki ◽  
Juan R. Hernandez

The human nose is an important organ of respiration which by virtue of its valvular influence becomes a significant effector of respiratory resistance over a wide range of ventilatory requirements. In man its effectiveness in this regard is related to its flow limiting segment (FLS) located at the limen nasi. Its passive valvular effect is additionally modified by active respiratory contractions of the dilator naris muscle (DNM) controlled through the VII cranial nerve by the brain stem respiratory center. Its behavior, quantitatively determined in human beings and experimental animals, is summarized. 1) In man, phasic DNM activity operates during eupneic nasal breathing and varies directly with ventilatory resistance. 2) The elimination of all measurable ventilatory resistance results in complete cessation of DNM activity. 3) Over time, reduced resistance produces difficulty in reestablishing dilator function once it is physiologically lost. 4) DNM respiratory activity is modified by pulmonary mechano-and pressure-receptors via afferent vagal pathways. The response of nasal dilators in valvular control, therefore, appears dependent on the physiologic integrity of the vagus nerves. It is our belief that nasal valvular control has not previously been appreciated in this context.


Author(s):  
Marisel Moreno

Migration has always been at the core of Latina/o literature. In fact, it would be difficult to find any work in this corpus that does not address migration to some extent. This is because, save some exceptions, the experience of migration is the unifying condition from which Latina/o identities have emerged. All Latinas/os trace their family origins to Latin America and/or the Hispanic Caribbean. That said, not all of them experience migration first-hand or in the same manner; there are many factors that determine why, how, when, and where migration takes place. Yet, despite all of these factors, it is safe to say that a crucial reason behind the mass movements of people from Latin America and the Hispanic Caribbean to the United States has been direct or indirect US involvement in the countries of origin. This is evident, for instance, in the cases of Puerto Rico (invasion of 1898) and Central America (civil wars in the 1980s), where US intervention led to migration to the United States in the second half of the 20th century. Other factors that tend to affect the experience of migration include nationality, class, race, ethnicity, gender, sexuality, religion, language, citizenship status, age, ability, and the historical juncture at which migration takes place. The heterogeneous ways in which migration is represented in Latina/o literature reflect the wide range of factors that influence and shape the experience of migration. Latina/o narrative, poetry, theatre, essay, and other forms of literary expressions capture the diversity of the migration experience. Some of the constant themes that emerge in these works include nostalgia, transculturation, discrimination, racism, uprootedness, hybridity, and survival. In addressing these issues, Latina/o literature brings visibility to the complexities surrounding migration and Latina/o identity, while undermining the one-dimensional and negative stereotypes that tend to dehumanize Latinas/os in US dominant society. Most importantly, it allows the public to see that while migration is complex and in constant flux, those who experience it are human beings in search for survival.


Mind Shift ◽  
2021 ◽  
pp. 125-139
Author(s):  
John Parrington

This chapter explores how language helps human beings to group, distinguish, and differentiate between things in the world around them. In other words, what is the basis of conceptual thought, and how does this relate to language capacity as a whole? On the one hand, behaviourists have argued that human beings start life as a ‘blank slate’, and our language capacity is something that we learn by exposure to the language of others. On the other, Noam Chomsky proposed that humans are born with an innate capacity for language, as shown by the ease with which children rapidly learn a wide vocabulary, but also by their remarkable capacity for linking words together in complex grammatical structures. The chapter then looks at developments in the search for a biological basis for human language. Studies suggest that language in humans involves a complex interplay between a number of different genes, FOXP2 being a key one, which affect the connections between neurons in specific parts of the brain. However, it still remains unclear whether FOXP2 gene affects neurons involved in language processing per se or those that control muscles involved in speech.


2003 ◽  
Vol 90 (3) ◽  
pp. 1887-1903 ◽  
Author(s):  
Nicholas L. Port ◽  
Robert H. Wurtz

The visual world presents multiple potential targets that can be brought to the fovea by saccadic eye movements. These targets produce activity at multiple sites on a movement map in the superior colliculus (SC), an area of the brain related to saccade generation. The saccade made must result from competition between the populations of neurons representing these many saccadic goals, and in the present experiments we used multiple moveable microelectrodes to follow this competition. We recorded simultaneously from two sites on the SC map where each site was related to a different saccade target. The two targets appeared in rapid sequence, and the monkey was rewarded for making a saccade toward the one appearing first. Our study concentrated on trials in which the monkey made strongly curved saccades that were directed first toward one target and then toward the other. These curved saccades activated both sites on the SC map as they veered from one target to the other. The major finding was that the strongly curved saccades were preceded by sequential activity in the two neurons as indicated by three observations: the firing rate for the neuron related to the first target reached its peak earlier than did the rate of the neuron for the second target; the timing of the peak activity of the two neurons was related to the beginning and end of the saccade curvature; a weighted vector-average model based on the activity of the two neurons predicted the timing of saccade curvature. Straight averaging saccades ended between the targets so that they did not go to either target, and they were accompanied by simultaneous rather than sequential activation of the two neurons. Thus when multiple populations of neurons are active on the SC movement map, the resulting saccade is determined by the relative timing of the activity in the populations as well as their magnitude. In contrast, SC activity at the two sites did not predict the final direction of the saccade, and several control experiments found insufficient activity at other sites on the SC map to account for that final direction. We conclude that the SC neuronal activity predicts the timing of the saccade curvature, but not the final direction of the trajectory. These observations are consistent with SC activity being critical in selecting the goal of the saccade, but not in determining the exact trajectory.


Author(s):  
J L Mazher Iqbal ◽  
S. Arun

The detection of human beings in a camera attracts more attention because of its wide range of applications such as abnormal event detection, person counting in a dense crowd, person identification, fall detection for care to elderly people, etc. Over the time, various techniques have evolved to enhance the visual information. This article presents a novel 3-D intelligent information system for identifying abnormal human activity using background subtraction, rectification, morphology, neural networks and depth estimation with a thermal camera and a pair of hand held Universal Serial Bus (USB) camera to visualize un-calibrated images. The proposed system detects strongest points using Speed-Up Robust Features (SURF). The Sum of Absolute Difference (SAD) algorithm match the strongest points detected by SURF. 3-D object model and image stitching from image sequences are carried out in the proposed work. A series of images captured from different cameras are stitched into a geometrically consistent mosaic either horizontally/vertically based on the image acquisition. 3-D image and depth estimation of un-calibrated stereo images are acquired using rectification and disparity. The background is separated from the scene using threshold approach. Features are extracted using morphological operators in order to get the skeleton. Junction points and end points of the skeleton image are obtained from the skeleton. Data set of abnormal human activity is created using supervised learning such as neural network with a thermal camera and a pair of webcam. The feature vector of an activity is compared with already created data set, if a match occurs the classifier detects abnormal human activity. Additionally the proposed algorithm performs depth estimation to measure real time distance of objects dynamically. The system use thermal camera, Intel computing stick, converter, video graphics array (VGA) to high-definition multimedia interface (HDMI) and webcams. The proposed novel intelligent information system gives 94% maximum accuracy and 89% minimum accuracy for different activities, thus it effectively detects suspicious activity during day and night.


Author(s):  
Caroline A. Miller ◽  
Laura L. Bruce

The first visual cortical axons arrive in the cat superior colliculus by the time of birth. Adultlike receptive fields develop slowly over several weeks following birth. The developing cortical axons go through a sequence of changes before acquiring their adultlike morphology and function. To determine how these axons interact with neurons in the colliculus, cortico-collicular axons were labeled with biocytin (an anterograde neuronal tracer) and studied with electron microscopy.Deeply anesthetized animals received 200-500 nl injections of biocytin (Sigma; 5% in phosphate buffer) in the lateral suprasylvian visual cortical area. After a 24 hr survival time, the animals were deeply anesthetized and perfused with 0.9% phosphate buffered saline followed by fixation with a solution of 1.25% glutaraldehyde and 1.0% paraformaldehyde in 0.1M phosphate buffer. The brain was sectioned transversely on a vibratome at 50 μm. The tissue was processed immediately to visualize the biocytin.


2018 ◽  
Vol 23 (1) ◽  
pp. 10-13
Author(s):  
James B. Talmage ◽  
Jay Blaisdell

Abstract Injuries that affect the central nervous system (CNS) can be catastrophic because they involve the brain or spinal cord, and determining the underlying clinical cause of impairment is essential in using the AMA Guides to the Evaluation of Permanent Impairment (AMA Guides), in part because the AMA Guides addresses neurological impairment in several chapters. Unlike the musculoskeletal chapters, Chapter 13, The Central and Peripheral Nervous System, does not use grades, grade modifiers, and a net adjustment formula; rather the chapter uses an approach that is similar to that in prior editions of the AMA Guides. The following steps can be used to perform a CNS rating: 1) evaluate all four major categories of cerebral impairment, and choose the one that is most severe; 2) rate the single most severe cerebral impairment of the four major categories; 3) rate all other impairments that are due to neurogenic problems; and 4) combine the rating of the single most severe category of cerebral impairment with the ratings of all other impairments. Because some neurological dysfunctions are rated elsewhere in the AMA Guides, Sixth Edition, the evaluator may consult Table 13-1 to verify the appropriate chapter to use.


1970 ◽  
Vol 6 (1) ◽  
Author(s):  
Muskinul Fuad

The education system in Indonesia emphasize on academic intelligence, whichincludes only two or three aspects, more than on the other aspects of intelligence. For thatreason, many children who are not good at academic intelligence, but have good potentials inother aspects of intelligence, do not develop optimally. They are often considered and labeledas "stupid children" by the existing system. This phenomenon is on the contrary to the theoryof multiple intelligences proposed by Howard Gardner, who argues that intelligence is theability to solve various problems in life and produce products or services that are useful invarious aspects of life.Human intelligence is a combination of various general and specific abilities. Thistheory is different from the concept of IQ (intelligence quotient) that involves only languageskills, mathematical, and spatial logics. According to Gardner, there are nine aspects ofintelligence and its potential indicators to be developed by each child born without a braindefect. What Gardner suggested can be considered as a starting point to a perspective thatevery child has a unique individual intelligence. Parents have to treat and educate theirchildren proportionally and equitably. This treatment will lead to a pattern of education that isfriendly to the brain and to the plurality of children’s potential.More than the above points, the notion that multiple intelligences do not just comefrom the brain needs to be followed. Humans actually have different immaterial (spiritual)aspects that do not refer to brain functions. The belief in spiritual aspects and its potentialsmeans that human beings have various capacities and they differ from physical capacities.This is what needs to be addressed from the perspective of education today. The philosophyand perspective on education of the educators, education stakeholders, and especially parents,are the first major issue to be addressed. With this step, every educational activity andcommunication within the family is expected to develop every aspect of children'sintelligence, especially the spiritual intelligence.


Sign in / Sign up

Export Citation Format

Share Document