Dataset Information

The Acquisition of Noun and Verb Categories by Bootstrapping From a Few Known Words: A Computational Model.

ABSTRACT: While many studies have shown that toddlers are able to detect syntactic regularities in speech, the learning mechanism allowing them to do this is still largely unclear. In this article, we use computational modeling to assess the plausibility of a context-based learning mechanism for the acquisition of nouns and verbs. We hypothesize that infants can assign basic semantic features, such as "is-an-object" and/or "is-an-action," to the very first words they learn, then use these words, the semantic seed, to ground proto-categories of nouns and verbs. The contexts in which these words occur, would then be exploited to bootstrap the noun and verb categories: unknown words are attributed to the class that has been observed most frequently in the corresponding context. To test our hypothesis, we designed a series of computational experiments which used French corpora of child-directed speech and different sizes of semantic seed. We partitioned these corpora in training and test sets: the model extracted the two-word contexts of the seed from the training sets, then used them to predict the syntactic category of content words from the test sets. This very simple algorithm demonstrated to be highly efficient in a categorization task: even the smallest semantic seed (only 8 nouns and 1 verb known) yields a very high precision (~90% of new nouns; ~80% of new verbs). Recall, in contrast, was low for small seeds, and increased with the seed size. Interestingly, we observed that the contexts used most often by the model featured function words, which is in line with what we know about infants' language development. Crucially, for the learning method we evaluated here, all initialization hypotheses are plausible and fit the developmental literature (semantic seed and ability to analyse contexts). While this experiment cannot prove that this learning mechanism is indeed used by infants, it demonstrates the feasibility of a realistic learning hypothesis, by using an algorithm that relies on very little computational and memory resources. Altogether, this supports the idea that a probabilistic, context-based mechanism can be very efficient for the acquisition of syntactic categories in infants.

SUBMITTER: Brusini P

PROVIDER: S-EPMC8416756 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:A long-standing but implicit assumption is that words strongly associated with a presented cue are automatically activated in the memory through rapid spread of activation within brain semantic networks. The current study was aimed to provide direct evidence of such rapid access to words' semantic representations and to investigate its neural sources using magnetoencephalography (MEG) and distributed source localization technique. Thirty-three neurotypical subjects underwent the MEG recording during verb generation task, which was to produce verbs related to the presented noun cues. Brain responses evoked by the noun cues were examined while manipulating the strength of association between the noun and the potential verb responses. The strong vs. weak noun-verb association led to a greater noun-related neural response at 250-400 ms after cue onset, and faster verb production. The cortical sources of the differential response were localized in left temporal pole, previously implicated in semantic access, and left ventrolateral prefrontal cortex (VLPFC), thought to subserve controlled semantic retrieval. The strength of the left VLPFC's response to the nouns with strong verb associates was positively correlated to the speed of verbs production. Our findings empirically validate the theoretical expectation that in case of a strongly connected noun-verb pair, successful access to target verb representation may occur already at the stage of lexico-semantic analysis of the presented noun. Moreover, the MEG results suggest that contrary to the previous conclusion derived from fMRI studies left VLPFC supports selection of the target verb representations, even if they were retrieved from semantic memory rapidly and effortlessly. The discordance between MEG and fMRI findings in verb generation task may stem from different modes of neural activation captured by phase-locked activity in MEG and slow changes of blood-oxygen-level-dependent (BOLD) signal in fMRI.

Project description:The behavioural and neural processes underpinning different word classes, particularly nouns and verbs, have been a long-standing area of interest in psycholinguistic, neuropsychology and aphasiology research. This topic has theoretical implications concerning the organisation of the language system, as well as clinical consequences related to the management of patients with language deficits. Research findings, however, have diverged widely, which might, in part, reflect methodological differences, particularly related to controlling the psycholinguistic variations between nouns and verbs. The first aim of this study, therefore, was to develop a set of neuropsychological tests that assessed single-word production and comprehension with a matched set of nouns and verbs. Secondly, the behavioural profiles and neural correlates of noun and verb processing were explored, based on these novel tests, in a relatively large cohort of 48 patients with chronic post-stroke aphasia. A data-driven approach, principal component analysis (PCA), was also used to determine how noun and verb production and comprehension were related to the patients' underlying fundamental language domains. The results revealed no performance differences between noun and verb production and comprehension once matched on multiple psycholinguistic features including, most critically, imageability. Interestingly, the noun-verb differences found in previous studies were replicated in this study once un-matched materials were used. Lesion-symptom mapping revealed overlapping neural correlates of noun and verb processing along left temporal and parietal regions. These findings support the view that the neural representation of noun and verb processing at single-word level are jointly-supported by distributed cortical regions. The PCA generated five fundamental language and cognitive components of aphasia: phonological production, phonological recognition, semantics, fluency, and executive function. Consistent with the behavioural analyses and lesion-symptom mapping results, both noun and verb processing loaded on common underlying language domains: phonological production and semantics. The neural correlates of these five principal components aligned with existing models of language and the regions implicated by other techniques such as functional neuroimaging and neuro-stimulation.

Project description:The aim of this study is to assess the role of readers' proficiency and of the base-word distributional properties on eye-movement behavior. Sixty-two typically developing children, attending 3rd, 4th, and 5th grade, were asked to read derived words in a sentence context. Target words were nouns derived from noun bases (e.g., umorista, 'humorist'), which in Italian are shared by few derived words, and nouns derived from verb bases (e.g., punizione, 'punishment'), which are shared by about 50 different inflected forms and several derived words. Data shows that base and word frequency affected first-fixation duration for nouns derived from noun bases, but in an opposite way: base frequency had a facilitative effect on first fixation, whereas word frequency exerted an inhibitory effect. These results were interpreted as a competition between early accessed base words (e.g., camino, chimney) and target words (e.g., caminetto, fireplace). For nouns derived from verb bases, an inhibitory base frequency effect but no word frequency effect was observed. These results suggest that syntactic context, calling for a noun in the target position, lead to an inhibitory effect when a verb base was detected, and made it difficult for readers to access the corresponding base+suffix combination (whole word) in the very early processing phases. Gaze duration was mainly affected by word frequency and length: for nouns derived from noun bases, this interaction was modulated by proficiency, as length effect was stronger for less proficient readers, while they were processing low-frequency words. For nouns derived from verb bases, though, all children, irrespective of their reading ability, showed sensitivity to the interaction within frequency of base+suffix combination (word frequency) and target length. Results of this study are consistent with those of other Italian studies that contrasted noun and verb processing, and confirm that distributional properties of morphemic constituents have a significant impact on the strategies used for processing morphologically complex words.

Project description:Verbs are more difficult to produce than nouns. Thus, if executive resources are reduced as in Parkinson's disease (PD), verbs are penalized compared to nouns. However, in an experimental condition in which it is the noun that must be selected from a larger number of alternatives compared to the verb, it is the noun production that becomes slower and more prone to errors. Indeed, patients are slower and less accurate than normal subjects when required to produce nouns from verbs (VN) in a morphology derivation tasks (e.g., "osservazione" from "osservare") ["observation" from "observe"] than verbs from nouns in a morphology generation task, in which only a verb can be generated from the noun (NV) (e.g., "fallire" from "fallimento") ["to fail" from "failure"]. In the Italian language morphology, in fact, generation and derivation tasks differ in the number of lexical entries among which the response must be selected. The left Inferior Frontal Gyrus (IFG) has been demonstrated to be involved in selection processes. In the present study, we explored if the ability to select words is related to the cortical thickness of the left IFG. Twelve right-sided PD with nigrostriatal hypofunctionality in the left hemisphere (RPD-LH), 9 left-sided PD with nigrostriatal hypofunctionality in the right hemisphere (LPD-RH) and 19 healthy controls (HC) took part in the study. NV and VN production tasks were administered; accuracy and reaction times (RTs) were collected. All 40 subjects received a structural MRI examination. Cortical thickness of the IFG and volumetric measurements for subcortical regions, thought to support selection processes, were computed using FreeSurfer. In VN derivation tasks RPD-LH patients were less accurate than LPD-RH patients (accuracy: 66% vs. 77%). No difference emerged among the three groups in RTs. Task accuracy/RTs and IFG thickness showed a significant correlation only in RPD-LH. Not only nouns (as expected) but also verbs were correlated with cortical thickness. This suggests that the linguistic nature of the stimuli along with executive resources are both relevant during word selection processes. Our data confirm that executive resources and language interact in the left IFG in word production tasks.

Dataset Information

The Acquisition of Noun and Verb Categories by Bootstrapping From a Few Known Words: A Computational Model.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets