Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition.

ABSTRACT: Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based on a deep neural network (DNN) trained to discriminate between phonetic units, i.e. trained for the task of automatic speech recognition (ASR). This DNN aims to compress information in one of its layers, known as bottleneck (BN) layer, which is used to obtain a new frame representation of the audio signal. This representation has been proven to be useful for the task of language identification (LID). Thus, bottleneck features are used as input to the language recognition system, instead of a classical parameterization of the signal based on cepstral feature vectors such as MFCCs (Mel Frequency Cepstral Coefficients). Despite the success of this approach in language recognition, there is a lack of studies analyzing in a systematic way how the topology of the DNN influences the performance of bottleneck feature-based language recognition systems. In this work, we try to fill-in this gap, analyzing language recognition results with different topologies for the DNN used to extract the bottleneck features, comparing them and against a reference system based on a more classical cepstral representation of the input signal with a total variability model. This way, we obtain useful knowledge about how the DNN configuration influences bottleneck feature-based language recognition systems performance.

SUBMITTER: Lozano-Diez A

PROVIDER: S-EPMC5552160 | biostudies-other | 2017

REPOSITORIES: biostudies-other

ACCESS DATA

Json Xml

Similar Datasets

An Innovative Multi-Model Neural Network Approach for Feature Selection in Emotion Recognition Using Deep Feature Clustering.

Project description:Emotional awareness perception is a largely growing field that allows for more natural interactions between people and machines. Electroencephalography (EEG) has emerged as a convenient way to measure and track a user's emotional state. The non-linear characteristic of the EEG signal produces a high-dimensional feature vector resulting in high computational cost. In this paper, characteristics of multiple neural networks are combined using Deep Feature Clustering (DFC) to select high-quality attributes as opposed to traditional feature selection methods. The DFC method shortens the training time on the network by omitting unusable attributes. First, Empirical Mode Decomposition (EMD) is applied as a series of frequencies to decompose the raw EEG signal. The spatiotemporal component of the decomposed EEG signal is expressed as a two-dimensional spectrogram before the feature extraction process using Analytic Wavelet Transform (AWT). Four pre-trained Deep Neural Networks (DNN) are used to extract deep features. Dimensional reduction and feature selection are achieved utilising the differential entropy-based EEG channel selection and the DFC technique, which calculates a range of vocabularies using k-means clustering. The histogram characteristic is then determined from a series of visual vocabulary items. The classification performance of the SEED, DEAP and MAHNOB datasets combined with the capabilities of DFC show that the proposed method improves the performance of emotion recognition in short processing time and is more competitive than the latest emotion recognition methods.

| S-EPMC7374326 | biostudies-literature

Morphological diagnosis of hematologic malignancy using feature fusion-based deep convolutional neural network.

Project description:Leukemia is a cancer of white blood cells characterized by immature lymphocytes. Due to blood cancer, many people die every year. Hence, the early detection of these blast cells is necessary for avoiding blood cancer. A novel deep convolutional neural network (CNN) 3SNet that has depth-wise convolution blocks to reduce the computation costs has been developed to aid the diagnosis of leukemia cells. The proposed method includes three inputs to the deep CNN model. These inputs are grayscale and their corresponding histogram of gradient (HOG) and local binary pattern (LBP) images. The HOG image finds the local shape, and the LBP image describes the leukaemia cell's texture pattern. The suggested model was trained and tested with images from the AML-Cytomorphology_LMU dataset. The mean average precision (MAP) for the cell with less than 100 images in the dataset was 84%, whereas for cells with more than 100 images in the dataset was 93.83%. In addition, the ROC curve area for these cells is more than 98%. This confirmed proposed model could be an adjunct tool to provide a second opinion to a doctor.

| S-EPMC10562409 | biostudies-literature

Diagnostic Performance of a Convolutional Neural Network for Diminutive Colorectal Polyp Recognition

Project description:Interventions: None Primary outcome(s): The primary outcome of the study is the accuracy of the CAD-CNN system for predicting histology of diminutive colorectal polyps (1-5mm) compared with the accuracy of the prediction of the endoscopist. Both the CAD-CNN system and the endoscopist will use NBI for their predictions. Accuracy is defined as the percentage of correctly predicted optical diagnoses of the CAD-CNN system and/or endoscopist compared to the gold standard pathology. For the calculation of the accuracy, adenomas and SSLs will be dichotomised as neoplastic polyps, while HPs and other non-neoplastic histology are considered non-neoplastic. Study Design: N/A: single arm study, Open (masking not used), N/A , unknown, Other

| 2443187 | ecrin-mdr-crc

Object and anatomical feature recognition in surgical video images based on a convolutional neural network.

Project description:PurposeArtificial intelligence-enabled techniques can process large amounts of surgical data and may be utilized for clinical decision support to recognize or forecast adverse events in an actual intraoperative scenario. To develop an image-guided navigation technology that will help in surgical education, we explored the performance of a convolutional neural network (CNN)-based computer vision system in detecting intraoperative objects.MethodsThe surgical videos used for annotation were recorded during surgeries conducted in the Department of Surgery of Tokyo Women's Medical University from 2019 to 2020. Abdominal endoscopic images were cut out from manually captured surgical videos. An open-source programming framework for CNN was used to design a model that could recognize and segment objects in real time through IBM Visual Insights. The model was used to detect the GI tract, blood, vessels, uterus, forceps, ports, gauze and clips in the surgical images.ResultsThe accuracy, precision and recall of the model were 83%, 80% and 92%, respectively. The mean average precision (mAP), the calculated mean of the precision for each object, was 91%. Among surgical tools, the highest recall and precision of 96.3% and 97.9%, respectively, were achieved for forceps. Among the anatomical structures, the highest recall and precision of 92.9% and 91.3%, respectively, were achieved for the GI tract.ConclusionThe proposed model could detect objects in operative images with high accuracy, highlighting the possibility of using AI-based object recognition techniques for intraoperative navigation. Real-time object recognition will play a major role in navigation surgery and surgical education.

| S-EPMC8224261 | biostudies-literature

Meta-neural-network for real-time and passive deep-learning-based object recognition.

Project description:Analyzing scattered wave to recognize object is of fundamental significance in wave physics. Recently-emerged deep learning technique achieved great success in interpreting wave field such as in ultrasound non-destructive testing and disease diagnosis, but conventionally need time-consuming computer postprocessing or bulky-sized diffractive elements. Here we theoretically propose and experimentally demonstrate a purely-passive and small-footprint meta-neural-network for real-time recognizing complicated objects by analyzing acoustic scattering. We prove meta-neural-network mimics a standard neural network despite its compactness, thanks to unique capability of its metamaterial unit-cells (dubbed meta-neurons) to produce deep-subwavelength phase shift as training parameters. The resulting device exhibits the "intelligence" to perform desired tasks with potential to overcome the current limitations, showcased by two distinctive examples of handwritten digit recognition and discerning misaligned orbital-angular-momentum vortices. Our mechanism opens the route to new metamaterial-based deep-learning paradigms and enable conceptual devices automatically analyzing signals, with far-reaching implications for acoustics and related fields.

| S-EPMC7725829 | biostudies-literature

Deep convolutional neural network architecture for facial emotion recognition.

Project description:Facial emotion detection is crucial in affective computing, with applications in human-computer interaction, psychological research, and sentiment analysis. This study explores how deep convolutional neural networks (DCNNs) can enhance the accuracy and reliability of facial emotion detection by focusing on the extraction of detailed facial features and robust training techniques. Our proposed DCNN architecture uses its multi-layered design to automatically extract detailed facial features. By combining convolutional and pooling layers, the model effectively captures both subtle facial details and higher-level emotional patterns. Extensive testing on the benchmark Fer2013Plus dataset shows that our DCNN model outperforms traditional methods, achieving high accuracy in recognizing a variety of emotions. Additionally, we explore transfer learning techniques, showing that pre-trained DCNNs can effectively handle specific emotion recognition tasks even with limited labeled data.Our research focuses on improving the accuracy of emotion detection, demonstrating the model's capability to capture emotion-related facial cues through detailed feature extraction. Ultimately, this work advances facial emotion detection, with significant applications in various human-centric technological fields.

| S-EPMC11784769 | biostudies-literature

Spatial-Frequency Feature Learning and Classification of Motor Imagery EEG Based on Deep Convolution Neural Network.

Project description:EEG pattern recognition is an important part of motor imagery- (MI-) based brain computer interface (BCI) system. Traditional EEG pattern recognition algorithm usually includes two steps, namely, feature extraction and feature classification. In feature extraction, common spatial pattern (CSP) is one of the most frequently used algorithms. However, in order to extract the optimal CSP features, prior knowledge and complex parameter adjustment are often required. Convolutional neural network (CNN) is one of the most popular deep learning models at present. Within CNN, feature learning and pattern classification are carried out simultaneously during the procedure of iterative updating of network parameters; thus, it can remove the complicated manual feature engineering. In this paper, we propose a novel deep learning methodology which can be used for spatial-frequency feature learning and classification of motor imagery EEG. Specifically, a multilayer CNN model is designed according to the spatial-frequency characteristics of MI EEG signals. An experimental study is carried out on two MI EEG datasets (BCI competition III dataset IVa and a self-collected right index finger MI dataset) to validate the effectiveness of our algorithm in comparison with several closely related competing methods. Superior classification performance indicates that our proposed method is a promising pattern recognition algorithm for MI-based BCI system.

| S-EPMC7387988 | biostudies-literature

forgeNet: a graph deep neural network model using tree-based ensemble classifiers for feature graph construction.

Project description:MotivationA unique challenge in predictive model building for omics data has been the small number of samples (n) versus the large amount of features (p). This 'n≪p' property brings difficulties for disease outcome classification using deep learning techniques. Sparse learning by incorporating known functional relationships between the biological units, such as the graph-embedded deep feedforward network (GEDFN) model, has been a solution to this issue. However, such methods require an existing feature graph, and potential mis-specification of the feature graph can be harmful on classification and feature selection.ResultsTo address this limitation and develop a robust classification model without relying on external knowledge, we propose a forest graph-embedded deep feedforward network (forgeNet) model, to integrate the GEDFN architecture with a forest feature graph extractor, so that the feature graph can be learned in a supervised manner and specifically constructed for a given prediction task. To validate the method's capability, we experimented the forgeNet model with both synthetic and real datasets. The resulting high classification accuracy suggests that the method is a valuable addition to sparse deep learning models for omics data.Availability and implementationThe method is available at https://github.com/yunchuankong/forgeNet.Contacttianwei.yu@emory.edu.Supplementary informationSupplementary data are available at Bioinformatics online.

| S-EPMC7267822 | biostudies-literature

Rectal Cancer Treatment Management: Deep-Learning Neural Network Based on Photoacoustic Microscopy Image Outperforms Histogram-Feature-Based Classification.

Project description:We have developed a novel photoacoustic microscopy/ultrasound (PAM/US) endoscope to image post-treatment rectal cancer for surgical management of residual tumor after radiation and chemotherapy. Paired with a deep-learning convolutional neural network (CNN), the PAM images accurately differentiated pathological complete responders (pCR) from incomplete responders. However, the role of CNNs compared with traditional histogram-feature based classifiers needs further exploration. In this work, we compare the performance of the CNN models to generalized linear models (GLM) across 24 ex vivo specimens and 10 in vivo patient examinations. First order statistical features were extracted from histograms of PAM and US images to train, validate and test GLM models, while PAM and US images were directly used to train, validate, and test CNN models. The PAM-CNN model performed superiorly with an AUC of 0.96 (95% CI: 0.95-0.98) compared to the best PAM-GLM model using kurtosis with an AUC of 0.82 (95% CI: 0.82-0.83). We also found that both CNN and GLMs derived from photoacoustic data outperformed those utilizing ultrasound alone. We conclude that deep-learning neural networks paired with photoacoustic images is the optimal analysis framework for determining presence of residual cancer in the treated human rectum.

| S-EPMC8495416 | biostudies-literature

Deep learning-based idiomatic expression recognition for the Amharic language.

Project description:Idiomatic expressions are built into all languages and are common in ordinary conversation. Idioms are difficult to understand because they cannot be deduced directly from the source word. Previous studies reported that idiomatic expression affects many Natural language processing tasks in the Amharic language. However, most natural language processing models used with the Amharic language, such as machine translation, semantic analysis, sentiment analysis, information retrieval, question answering, and next-word prediction, do not consider idiomatic expressions. As a result, in this paper, we proposed a convolutional neural network (CNN) with a FastText embedding model for detecting idioms in an Amharic text. We collected 1700 idiomatic and 1600 non-idiomatic expressions from Amharic books to test the proposed model's performance. The proposed model is then evaluated using this dataset. We employed an 80 by 10,10 splitting ratio to train, validate, and test the proposed idiomatic recognition model. The proposed model's learning accuracy across the training dataset is 98%, and the model achieves 80% accuracy on the testing dataset. We compared the proposed model to machine learning models like K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Random Forest classifiers. According to the experimental results, the proposed model produces promising results.

| S-EPMC10720994 | biostudies-literature

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data