Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

Seven benchmark datasets of instance segmentation of mitochondria: 6 diverse volume EM + 1 TEM (100 images) datasets

ABSTRACT:

SUBMITTER: Kedar Narayan

PROVIDER: EMPIAR-10982 | biostudies-other |

REPOSITORIES: biostudies-other

ACCESS DATA

Json Xml

Similar Datasets

Benchmark for multi-cellular segmentation of bright field microscopy images.

Project description:BackgroundMulti-cellular segmentation of bright field microscopy images is an essential computational step when quantifying collective migration of cells in vitro. Despite the availability of various tools and algorithms, no publicly available benchmark has been proposed for evaluation and comparison between the different alternatives.DescriptionA uniform framework is presented to benchmark algorithms for multi-cellular segmentation in bright field microscopy images. A freely available set of 171 manually segmented images from diverse origins was partitioned into 8 datasets and evaluated on three leading designated tools.ConclusionsThe presented benchmark resource for evaluating segmentation algorithms of bright field images is the first public annotated dataset for this purpose. This annotated dataset of diverse examples allows fair evaluations and comparisons of future segmentation methods. Scientists are encouraged to assess new algorithms on this benchmark, and to contribute additional annotated datasets.

| S-EPMC3826518 | biostudies-literature

Lumbar spine segmentation in MR images: a dataset and a public benchmark.

Project description:This paper presents a large publicly available multi-center lumbar spine magnetic resonance imaging (MRI) dataset with reference segmentations of vertebrae, intervertebral discs (IVDs), and spinal canal. The dataset includes 447 sagittal T1 and T2 MRI series from 218 patients with a history of low back pain and was collected from four different hospitals. An iterative data annotation approach was used by training a segmentation algorithm on a small part of the dataset, enabling semi-automatic segmentation of the remaining images. The algorithm provided an initial segmentation, which was subsequently reviewed, manually corrected, and added to the training data. We provide reference performance values for this baseline algorithm and nnU-Net, which performed comparably. Performance values were computed on a sequestered set of 39 studies with 97 series, which were additionally used to set up a continuous segmentation challenge that allows for a fair comparison of different segmentation algorithms. This study may encourage wider collaboration in the field of spine segmentation and improve the diagnostic value of lumbar spine MRI.

| S-EPMC10908819 | biostudies-literature

A hybrid CNN-Random Forest algorithm for bacterial spore segmentation and classification in TEM images.

Project description:We present a new approach to segment and classify bacterial spore layers from Transmission Electron Microscopy (TEM) images using a hybrid Convolutional Neural Network (CNN) and Random Forest (RF) classifier algorithm. This approach utilizes deep learning, with the CNN extracting features from images, and the RF classifier using those features for classification. The proposed model achieved 73% accuracy, 64% precision, 46% sensitivity, and 47% F1-score with test data. Compared to other classifiers such as AdaBoost, XGBoost, and SVM, our proposed model demonstrates greater robustness and higher generalization ability for non-linear segmentation. Our model is also able to identify spores with a damaged core as verified using TEMs of chemically exposed spores. Therefore, the proposed method will be valuable for identifying and characterizing spore features in TEM images, reducing labor-intensive work as well as human bias.

| S-EPMC10618482 | biostudies-literature

Automated Segmentation of Optical Coherence Tomography Angiography Images: Benchmark Data and Clinically Relevant Metrics.

Project description:PurposeTo generate the first open dataset of retinal parafoveal optical coherence tomography angiography (OCTA) images with associated ground truth manual segmentations, and to establish a standard for OCTA image segmentation by surveying a broad range of state-of-the-art vessel enhancement and binarization procedures.MethodsHandcrafted filters and neural network architectures were used to perform vessel enhancement. Thresholding methods and machine learning approaches were applied to obtain the final binarization. Evaluation was performed by using pixelwise metrics and newly proposed topological metrics. Finally, we compare the error in the computation of clinically relevant vascular network metrics (e.g., foveal avascular zone area and vessel density) across segmentation methods.ResultsOur results show that, for the set of images considered, deep learning architectures (U-Net and CS-Net) achieve the best performance (Dice = 0.89). For applications where manually segmented data are not available to retrain these approaches, our findings suggest that optimally oriented flux (OOF) is the best handcrafted filter (Dice = 0.86). Moreover, our results show up to 25% differences in vessel density accuracy depending on the segmentation method used.ConclusionsIn this study, we derive and validate the first open dataset of retinal parafoveal OCTA images with associated ground truth manual segmentations. Our findings should be taken into account when comparing the results of clinical studies and performing meta-analyses. Finally, we release our data and source code to support standardization efforts in OCTA image segmentation.Translational relevanceThis work establishes a standard for OCTA retinal image segmentation and introduces the importance of evaluating segmentation performance in terms of clinically relevant metrics.

| S-EPMC7718823 | biostudies-literature

Segmentation and intensity estimation for microarray images with saturated pixels

Project description:Images and gpr files were examined using a novel saturation reduction method to determine whether accuracy could be improved by extending dynamic range of saturated pixels Three immunosignatures from human Valley Fever (Coccidiodes) patients and three immunosignatures from human influenza vaccine recipients were examined to test an algorithm that extends the apparent dynamic range of a fluorescence image. These images had several saturated spots at 70PMT and 100% laser power. The program examined the differences between Valley Fever and influenza in terms of standard image processing vs. segmentation and intensity estimation.

2011-11-23 | E-GEOD-33899 | biostudies-arrayexpress

Segmentation and intensity estimation for microarray images with saturated pixels

Project description:Images and gpr files were examined using a novel saturation reduction method to determine whether accuracy could be improved by extending dynamic range of saturated pixels

2011-11-23 | GSE33899 | GEO

Benchmark datasets incorporating diverse tasks, sample sizes, material systems, and data heterogeneity for materials informatics.

Project description:Materials discovery via machine learning has become an increasingly popular method due to its ability to rapidly predict materials properties in a time-efficient and low-cost manner. However, one limitation in this field is the lack of benchmark datasets, particularly those that encompass the size, tasks, material systems, and data modalities present in the materials informatics literature. This makes it difficult to identify optimal machine learning model choices including algorithm, model architecture, data splitting, and data featurization for a given task. Here, we attempt to address this lack of benchmark datasets by assembling a unique repository of 50 different datasets for materials properties. The data contains both experimental and computational data, data suited for regression as well as classification, sizes ranging from 12 to 6354 samples, and materials systems spanning the diversity of materials research. Data were extracted from 16 publications. In addition to cleaning the data where necessary, each dataset was split into train, validation, and test splits. For datasets with more than 100 values, train-val-test splits were created, either with a 5-fold or 10-fold cross-validation method, depending on what each respective paper did in their studies. Datasets with less than 100 values had train-test splits created using the Leave-One-Out cross-validation method. These benchmark data can serve as a basis for a more diverse benchmark dataset in the future to further improve their effectiveness in the comparison of machine learning models.

| S-EPMC8319566 | biostudies-literature

Representativeness of variation benchmark datasets.

Project description:BackgroundBenchmark datasets are essential for both method development and performance assessment. These datasets have numerous requirements, representativeness being one. In the case of variant tolerance/pathogenicity prediction, representativeness means that the dataset covers the space of variations and their effects.ResultsWe performed the first analysis of the representativeness of variation benchmark datasets. We used statistical approaches to investigate how proteins in the benchmark datasets were representative for the entire human protein universe. We investigated the distributions of variants in chromosomes, protein structures, CATH domains and classes, Pfam protein families, Enzyme Commission (EC) classifications and Gene Ontology annotations in 24 datasets that have been used for training and testing variant tolerance prediction methods. All the datasets were available in VariBench or VariSNP databases. We tested also whether the pathogenic variant datasets contained neutral variants defined as those that have high minor allele frequency in the ExAC database. The distributions of variants over the chromosomes and proteins varied greatly between the datasets.ConclusionsNone of the datasets was found to be well representative. Many of the tested datasets had quite good coverage of the different protein characteristics. Dataset size correlates to representativeness but only weakly to the performance of methods trained on them. The results imply that dataset representativeness is an important factor and should be taken into account in predictor development and testing.

| S-EPMC6267811 | biostudies-literature

VegAnn, Vegetation Annotation of multi-crop RGB images acquired under diverse conditions for segmentation.

Project description:Applying deep learning to images of cropping systems provides new knowledge and insights in research and commercial applications. Semantic segmentation or pixel-wise classification, of RGB images acquired at the ground level, into vegetation and background is a critical step in the estimation of several canopy traits. Current state of the art methodologies based on convolutional neural networks (CNNs) are trained on datasets acquired under controlled or indoor environments. These models are unable to generalize to real-world images and hence need to be fine-tuned using new labelled datasets. This motivated the creation of the VegAnn - Vegetation Annotation - dataset, a collection of 3775 multi-crop RGB images acquired for different phenological stages using different systems and platforms in diverse illumination conditions. We anticipate that VegAnn will help improving segmentation algorithm performances, facilitate benchmarking and promote large-scale crop vegetation segmentation research.

| S-EPMC10199053 | biostudies-literature

Automated eyeball volume measurement based on CT images using neural network-based segmentation and simple estimation.

Project description:With the increase in the dependency on digital devices, the incidence of myopia, a precursor of various ocular diseases, has risen significantly. Because myopia and eyeball volume are related, myopia progression can be monitored through eyeball volume estimation. However, existing methods are limited because the eyeball shape is disregarded during estimation. We propose an automated eyeball volume estimation method from computed tomography images that incorporates prior knowledge of the actual eyeball shape. This study involves data preprocessing, image segmentation, and volume estimation steps, which include the truncated cone formula and integral equation. We obtained eyeball image masks using U-Net, HFCN, DeepLab v3 +, SegNet, and HardNet-MSEG. Data from 200 subjects were used for volume estimation, and manually extracted eyeball volumes were used for validation. U-Net outperformed among the segmentation models, and the proposed volume estimation method outperformed comparative methods on all evaluation metrics, with a correlation coefficient of 0.819, mean absolute error of 0.640, and mean squared error of 0.554. The proposed method surpasses existing methods, provides an accurate eyeball volume estimation for monitoring the progression of myopia, and could potentially aid in the diagnosis of ocular diseases. It could be extended to volume estimation of other ocular structures.

| S-EPMC11219917 | biostudies-literature

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data