Browse
Submit Data
Databases
API
Help

Dataset Information

14 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels

ABSTRACT: This article presents handwritten isolated characters of the Devanagari script. Devanagari script contains ten numerals, 13 vowels, and 33 consonants. Devanagari Character dataset includes 23 different characters of numerals and vowels. 2400 handwritten samples are collected for each of the numerals and 1400 for each vowel. Collected samples are digitized and pre-processed. During pre-processing, images with noise are removed. In this context, a final dataset of 38,750 images were included, where 2,250 and 1,250 samples for each numeral and vowel, respectively. The data is available in images and comma-separated-values, along with attached labels. The dataset could be used for Optical Character Recognition research and deep learning. In India, the Devanagari script is the base script on which 120+ languages are evolved; hence this dataset serves as the base for Machine Learning research in these languages. The data set is publicly available at https://data.mendeley.com/datasets/pxrnvp4yy8/2.

SUBMITTER: Prashanth D

PROVIDER: S-EPMC8713117 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Json Xml

Similar Datasets

BanglaLekha-Isolated: A multi-purpose comprehensive dataset of Handwritten Bangla Isolated characters.

Project description:BanglaLekha-Isolated, a Bangla handwritten isolated character dataset is presented in this article. This dataset contains 84 different characters comprising of 50 Bangla basic characters, 10 Bangla numerals and 24 selected compound characters. 2000 handwriting samples for each of the 84 characters were collected, digitized and pre-processed. After discarding mistakes and scribbles, 1,66,105 handwritten character images were included in the final dataset. The dataset also includes labels indicating the age and the gender of the subjects from whom the samples were collected. This dataset could be used not only for optical handwriting recognition research but also to explore the influence of gender and age on handwriting. The dataset is publicly available at https://data.mendeley.com/datasets/hf6sf8zrkc/2.

| S-EPMC5382023 | biostudies-literature

BanglaWriting: A multi-purpose offline Bangla handwriting dataset.

Project description:This article presents a Bangla handwriting dataset named BanglaWriting that contains single-page handwritings of 260 individuals of different personalities and ages. Each page includes bounding-boxes that bounds each word, along with the unicode representation of the writing. This dataset contains 21,234 words and 32,787 characters in total. Moreover, this dataset includes 5,470 unique words of Bangla vocabulary. Apart from the usual words, the dataset comprises 261 comprehensible overwriting and 450 handwritten strikes and mistakes. All of the bounding-boxes and word labels are manually-generated. The dataset can be used for complex optical character/word recognition, writer identification, handwritten word segmentation, and word generation. Furthermore, this dataset is suitable for extracting age-based and gender-based variation of handwriting.

| S-EPMC7744928 | biostudies-literature

MoVi: A large multi-purpose human motion and video dataset.

Project description:Large high-quality datasets of human body shape and kinematics lay the foundation for modelling and simulation approaches in computer vision, computer graphics, and biomechanics. Creating datasets that combine naturalistic recordings with high-accuracy data about ground truth body shape and pose is challenging because different motion recording systems are either optimized for one or the other. We address this issue in our dataset by using different hardware systems to record partially overlapping information and synchronized data that lend themselves to transfer learning. This multimodal dataset contains 9 hours of optical motion capture data, 17 hours of video data from 4 different points of view recorded by stationary and hand-held cameras, and 6.6 hours of inertial measurement units data recorded from 60 female and 30 male actors performing a collection of 21 everyday actions and sports movements. The processed motion capture data is also available as realistic 3D human meshes. We anticipate use of this dataset for research on human pose estimation, action recognition, motion modelling, gait analysis, and body shape reconstruction.

| S-EPMC8211257 | biostudies-literature

Performance counter dataset for behavioural biometric purpose.

Project description:In the pursuit of advancing research in continuous user authentication, we introduce COUNT-OS-I and COUNT-OS-II, two distinct performance counter datasets from Windows operating systems, crafted to bolster research in continuous user authentication. Encompassing data from 63 computers and users, the datasets offer rich, real-world insights for developing and evaluating authentication models. COUNT-OS-I spans 26 users in an IT department, capturing 159 attributes across diverse hardware and software environments over 26 h on average per user. COUNT-OS-II, on the other hand, encompasses 37 users with identical system configurations, recording 218 attributes per sample over a 48-hour period. Both datasets utilize pseudonymization to safeguard user identities while maintaining data integrity and statistical accuracy. The well-balanced nature of the data, confirmed by comprehensive statistical analysis, positions these datasets as reliable benchmarks for the continuous user authentication domain. Through their release, we aim to empower the development of robust, real-world applicable authentication models, contributing to enhanced system security and user trust.

| S-EPMC10788215 | biostudies-literature

A benchmark dataset for printed Meitei/Meetei script character recognition.

Project description:The Manipuri language is the official language of the Indian state of Manipur. The language belongs to the Tibeto-Burman family of languages. A benchmark Meitei/Meetei script printed document images dataset is presented in this article. The dataset contains raw 824 pages of printed documents and binarized images, text files, and XML files for each raw image. It also includes 51,460 isolated character samples, composed of 27 consonants, 7 half-consonants, 8 vowels, and 10 numerical. This dataset could be used for optical character recognition (OCR) research and in the different research areas of natural language processing (NLP).

| S-EPMC9679442 | biostudies-literature

Multi-heme proteins: nature's electronic multi-purpose tool.

Project description:While iron is often a limiting nutrient to Biology, when the element is found in the form of heme cofactors (iron protoporphyrin IX), living systems have excelled at modifying and tailoring the chemistry of the metal. In the context of proteins and enzymes, heme cofactors are increasingly found in stoichiometries greater than one, where a single protein macromolecule contains more than one heme unit. When paired or coupled together, these protein associated heme groups perform a wide variety of tasks, such as redox communication, long range electron transfer and storage of reducing/oxidizing equivalents. Here, we review recent advances in the field of multi-heme proteins, focusing on emergent properties of these complex redox proteins, and strategies found in Nature where such proteins appear to be modular and essential components of larger biochemical pathways. This article is part of a Special Issue entitled: Metals in Bioenergetics and Biomimetics Systems.

| S-EPMC3880547 | biostudies-literature

Multi-purpose SLM-light-sheet microscope.

Project description:By integrating a phase-only Spatial Light Modulator (SLM) into the illumination arm of a cylindrical-lens-based Selective Plane Illumination Microscope (SPIM), we have created a versatile system able to deliver high quality images by operating in a wide variety of different imaging modalities. When placed in a Fourier plane, the SLM permits modulation of the microscope's light-sheet to implement imaging techniques such as structured illumination, tiling, pivoting, autofocusing and pencil beam scanning. Previous publications on dedicated microscope setups have shown how these techniques can deliver improved image quality by rejecting out-of-focus light (structured illumination and pencil beam scanning), reducing shadowing (light-sheet pivoting), and obtaining a more uniform illumination by moving the highest-resolution region of the light-sheet across the imaging Field of View (tiling). Our SLM-SPIM configuration is easy to build and use, and has been designed to allow all of these techniques to be employed on an easily reconfigurable optical setup, compatible with the OpenSPIM design. It offers the possibility to choose between three different light-sheets, in thickness and height, which can be selected according to the characteristics of the sample and the imaging technique to be applied. We demonstrate the flexibility and performance of the system with results obtained by applying a variety of different imaging techniques on samples of fluorescent beads, zebrafish embryos, and optically cleared whole mouse brain samples. Thus our approach allows easy implementation of advanced imaging techniques while retaining the simplicity of a cylindrical-lens-based light-sheet microscope.

| S-EPMC6238942 | biostudies-literature

Tutorial test dataset for training purpose

Project description:This is a test project description. This is a test project description. This is a test project description.

2022-06-24 | PXD028427 | Pride

Nanodiamonds as multi-purpose labels for microscopy.

Project description:Nanodiamonds containing fluorescent nitrogen-vacancy centers are increasingly attracting interest for use as a probe in biological microscopy. This interest stems from (i) strong resistance to photobleaching allowing prolonged fluorescence observation times; (ii) the possibility to excite fluorescence using a focused electron beam (cathodoluminescence; CL) for high-resolution localization; and (iii) the potential use for nanoscale sensing. For all these schemes, the development of versatile molecular labeling using relatively small diamonds is essential. Here, we show the direct targeting of a biological molecule with nanodiamonds as small as 70 nm using a streptavidin conjugation and standard antibody labelling approach. We also show internalization of 40 nm sized nanodiamonds. The fluorescence from the nanodiamonds survives osmium-fixation and plastic embedding making them suited for correlative light and electron microscopy. We show that CL can be observed from epon-embedded nanodiamonds, while surface-exposed nanoparticles also stand out in secondary electron (SE) signal due to the exceptionally high diamond SE yield. Finally, we demonstrate the magnetic read-out using fluorescence from diamonds prior to embedding. Thus, our results firmly establish nanodiamonds containing nitrogen-vacancy centers as unique, versatile probes for combining and correlating different types of microscopy, from fluorescence imaging and magnetometry to ultrastructural investigation using electron microscopy.

| S-EPMC5429637 | biostudies-literature

CEACAM1 as a multi-purpose target for cancer immunotherapy.

Project description:CEACAM1 is an extensively studied cell surface molecule with established functions in multiple cancer types, as well as in various compartments of the immune system. Due to its multi-faceted role as a recently appreciated immune checkpoint inhibitor and tumor marker, CEACAM1 is an attractive target for cancer immunotherapy. Herein, we highlight CEACAM1's function in various immune compartments and cancer types, including in the context of metastatic disease. This review outlines CEACAM1's role as a therapeutic target for cancer treatment in light of these properties.

| S-EPMC5543821 | biostudies-other

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data