Project description:This article introduces Black gram Plant Leaf Disease (BPLD) dataset, which is scientifically called as Vigna Mungo and is popularly known as Urad in India. It is widely considered to be one of the most significant pulse crops farmed in India. Anthracnose, Leaf Crinkle, Powdery Mildew and Yellow Mosaic diseases shown significant impact on the black gram production and causing financial loss to the farmers. A fusion of image processing and computer vison algorithms are widely used in recent years, for applications in the diagnosis and categorization of diseases that affect plant leaves. To detect and classify plant leaf diseases which degrades the quality of the black gram crop, in early stages, using computer vision algorithms, a Black gram Plant Leaf Disease (BPLD) dataset was created and briefly discussed in this article. The dataset holds a total of 1000 images belongs to five classes: four diseases and one healthy. The images in the presented dataset were captured under the real cultivation fields at Nagayalanka, Krishna, Andhra Pradesh, using camera and mobile phones. After the image acquisition, the images were categorized and processed with the help of agriculture experts. Researchers who utilize image processing, machine learning and particularly deep learning algorithms for automated diagnosis and classification of black gram plant leaf diseases in early stage to assist farmers could benefit from this dataset. The dataset is publicly and freely available at https://doi.org/10.17632/zfcv9fmrgv.3.
Project description:Preserving maritime ecosystems is a major concern for governments and administrations. Additionally, improving fishing industry processes, as well as that of fish markets, to have a more precise evaluation of the captures, will lead to a better control on the fish stocks. Many automated fish species classification and size estimation proposals have appeared in recent years, however, they require data to train and evaluate their performance. Furthermore, this data needs to be organized and labelled. This paper presents a dataset of images of fish trays from a local wholesale fish market. It includes pixel-wise (mask) labelled specimens, along with species information, and different size measurements. A total of 1,291 labelled images were collected, including 7,339 specimens of 59 different species (in 60 different class labels). This dataset can be of interest to evaluate the performance of novel fish instance segmentation and/or size estimation methods, which are key for systems aimed at the automated control of stocks exploitation, and therefore have a beneficial impact on fish populations in the long run. Measurement(s)specimen size • fish speciesTechnology Type(s)homography estimation • expert’s knowledgeSample Characteristic - OrganismMediterranean fishSample Characteristic - Environmentfish marketSample Characteristic - LocationLevantine Balearic sea
Project description:Computer vision is the science that enables computers and machines to see and perceive image content on a semantic level. It combines concepts, techniques, and ideas from various fields such as digital image processing, pattern matching, artificial intelligence, and computer graphics. A computer vision system is designed to model the human visual system on a functional basis as closely as possible. Deep learning and Convolutional Neural Networks (CNNs) in particular which are biologically inspired have significantly contributed to computer vision studies. This research develops a computer vision system that uses CNNs and handcrafted filters from Log-Gabor filters to identify medicinal plants based on their leaf textural features in an ensemble manner. The system was tested on a dataset developed from the Centre of Plant Medicine Research, Ghana (MyDataset) consisting of forty-nine (49) plant species. Using the concept of transfer learning, ten pretrained networks including Alexnet, GoogLeNet, DenseNet201, Inceptionv3, Mobilenetv2, Restnet18, Resnet50, Resnet101, vgg16, and vgg19 were used as feature extractors. The DenseNet201 architecture resulted with the best outcome of 87% accuracy and GoogLeNet with 79% preforming the worse averaged across six supervised learning algorithms. The proposed model (OTAMNet), created by fusing a Log-Gabor layer into the transition layers of the DenseNet201 architecture achieved 98% accuracy when tested on MyDataset. OTAMNet was tested on other benchmark datasets; Flavia, Swedish Leaf, MD2020, and the Folio dataset. The Flavia dataset achieved 99%, Swedish Leaf 100%, MD2020 99%, and the Folio dataset 97%. A false-positive rate of less than 0.1% was achieved in all cases.
Project description:Human beings rely heavily on social communication as one of the major aspects of communication. Language is the most effective means of verbal and nonverbal communication and association. To bridge the communication gap between deaf people communities, and non-deaf people, sign language is widely used. According to the World Federation of the Deaf, there are about 70 million deaf people present around the globe and about 300 sign languages being used. Hence, the structural form of the hand gestures involving visual motions and signs is used as a communication system to help the deaf and speech-impaired community for daily interaction. The aim is to collect a dataset of Urdu sign language (USL) and test it through a machine learning classifier. The overview of the proposed system is divided into four main stages i.e., data collection, data acquisition, training model ad testing model. The USL dataset which is comprised of 1,560 images was created by photographing various hand positions using a camera. This work provides a strategy for automated identification of USL numbers based on a bag-of-words (BoW) paradigm. For classification purposes, support vector machine (SVM), Random Forest, and K-nearest neighbor (K-NN) are used with the BoW histogram bin frequencies as characteristics. The proposed technique outperforms others in number classification, attaining the accuracies of 88%, 90%, and 84% for the random forest, SVM, and K-NN respectively.
Project description:PurposeThe interpretation of genetic variants after genome-wide analysis is complex in heterogeneous disorders such as intellectual disability (ID). We investigate whether algorithms can be used to detect if a facial gestalt is present for three novel ID syndromes and if these techniques can help interpret variants of uncertain significance.MethodsFacial features were extracted from photos of ID patients harboring a pathogenic variant in three novel ID genes (PACS1, PPM1D, and PHIP) using algorithms that model human facial dysmorphism, and facial recognition. The resulting features were combined into a hybrid model to compare the three cohorts against a background ID population.ResultsWe validated our model using images from 71 individuals with Koolen-de Vries syndrome, and then show that facial gestalts are present for individuals with a pathogenic variant in PACS1 (p = 8 × 10-4), PPM1D (p = 4.65 × 10-2), and PHIP (p = 6.3 × 10-3). Moreover, two individuals with a de novo missense variant of uncertain significance in PHIP have significant similarity to the expected facial phenotype of PHIP patients (p < 1.52 × 10-2).ConclusionOur results show that analysis of facial photos can be used to detect previously unknown facial gestalts for novel ID syndromes, which will facilitate both clinical and molecular diagnosis of rare and novel syndromes.
Project description:The double-yolked (DY) egg is quite popular in some Asian countries because it is considered as a sign of good luck, however, the double yolk is one of the reasons why these eggs fail to hatch. The usage of automatic methods for identifying DY eggs can increase the efficiency in the poultry industry by decreasing egg loss during incubation or improving sale proceeds. In this study, two methods for DY duck egg identification were developed by using computer vision technology. Transmittance images of DY and single-yolked (SY) duck eggs were acquired by a CCD camera to identify them according to their shape features. The Fisher's linear discriminant (FLD) model equipped with a set of normalized Fourier descriptors (NFDs) extracted from the acquired images and the convolutional neural network (CNN) model using primary preprocessed images were built to recognize duck egg yolk types. The classification accuracies of the FLD model for SY and DY eggs were 100% and 93.2% respectively, while the classification accuracies of the CNN model for SY and DY eggs were 98% and 98.8% respectively. The CNN-based algorithm took about 0.12 s to recognize one sample image, which was slightly faster than the FLD-based (about 0.20 s). Finally, this work compared two classification methods and provided the better method for DY egg identification.
Project description:BackgroundThe explosively radiating evolution of cichlid fishes of Lake Malawi has yielded an amazing number of haplochromine species estimated as many as 500 to 800 with a surprising degree of diversity not only in color and stripe pattern but also in the shape of jaw and body among them. As these morphological diversities have been a central subject of adaptive speciation and taxonomic classification, such high diversity could serve as a foundation for automation of species identification of cichlids.Methodology/principal findingHere we demonstrate a method for automatic classification of the Lake Malawi cichlids based on computer vision and geometric morphometrics. For this end we developed a pipeline that integrates multiple image processing tools to automatically extract informative features of color and stripe patterns from a large set of photographic images of wild cichlids. The extracted information was evaluated by statistical classifiers Support Vector Machine and Random Forests. Both classifiers performed better when body shape information was added to the feature of color and stripe. Besides the coloration and stripe pattern, body shape variables boosted the accuracy of classification by about 10%. The programs were able to classify 594 live cichlid individuals belonging to 12 different classes (species and sexes) with an average accuracy of 78%, contrasting to a mere 42% success rate by human eyes. The variables that contributed most to the accuracy were body height and the hue of the most frequent color.ConclusionsComputer vision showed a notable performance in extracting information from the color and stripe patterns of Lake Malawi cichlids although the information was not enough for errorless species identification. Our results indicate that there appears an unavoidable difficulty in automatic species identification of cichlid fishes, which may arise from short divergence times and gene flow between closely related species.
Project description:A dataset of street light images is presented. Our dataset consists of ∼350 k images, taken from 140 UMBRELLA nodes installed in the South Gloucestershire region in the UK. Each UMBRELLA node is installed on the pole of a lamppost and is equipped with a Raspberry Pi Camera Module v1 facing upwards towards the sky and lamppost light bulb. Each node collects an image at hourly intervals for 24 h every day. The data collection spans for a period of six months. Each image taken is logged as a single entry in the dataset along with the Global Positioning System (GPS) coordinates of the lamppost. All entries in the dataset have been post-processed and labelled based on the operation of the lamppost, i.e., whether the lamppost is switched ON or OFF. The dataset can be used to train deep neural networks and generate pre-trained models providing feature representations for smart city CCTV applications, smart weather detection algorithms, or street infrastructure monitoring. The dataset can be found at 10.5281/zenodo.6046758.
Project description:In the field of transportation and logistics, smart vision systems have been employed successfully to automate various tasks such as number-plate recognition and vehicle identity recognition. The development of such automated systems is possible with the availability of large image datasets having proper annotations. The TRODO dataset is a rich-annotated collection of odometer displays that can enable automatic mileage reading from raw images. Initially, the dataset consisted of 2613 frames captured in different conditions in terms of resolution, quality, illumination and vehicle type. After data pre-processing and cleaning, the number of images was reduced to 2389. The images were annotated using the CVAT image annotation tool. The dataset provides the following information for each frame: the type of odometer (analog or digital), the mileage value displayed on the odometer, the bounding boxes of the odometer, and the digits and characters displayed on the screen. Combined with machine learning and artificial intelligence, the TRODO dataset can be used to train odometer classifiers, digit recognition and number reading models from odometers and similar types of displays.
Project description:Tremor is one of the most common neurological symptoms. Its clinical and neurobiological complexity necessitates novel approaches for granular phenotyping. Instrumented neurophysiological analyses have proven useful, but are highly resource-intensive and lack broad accessibility. In contrast, bedside scores are simple to administer, but lack the granularity to capture subtle but relevant tremor features. We utilise the open-source computer vision pose tracking algorithm Mediapipe to track hands in clinical video recordings and use the resulting time series to compute canonical tremor features. This approach is compared to marker-based 3D motion capture, wrist-worn accelerometry, clinical scoring and a second, specifically trained tremor-specific algorithm in two independent clinical cohorts. These cohorts consisted of 66 patients diagnosed with essential tremor, assessed in different task conditions and states of deep brain stimulation therapy. We find that Mediapipe-derived tremor metrics exhibit high convergent clinical validity to scores (Spearman's ρ = 0.55-0.86, p≤ .01) as well as an accuracy of up to 2.60 mm (95% CI [-3.13, 8.23]) and ≤0.21 Hz (95% CI [-0.05, 0.46]) for tremor amplitude and frequency measurements, matching gold-standard equipment. Mediapipe, but not the disease-specific algorithm, was capable of analysing videos involving complex configurational changes of the hands. Moreover, it enabled the extraction of tremor features with diagnostic and prognostic relevance, a dimension which conventional tremor scores were unable to provide. Collectively, this demonstrates that current computer vision algorithms can be transformed into an accurate and highly accessible tool for video-based tremor analysis, yielding comparable results to gold standard tremor recordings.