Unknown

Dataset Information

0

Database of Potential Promoter Sequences in the Capsicum annuum Genome.


ABSTRACT: In this study, we used a mathematical method for the multiple alignment of highly divergent sequences (MAHDS) to create a database of potential promoter sequences (PPSs) in the Capsicum annuum genome. To search for PPSs, 20 statistically significant classes of sequences located in the range from -499 to +100 nucleotides near the annotated genes were calculated. For each class, a position-weight matrix (PWM) was computed and then used to identify PPSs in the C. annuum genome. In total, 825,136 PPSs were detected, with a false positive rate of 0.13%. The PPSs obtained with the MAHDS method were tested using TSSFinder, which detects transcription start sites. The databank of the found PPSs provides their coordinates in chromosomes, the alignment of each PPS with the PWM, and the level of statistical significance as a normal distribution argument, and can be used in genetic engineering and biotechnology.

SUBMITTER: Rudenko V 

PROVIDER: S-EPMC9332048 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC4888997 | biostudies-literature
| S-EPMC7748824 | biostudies-literature
2011-04-30 | GSE28751 | GEO
| S-EPMC6332240 | biostudies-literature
2021-06-02 | GSE49432 | GEO
| S-EPMC5980001 | biostudies-literature
| S-EPMC7412467 | biostudies-literature
| PRJNA435683 | ENA
| PRJNA438562 | ENA
| PRJNA121467 | ENA