Unknown

Dataset Information

0

CodSeqGen: A tool for generating synonymous coding sequences with desired GC-contents.


ABSTRACT: Identification of regulatory elements is essential for understanding the mechanism behind regulating gene expression. These regulatory elements-located in or near gene-bind to proteins called transcription factors to initiate the transcription process. Their occurrences are influenced by the GC-content or nucleotide composition. For generating synthetic coding sequences with pre-specified amino acid sequence and desired GC-content, there exist two stochastic methods, multinomial and maximum entropy. Both methods rely on the probability of choosing the codon synonymous for usage in regard to a specific amino acid. In spite the latter exhibited unbiased manner, the produced sequences are not exactly obeying the GC-content constraint. In this paper, we present an algorithmic solution to produce coding sequences that follow exactly a primary amino acid sequence and a desired GC-content. The proposed tool, namely CodSeqGen, depends on random selection for smaller subsets to be traversed using the backtracking approach.

SUBMITTER: Al-Ssulami AM 

PROVIDER: S-EPMC7127556 | biostudies-literature | 2020 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

CodSeqGen: A tool for generating synonymous coding sequences with desired GC-contents.

Al-Ssulami Abdulrakeeb M AM   Azmi Aqil M AM   Hussain Muhammad M  

Genomics 20190206 1


Identification of regulatory elements is essential for understanding the mechanism behind regulating gene expression. These regulatory elements-located in or near gene-bind to proteins called transcription factors to initiate the transcription process. Their occurrences are influenced by the GC-content or nucleotide composition. For generating synthetic coding sequences with pre-specified amino acid sequence and desired GC-content, there exist two stochastic methods, multinomial and maximum entr  ...[more]

Similar Datasets

| S-EPMC5106001 | biostudies-literature
| S-EPMC2989939 | biostudies-literature
| S-EPMC5438181 | biostudies-literature
| S-EPMC5014483 | biostudies-literature
| S-EPMC4911796 | biostudies-literature
| S-EPMC4316630 | biostudies-literature
2017-10-30 | GSE103037 | GEO
2017-10-30 | GSE103112 | GEO
2017-10-30 | GSE76406 | GEO
| S-EPMC4022393 | biostudies-literature