Unknown

Dataset Information

0

Distinguishing Functional DNA Words; A Method for Measuring Clustering Levels.


ABSTRACT: Functional DNA sub-sequences and genome elements are spatially clustered through the genome just as keywords in literary texts. Therefore, some of the methods for ranking words in texts can also be used to compare different DNA sub-sequences. In analogy with the literary texts, here we claim that the distribution of distances between the successive sub-sequences (words) is q-exponential which is the distribution function in non-extensive statistical mechanics. Thus the q-parameter can be used as a measure of words clustering levels. Here, we analyzed the distribution of distances between consecutive occurrences of 16 possible dinucleotides in human chromosomes to obtain their corresponding q-parameters. We found that CG as a biologically important two-letter word concerning its methylation, has the highest clustering level. This finding shows the predicting ability of the method in biology. We also proposed that chromosome 18 with the largest value of q-parameter for promoters of genes is more sensitive to dietary and lifestyle. We extended our study to compare the genome of some selected organisms and concluded that the clustering level of CGs increases in higher evolutionary organisms compared to lower ones.

SUBMITTER: Moghaddasi H 

PROVIDER: S-EPMC5269680 | biostudies-literature | 2017 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Distinguishing Functional DNA Words; A Method for Measuring Clustering Levels.

Moghaddasi Hanieh H   Khalifeh Khosrow K   Darooneh Amir Hossein AH  

Scientific reports 20170127


Functional DNA sub-sequences and genome elements are spatially clustered through the genome just as keywords in literary texts. Therefore, some of the methods for ranking words in texts can also be used to compare different DNA sub-sequences. In analogy with the literary texts, here we claim that the distribution of distances between the successive sub-sequences (words) is q-exponential which is the distribution function in non-extensive statistical mechanics. Thus the q-parameter can be used as  ...[more]

Similar Datasets

| S-EPMC6437941 | biostudies-literature
| S-EPMC3342474 | biostudies-literature
| S-EPMC55324 | biostudies-literature
2010-05-27 | E-GEOD-19788 | biostudies-arrayexpress
2010-05-27 | GSE19788 | GEO
| S-EPMC3477851 | biostudies-literature
| S-EPMC2265534 | biostudies-literature
| S-EPMC6403383 | biostudies-literature
| S-EPMC6638850 | biostudies-literature
| S-EPMC1774579 | biostudies-literature