Unknown

Dataset Information

0

The central exons of the human MUC2 and MUC6 mucins are highly repetitive and variable in sequence between individuals.


ABSTRACT: The DNA sequence of the two human mucin genes MUC2 and MUC6 have not been completely resolved due to the repetitive nature of their central exon coding for Proline, Threonine and Serine rich sequences. The exact nucleotide sequence of these exons has remained unknown for a long time due to limitations in traditional sequencing techniques. These are still very poorly covered in new whole genome sequencing projects with the corresponding protein sequences partly missing. We used a BAC clone containing both these genes and third generation sequencing technology, SMRT sequencing, to obtain the full-length contiguous MUC2 and MUC6 tandem repeat sequences. The new sequences span the entire repeat regions with good coverage revealing their length, variation in repeat sequences and their internal organization. The sequences obtained were used to compare with available sequences from whole genome sequencing projects indicating variation in number of repeats and their internal organization between individuals. The lack of these sequences has limited the association of genetic alterations with disease. The full sequences of these mucins will now allow such studies, which could be of importance for inflammatory bowel diseases for MUC2 and gastric ulcer diseases for MUC6 where deficient mucus protection is assumed to play an important role.

SUBMITTER: Svensson F 

PROVIDER: S-EPMC6269512 | biostudies-literature | 2018 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

The central exons of the human MUC2 and MUC6 mucins are highly repetitive and variable in sequence between individuals.

Svensson Frida F   Lang Tiange T   Johansson Malin E V MEV   Hansson Gunnar C GC  

Scientific reports 20181130 1


The DNA sequence of the two human mucin genes MUC2 and MUC6 have not been completely resolved due to the repetitive nature of their central exon coding for Proline, Threonine and Serine rich sequences. The exact nucleotide sequence of these exons has remained unknown for a long time due to limitations in traditional sequencing techniques. These are still very poorly covered in new whole genome sequencing projects with the corresponding protein sequences partly missing. We used a BAC clone contai  ...[more]

Similar Datasets

| S-EPMC3706859 | biostudies-other
| S-EPMC8813977 | biostudies-literature
| S-EPMC7205180 | biostudies-literature
| S-EPMC5538358 | biostudies-literature
| S-EPMC3930937 | biostudies-literature
| S-EPMC6804933 | biostudies-literature
| S-EPMC3128253 | biostudies-literature
| S-EPMC1220647 | biostudies-other
| S-EPMC5576652 | biostudies-literature
| S-EPMC1885805 | biostudies-literature