Project description:Whole-genome sequencing (WGS) data of a bacterial strain IITK SM2 isolated from an aquifer located in the middle Indo-Gangetic plain is reported here, along with its physiological, morphological, biochemical, and redox-transformation characteristics in the presence of dissolved arsenic (As). The aquifer exhibits oxidizing conditions relative to As speciation. Analyses based on 16S rRNA and recN sequences indicate that IITK SM2 was clustered with C. youngae NCTC 13708T and C. pasteuri NCTC UMH17T. However, WGS analyses using the digital DNA-DNA hybridization and Rapid Annotations using Subsystems Technology suggest that IITK SM2 belongs to a strain of C. youngae. This strain can effectively reduce As(V) to As(III) but cannot oxidize As(III) to As(V). It exhibited high resistance to As(V) [32,000 mg L-1] and As(III) [1,100 mg L-1], along with certain other heavy metals typically found in contaminated groundwater. WGS analysis also indicates the presence of As-metabolizing genes such as arsC, arsB, arsA, arsD, arsR, and arsH in this strain. Although these genes have been identified in several As(V)-reducers, the clustering of these genes in the forms of arsACBADR, arsCBRH, and an independent arsC gene has not been observed in any other Citrobacter species or other selected As(V)-reducing strains of Enterobacteriaceae family. Moreover, there were differences in the number of genes corresponding to membrane transporters, virulence and defense, motility, protein metabolism, phages, prophages, and transposable elements in IITK SM2 when compared to other strains. This genomic dataset will facilitate subsequent molecular and biochemical analyses of strain IITK SM2 to identify the reasons for high arsenic resistance in Citrobacter youngae and understand its role in As mobilization in middle Indo-Gangetic plain aquifers.