Project description:The Information Retrieval user experience has remained largely unchanged since its inception for computers and mobile devices alike. However, recent developments in Virtual Reality hardware (pioneered by Oculus Rift in 2013) could introduce a new environment for Information Retrieval. This paper reports the results of a Scoping Literature Review (PRISMA-ScR) by rigorously examining the entire body of relevant literature with reproducible methods. The following research questions are answered: "What prototypes and concepts of Virtual Reality Information Retrieval systems with current generation hardware exist?", "How are user interaction and especially user input realised in these systems?", "What Retrieval features are used in these systems?", "How are search results displayed in these systems?" and "Can these VR IR systems compare to traditional (non-VR) IR systems?". After querying Google Scholar, Scopus and Web of Science, 1042 documents were reviewed in depth. Key features and attributes of the systems are summarised and discussed. Sketches of the user interfaces are included as well. The 30 documents that were relevant to the research questions include 16 distinct systems or theories. They discuss and utilise several user input technologies, ranging from controllers, voice input or hand tracking. Although conventional retrieval features are less common, systems enable retrieval of literature, 3D objects, images, books and texts and arrange them in a virtual space (e.g. as grids, arcs or maps). Finally, many of these systems were compared to conventional counterparts through user evaluation (n = 10). Most found user task times to be shorter or equal (n = 5, n = 3). In the seven papers that measured user performance (rate of correct solutions), three reported better performance (one equal). Notably, users always were more satisfied with the Virtual Reality systems compared to conventional ones. Possible limitations of these evaluations are demographic selection and the quality of baseline systems (control).
Project description:Systems biology is a data-heavy field that focuses on systems-wide depictions of biological phenomena necessarily sacrificing a detailed characterization of individual components. As an example, genome-wide protein interaction networks are widely used in systems biology and continuously extended and refined as new sources of evidence become available. Despite the vast amount of information about individual protein structures and protein complexes that has accumulated in the past 50 years in the Protein Data Bank, the data, computational tools, and language of structural biology are not an integral part of systems biology. However, increasing effort has been devoted to this integration, and the related literature is reviewed here. Relationships between proteins that are detected via structural similarity offer a rich source of information not available from sequence similarity, and homology modeling can be used to leverage Protein Data Bank structures to produce 3D models for a significant fraction of many proteomes. A number of structure-informed genomic and cross-species (i.e., virus-host) interactomes will be described, and the unique information they provide will be illustrated with a number of examples. Tissue- and tumor-specific interactomes have also been developed through computational strategies that exploit patient information and through genetic interactions available from increasingly sensitive screens. Strategies to integrate structural information with these alternate data sources will be described. Finally, efforts to link protein structure space with chemical compound space offer novel sources of information in drug design, off-target identification, and the identification of targets for compounds found to be effective in phenotypic screens.
Project description:Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physicochemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here we propose the first hierarchical classification of whole protein complexes of known 3-D structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows nonredundant sets to be derived at different levels of detail. This reveals that between one-half and two-thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits and find that they form a small number of arrangements compared with all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, we identified many possible errors in quaternary structure assignments. Our classification, available as a database and Web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes.
Project description:QSCOP is a quantitative structural classification of proteins which distinguishes itself from other classifications by two essential properties: (i) QSCOP is concurrent with the Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank and (ii) QSCOP covers the widely used SCOP classification with layers of quantitative structural information. The QSCOP-BLAST web server presented here combines the BLAST sequence search engine with QSCOP to retrieve, for a given query sequence, all structural information currently available. The resulting search engine is reliable in terms of the quality of results obtained, and it is efficient in that results are displayed instantaneously. The hierarchical organization of QSCOP is used to control the redundancy and diversity of the retrieved hits with the benefit that the often cumbersome and difficult interpretation of search results is an intuitive and straightforward exercise. We demonstrate the use of QSCOP-BLAST by example. The server is accessible at http://qscop-blast.services.came.sbg.ac.at/
Project description:As functional components in three-dimensional (3D) conformation of an RNA, the RNA structural motifs provide an easy way to associate the molecular architectures with their biological mechanisms. In the past years, many computational tools have been developed to search motif instances by using the existing knowledge of well-studied families. Recently, with the rapidly increasing number of resolved RNA 3D structures, there is an urgent need to discover novel motifs with the newly presented information. In this work, we classify all the loops in non-redundant RNA 3D structures to detect plausible RNA structural motif families by using a clustering pipeline. Compared with other clustering approaches, our method has two benefits: first, the underlying alignment algorithm is tolerant to the variations in 3D structures. Second, sophisticated downstream analysis has been performed to ensure the clusters are valid and easily applied to further research. The final clustering results contain many interesting new variants of known motif families, such as GNAA tetraloop, kink-turn, sarcin-ricin and T-loop. We have also discovered potential novel functional motifs conserved in ribosomal RNA, sgRNA, SRP RNA, riboswitch and ribozyme.
Project description:Residue types at the interface of protein-protein complexes (PPCs) are known to be reasonably well conserved. However, we show, using a dataset of known 3-D structures of homologous transient PPCs, that the 3-D location of interfacial residues and their interaction patterns are only moderately and poorly conserved, respectively. Another surprising observation is that a residue at the interface that is conserved is not necessarily in the interface in the homolog. Such differences in homologous complexes are manifested by substitution of the residues that are spatially proximal to the conserved residue and structural differences at the interfaces as well as differences in spatial orientations of the interacting proteins. Conservation of interface location and the interaction pattern at the core of the interfaces is higher than at the periphery of the interface patch. Extents of variability of various structural features reported here for homologous transient PPCs are higher than the variation in homologous permanent homomers. Our findings suggest that straightforward extrapolation of interfacial nature and inter-residue interaction patterns from template to target could lead to serious errors in the modeled complex structure. Understanding the evolution of interfaces provides insights to improve comparative modeling of PPC structures.
Project description:In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial assumptions about the structure of data. Here, we reformulate the clustering problem from an information theoretic perspective that avoids many of these assumptions. In particular, our formulation obviates the need for defining a cluster "prototype," does not require an a priori similarity metric, is invariant to changes in the representation of the data, and naturally captures nonlinear relations. We apply this approach to different domains and find that it consistently produces clusters that are more coherent than those extracted by existing algorithms. Finally, our approach provides a way of clustering based on collective notions of similarity rather than the traditional pairwise measures.
Project description:Non-planar Fin Field Effect Transistors (FinFET) are already present in modern devices. The evolution from the well-established 2D planar technology to the design of 3D nanostructures rose new fabrication processes, but a technique capable of full characterization, particularly their dopant distribution, in a representative (high statistics) way is still lacking. Here we propose a methodology based on Medium Energy Ion Scattering (MEIS) to address this query, allowing structural and compositional quantification of advanced 3D FinFET devices with nanometer spatial resolution. When ions are backscattered, their energy losses unfold the chemistry of the different 3D compounds present in the structure. The FinFET periodicity generates oscillatory features as a function of backscattered ion energy and, in fact, these features allow a complete description of the device dimensions. Additionally, each measurement is performed over more than thousand structures, being highly representative in a statistical meaning. Finally, independent measurements using electron microscopy corroborate the proposed methodology.
Project description:The exclusion of monolingual natives from cyberspace is a global socioeconomic and cultural problem. Efforts at addressing this problem have been socioeconomic, culminating in training, empowerment, and digital access with the indelible hurt of language inequities. This paper is aimed at the cyber-inclusion of monolingual natives. Since cyber participation is basically through human interaction with cyber-applications in a human language, encapsulating these applications for interaction in any human language will help evade the hurt of language inequities. Information retrieval system (IRS) remains a fundamental cyber-application. Consequently, adopting the design science research methodology, we introduced a lingual agnostic IRS architecture designed on the principle of transparency on user language detection, information translations, and caching. The detailed design of the architecture was done using the unified modeling language. The designed IRS architecture has been implemented using the agile and component-based software engineering approaches. The resultant lingual agnostic IRS (LAIRS) was evaluated using heuristics and system evaluation methods for parity of language of interaction against the default language and was excellently stable across queries and languages, guaranteeing 86% parity with the default language in the use of other languages for information access and retrieval. Furthermore, it has been shown that LAIRS is the most appropriate IRS to address the problem of language barriers to cyber-inclusion compared with existing IRSs.
Project description:Letter-cued word fluency is conceptualized as a phonemically guided word retrieval process. Accordingly, word clusters typically are defined solely by their phonemic similarity. We investigated semantic clustering in two letter-cued (P and S) word fluency task performances by 315 healthy adults, each for 1 min. Singular value decomposition (SVD) and generalized topological overlap measure (GTOM) were applied to verbal outputs to conservatively extract clusters of high-frequency words. The results generally confirmed phonemic clustering. However, we also found considerable semantic/associative clusters of words (e.g., pen, pencil, and paper), and some words showed both phonemic and semantic associations within a single cluster (e.g., pair, pear, peach). We conclude that letter-cued fluency is not necessarily a purely phonemic word retrieval process. Strong automatic semantic activation mechanisms play an important role in letter-cued lexical retrieval. Theoretical conceptualizations of the word retrieval process with phonemic cues may also need to be reexamined in light of these analyses.