Dataset Information

Enabling Hardware Affinity in JVM-Based Applications: A Case Study for Big Data

ABSTRACT: Java has been the backbone of Big Data processing for more than a decade due to its interesting features such as object orientation, cross-platform portability and good programming productivity. In fact, most popular Big Data frameworks such as Hadoop and Spark are implemented in Java or using other languages designed to run on the Java Virtual Machine (JVM) such as Scala. However, modern computing hardware is increasingly complex, featuring multiple processing cores aggregated into one or more CPUs that are usually organized as a Non-Uniform Memory Access (NUMA) architecture. The platform-independent features of the JVM come at the cost of hardware abstraction, which makes it more difficult for Big Data developers to take advantage of hardware-aware optimizations based on managing CPU or NUMA affinities. In this paper we introduce jhwloc, a Java library for easily managing such affinities in JVM-based applications and gathering information about the underlying hardware topology. To demonstrate the functionality and benefits of our proposal, we have extended Flame-MR, our Java-based MapReduce framework, to provide support for setting CPU affinities through jhwloc. The experimental evaluation using representative Big Data workloads has shown that performance can be improved by up to 17% when efficiently exploiting the hardware. jhwloc is publicly available to download at https://github.com/rreye/jhwloc.

SUBMITTER: Krzhizhanovskaya V

PROVIDER: S-EPMC7302232 | biostudies-literature | 2020 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Dataset Information

Enabling Hardware Affinity in JVM-Based Applications: A Case Study for Big Data

OmicsDI is part of the ELIXIR infrastructure

Tweets