HDF5 for efficient I/O

Fast Containers

© Lead Image © Kirsty Pargeter, 123RF.com

© Lead Image © Kirsty Pargeter, 123RF.com

Article from Issue 205/2017

HDF5 is a flexible, self-describing, and portable hierarchical filesystem supported by a number of languages and tools, with the ability to run processes in parallel.

Input/output operations are a very important part of many applications, sometimes involving a huge amount of data and a large number of reads and writes. Therefore, applications can use a very significant portion of their total run time to perform I/O, which becomes critical in Big Data, machine learning, and high-performance computing (HPC).

In a previous article [1], I discussed options for improving I/O performance, focusing on parallel I/O. One of the options mentioned was to use a high-level library to perform the I/O. A great example of such a library is the Hierarchical Data Format (HDF) [2], a standard library used primarily for scientific computing.

In this article, I introduce HDF5 and focus on the concepts and its strengths in performing I/O; then, I look at some simple Python and Fortran code examples, before ending with an example of parallel I/O with HDF5 and Fortran.


Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • TruPax 9

    The TruPax tool specializes in encrypting small datasets to safeguard your data from prying eyes.

  • FAQ – Common crawl project

    Download the entire web to kick-start a data science empire.

  • Perl: Google Chart Instructions

    A CPAN module passes drawing instructions in object-oriented Perl to Google Chart, which draws visually attractive diagrams.

  • RAID Performance

    You can improve performance up to 20% by using the right parameters when you configure the filesystems on your RAID devices.

  • Synkron

    Legacy backup programs are too heavyweight for a quick backup on the fly, but Synkron helps you keep smaller datasets in sync with just a couple of mouse clicks.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95


njobs Europe
Njobs Netherlands Njobs Deutschland Njobs United Kingdom Njobs Italia Njobs France Njobs Espana Njobs Poland
Njobs Austria Njobs Denmark Njobs Belgium Njobs Czech Republic Njobs Mexico Njobs India Njobs Colombia