e-Book - Redis in Action

This book covers the use of Redis, an in-memory database/data structure server.
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    6.6 Distributing files with Redis

    When building distributed software and systems, it’s common to need to copy, distribute,
    or process data files on more than one machine. There are a few different common
    ways of doing this with existing tools. If we have a single server that will always
    have files to be distributed, it’s not uncommon to use NFS or Samba to mount a path
    or drive. If we have files whose contents change little by little, it’s also common to use
    a piece of software called Rsync to minimize the amount of data to be transferred
    between systems. Occasionally, when many copies need to be distributed among
    machines, a protocol called BitTorrent can be used to reduce the load on the server
    by partially distributing files to multiple machines, which then share their pieces
    among themselves.

    Unfortunately, all of these methods have a significant setup cost and value that’s
    somewhat relative. NFS and Samba can work well, but both can have significant issues
    when network connections aren’t perfect (or even if they are perfect), due to the way
    both of these technologies are typically integrated with operating systems. Rsync is
    designed to handle intermittent connection issues, since each file or set of files can be
    partially transferred and resumed, but it suffers from needing to download complete
    files before processing can start, and requires interfacing our software with Rsync in
    order to fetch the files (which may or may not be a problem). And though BitTorrent
    is an amazing technology, it only really helps if we’re running into limits sending from
    our server, or if our network is underutilized. It also relies on interfacing our software
    with a BitTorrent client that may not be available on all platforms, and which may not
    have a convenient method to fetch files.

    Each of the three methods described also require setup and maintenance of users,
    permissions, and/or servers. Because we already have Redis installed, running, and
    available, we’ll use Redis to distribute files instead. By using Redis, we bypass issues
    that some other software has: our client handles connection issues well, we can fetch
    the data directly with our clients, and we can start processing data immediately (no
    need to wait for an entire file).