Setting up a file server cluster with Samba and CTDB
For a CTDB setup, the Samba developers recommend (at least) two – preferably physically separated – networks: a public network from which the clients will access the available services (Samba, NFS, ftp, …) and a private network, which CTDB uses to handle internal communications within the cluster.
The network for the cluster filesystem can be a separate network, or it can be the same network that CTDB uses internally. A separate management network can turn out to be a good thing for, say, SSH logins on the nodes. Figure 1 shows the basic configuration of a CTDB cluster.
See the box titled "Downloading and Compiling CTDB" for information on how to add CTDB to your own Samba implementation. CTDB's central configuration file is /etc/sysconfig/ctdb. The really important thing is to specify the recovery lock file via the CTDB_RECOVERY_LOCK variable. On top of this, the admin user has to populate the /etc/ctdb/nodes file with the IP addresses of all the CTDB nodes on the private network (Listing 1). This file also has to be identical on all nodes.
Downloading and Compiling CTDB
The Samba project has been using the decentralized Git  code management system since late in 2007. The developers maintain Samba and CTDB on the server at git://git.samba.org or on the web front end . The branches for the official Samba versions and the master developer branch are available from the git://git.samba.org/samba.git repository. The mirror  will even give you tarball snapshots of every single revision.
The official CTDB sources are available from Ronnie Sahlberg's repository . The repository at git://git.samba.org/obnox/samba-ctdb.git contains Samba versions with cluster extensions based on the official release branches – in particular, a production-ready cluster variant of Samba version 3.2 (v3-2-ctdb). As of this writing, the CTDB software will run on Linux and AIX. The normal sequence of commands will build and install the software:
cd ctdb/ ./autogen.sh ./configure [options] make make install
You don't need any special configure options. The normal --prefix allows the administrator to customize the installation directories. On RPM systems, you can generate a package directly from a Git checkout:
01 192.168.46.70 02 192.168.46.71 03 192.168.46.72
If you have Samba with cluster support (see the box "How-to Build Your Own Samba"), you will want to configure it with your own smb.conf parameters. The clustering = yes parameter enables clustering at run time. Without this parameter, the Samba clustering version will work like any old version of Samba without cluster support.
Despite what various pages of the Samba wiki say , you won't need to locate private dir on the cluster filesystem (well, maybe for a local smbpsswd). This information only applies to earlier versions of CTDB that could not handle persistent TDB databases, such as secrets.tdb and passdb.tdb in private dir. Current versions of CTDB automatically distribute the persistent TDBs over the cluster.
If you need group mappings, you must change the back end from the default of ldb to tdb with groupdb:backend = tdb.
Samba uses an identification code to store the locking information: smbd typically creates this code by stat()ing the file's device and inode number. However, the cluster setup needs an ID that is valid for multiple nodes because the device number is not invariable for the file in the cluster. The VFS fileid module provides an alternative approach to forming a file ID that is valid throughout the cluster. The vfs objects = fileid parameter in the corresponding configuration section enables the fileid module either globally or for a share. The value of the fileid:algorithm option in the [global] section configures the method, as in
vfs objects = fileid fileid:algorithm = fsid
How-to Build Your Own Samba
If you can not, or prefer not to, use prebuilt packages, you can build and install a cluster-capable Samba 3.3 from the source code using the standard sequence of commands:
cd samba/source ./autogen.sh ./configure --with-cluster-support --with-ctdb=/usr/include --with-shared-modules=idmap_tdb2 [Options] ./make everything ./make install
You only need to call autogen.sh if you are using a Git repository snapshot instead of the release tarball. The --with-ctdb= configure parameter specifies where the CTDB headers are on the system. Samba needs them to compile the code for communications with CTDB. If you have already installed CTDB from a package, /usr/include is normally okay.
Building the cluster variant of the standard ID mapping module, idmap_tdb, adds idmap_tdb2 to the list of modules in --with-shared-modules=. The Samba team is currently working on merging idmap_tdb and idmap_tdb2 to support idmap_tdb in the cluster. One of the next Samba versions will probably resolve this issue.
cd samba/source ./packaging/RHEL-CTDB/makerpms.sh
generate RPMs for Red Hat and SUSE systems directly from a Git checkout.
To distribute public IP addresses across the cluster nodes you can use any of three options. For example, you can assign addresses statically without involving CTDB. In this case, CTDB can't play its high-availability card. Or, you can use a single IP address as the public cluster address in what is known as LVS mode and let the LVS master node distribute the address to the participating nodes. Setting the CTDB_LVS_PUBLIC_IP and CTDB_PUBLIC_INTERFACE variables in /etc/sysconfig/ctdb enables this mode.
The third method is to allow CTDB to dynamically assign multiple public IP addresses to the nodes. In combination with round-robin DNS upstream, this option adds load balancing and high availability to your CTDB cluster. To allow this to happen, you need to specify a file – typically /etc/ctdb/public_addresses – with the /etc/sysconfig/ctdb CTDB_PUBLIC_ADDRESSES variable on each node; the file contains the address pool with the netmasks and network interfaces that CTDB will assign to the nodes.
The address list does not need to be on every node, and it does not need to be the same on each node. Instead, you can take the network topology of your public network into consideration and create partitions. If a node fails, CTDB transfers its public IP addresses to other cluster nodes, which have these addresses in their public_addresses lists.
It is important to understand that load balancing and client distribution over the client nodes are connection oriented. If an IP address is switched from one node to another, all the connections actively using this IP address are dropped and the clients have to reconnect.
To avoid delays, CTDB uses a trick: When an IP is switched, the new CTDB node "tickles" the client with an illegal TCP ACK packet (tickle ACK) containing an invalid sequence number of 0 and an ACK number of 0. The client responds with a valid ACK packet, allowing the new IP address owner to close the connection with an RST packet, thus forcing the client to reestablish the connection to the new node.
Kernel king admits his tone has alienated volunteers, but says the demands of the process require directness.
New flaw in an old encryption scheme leaves the experts scrambling to disable SSL 3
Lennart Poettering wants to change the way Linux developers talk to each other.
Enterprise giant frees itself from ink and home PCs (and visa versa).
Mozilla’s product think tank sinks silently into history.
TODO group will focus on open source tools in large-scale environments.
New tool will look like GParted but support a wider range of storage technologies.
New public key pinning feature will help prevent man-in-the-middle attacks.
Carnegie Mellon researchers say 3 million pages could fall down the phishing hole in the next year.
The US government rolls new best-practice rules for protecting SSH.