This blog post is aimed at helping you get started with B-tree filesystem (BtrFS). Kernel-based filesystems in the Linux kernel tree are currently over 55 with each filesystem having its pros and cons. Here we’ll extensively cover how to administer BtrFS filesystem in Linux.

Some filesystems have limited or rather very specific usage and the filesystems considered to be truly general purpose are extN systems like ext2,ext3,ext4 – They are stable and powerful but still have certain limitations.

B-tree filesystem (BtrFS) which is pronounced as Better FS has been making inroads in Linux for quite some time now. As of this writing, the stable available version is 4.9. Let’s now get to basics of  BtrFS filesystem in Linux management.

What is BtrFS?

BtrFS is the next generation general purpose Linux file system that offers unique features like advanced integrated device management, scalability and reliability. BtrFS scales to 16 exabytes (EB) and is focused on features that no other Linux filesystems have, some even argue that Btrfs is the Linux answer to the Sun/Oracle ZFS, but its architecture is more scalable than ZFS. In fact BtrFS Filesystem in Linux is getting huge attention at the moment.

BtrFS builds the foundation for Ceph distributed filesystem and its RADOS object store layer for “cloud” technologies. It encompasses ideas from ext4,XFS,HP aufs and Reiser file systems. BtrFS development is very active and new features are being added at a tremendous pace.

Why BtrFS

BtrFS filesystem in Linux provides the following Features and Capabilities

  • Built-in copy on write
  • Powerful snapshot capabilities
  • Built-in volume management with subvolumes
  • Massive Scalability upto 16 Exabytes
  • Built-in data integrity (checksums)
  • SSD optimization
  • Compression capabilities
  • Cloud ready
  • RAID built in BtrFS
  • Manual defragmentation
  • Online filesystem management
  • Data and metadata integrity
  • In-place conversion from ext2/3/4 and ReiserFS
  • Quota groups
  • Online expansion and reduction of filesystem size
  • Object level RAID
  • Seeding devices
  • Support for ultiple devices

Btrfs Specs

  • Max volume size: 16 EB (2^64 byte)
  • Max file size: 16 EB
  • Max file name size: 255 bytes
  • Filesystem check: online and offline
  • Directory lookup algorithm: B-Tree
  • Characters in file name: any, except 0x00
  • Compatibility
    • Hard and symbolic links
    • Access Control Lists (ACLs)
    • Extended Attributes (xattrs)
    • POSIX file owner/permissions
    • Asynchronous and Direct I/O
    • Sparse files

BtrFS truly support maximum file size of 16 Exabytes. If the term Exabytes make you a bit confused, refer to the diagram below which can help you visualize the perspective.

 BtrFS filesystem in LinuxUnderstand file system sizing

To check which filesystems your kernel currently supports, you can find in the file /proc/filesystems. Example is shown below for my local system.

# cat /proc/filesystems 

nodev   sysfs
nodev   rootfs
..............
    btrfs
    ext3
    ext2
    ext4
    vfat
    xfs
    fuseblk
nodev   fuse
nodev   fusectl

For BtrFS support, the output should contain the keyword btrfs.

Installing BtrFS

On Debian based systems:

# apt-get -y install btrfs-progs

RHEL based systems:

# yum -y install btrfs-progs

Arch Linux

# pacman -S btrfs-progs

Gentoo

# emerge --ask sys-fs/btrfs-progs

BtrFS useful mount options

Option Meaning
acl, noacl Enable/disable support for Posix Access Control Lists (ACLs). The default is on
device=/dev/name Tells BtrFS to scan the named device/s for a BtrFS volume.
max_inline=number Specify the maximum amount of space, in bytes, that can be inlined in a metadata B-tree leaf. The default is the default value has changed to 2048 in kernel 4.6. It can be turned off by specifying 0.
clear_cache Use this option to clear all the free space caches during mount.
thread_pool=number The number of worker threads to allocate. NRCPUS is number of on-line CPUs detected at the time of mount. Default is min(NRCPUS + 2, 8).NRCPUS is number of on-line CPUs detected at the time of mount. Small number leads to less parallelism is processing data and metadata, higher numbers could lead to a performance hit due to increased locking contention, cache-line bouncing or costly data transfers between local CPU memories.
space_cache, space_cache=version, nospace_cache The free space cache greatly improves performance when reading block group free space into memory. However, managing the space cache consumes some resources, including a small amount of disk space.
user_subvol_rm_allowed Allow subvolumes to be deleted by their respective owner. Otherwise, only the root user can do that. The deafult is on.

For more details on the available options, read btrfs man page

# man 5 btrfs

Working with BtrFS – Using Examples

My lab machine currently has two secondary hard drives, each one consist of 1 GB to use in the demonstrations to follow shortly. To follow along smoothly, you can spin a virtual machine, install btrfs-progs package and add two secondary hard drives.

Creating and Mounting BtrFS partition

To kick off the demo, we’ll start by creating a BtrFS filesystem on a single 1 GB partition, and mount it to the /data directory. We’re going to create a partition on /dev/vdb which covers 30% of the block device. To make a basic BtrFS file system and mount it, use the following commands:

# parted --script /dev/vdb "mklabel gpt"
# parted --script /dev/vdb "mkpart primary 1 30%"
# parted  /dev/vdb print 
# mkdir /data
# mkfs.btrfs /dev/vdb1
# mount /dev/vdb1 /data

To confirm that mounted partition work the way we wanted, let’s copy some data to it as follows:

# find /usr/share/doc -name '*[a,b].html' -exec cp {} /data \;
# ls -l /data

Check the filesystem using btrfs commands as well:

# btrfs filesystem show /dev/vdb1
# btrfs filesystem df -h /data/
# btrfs filesystem usage /data/

From these commands, you’ll see that we copied some of the existing html files to give us some real data to use for demo. The last command confirms the size close to 300MB ( 30% of 1 GB).

List the subvolumes of the root volume

# btrfs subvolume list /data/

View the disk space utilization:

# btrfs filesystem df -h /data
# btrfs filesystem show /dev/vdb1

Enlarging a btrfs File System

From previous partitioning of dev/vdb, we still have around 700MB unpartitioned. We’re going to use this to enlarge btrfs filesystem.

# parted /dev/vdb mkpart primary 30% 60%
# btrfs device add /dev/vdb2 /data/
# btrfs filesystem show /data
# df -h /data/

Removing btrfs devices

Use the btrfs device delete command to remove an online device. It will redistribute any extents in use to other devices in the file system in order to be safely removed.

Example:

# btrfs device delete /dev/vdb2 /data

That’s all we needed to do to grow BtrFS filesystem. This was confirmed from the output shown below:

 BtrFS filesystem in LinuxGrow BtrFS filesystem

You can also resize directly by specifying the intended size, the syntax is:

# btrfs filesystem resize amount /mount-point

The amount can be a set size, such as ”+3g” for an increase in 3 GiB, or it can be “max” to grow the file system to fill the whole block device. Use ”-3g” for a decrease of 3 GiB. Consider example below to add new partition /dev/sda4 to /home and extend it.

# btrfs device add /dev/sda4 /home -f
# btrfs filesystem resize max /home
# btrfs filesystem show /home

Label: 'home'  uuid: b40ffd9b-c09d-403e-a5f3-b79b5c314505
    Total devices 2 FS bytes used 79.71GiB
    devid    1 size 88.81GiB used 88.81GiB path /dev/mapper/arch-home
    devid    2 size 8.89GiB used 1.00GiB path /dev/sda4

Note that new device was added successfully.

Balancing the filesystem

If we run out of disk space within the original volume, we can add an extra partition. The metadata and data on these devices are still stored only on /dev/vdb1. It must now be balanced to spread across all partitions using below commands:

# btrfs balance start -d -m /data

Arguments:

-d : Represents the data

-m: Represents the metadata

This will ensure that the disks are equally used.

Testing:

It’s time to do some testing on our BtrFS filesystem in Linux. To test that balancing works, i’ll generate two random data of sized 100MB each.

# dd if=/dev/urandom of=/data/hugefile1 bs=1M count=100
# dd if=/dev/urandom of=/data/hugefile2 bs=1M count=100
# btrfs balance start -d -m /data
# btrfs filesystem show /data

You should notice that the data is well balanced across the two volumes.

If you would like the /data directory mounted at boot time, append below entry to the /etc/fstabfile:

/dev/vdb1  /data btrfs device=/dev/vdb1,device=/dev/vdb2 0 0

Multi-device File System Creation

With BtrFS filesystem in Linux. It’s possible to do multi-device management. This makes use of -d and -m options with the mkfs.btrfs command. Valid specifications are:

  • single
  • raid0 : Striping without redundancy
  • raid1 : Disk mirroring
  • raid10 : Striped mirror

The -m single option instructs that no duplication of metadata is done. This may be desired when using hardware raid.

To add a new device to an already created multi-device filesystem, use:

# mkfs.btrfs /dev/device1 /dev/device2 /dev/device3
# mount /dev/device3 /mount-point

Reload btrfs module then run:

# btrfs device scan 

to discover all multi-device filesystems.

Let’s consider example below to create raid10 and raid1 btrfs file system. Notice that raid 10 needs at least four devices for it to operate correctly.

Four devices with metadata mirrored, data striped

# mkfs.btrfs /dev/device1 /dev/device2 /dev/device3 /dev/device4

Two devices, metadata striping but no mirroring

# mkfs.btrfs -m raid0 /dev/device1 /dev/device2

raid10 being used for both data and metadata

# mkfs.btrfs -m raid10 -d raid10 /dev/device1 /dev/device2 /dev/device3 /dev/device4

Full capacity of each device being used when the drives are different sizes:

# mkfs.btrfs /dev/device1 /dev/device2 /dev/device3
# mount /dev/device1 /mount-point

Do not duplicate metadata on a single drive.

# mkfs.btrfs -m single /dev/device

BtrFS device scanning

Scan all block devices under /dev and probe for BtrFS volumes using:

# btrfs device scan
# btrfs device scan /dev/device

Create BtrFS subvolumes

Subvolumes allow discrete management identities within the BtrFS filesystem. In this section, we’ll create two subvolumes, subvolume1 and subvolume2. For this, we’ll start by creating a new BtrFS on the /dev/vdb3 device, create a mount point and mount it,:

# parted /dev/vdb mkpart primary 60% 100%
# mkfs.btrfs /dev/vdb3
# mkdir /subvol_btrfs
# mount /dev/vdb3 /subvol_btrfs

Now let’s create the two subvolumes on /subvol_btrfs.

# btrfs subvolume create /subvol_btrfs/subvolume1
# btrfs subvolume create /subvol_btrfs/subvolume2

When we define the subvolumes, both the directories and BtrFS subvolume entities will be created in the filesystem.

Create a few files in /subvol_btrfs and the subvolumes:

# touch /subvol_btrfs/btrfsmainfile.txt
# touch /subvol_btrfs/subvolume1/subvolume1file.txt
# touch /subvol_btrfs/subvolume2/subvolume2file.txt

List the currently available subvolumes in /subvol_btrfs:

# btrfs subvolume list /subvol_btrfs

ID 256 gen 9 top level 5 path subvolume1
ID 257 gen 9 top level 5 path subvolume2

Unmount /subvol_btrfs:

# umount /subvol_btrfs

Mounting subvolumes

You can mount a subvolume to a mount point. Let’s do this and compare the results using ls -lcommand:

# mount /dev/vdb3 /subvol_btrfs/
# ls -l /subvol_btrfs/
# umount  /subvol_btrfs/

# mount -o subvol=subvolume1 /dev/vdb3 /subvol_btrfs/
# ls -l /subvol_btrfs/

# mount -o subvol=subvolume2 /dev/vdb3 /subvol_btrfs/
# ls -l /subvol_btrfs/

Make subvolume default subvolume instead of the current root volume:

Let’s make subvolume1 the default subvolume. What we need is its ID:

# umount /subvol_btrfs/ 2>/dev/null
# mount /dev/vdb3 /subvol_btrfs/
# ID=`btrfs subvolume list /subvol_btrfs/ | grep subvolume1 | awk '{print $2}'`
# btrfs subvolume set-default ${ID} /subvol_btrfs

Test by re-mounting /dev/vdb3:

# umount  /subvol_btrfs/
# mount /dev/vdb3 /subvol_btrfs/

# ls -l /subvol_btrfs/
total 0
-rw-r--r--. 1 root root 0 Jan 10 11:22 subvolume1file.txt

Notice from the output above that the data we had created on subvolume1 is the default available on mounting /dev/vdb3.

To set the default back to the root volume, use the ID of 0 or 5:

# btrfs subvolume set-default 0 /subvol_btrfs
# umount /subvol_btrfs 
# mount /dev/vdb3 /subvol_btrfs/

# ls -l /subvol_btrfs/
total 0
-rw-r--r--. 1 root root  0 Jan 10 11:22 btrfsmainfile.txt
drwxr-xr-x. 1 root root 36 Jan 10 11:22 subvolume1
drwxr-xr-x. 1 root root 36 Jan 10 11:22 subvolume2

 BtrFS filesystem in LinuxCreate and Mount BtrFS subvolume

This marks the end of working with subvolumes in BtrFS.

Working with BtrFS snapshots

BtrFS filesystem in Linux snapshots feature can be used as read only or read/write copies of data. Snapshots can be used in the following ways:

1. Creating the snapshot as read only and subsequently implementing a backup of the snapshot. In this way, the backup will be of the host filesystem at the point in time that the snapshot was created.

2. Using it as revert point when modifying many files. If the modifications cause negative results, you can easily revert to the snapshot copy.

The snapshot have to be created on the same filesystem as the target data since rapid creation of the snapshot is affected by a form of internal linking within the filesystem.

NOTE: You cannot create a snapshot of the complete filesystem. This is because changes to the snapshot will need to be written back to itself resulting in infinite recursion.

For the purpose of demonstrations, we’ll use the two subvolumes we created earlier. Our scenario is that we create a read-only snapshot of the working subvolume subvoume1.

# btrfs subvolume snapshot -r /subvol_btrfs/subvolume1 /subvol_btrfs/subvolume2/backup/

Create a readonly snapshot of '/subvol_btrfs/subvolume1' in '/subvol_btrfs/subvolume2/backup'

We can list the available subvolumes with the command:

# btrfs subvolume list  /subvol_btrfs/

ID 256 gen 24 top level 5 path subvolume1
ID 257 gen 24 top level 5 path subvolume2
ID 258 gen 24 top level 257 path subvolume2/backup

From the output, we can see that the snapshot appears as a new subvolume. Listing the contents of both directories should indicate that the contents are the same:

# ls /subvol_btrfs/subvolume2/backup/
subvolume1file.txt

# ls /subvol_btrfs/subvolume1/
subvolume1file.txt

Should we delete all the files from /subvol_btrfs/subvolume1/, the copy-on-write (COW) technology in BtrFS will then create the files in /subvol_btrfs/subvolume2/backup. We can simply copy the files back to the original location in the event of a catastrophe since they won’t be modified if the original files changes.

BtrFS In-Place Migration; Convert an ext4 Filesystem to BtrFS

In this example, I’ll show you how to convert an ext4 Filesystem to BtrFS. Since I’m running CentOS server on KVM, I’ll add secondary hard drive, create an ext4 partition, then convert it to BtrFS so that you can get the full picture of how it is done.

Add 1GB secondary block device, this is to de done on the host machine:

# virsh vol-create-as default  --name btrfs-sec.qcow2 1G
# virsh vol-list --pool default
# virsh attach-disk --domain cs1 --source /var/lib/libvirt/images/btrfs-sec.qcow2 --persistent --target vdc

Confirm it’s added on vm:

# lsblk  /dev/vdc 

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vdc  252:32   0   1G  0 disk

Create ext4 partition:

# parted --script /dev/vdc "mklabel gpt mkpart primary 0% 100%"
# parted --script /dev/vdc print
# lsblk -f /dev/vdc

Mount the newly created file system, create a few files and directories then unmount the filesystem:

# mkdir /ext4tobtrfs
# mount /dev/vdc1 /ext4tobtrfs/
# mkdir /ext4tobtrfs/test-{1-4}-dir
# touch /ext4tobtrfs/test-file{1..10}.txt
# ls -l  /ext4tobtrfs/
# umount /ext4tobtrfs/

Convert the filesystem to Btrfs:

# btrfs-convert -l convertedfs /dev/vdc1 
create btrfs filesystem:
    blocksize: 4096
    nodesize:  16384
    features:  extref, skinny-metadata (default)
creating btrfs metadata.
copy inodes [o] [         0/        22]
creating ext2 image file.
set label to 'convertedfs'
cleaning up system chunk.
conversion complete.

Mount the filesystem again and view the filesystem type:

# mount /dev/vdc1 /ext4tobtrfs/
# df -hT /ext4tobtrfs/
Filesystem     Type   Size  Used Avail Use% Mounted on
/dev/vdc1      btrfs 1022M   51M  643M   8% /ext4tobtrfs

Note that the filesystem of /ext4tobtrfs is of type btrfs.

To view subvolumes,BtrFS information and content, use:

# btrfs filesystem show /ext4tobtrfs/
Label: 'convertedfs'  uuid: 3e985770-66a0-4b85-810e-2e93182696f3
    Total devices 1 FS bytes used 34.78MiB
    devid    1 size 1022.00MiB used 616.25MiB path /dev/vdc1

# btrfs subvolume list /ext4tobtrfs/
ID 256 gen 6 top level 5 path ext2_saved

# ls -l /ext4tobtrfs/
total 16
drwxr-xr-x. 1 root root 10 Jan 10 13:22 ext2_saved
drwx------. 1 root root  0 Jan 10 13:14 lost+found
drwxr-xr-x. 1 root root  0 Jan 10 13:19 test-{1-4}-dir
-rw-r--r--. 1 root root  0 Jan 10 13:19 test-file10.txt
-rw-r--r--. 1 root root  0 Jan 10 13:19 test-file1.txt
-rw-r--r--. 1 root root  0 Jan 10 13:19 test-file2.txt
-rw-r--r--. 1 root root  0 Jan 10 13:19 test-file3.txt
-rw-r--r--. 1 root root  0 Jan 10 13:19 test-file4.txt
-rw-r--r--. 1 root root  0 Jan 10 13:19 test-file5.txt
-rw-r--r--. 1 root root  0 Jan 10 13:19 test-file6.txt
-rw-r--r--. 1 root root  0 Jan 10 13:19 test-file7.txt
-rw-r--r--. 1 root root  0 Jan 10 13:19 test-file8.txt
-rw-r--r--. 1 root root  0 Jan 10 13:19 test-file9.txt

# file  /ext4tobtrfs/ext2_saved/image
/ext4tobtrfs/ext2_saved/image: Linux rev 1.0 ext4 filesystem data, UUID=7e6849f2-8560-4b9d-add8-d344ef577650 (extents) (64bit) (large files) (huge files)

> To mount subvolume or the image in ext2_saved subvolume, use:

# mount -o subvol=ext2_saved /dev/vdc1 /mnt/
# ls -l /mnt
# umount /mnt
# mount -o loop /ext4tobtrfs/ext2_saved/image /mnt/
# ls -la /mnt/

Roll back to the ext4 filesystem:
# umount /ext4tobtrfs/

# btrfs-convert -r /dev/vdc1
rollback complete.

# mount /dev/vdc1 /ext4tobtrfs/

# df -hT /ext4tobtrfs/
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/vdc1      ext4  990M  2.6M  921M   1% /ext4tobtrfs

If you view the files in /ext4tobtrfs/, you’ll note that the directories you created on the BtrFS are gone, only those created initially on the ext4 file system are there.

convert an existing single device system

Convert an existing single device system, /dev/vdb1 in this case, into a two device, raid1 system in order to protect against a single disk failure, use the following commands:

# umount  /subvol_btrfs/
# mount /dev/vdb1 /subvol_btrfs/
# btrfs device add /dev/vdb2 /subvol_btrfs/ -f
# btrfs balance start -dconvert=raid1 -mconvert=raid1 /subvol_btrfs/

BtrFS Maintenance Tasks

BtrFS filesystem in Linux will always require an Admin to know how to perform the following maintenance tasks.

1. Verify checksums with scrub:

Open a terminal window and run:

# watch btrfs scrub status  /subvol_btrfs/

Open another terminal window and run:

# btrfs scrub start /subvol_btrfs/

The watch at the first prompt will show the scrubbing progress.

2. Watch balance:

On one terminal run:

# watch btrfs balance status /subvol_btrfs/

On another terminal, run:

# btrfs balance start /subvol_btrfs/

3. Defragment the filesystem recursively,

# btrfs filesystem defragment -r /subvol_btrfs/

Replacing failed devices on a btrfs file system

If a device is missing or the super block is corrupted, the filesystem will need to be mounted in a degraded mode before troubleshooting. Example is shown below:

# mkfs.btrfs -m raid1 /dev/vdb /dev/vdc /dev/vdd 
# mount -o degraded /dev/vdb /mnt
# btrfs device delete missing /mnt

Conclusion

In this guide on BtrFS filesystem in Linux, I provided comprehensive coverage of BtrFS filesystem in Linux, starting from the basics to doing hands-on configurations. BtrFS is truly something to start working with now as it will be the default enterprise filesystem for years to come. We saw how BtrFS filesystem in Linux simplifies filesystem and volume management by bundling the two in a single-task-work model. Hope you had fun working with BtrFS.

References

1. The man page btrfs(8) is a good place to start. It covers all important management commands, which includes:

  • Subvoume and snapshots management
  • Use of scrub,balance and defragment commands
  • Filesystem management with manage command
  • The device commands for managing devices.

Other man pages include:

To learn more on  BtrFS filesystem in Linux administration, please refer to the following man pages.

# man mkfs.btrfs 
# man 5 btrfs
# man 8 fsck.btrfs