Over at Nex7’s Blog, Andrew Galloway from Nexenta Systems writes that while ZFS is one of the most powerful, flexible, and robust filesystems, it does have its own share of caveats, gotchya’s, and hidden “features.”
Deduplication Is Not Free. Another common misunderstanding is that ZFS deduplication, since its inclusion, is a nice, free feature you can enable to hopefully gain space savings on your ZFS filesystems/zvols/zpools. Nothing could be farther from the truth. Unlike a number of other deduplication implementations, ZFS deduplication is on-the-fly as data is read and written. This creates a number of architectural challenges that the ZFS team had to conquer, and the methods by which this was achieved lead to a significant and sometimes unexpectedly high RAM requirement. Every block of data in a dedup’ed filesystem can end up having an entry in a database known as the DDT (DeDupe Table). DDT entries need RAM. It is not uncommon for DDT’s to grow to sizes larger than available RAM on zpools that aren’t even that large (couple of TB’s). If the hits against the DDT aren’t being serviced primarily from RAM or fast SSD, performance quickly drops to abysmal levels. Because enabling/disabling deduplication within ZFS doesn’t actually do anything to data already on disk, do not enable deduplication without a full understanding of its requirements and architecture first. You will be hard-pressed to get rid of it later.
Read the Full Story.