SANITY CHECK: Block Cloning

nickf · August 27, 2023, 2:55pm

Hey all just a quick little blerb and sanity check. Am I on the right track here?

Block Cloning:

With OpenZFS 2.3/TrueNAS Cobia we now have a really cool new feature called block cloning. What is it? “Basically free” deduplication…but with an asterix.

Nothing is free in life and block cloning has some limitations for when it can work it’s magic.

PREFACE:

I have a fairly elaborate system for ingesting media files. I have a “staging” folder for new media as it is ingested. My staging folder is actually in a separate dataset, which has been my design for years…this was in an attempt to get more granular snapshot retention policies and save on space efficiency by retaining the production files longer than the source/staging files.

I keep these original media files for a period of time, but I do not use them directly.
This media can be from

a camera
a DVR
old VHS tapes captures or DVD rips
AI Upscaled projects from Topaz
Internet Archive Torrents

I have scripts which copy them to another dataset and then re-encode them. The “copy script” use Python’s shutil.copy2(file_path, new_file_path) .

Basically, my use case can be thought of both as video production and Plex joined together.

QUESTION:

Once the copied files are converted they are deleted. However, since there is some lag time between copying and re-encode, there is always some duplicate data which is stored. Currently my backlog is ~500GiB

But unfortunately, I do not realize all of these savings from block cloning.

I believe this is because we are crossing the boundaries of the dataset? I’m fairly confident that by moving my ingestion folder from it’s current, separate dataset, into a basic subfolder of the production dataset I will be able to leverage Block Cloning for this workflow. But I am looking for a sanity check I don’t want to re-engineer things only to find out my understand was wrong lol.

FWIW: I do understand that for this workflow the savings as reported in the BRT would only be temporary because the resultant files from being re-encoded would no longer be copies of the originals. But the end-result is typically a much smaller file than the source, and I do not keep the originals forever. That fact is factored into my overall storage growth plans.

It just sucks that I waste about half a TiB every month or so for no reason, which is why I would love to try to get block cloning to help me out here. Since the production files are written once and never change, I’m fairly confident I can come up with some pretty sane snapshot retention policies which will help continue to realize most of the savings.

robn · August 27, 2023, 10:09pm

This is a very nice use case for block cloning!

I think the main problem is in your use of shutil.copy2(). The docs say that it uses os.sendfile() under the hood. Currently sendfile() can’t clone a file. Its likely that if you could do something with os.copy_file_range() you would start getting clones.

That would also deal with the cross-dataset parts; currently on Linux copy_file_range() is the only thing that can clone across datasets.

If you have coreutils 9.0 or later (highliy likely, use cp --version to check), you can use to check if and when cloning works. cp --reflink=auto will use copy_file_range(), and so should be able to do cross-dataset clones, but will fall back to regular copies if it can’t do the clone for some reason.

You can also try cp --reflink=always to try an explicit clone. This will never fall back to a copy if it can’t do a clone. Unfortunately this won’t work across datasets.

(these are all limitations in Linux; OpenZFS would quite happily clone across the entire pool, but Linux rejects the requests before OpenZFS sees them).

And if those don’t work, we’ll have to dig deeper, but at least make sure your pool has the block_cloning feature enabled.