A lot of time has passed, and a lot of code has been written. Bazil is still in heavy development, but it has reached a good milestone to blog about: it can synchronize changes from one peer to another.
Warning: at this stage in development, we will put no effort into compatibility of file formats or protocols. Do not stare into laser with remaining eye.
What follows is a walkthrough of scenario where we have two computers sharing files – find me at GopherCon for a live demo, or follow the steps and run it yourself.
First, make sure you have a working Go (>=1.4) installation. You are expected to have basic familiarity with Go, at this point in development.
Unfortunately, to work around a missing gRPC feature, we need a custom branch of it for now. Let’s check that out:
$ go get google.golang.org/grpc $ cd $GOPATH/src/google.golang.org/grpc $ git remote add bazil https://github.com/bazil/grpc-go $ git fetch bazil $ git checkout -b auth bazil/auth
And then install Bazil itself:
$ go get bazil.org/bazil
For the rest, we’ll assume you have two computers, virtual machines or containers that will talk to each other.
You can also run the steps on one host, by calling passing the
-data-dir=PATH option as appropriate to keep two separate state
We’ll call our two environments
white, and differentiate
them with that hostname in the prompt.
white$ bazil create
black$ bazil create
To introduce the peers to each other, we need to pass their public keys to each other. As the current code doesn’t actually keep track of any nicknames or aliases for peers, we’ll need to refer to these public keys a lot. Let’s set shell variables to remember them.
To see the public key of a node, run
white$ bazil debug pubkey
Typically, debug commands access the database directly, and will only work if the server is not running.
Now set the variable
$BLACK on the host
white with the value being
the public key of
black, and vice versa. If you’re running the two
on the same host, the following will work; if not, copy-pasting with
the mouse is needed.
white$ BLACK="$(bazil -data-dir=path/to/datadir/of/black debug pubkey)"
black$ WHITE="$(bazil -data-dir=path/to/datadir/of/white debug pubkey)"
As is probably obvious from the
debug in the command name, this is
not the final UX for this.
Bazil has a (per-user) server component that the command-line
utilities communicate with. Let’s start the server on
white$ bazil server run & bazil: Listening on [::]:34211
black$ bazil server run & bazil: Listening on [::]:nnnnn
We believe in the value of encryption. Bazil uses convergent encryption with sharing keys where the people who know the relevant sharing key can have access to the data.
The default installation sets up one sharing key, but let’s make a new
one for our shared files; it’s just 32 bytes of random data. We’ll
name our new sharing key
white$ dd if=/dev/urandom of=sekrit bs=32 count=1 white$ bazil sharing add friends <sekrit
Let’s create a volume using the new sharing key, and mount it.
white$ bazil volume create -sharing=friends myfiles white$ mkdir mnt white$ bazil volume mount myfiles mnt
We now have an encrypted, deduplicating, snapshottable, local file
system. Let’s share it with
black, using the public key stored in
$BLACK from earlier.
We introduce a new peer, identified by the public key stored in
$BLACK. We tell
white to allow
black to access its local
content-addressed storage, and
myfiles volume we just created.
white$ bazil peer add $BLACK white$ bazil peer storage allow $BLACK local white$ bazil peer volume allow $BLACK myfiles
black to use the new volume. First, we introduce the
white as a new peer for
black, and giving the network location
where the server on
white is listening on. The server prefers the
port 34211 (bazil, do you see it?), but will use any free port. We
saw the port output earlier.
black$ bazil peer add $WHITE black$ bazil peer location set $WHITE 192.0.2.42:34211
Later, we’ll introduce more rendezvous mechanisms, including multicast DNS and an internet-wide lookup based on the public key, and mechanisms for working behind NATs.
black needs to know the sharing key from earlier. Copy the
black through whatever means are appropriate,
and then run
black$ bazil sharing add friends <sekrit black$ bazil volume connect -sharing=friends $WHITE myfiles black$ bazil volume storage add -sharing=friends myfiles peerkey:$WHITE black$ mkdir mnt black$ bazil volume mount myfiles mnt
We now have the save volume mounted on two machines.
Let’s make changes on
white and observe them on
white$ echo hello, world >mnt/greeting
black$ bazil volume sync myfiles $WHITE black$ ls mnt black$ cat mnt/greeting
white$ echo hello, again >mnt/greeting
black$ bazil volume sync myfiles $WHITE black$ cat mnt/greeting
Hey! It works!
The sync implementation doesn’t currently handle deletions or subdirectories.
There is currently no user interface to resolve conflicts, or to finish sync merges that were postponed because a file was still open.
At this stage in development, we will put no effort into compatibility of file formats or protocols.
After the obvious missing functionality mentioned is done, there’s plenty of work to be done on making the user experience of managing peers better. The steps above are very manual and discrete right now, as that is what’s easiest to debug.
Once the common usage scenarios have been explored, more convenient mechanisms can be added on top of these low-level steps, e.g. bootstrapping a peer connection over ssh, and interacting with friends over im with humans copy-pasting short messages.
To learn more about the why of Bazil, read the introductory blog post.
To understand the architecture of Bazil better, browse the documentation https://bazil.org/doc/ .
Bazil is still at an early stage in development, but the future looks really exciting. We’d love to have you participating.
GopherCon is here, and it is time to reveal what Bazil is all about.
Bazil, also known as
bazil.org/bazil, is a file system that lets
your data reside where it is most convenient for it to reside.
Bazil is still under heavy development, but welcomes developers and curious power users. Here’s a little teaser of what’s coming.
Imagine you have
On the desktop, you naturally want to be able to use the whole 3TB disk. And you’re not always using the desktop, even when you’re home – the sofa is just so comfortable. You’d like to work with your files even when you’re on the laptop.
So you install the currently fashionable large-corporate-backed cloud-sync solution.
A file sync based solution will try to copy all of your files from the desktop to the laptop – yet the laptop’s smaller SSD just can’t hold that much! You’re forced to play games with picking-and-choosing what folders get synchronized, and just don’t have the convenience of grabbing that 8-year-old wedding photo on a whim.
To modernize an aphorism, you can’t put ten terabytes of files on a 500 GB SSD. Syncing between very disproportionate systems is fundamentally a problematic design, and is best for a small hand-picked set of files, not as an actual storage solution.
Don’t take this the wrong way; you really should have some sort of remote backups for important data, in case the building burns down. S3 RRS/Glacier, Google Cloud Storage DRA seem very promising for backup cold storage; we’ll come back to that later.
Rocking it old school? We’re down with that.
A network file system like CIFS or NFS, or something like
would let you use the files from the desktop on the laptop – but your
wifi will never be as fast as the laptop’s local SSD, in either
bandwidth or latency, so now all your file accesses are crawling, and
you end up hunting for an ethernet cable whenever you need to transfer
To speed things up, you end up copying often used files to the SSD. Now you have several copies of the same files, and no idea what was modified when, or whether you’re looking at the last copy, or whether it’s safe to delete to free up space on the cramped laptop.
A network file system will also require for you to stay within wifi range. For travel, you’re once again reduced to up manually copying files around, and once again lose track of where’s the latest copy of what file.
Desktop use is kinda sorta tolerable: you’re never sure whether the file you are looking at is the latest copy
Laptop use is miserable: you’re confused about which copies of your files are the right ones, the network file system is an umbilical tying you to your home network, and everything goes over the slow wifi all the time
Cloud storage is still expensive, but now you can use it as backup only and bypass the synchronization service providers: switching between clouds is easier, and cold storage and reduced availability is cheaper.
However, this leaves you installing & configuring cloud backup software in addition to your network file system woes; not the simplest ordeal, and don’t expect any kind of file history browsing / recovery integration for you network file system clients.
or You can choose to back up to your own disks – with the same caveats as above
or All of the bad parts of the computer in the closet, with the extra of needing to fiddle with the disks and remember things.
Bazil separates knowledge of a file from the contents of the file, letting the laptop know about all of the files, without having to store the contents of the file.
With Bazil, the laptop SSD contents act as
And because Bazil keeps track of the changes, it can also keep track of changes and synchronize them between the different peers; no more confusion about what copy is the latest.
You try to read a file where the contents are not locally stored, the data will be fetched from desktop or cloud/closet server, whichever happens to be the fastest way. All the data is accessible even if it won’t fit on the SSD.
You can pin files for travel, so you’re no longer tied to your home network, or even any network connectivity.
Bazil is the archival solution, with the snapshot feature. Every Bazil peer can browse the earlier snapshots, making restoring files easy no matter what computer you’re on. You don’t have to manage both a network file system and a backup solution.
Bazil is the redundancy solution, with copies of file contents
stored on multiple computers. The CAS stores
immutable, write-once objects, so you can even mitigate software bugs
by taking an extra copy of the history with just
rsync, file system
snapshots, or any other file copy tool. A snapshot is just an object,
and refers to other objects; the objects contain everything needed to
regain access to your files.
All Bazil file storage can be encrypted to guarantee your privacy, whether in the cloud, on your own computers, or on external hard drives. Encryption is on by default.
Bazil is still under heavy development, and a lot of the functionality hinted at above is still not quite there. We welcome developers and curious power users.
The original gopher image was made by Renee French.
Hi. This blog post establishes the Bazil.org project. This is not an April Fool’s joke.
There are more ambituous things floating in the background, but many people have expressed interest in this, so here’s an early release: a Go FUSE filesystem programming library.
This is based on Russ Cox’s fuse library, as hosted at https://code.google.com/p/rsc/source/browse/#hg%2Ffuse
Here’s how to get going:
go get bazil.org/fuse
The github repository is at https://github.com/bazil/fuse