The backup
tool, which we will call backupd
, will be responsible for periodically checking the paths listed in the filedb
database, hashing the folders to see whether anything has changed, and using the backup
package to actually perform the archiving of folders that need it.
Create a new folder called backupd
alongside the backup/cmds/backup
folder, and let's jump right into handling the fatal errors and flags:
func main() { var fatalErr error defer func() { if fatalErr != nil { log.Fatalln(fatalErr) } }() var ( interval = flag.Int("interval", 10, "interval between checks (seconds)") archive = flag.String("archive", "archive", "path to archive location") dbpath = flag.String("db", "./db", "path to filedb database") ) flag.Parse() }
You must be quite used to seeing this kind of code by now. We defer the handling of fatal errors before specifying three flags: interval
, archive
, and db
. The interval
flag represents the number of seconds between checks to see whether folders have changed, the archive
flag is the path to the archive location where ZIP files will go, and the db
flag is the path to the same filedb
database that the backup
command is interacting with. The usual call to flag.Parse
sets the variables up and validates whether we're ready to move on.
In order to check the hashes of the folders, we are going to need an instance of Monitor
that we wrote earlier. Append the following code to the main
function:
m := &backup.Monitor{ Destination: *archive, Archiver: backup.ZIP, Paths: make(map[string]string), }
Here we create a backup.Monitor
method using the archive
value as the Destination
type. We'll use the backup.ZIP
archiver and create a map ready for it to store the paths and hashes internally. At the start of the daemon, we want to load the paths from the database so that it doesn't archive unnecessarily as we stop and start things.
Add the following code to the main
function:
db, err := filedb.Dial(*dbpath) if err != nil { fatalErr = err return } defer db.Close() col, err := db.C("paths") if err != nil { fatalErr = err return }
You have seen this code before too; it dials the database and creates an object that allows us to interact with the paths
collection. If anything fails, we set fatalErr
and return.
Since we're going to use the same path structure as in our user command-line tool program, we need to include a definition of it for this program too. Insert the following structure above the main
function:
type path struct { Path string Hash string }
The object-oriented programmers out there are no doubt by now screaming at the pages demanding for this shared snippet to exist in one place only and not be duplicated in both programs. I urge you to resist this compulsion of early abstraction. These four lines of code hardly justify a new package and therefore dependency for our code, when they can just as easily exist in both programs with very little overhead. Consider also that we might want to add a LastChecked
field to our backupd
program so that we could add rules where each folder only gets archived at most once an hour. Our backup
program doesn't care about this and will chug along perfectly happy with its view into what fields constitute a path.
We can now query all existing paths and update the Paths
map, which is a useful technique to increase the speed of a program, especially given slow or disconnected data stores. By loading the data into a cache (in our case, the Paths
map), we can access it at lightening speeds without having to consult the files each time we need information.
Add the following code to the body of the main
function:
var path path col.ForEach(func(_ int, data []byte) bool { if err := json.Unmarshal(data, &path); err != nil { fatalErr = err return true } m.Paths[path.Path] = path.Hash return false // carry on }) if fatalErr != nil { return } if len(m.Paths) < 1 { fatalErr = errors.New("no paths - use backup tool to add at least one") return }
Using the ForEach
method again allows us to iterate over all the paths in the database. We Unmarshal
the JSON bytes into the same path structure as we used in our other program and set the values in the Paths
map. Assuming nothing goes wrong, we do a final check to make sure there is at least one path, and if not, return with an error.
The next thing we need to do is to perform a check on the hashes right away to see whether anything needs archiving, before entering into an infinite timed loop where we check again at regular specified intervals.
An infinite loop sounds like a bad idea; in fact to some it sounds like a bug. However, since we're talking about an infinite loop within this program, and since infinite loops can be easily broken with a simple break
command, they're not as dramatic as they might sound.
In Go, to write an infinite loop is as simple as:
for {}
The instructions inside the braces get executed over and over again, as quickly as the machine running the code can execute them. Again this sounds like a bad plan, unless you're careful about what you're asking it to do. In our case, we are immediately initiating a select
case on the two channels that will block safely until one of the channels has something interesting to say.
Add the following code:
check(m, col) signalChan := make(chan os.Signal, 1) signal.Notify(signalChan, syscall.SIGINT, syscall.SIGTERM) for { select { case <-time.After(time.Duration(*interval) * time.Second): check(m, col) case <-signalChan: // stop fmt.Println() log.Printf("Stopping...") goto stop } } stop:
Of course, as responsible programmers, we care about what happens when the user terminates our programs. So after a call to the check
method, which doesn't yet exist, we make a signal channel and use signal.Notify
to ask for the termination signal to be given to the channel, rather than handled automatically. In our infinite for
loop, we select on two possibilities: either the timer
channel sends a message or the termination signal channel sends a message. If it's the timer
channel message, we call check
again, otherwise we go about terminating the program.
The time.After
function returns a channel that will send a signal (actually the current time) after the specified time has elapsed. The somewhat confusing time.Duration(*interval) * time.Second
code simply indicates the amount of time to wait before the signal is sent; the first *
character is a dereference operator since the flag.Int
method represents a pointer to an int, and not the int itself. The second *
character multiplies the interval value by time.Second
, which gives a value equivalent to the specified interval in seconds. Casting the *interval int
to time.Duration
is required so that the compiler knows we are dealing with numbers.
We take a short trip down the memory lane in the preceding code snippet by using the goto
statement to jump out of the switch and to block loops. We could do away with the goto
statement altogether and just return when a termination signal is received, but the pattern discussed here allows us to run non-deferred code after the for
loop, should we wish to.
All that is left is for us to implement the check
function that should call the Now
method on the Monitor
type and update the database with new hashes if there are any.
Underneath the main
function, add the following code:
func check(m *backup.Monitor, col *filedb.C) { log.Println("Checking...") counter, err := m.Now() if err != nil { log.Fatalln("failed to backup:", err) } if counter > 0 { log.Printf(" Archived %d directories ", counter) // update hashes var path path col.SelectEach(func(_ int, data []byte) (bool, []byte, bool) { if err := json.Unmarshal(data, &path); err != nil { log.Println("failed to unmarshal data (skipping):", err) return true, data, false } path.Hash, _ = m.Paths[path.Path] newdata, err := json.Marshal(&path) if err != nil { log.Println("failed to marshal data (skipping):", err) return true, data, false } return true, newdata, false }) } else { log.Println(" No changes") } }
The check
function first tells the user that a check is happening, before immediately calling Now
. If the Monitor
type did any work for us, which is to ask if it archived any files, we output them to the user and go on to update the database with the new values. The SelectEach
method allows us to change each record in the collection if we so wish, by returning the replacement bytes. So we Unmarshal
the bytes to get the path structure, update the hash value and return the marshaled bytes. This ensures that next time we start a backupd
process, it will do so with the correct hash values.