Random Stuff of Randomness

Simon's

TimeMachine soooo, yester-year?

Unix bash Backup

The way Apples Time Machine works is really admireable. It is a fascinatingly simple solution in principle: Every incremental backup is a collection of hard links to the last backup. Then only the Files that have been changed are replaced with and updated copy.

elegant ?

This seems to be a very crude method to make incremental backups. If a large file changes, the whole file is copied. Most of the time only small files change on the average desktop computer (the intended audience).

elegant !

The elegant part of this solution is, that it does not require specialized software to restore any backup from a TimeMachine drive, as long as the disk can be mounted (usually HFS+ and encryption). Every backup presents the user a complete set of files to browse trough. No need to stitch together mutliple folders with aprtial backups that have been created over time.

Can we use it on other systems ?

Yes, if the file system does support hard links, this should work. Most *nix file systems such as ext2/3/4, btrfs, zfs support hard links. I believe even NTFS has this capability (FAT, et all, however can’t).

but how?

It can be achieved pretty simply by the following two steps:

1. copy

copy all files and folders, to a new location as hard links

1cp -al <last_backup> <dst>

The -l option creates a copy of hard links.

2. rsync the difference

Use rsync to copy only changed files, replacing the hard link in <dst> with the changed file:

1rsync --delete <src> <dst>

That’s basically it, simple!

real world usage

I am using this principle to backup all my data on my home nas to an external disk. The script is called MachineTime.sh:

  1#!/usr/bin/env bash
  2# -*- tab-width: 4; encoding: utf-8; -*-
  3#
  4#########
  5# License:
  6#
  7#             DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
  8#                    Version 2, December 2004
  9#
 10# Copyright (C) 2022 Simon Wunderlin <wunderlins@gmail.com>
 11#
 12# Everyone is permitted to copy and distribute verbatim or modified
 13# copies of this license document, and changing it is allowed as long
 14# as the name is changed.
 15#
 16#            DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
 17#   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
 18#
 19#  0. You just DO WHAT THE FUCK YOU WANT TO.
 20#
 21#########
 22#
 23## @file
 24## @author Simon Wunderlin <wunderlins@gmail.com>
 25## @brief Hardlink Backup Script using rsync
 26## @copyright WTFPLv2
 27## @version 1
 28## @details
 29##
 30## @par Synopsis
 31##
 32## This script creates incremental backups by hard linking 
 33## all files from the last backup and only replacing changed files.
 34##
 35## This is achived by using the built in `cp -al <src> <dst>` command (`-al` 
 36## makes a copy of hard links) and `rsync --delete <last_backup> <dst>` 
 37## to replace changed files.
 38##
 39## @par Usage
 40##
 41## ```bash
 42## backup.sh [-f] <backup_base> <src_dir_1> [<src_dir_2> [<src_dir_3> [...]]
 43## backup.sh [-f] -c <config.ini>
 44## ```
 45## 
 46## - `-f` (optional) @n
 47##   Full backup, otherwise incremental
 48## - `-c` <config.ini> @n
 49##   Take all parameters fom this config file
 50## - `backup_base`  @n
 51##   base directory of backups
 52## - `src_dir_N`  @n
 53##   folder to backup
 54## 
 55## @par config.ini
 56##
 57## Example configuration file
 58##
 59## ```ini
 60## [main]
 61## backup_root = /srv/data/bkp
 62## src_dir     = /srv/data/test
 63## src_dir     = /srv/data/test2
 64## ```
 65## 
 66## @par Configuration
 67## 
 68## some parameters can be configured in the script or by setting 
 69## environment variables:
 70##
 71## - `SVC_NAME`: @n used as service name in logger
 72## - `NICENESS`: @n process nice (0 neutral, 19, lowest, -20 highes priority)
 73## 
 74## @par Permanent Mount in OpenWRT
 75##
 76## Example on hwo to add a permanent mount for the backup disk in openwrt
 77## with the `uci` command:
 78## 
 79## gives partition info, especially uid. Helps identify new usb drives easily.
 80##
 81## ```bash
 82## block detect 
 83## ```
 84## show current mount configuration
 85## 
 86## ```bash
 87## uci show fstab
 88## ```
 89## 
 90## add new mount and enable it
 91## ```bash 
 92## uci add fstab mount
 93## uci set fstab.@mount[-1].target='/mnt/backup'
 94## uci set fstab.@mount[-1].uuid='f1036013-33OO-4d7b-b3e8-c0c3f9a8NNNN'
 95## uci set fstab.@mount[-1].enabled='0'
 96## uci set fstab.@global[0].anon_mount='1'
 97## ```
 98## 
 99## review changes:
100## ```bash 
101## uci changed
102## ```
103##
104## save changes
105## ```bash 
106## uci commit
107## ```
108##
109## activate mount
110## ```bash 
111## /etc/init.d/fstab boot
112## ```
113## 
114## @see https://github.com/Anvil/bash-doxygen
115## @see https://earlruby.org/2013/05/creating-differential-backups-with-hard-links-and-rsync/
116##
117
118# configuration ################################################################
119
120## @brief this is the name used in the logger
121declare -r SVC_NAME=MachineTime
122## @brief nice value of i/o jobs
123## @details 
124## - 19 lowest priority
125## - 0 normal
126## -  -20 highrs priority
127declare -r NICENESS=10
128
129# globals ######################################################################
130
131# absolute path to the running script
132declare -r script_dir=$(readlink -f $(dirname $0))
133
134## @brief the current process id 
135## @private
136declare PROCID=$$
137
138if [[ -z "$SVC_NAME" ]]; then
139	## @brief this is the name used in the logger
140	declare SVC_NAME=common.sh
141fi
142
143## @brief total of all copied bytes of all sub directores
144## @private
145declare STATS_COPY_SUM=0
146
147## @brief total of all deleted bytes of all sub directores
148## @private
149declare STATS_DELETE_SUM=0
150
151## @brief the current timestamp, will be used in base directory
152## @private
153declare -r START_DT=$(date +%Y%m%d-%H%M%S)
154
155# setup locations
156## @brief timestamp of the last backup
157## @private
158declare LAST_DT=""
159
160## @brief folder name of the last backup
161## @private
162declare LAST_BASE=""
163
164## @brief full backup or incremental? (default: incremental)
165## @details
166## use `-f` on the command line for full backup
167## @private
168declare FULL_BACKUP=0
169
170## @brief path to global log file 
171## @private
172declare global_logfile=""
173
174## @brief base directory of backups
175## @private
176declare BACKUP_BASE=""
177
178## @brief space separated list of source directories, no files allowed
179## @private
180declare SRC_DIRS=""
181
182# functions ####################################################################
183
184## @fn log()
185## @brief log string to console
186## @param message String message to print
187## @return String formated log message
188function log() {
189	ts=`date +"%b %d %H:%M:%S"`
190	echo "${ts} ${USER} ${SVC_NAME}[${PROCID}]: ${1}"
191}
192
193## @fn fail()
194## @brief display error message and exit
195## @param message String Error message
196## @param exit_code int exit code
197function fail() {
198	local msg="$1"
199	log "$msg"
200	exit $2
201}
202
203#if ! which realpath; then # not in path, declare substitute
204#log "declaring function realpath()"
205## @fn realpath()
206## @brief realpath substitute using readlink
207## @param Path realtive path to resolve
208## @retval 0 on success
209## @retval 1 if the path cannot be resolved
210## @return Path absolute path or empty string if the file does not exist
211function realpath() {
212	local file="$1"
213	local resolved=$(readlink -f "$file")
214	local ret=$?
215	echo "${resolved}"
216	return $ret
217}
218#fi
219
220## @fn iniget()
221## @brief Get values from a .ini file
222## @details
223##
224## Return values from an ini file. The special parameter `--list` returns 
225## a newline `\n` separated list of sections.
226##
227## If a section contains multiple keys with the same name, a newline `\n`
228## separated list of values is returned.
229## 
230## @par usage
231## usage syntax:
232## 
233##     iniget <file> [--list|<section> [key]]
234## 
235## @par examples
236## 
237## file.ini:
238## 
239##     [Machine1]
240##     app = version1
241##     
242##     [Machine2]
243##     app = version1
244##     app	= version2
245##     [Machine3]
246##     app=version1
247##     app = version3
248## 
249## Examples:
250## Get a list of sections
251## 
252##     $ iniget file.ini --list
253##     Machine1
254##     Machine2
255##     Machine3
256## 
257## get a list of `key=values` in `section`:
258## 
259##     $ iniget file.ini Machine3
260##     app=version1
261##     app=version3
262## 
263## get a list of values (one result)
264## 
265##     $ iniget file.ini Machine1 app
266##     version1
267## 
268## get a list of values (two results)
269## 
270##     $ iniget file.ini Machine2 app
271##     version2
272##     version3
273##
274## loop over results
275## 
276##     for v in $(iniget file.ini Machine2 app); do 
277##         echo "val: $v"; 
278##     done
279##
280## @param file path to `.ini` file
281## @param section or `--list` (returns a list of sections)
282## @param key key name to search for
283## @return String list of sections when used with `--list` else values of `section`/`key` combination
284function iniget() {
285	if [[ $# -lt 2 || ! -f $1 ]]; then
286		echo "usage: iniget <file> [--list|<section> [key]]"
287		return 1
288	fi
289	local inifile=$1
290	
291	if [ "$2" == "--list" ]; then
292		for section in $(cat $inifile | grep "^\\s*\[" | sed -e "s#\[##g" | sed -e "s#\]##g"); do
293			echo $section
294		done
295		return 0
296	fi
297	
298	local section=$2
299	local key
300	[ $# -eq 3 ] && key=$3
301	
302	# This awk line turns ini sections => [section-name]key=value
303	local lines=$(awk '/\[/{prefix=$0; next} $1{print prefix $0}' $inifile)
304	lines=$(echo "$lines" | sed -e 's/[[:blank:]]*=[[:blank:]]*/=/g')
305	while read -r line ; do
306		if [[ "$line" = \[$section\]* ]]; then
307			local keyval=$(echo "$line" | sed -e "s/^\[$section\]//")
308			if [[ -z "$key" ]]; then
309				echo $keyval
310			else		  
311				if [[ "$keyval" = $key=* ]]; then
312					echo $(echo $keyval | sed -e "s/^$key=//")
313				fi
314			fi
315		fi
316	done <<<"$lines"
317}
318
319## @fn cleanup()
320## @brief remove temp files, etc. run before shutdown
321function cleanup() {
322	rm -rf "/tmp/${PROCID}.log" >/dev/null 2>&1 || true
323	log "== Backup finished $(date)"
324	log "Copied/Deleted: $STATS_COPY_SUM/$STATS_DELETE_SUM bytes"
325}
326
327## @fn stats_copy()
328## @brief generate statistics of copied and deleted files
329## @param dst Path of destination folder
330## @param logfile Path to log file
331## @return Integer number of bytes
332function stats_copy() {
333	local dst="$1"
334	local logfile="$2"
335	
336	sum_copy_bytes=0;
337	sizes=$(awk -F' ' '/^file: .*[^\/]$/ {print $2}' $logfile);
338	for n in $sizes; do
339		sum_copy_bytes=$(($sum_copy_bytes + $n));
340	done;
341	echo $sum_copy_bytes
342}
343
344## @fn stats_delete()
345## @brief generate statistics of delete files in incremental backup
346## @param dst Path of destination folder
347## @param last Path of reference folder
348## @param logfile Path to log file
349## @return Integer number of bytes
350function stats_delete() {
351	local dst="$1"
352	local last="$2"
353	local logfile="$3"
354	
355	sum_delete_bytes=0;
356	files=$(awk '/^deleting / {print $2}' $logfile)
357	for f in $files; do
358		#echo "${last}${f}"
359		size=$(stat -c "%s" "${last}${f}")
360		sum_delete_bytes=$(($sum_delete_bytes + $size));
361	done
362	#log "Deleted ${sum_delete_bytes} bytes from $last"
363	echo $sum_delete_bytes
364}
365
366## @fn backup_full()
367## @brief create a full backup
368## @param dir Path source directory to backup from
369## @param dst Path base destination directory
370## @return Integer exit code of `rsync`, exits if `dst` cannot be created
371function backup_full() {
372	local dir="$1"; shift
373	local dst="$1"; shift
374	log "${dir}${dst}"
375	
376	mkdir -p "${dst}${dir}" || fail "Failed to create destionation directory" 3
377	nice -n $NICENESS rsync --out-format="file: %l ${dst}${dir}%n%L" -a \
378		"$dir" "${dst}${dir}" | tee -a ${global_logfile} > /tmp/${PROCID}.log
379	local ret=$?
380
381	# statistics
382	copy_bytes=$(stats_copy "${dst}" "/tmp/${PROCID}.log")
383	STATS_COPY_SUM=$(($STATS_COPY_SUM + copy_bytes))
384	log "» Copied/Deleted: $copy_bytes/0 bytes"
385
386	[ "$ret" != "0" ] && log "Error during full backup of ${dir}"
387	return $ret
388}
389
390## @fn backup_incremental()
391## @brief create a full backup
392## @param dir Path source directory to backup from
393## @param dts Path base destination directory
394## @param last Path base destination directory of the last backup
395## @return Integer exit code of `rsync`, exits if `dst` cannot be created
396function backup_incremental() {
397	local dir="$1"; shift
398	local dst="$1"; shift
399	local last="$1"; shift
400	log "${dir}${dst} Δ ${last}"
401	#log "dir: $dir"
402	#log "dst: $dst"
403	#log "last: $last"
404
405	mkdir -p $(dirname "${dst}${dir}") || \
406		fail "Failed to create destionation directory" 3
407	# “cp -al” makes a hard link copy
408	nice -n $NICENESS cp -al "${last}${dir}" "${dst}${dir}"
409	nice -n $NICENESS rsync --out-format="file: %l ${dst}${dir}%n%L" \
410		-a --delete "${dir}" "${dst}${dir}" | \
411		tee -a ${global_logfile} > /tmp/${PROCID}.log
412	local ret=$?
413	
414	# statistics
415	copy_bytes=$(stats_copy  "${dst}" "/tmp/${PROCID}.log")
416	del_bytes=$(stats_delete "${dst}" "${last}${dir}" "/tmp/${PROCID}.log")
417	STATS_COPY_SUM=$(($STATS_COPY_SUM + copy_bytes))
418	STATS_DELETE_SUM=$(($STATS_DELETE_SUM + del_bytes))
419	log "» Copied/Deleted: $copy_bytes/$del_bytes bytes"
420
421	[ "$ret" != "0" ] && log "Error during incremental backup of ${dir}"
422	return $ret
423}
424
425## @fn usage()
426## @brief show usage
427## @param Path script relative or absolute
428function usage() {
429	echo ""
430	cat <<-EOT
431		usage: $(basename $1) [-f] <backup_base> <src_dir_1> [<src_dir_2> [<src_dir_3> [...]]
432		       $(basename $1) [-f] -c <config.ini>
433		
434		Parameters:
435		    - -f (optional): Full backup, otherwise incremental
436		    - -c <config.ini>: config file
437		    - backup_base: base directory of backups
438		    - src_dir_N: folder to backup
439		
440		Example <config.ini>:
441		
442		    [main]
443		    backup_root = /srv/data/bkp
444		    src_dir     = /srv/data/test
445		    src_dir     = /srv/data/test2
446
447EOT
448	echo ""
449}
450
451## @fn log_parameters()
452## @brief log all parameters to backup directory
453function log_parameters() {
454# write parameter file
455cat <<-EOT > "${BASE_DIR}/parameters.ini"
456[main]
457backup_root      = $BACKUP_BASE
458src_dir          = $SRC_DIRS
459full_backup      = $FULL_BACKUP
460config_file      = $config_file
461
462[variables]
463SVC_NAME         = $SVC_NAME
464NICENESS         = $NICENESS
465script_dir       = $script_dir
466STATS_COPY_SUM   = $STATS_COPY_SUM
467STATS_DELETE_SUM = $STATS_DELETE_SUM
468START_DT         = $START_DT
469PROCID           = $PROCID
470LAST_DT          = $LAST_DT
471LAST_BASE        = $LAST_BASE
472global_logfile   = $global_logfile
473
474EOT
475}
476
477# Main #########################################################################
478if [ -z "$1" ] || [ "$1" = "-h" ] || [ "$1" = "--help" ]; then
479	usage "$0"
480	exit
481fi
482
483log "== Starting backup @ $(date)"
484
485# check if we need to do a full backup or incremental
486[ "$1" == "-f" ] && FULL_BACKUP=1 && shift
487
488# check if we neeed to read aconfig file or if config is provided as parameters
489if [ "$1" == "-c" ]; then
490	config_file="$2"; shift; shift
491	[ ! -f "$config_file" ] && \
492		fail "Config file '$config_file' is not readable" 4
493		
494	BACKUP_BASE=$(iniget "$config_file" main backup_root)
495	SRC_DIRS=$(iniget "$config_file" main src_dir | tr '\n' ' ')
496else # read config from command line
497	BACKUP_BASE="$(realpath $1)"; shift
498	for d in $@; do SRC_DIRS="${SRC_DIRS} $(realpath $d)"; done
499fi
500
501# trim SRC_DIR variable
502SRC_DIRS=$(echo "$SRC_DIRS" | sed -e 's,^\s*,,; s,\s*$,,')
503log "Backup base:        ${BACKUP_BASE}"
504log "Source directories: ${SRC_DIRS}"
505
506## FIXME: add input parameter checks
507if [ -z "$SRC_DIRS" ] || [ -z "$BACKUP_BASE" ]; then
508	usage "$0"
509	exit 1
510fi
511
512## @brief destination folder for the new backup
513## @private
514declare -r BASE_DIR="${BACKUP_BASE}/${START_DT}"
515global_logfile="${BASE_DIR}/${START_DT}-backup.log"
516
517# for incremental backups, we need to find a reference backup
518if [ "$FULL_BACKUP" != "1" ]; then
519	# reference lcoations for incremental backup
520	LAST_DT=$(ls -1 $BACKUP_BASE 2>/dev/null | grep '[0-9-]\+' | sort -r | head -n1)
521	LAST_BASE="${BACKUP_BASE}/${LAST_DT}"
522	# if we do not find at least on old folder, we
523	# cannot create an incremental backup, so abort
524	[ -z "$LAST_DT" ] && fail "Existing folder required for incremental backup" 2
525	log "Running incremental backup Δ ${LAST_BASE}"
526else
527	log "Running full backup"
528fi
529
530mkdir -p "$BASE_DIR"
531log_parameters
532
533# loop over all source folders and include them in the backup
534for d in $SRC_DIRS; do
535	# make sure we have trailing slash for rsync
536	[[ "$d" == */ ]] || d="$d/"
537	echo "== Base dir: $d" >> /tmp/${PROCID}.log
538	
539	# do a full backup 
540	if [ "$FULL_BACKUP" == "1" ]; then
541		backup_full "$d" "$BASE_DIR" # "$@"
542	else
543		# if the sub folder does not exist in the last backup, do a full backup
544		if [ ! -d "${LAST_BASE}/$d" ]; then 
545			backup_full "$d" "$BASE_DIR" # "$@"
546		else # if the sub folder in last backup exists, do an incremental backup
547			backup_incremental "$d" "${BASE_DIR}" "${LAST_BASE}" # "${@}"
548		fi
549	fi
550done
551
552## update parameters after successful lrun
553log_parameters
554
555# cleanup temp files
556cleanup
557
558exit 0;