We have a new Canon Multi Function Device (printer, copier, scanner) in the office and it is network ready. So I have configured it to upload scans to the office server for easy access. The problem is that I want to purge these files after some time so as not to clog anything up. The solution was to archive them all off periodically using a CRON job and to use a shell script to manage that process. Here is the file I created:


#!/bin/sh

tar –no-recursion -cjf ~/canon/.archive/scan_archive_`date ‘+%Y-%m-%d.%s’`.tar.bz2 ~/canon/*
if [ $? -eq 0 ]
        then
                find ~/canon/ -name ‘*.jpg’ -mtime +6 -delete
                find ~/canon/ -name ‘*.tif’ -mtime +6 -delete
                find ~/canon/ -name ‘*.pdf’ -mtime +6 -delete
                find ~/canon/ -name ‘*.tar.bz2’ -mtime +180 -delete
        fi

This creates an archive which has a unique time-stamp in its name and then if that goes ahead successfully then it deletes any images older than 6 days from the directory. It also removes any archives older than 180 days just to stop this going on forever and I am contemplating getting it to delete any very small archives because when there is nothing to archive it still creates a fragment of an archive. I must thank this site for inspiring these commands and give them full credit.