Tuesday, April 1, 2014

Delete files older than x days - Python, bash etc

Our postgres logs directory had logs from almost 6 months and we only needed 5 days worth of data. Its pretty straightforward to remove stuff using bash:

find /db/logs/archive_data/* -type f -mtime +5 | xargs rm

But find sometimes fails with the error "too many arguments" when it has to deal with too many files.

Here is another way in Python:

Example usage: Save this as cleanup.py and run - python cleanup.py /db/data 5
from sys import argv
import os, time

script, dir, age = argv
print "Searching directory %s for file older than %s day(s)" % (str(argv[1]), str(argv[2]))

#convert age to sec. 1 day = 24*60*60
age = int(age)*86400

for file in os.listdir(dir):
    now = time.time()
    filepath = os.path.join(dir, file)
    modified = os.stat(filepath).st_mtime
    if modified < now - age:
        if os.path.isfile(filepath):
            os.remove(filepath)
            print 'Deleted: %s (%s)' % (file, modified)