Wednesday, January 19, 2011

Why is my Data.fs so big?

If your Data.fs is still big after packing, here are some tricks you might try to find out why it is big.

1. Write a custom script (example see below) that does a catalog query and returns each brain's getObjSize(). This will only find catalog'ed objects, but these are probably the ones that make your Data.fs big.

2. Check portal_historiesstorage to see how many old versions of objects are stored, and how big these objects are. Maybe there's this one news item with a 4Mb image of which there are 365 versions.

3. Check portal_purgepolicy's setting: by default it will keep an infinite number of versions (-1).

Thanks to Huub for these tips.


# example script to get a list of large content objects
from Products.CMFCore.utils import getToolByName

catalog = getToolByName(context, 'portal_catalog')
total = 0
for r in catalog():
size_str = r.getObjSize
(num, pow) = size_str.split(' ')
num = float(num)
if pow == 'kB':
size = num*1
if pow == 'MB':
size = num*1000
size = int(size)
if size > 100:
print size, r.getURL()
total = total + size

print "Total %s kB" % total
return printed

1 comment:

Fred said...

If you want some insight in your (or a customers' data.fs, you can also try
mr.inquisition . We had to do this a little too often to write the script again and again.

http://pypi.python.org/pypi/mr.inquisition

Fred van Dijk