1. Write a custom script (example see below) that does a catalog query and returns each brain's getObjSize(). This will only find catalog'ed objects, but these are probably the ones that make your Data.fs big.
2. Check portal_historiesstorage to see how many old versions of objects are stored, and how big these objects are. Maybe there's this one news item with a 4Mb image of which there are 365 versions.
3. Check portal_purgepolicy's setting: by default it will keep an infinite number of versions (-1).
Thanks to Huub for these tips.
# example script to get a list of large content objects
from Products.CMFCore.utils import getToolByName
catalog = getToolByName(context, 'portal_catalog')
total = 0
for r in catalog():
size_str = r.getObjSize
(num, pow) = size_str.split(' ')
num = float(num)
if pow == 'kB':
size = num*1
if pow == 'MB':
size = num*1000
size = int(size)
if size > 100:
print size, r.getURL()
total = total + size
print "Total %s kB" % total
return printed

1 comments:
If you want some insight in your (or a customers' data.fs, you can also try
mr.inquisition . We had to do this a little too often to write the script again and again.
http://pypi.python.org/pypi/mr.inquisition
Fred van Dijk
Post a Comment