Friday, June 27, 2008

How to search in file fields in custom content-types in Plone 2.5

If you have a file upload field in your custom content type, you have to mark it as implementing IIndexableContent and provide a method indexableContent. For my custom type, which has a file field called "file", the method looks like this:

def indexableContent(self, fields):
if 'SearchableText' in fields:
file = self.getFile()
if file:
# file is a file object
mimetype = file.getContentType()
icc = ICC()
return icc
return None

Tuesday, June 24, 2008

How to search in files (doc, pdf, etc.) in Plone 2.5

Plone 3 will index uploaded files out of the box, even Word docs. In Plone 2.5, you have to go to some extra trouble if you want uploaded files to be searched (indexed).

The key here is the TextIndexNG product. Here are the steps to take (from doc/README):

Installation on Plone:
- follow the steps above
- uncommented all directives in TextIndex3/adapters/configure.zcml (by
removing the HTML comments ''
- go to "Plone setup" -> "Add/remove programs"
- choose TextIndexNG3 to be added as new product
- a new configlet for TextIndexNG3 will appear on the setup screen (left
- click on the configlet and choose the only option to replace the
existing index setup with TextIndexNG3 indexes

edit 2008/09/23:
It's worth noting that you may replace the first two steps by:

- (possibly, first virtualenv your zope instance)
- easy_install "Products.TextIndexNG3<3.2"

About converters: on Ubuntu you might use wvWare (apt-get install wv) for MS-Word and xpdf for PDF. See

Friday, June 20, 2008

Plone: Listing all available permissions in the site

For a future PAS plugin, i wanted to create a (multi)select field on a content type that takes the available site permissions as a vocbulary. It took me a while to figure out how to get a list of all available permissions, so i'll put it here:

from Products.CMFCore.utils import getToolByName

portal_url = getToolByName(context, "portal_url")
portal = portal_url.getPortalObject()

all_permissions = []
# context has a method valid_roles
for role in portal.valid_roles():
# role is a string of 'Contributor', 'Reader', etc.
permissions = portal.permissionsOfRole(role)
# permissions is a list of dicts
for permission in permissions:
# permission is a dict with keys 'selected' and 'name'
if permission['name'] not in all_permissions:
return all_permissions

Thursday, June 12, 2008

Tricks for massive file manipulation

(modified 2010-01-26)

# List all .pyc files (or directories) in current directory and below:
find . -name '*.pyc'

# Remove all .pyc files (or directories) from current directory and below:
find . -name '*.pyc' -exec mv {} ~/.trash \;

# Remove all files from build directory and below, and also remove them from cvs:
find . -type f -wholename './build/*' -exec rm {} \; -exec cvs remove {} \;

# Find all .py.metadata files (or directories) in or below /path/to/directory, print the lines containing 'proxy', also list the filename (-H)
find /path/to/directory -name '*.py.metadata' -exec grep -H proxy {} \;

# Find all files or directories in current directory and print only their names:
find . -regex './[^/]*' -printf "%p"

# Find all files older than 30 days (by modification date) and print the modification date and name:
find . -type f -mtime +30 -printf '%t\t%p\n'

# Find directories in current folder not modified in the last 30 days, print modification date and name, and delete:
find . -regex './[^/]*' -type d -mtime +30 -printf '%t\t%p\n' -delete

# Find
# - files and directories in current folder:
# - not accessed in last month (if it's a folder)
# and do this:
# - delete it
# - (and possible subfolders)
find . -maxdepth 1 -name "*" -atime +31 -print0 | xargs -0 rm -rf

# Replace all occurrences of 'foo' by 'bar' in all files below this level
find . -name "*" -type f | xargs perl -pi -e 's/foo/bar/g'

# Rename files: replace 'foo' in filename with 'bar'.
# This may give errors if 'foo' occurs in more than one place on the path,
# but you can simply run it more than once.
find . -name "*foo*" | xargs rename 's/(.*)foo(.*)/$1bar$2/'