Not a real security geek, and don't even play one on TV, but... I think any approach based on blacklisting will demand constant updating, and you'll never really have confidence in it. I would lean instead towards an encoding system that turns any user-specified name into a clean sequence of letters and numbers, no matter what characters were in the original name. Or do these URL's have to be human-memorable?
no subject