jducoeur: (querki)
[personal profile] jducoeur
(This one is for the programmers out there, and especially for security geeks.)

As I was doing some updates yesterday, it occurred to me that Querki now allows you to name your Things pretty much anything you want. Including "javascript:...do something malicious...". Since we generate relative URLs to pages (and therefore, the URL is basically this name), this is Bad.

I've fixed the obvious hack by the simple expedient of screening out any URLs that begin "javascript:", but I'm guessing that that isn't enough -- that there are other ways to be malicious with a URL.

So I'm looking for suggestions. Take it for granted that Querki allows you to specify URLs, and that those URLs can be *fairly* arbitrary relative URLs, so I can't just whitelist a simple legal syntax -- I probably need to think in terms of blacklisting the badness. Do you know a good comprehensive list of the possible syntaxes that could be used for Javascript injection when placed inside an href? (Better yet, do you know an existing regex pattern to detect them?)

(no subject)

Date: 2013-12-18 05:47 pm (UTC)
From: [identity profile] goldsquare.livejournal.com
I'd certainly look for ANYTHING which contains a cross-script call of some kind: using colon, double slash (or backslash) and a standard IP, IPV6 or Domain Name. Do you want to take special care when referencing something that COULD be suspicious?

You may want to look at one of my favorite books on the topic - although it is somewhat aging now (perhaps there is a more-recent version): How To Break Web Software by Mike Andrews and James A. Whittaker. My copy is Copyright 2006.
http://books.google.com/books/about/How_to_Break_Web_Software.html?id=zEWvS-sTiNUC
http://www.qualitytesting.info/forum/topics/pdf-downloadhow-to-break-web

James A Whittaker has a much more recent book which I do not have, called "How Google Tests Software". That might have some interesting information.

And, for fun: http://xkcd.com/327/

(no subject)

Date: 2013-12-18 05:59 pm (UTC)
ext_81047: (Dr. Morden clone #187)
From: [identity profile] kihou.livejournal.com
I'm confused why you're talking several times about "relative URLs" here. A URL starting with "javascript:" isn't a valid relative URL, so if you escape things properly to ensure that you're always generating a valid relative URL, you should be fine even if people name their Things maliciously. If a URL starts with an alphanumeric (plus + . and -) string followed by a colon, that's interpreted as a scheme; you can fix that by prepending "./" to the start of your relative URL or (in most cases) URL-escaping the :.

(no subject)

Date: 2013-12-20 04:27 am (UTC)
From: [identity profile] hudebnik.livejournal.com
Yes, that sounds simple and straightforward, and preserves human-readability. Could somebody use "../" to step up out of the usual hierarchy of URL's? Is that a problem?

(no subject)

Date: 2013-12-18 07:24 pm (UTC)
From: [identity profile] ilaine-dcmrn.livejournal.com
In general the reason whitelisting is preferred over black is there are so many forms of encoding that it is very difficult to make the blacklist sufficiently comprehensive. For example, you need to know the various unicode expressions of your blacklist entries as well as ascii.

The OWASP ESAPI might be of use to you:
https://www.owasp.org/index.php/Category:OWASP_Enterprise_Security_API

(no subject)

Date: 2013-12-18 07:51 pm (UTC)
From: [identity profile] ilaine-dcmrn.livejournal.com
This costs real money and only supports Java code, but it is cool like whoah.

http://www1.contrastsecurity.com/

(no subject)

Date: 2013-12-19 04:37 pm (UTC)
From: [identity profile] hudebnik.livejournal.com
Not a real security geek, and don't even play one on TV, but... I think any approach based on blacklisting will demand constant updating, and you'll never really have confidence in it. I would lean instead towards an encoding system that turns any user-specified name into a clean sequence of letters and numbers, no matter what characters were in the original name. Or do these URL's have to be human-memorable?

(no subject)

Date: 2013-12-20 11:58 pm (UTC)
mneme: (Default)
From: [personal profile] mneme
I think the important thing is not so much policing the names of things, but correctly enclosing things. (the same as the issue with little Bobby Tables).

So javascript:whatever isn't an issue as long as when it's included in a link, it's actually href="html:relative_url", not href="relative_url_or_anything", and properly html encoded so it can't break out of the quote jail.

Similarly, it's not an issue in normal text (like the page title) as long as is encoded to the point that that's what appears on the page.

Obviously, you also may want to prohibit a -few- things (specifically, ../ due to the dual meaning), but that's still basically an enclosure issue, not putting something ambiguous into the uri.

Profile

jducoeur: (Default)
jducoeur

June 2025

S M T W T F S
12 34567
891011121314
15161718192021
22232425262728
2930     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags