Delimiter matching in Perl?
Dec. 30th, 2006 12:58 am![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Here's a question for the Perl and/or regexp experts in the audience; all help is solicited.
ProWiki has a query language built in. Simplifying greatly, the syntax looks like this:
The problem is, I'd really like to be able to do this recursively. That is, I'd like to be able to construct a query like (to take today's example, one of many):
That's conceptually straightforward, but I'm stuck on how to parse it. ProWiki, being based on UseMod, uses Perl regex for its parsing. That mostly works fine, but I can't figure out how to get it to work recursively. I need to find the *matching* {? ?} pairs, extracting as plaintext any pairs that might be contained inside them. (The Perl code itself will then deal with the recursion into the plaintext subexpression.)
Can this be done straightforwardly in regex? It seems like a fairly common problem -- it's basically a fancy variant of parenthesis matching -- but I'm not hip enough to regex to see the answer. It's not simply a matter of matching first and last delimiters in the string, since a given page might contain several unrelated top-level expressions; therefore, I need to find the genuinely *matching* delimiters.
I know there are a bunch of Perl gurus out there, so if you can outline the solution to me (even the solution to the basic parenthesis-matching problem would probably show me how to do it), I'd be grateful. Thanks...
ProWiki has a query language built in. Simplifying greatly, the syntax looks like this:
This translates roughly as "for each page that matches the given query terms, show the display results, interpolating the properties of the page". That all works nicely, and is at the heart of what ProWiki does.{? [query terms] : [display results] ?}
The problem is, I'd really like to be able to do this recursively. That is, I'd like to be able to construct a query like (to take today's example, one of many):
That would translate as something like, "For each Faction, display the Faction's Name, and then for each Character in that Faction, display the Character's Name". Basically, nested foreach loops.{?~Faction : %%Name%% -- {?~Character && Faction==%%PAGENAME%% : %%Name%% ?} ?}
That's conceptually straightforward, but I'm stuck on how to parse it. ProWiki, being based on UseMod, uses Perl regex for its parsing. That mostly works fine, but I can't figure out how to get it to work recursively. I need to find the *matching* {? ?} pairs, extracting as plaintext any pairs that might be contained inside them. (The Perl code itself will then deal with the recursion into the plaintext subexpression.)
Can this be done straightforwardly in regex? It seems like a fairly common problem -- it's basically a fancy variant of parenthesis matching -- but I'm not hip enough to regex to see the answer. It's not simply a matter of matching first and last delimiters in the string, since a given page might contain several unrelated top-level expressions; therefore, I need to find the genuinely *matching* delimiters.
I know there are a bunch of Perl gurus out there, so if you can outline the solution to me (even the solution to the basic parenthesis-matching problem would probably show me how to do it), I'd be grateful. Thanks...
(no subject)
Date: 2006-12-30 09:24 pm (UTC)there is no *formal* relationship between the Faction objects and the Faction property of the Character objects that happen to point to them
Why not? I would think that asserting all object/property names within a namespace/wikispace must be unique (except some special built-in keywords like "name" and "number" and "length" and "type", perhaps), so that a reference to an class as a property means a the property is a foreign key to the object. I'm trying to imagine a case where that is bad, and not coming up with something, though that may be an insufficiency of caffeine on my part.
(no subject)
Date: 2006-12-31 01:20 am (UTC)I would think that asserting all object/property names within a namespace/wikispace must be unique
I'll think about it. The language geek in me somewhat rebels at the notion of having type relationships be implicit in the name like that, but I can't deny that there's a strong ease-of-use argument for it, and I'm not sure there are any compelling use cases against it...