jducoeur: (Default)
[personal profile] jducoeur
My main project while on Christmas break down in Florida (more notes on that anon) was to work on the OP Compiler. That's steaming forward nicely -- the Alpha Lists are now all parsing, nearly all of the Award Lists are doing so, and the Court Reports are parsing back as far as Siegfried and Wanda: there's a chance I could finish the Court Reports tomorrow.

So why am I posting after midnight? Because the bloody Fencing awards were driving me crazy, as I (correctly) realized that I probably had them grouped wrong.

The thing is, the East has had a bunch of fencing awards, including a long-since-closed one named the Guardsman, and they've gone through a lot of names, since some didn't get named immediately. So I just spent about 45 minutes correlating the Court Reports, comparing them with the Award Lists, and have come to the following conclusions:

The Order of the Guardsman can be found in the OP as the "East Kingdom Fencing Order", the "East Kingdom Fencing Award", and the "East Kingdom Order of Fence", as well as the Guardsman.

On the flip side, the Order of the Golden Rapier is found as the "High Merit for Fence", the "Kingdom Order of Fence", and simply the "Order of Fence", before things settled down on the OGR.

You see why I was tearing my hair out over this?

I *think* I have it all straight, and blessedly I don't think there is any term that is used for *both* of the awards. (Yes, I could theoretically distinguish by date, since the Guardsman was closed before the OGR started, but that goes *way* against the architecture, and would be a ton of work -- the notion of two different awards with the same name is not a pretty one. If I found conflicts, I'd actually rewrite the Court Reports instead.)

Anyway, making progress. With any luck, before terribly long I'll be ready for folks to take the consolidated printout, and tell me about all the synonymous names you find. I suspect that the average person in the OP will turn out to have around two disconnected entries. (I only have one, but I've had the same name since long before my AoA. By contrast, Kenric has something like eight variations of his name, and Darius has at least half a dozen that don't even resemble each other...)

(no subject)

Date: 2013-01-04 01:27 pm (UTC)
From: [identity profile] dlevey.livejournal.com
Yeah, those were *interesting* times, socio-politically, in the fencing community.

(no subject)

Date: 2013-01-04 05:07 pm (UTC)
From: [identity profile] dlevey.livejournal.com
:-) Yes, that would have been, what, 1991 and 1994?

(no subject)

Date: 2013-01-04 05:47 pm (UTC)
From: [identity profile] dlevey.livejournal.com
I'm not sure what the A/C/L means. My recollection is that I was (I think) the 4th Perseus, after Peter, Danulf and... (Sebastian?). The timing was right (late April/Early May) but I can't remember at which event it happened.

(no subject)

Date: 2013-01-04 06:02 pm (UTC)
From: [identity profile] dlevey.livejournal.com
Might be easy enough: I think there were others at the time; it should be with the second group awarded. The OGR should be correct; it was under Bjorn and Morgen, in the fall out at the Rod and Gun Club, and could well have been their last court.

(no subject)

Date: 2013-01-04 02:25 pm (UTC)
From: [identity profile] nomadmwe.livejournal.com
Fencers: We ruin everything.

(no subject)

Date: 2013-01-05 03:55 pm (UTC)
From: [identity profile] nomadmwe.livejournal.com
Nah. That's, like, effort and dedication.

(no subject)

Date: 2013-01-04 02:48 pm (UTC)
From: [identity profile] goldsquare.livejournal.com
I think you SHOULD do a double check by date.

Nameless orders are a pain.

(no subject)

Date: 2013-01-04 05:43 pm (UTC)
From: [identity profile] goldsquare.livejournal.com
I still don't know Scala. (I bought the Impatient book when you recommended it - but still have not had the time/will to read it. It stares balefully on my nightstand.) And, even if I did know Scala, I lack the time to fully read and digest such a complex tool.

But I still find myself a bit surprised. The end result of your parsing should contain, at a minimum, some tuple of "who what-award when".

The idea that you can't, in some post-processing step, weed out that certain "what" items have date constrictions upon them ("not before", "not after") seems unlikely to me. Surely that won't break the architecture?

Frankly, if your architecture doesn't permit some filtration of the pool of awards in some way, perhaps it could be reasonably extended. So the list of "awards" could also certainly be annotated by dates. No one could be awarded a Pelican before the Pelican was created, no one can be awarded an Athena's Thimble after it was closed. That makes the list of awards a bit dynamic: The list on 1/1/2045 is certainly a proper subset of the list of all awards EVER, and may only be a (largely identical) set with the award lists available on the day before.

That entire last paragraph MIGHT be an architectural challenge that your system does not want to withstand, of course. But the former check, as a simple post-processing step, should be trivial: give me a list of all awardings of award X, sorted by date - and the human eye can catch the out of bounds dates.

(no subject)

Date: 2013-01-04 06:11 pm (UTC)
From: [identity profile] goldsquare.livejournal.com
How you spend your time is up to you - it is one of life's most precious commodities.

I note that you HAVE this problem, and therefore it is not a hypothetical problem. How much elbow grease this problem merits, is again your decision to make.

It appears to me that my suggestion would make this parser a bit MORE generalizable than it is. You are trying to use context to map sometimes-ambiguous information to more restricted categories, yes? So that you can import that more canonical data into a repository tool...

Given that date is an available contextual bit of information, it seems to me that it is useful for pruning ambiguity. If in 1994 Fred gets "The Fencing Award", we know it isn't OGR. If it is in 2004, we know it is.

If the idea I proffer is not sized to fit your resources and the scale of the problem, only you can decide that. If the decision is easy (and the answer is no), that doesn't necessarily make the idea or the offer silly...

And again - adding a simple error checking pass for dates at the END of the process, instead of within the parsing tool, is probably well worth the time to do. Judging from what you told Don about his Perseus, you are going to need to do SOME sort of date comparator anyway - to detect duplicates or near duplicates.

Profile

jducoeur: (Default)
jducoeur

July 2025

S M T W T F S
  12345
6789101112
13141516171819
20212223242526
27 28293031  

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags