Work III: The Braid
Jan. 2nd, 2004 06:14 pmOkay, let's round this out with the task that is arguably the largest, as well as the one least likely for me to do it. Still, it's lurking at the back of my mind, so it's worth acknowledging.
I know enough about memory and mnemonics that I really suspect that a well-done cyberspace would be easier to make one's way around in than the web. Those of you who have poked around my sadly neglected homepage know that I've been following spatial metaphors for many, many years. One of the most ancient mnemonic secrets is that it's much easier (for most people) to remember things if you place them spatially. The text version of Chez Coeur (which is still pretty similar to my original layout circa 1992) was always intended to be a stopgap until good online 3D tools existed to do it right.
The problem is, as far as I can tell, those tools still don't exist. Originally, I expected other people to create them -- I figured it was obvious. From time to time, I've worked on it myself. I was one of the original founders of the VRML project, which sought to build the standards for the 3D Web; however, that got mired in conflicting interests and agendas, and never really took the problem of creating a unified cyberspace at all seriously. (That generated some very good ideas, though: more on this later.) Two jobs ago, I was the client lead at Trenza, an overweeningly ambitious dotcom that sought to build a social 3D environment on top of the Web. (That also generated some key ideas, more on which later.)
Over the years, I've developed more and more ideas and opinions about this immersive 3D world, which I've wound up nicknaming The Braid. (I was originally planning on using that name for Trenza's version of it, but as it happens it's wound up applied to the concept in general.) Here are some of the more interesting elements that have floated into the project over time. They aren't necessarily all critical -- in some cases I may be wildly offbase that they're even good ideas -- but I think they're all probably necessary to really make the thing hum.
One important thing to understand going into this: I'm something of a radical when it comes to cyberspace. In my opinion, most previous attempts have been hamstrung by an implicit assumption that cyberspace should be as much as possible like the real world. I think that's foolish: we already have a real world, and while it works fairly well, it has serious limitations. The interesting question is, how can we build something that works far better than the real world?
Problem is, there wasn't much agreement on what this would be for. Some folks were mainly in it for scientific visualization; others wanted to be able to show off discrete spaces. I turned out to be in a relatively small minority in wanting to use this to create a connected cyberspace: a single continuous world that people could wander through.
Really, there were two of us who were focused on this problem, myself and Mark Pesce. Mark was known in those days as "the prophet of VRML" -- he was the one touring around, lecturing on the possibilities and so on. He and I both attacked the problem of the connected world, and came to radically different conclusions about how to do it. His approach was known as The Cyberspace Protocol: it was vaguely DNS-like in its concepts, and I always found it horribly confusing. My approach was different, and pretty damned radical, because it was designed to deal with a problem that most 3D worlds have largely ignored: real estate.
Consider: in the real world, real estate is a pain in the ass. If you have a "downtown" -- a desireable place that folks want to be close to -- then there is only a limited amount of land near to it. That finite land area has all sorts of economic impact: people wind up having to pay real money to be near the good spots, and those who can't afford it are out of luck. It's basic economics, due to the fact that you have a finite resource that is in more demand than you have supply.
My question is this: given that "land area" is a completely artificial construct in cyberspace, why does everyone insist upon applying real-world logic, with its resulting problems, to it?
Portals were my solution to the problem. I proposed them back in 1994, and I still think they're the right way to go. A full description can be found in the archives of the VRML mailing list but the key concept is that Real Estate is a problem only if you assume that cyberspace is globally consistent: that every point in cyberspace maps into a single, coherent, global spatial map. But such a map is largely useless, because isn't really how people think on a day-to-day basis. What folks really need is local consistency: paths that will always take them from A to B. So long as one can give directions, you don't really need the ability to show a top-level map of the "world". (Indeed, as far as I can tell, most people think innately in terms of directions, not in terms of maps. Most people don't do well with compass-based dead reckoning in the real world.)
So the solution is Portals, which are essentially threespace hyperlinks. A given space declares its "entrances": 2D planes that correspond to anchors in HTML. It also declares "exits", which are also 2D planes, and which lead to entances in other spaces. Together, these things are called Portals. If a pair of Portals are mutually linked -- each one is both an entrance and an exit pointing to the other -- then the spaces are simply joined together. But it isn't necessarily so: a given entrance can have many exits pointing to it.
So for example, if I have built a nightclub that people like, I can simply have an entrance at the front of the club. Folks from *anywhere* can declare exits that lead to it. If you want your back door to lead to my nightclub, you can do so. Everything is right next door to anything it wants to be next door to.
There are a bunch of subtleties involved, of course. In order for this to make intuitive sense, you need a concept of "binding" -- if I walk from A to B through a Portal, I should be able to walk back to A again through the same portal. You need to be able to pass bindings on to others, so that people can follow each other through their bindings. Etc; see the above-referenced article for more thoughts. (And even that doesn't cover all the ramifications I've thought of over the years. Feel free to chat about this.)
I'll be the first to admit that this scheme has some odd qualities; it doesn't map perfectly to the real world, so there will be some unintuitive aspects. But it results in a world that is non-Euclidean in all the right ways, I believe, avoiding the real-world problems of real-space while preserving enough to be easy to use. It allows each person building a space to hook it to whatever else they feel is appropriate, without having to worry about scarcity of Real Estate, and the resulting economic complications.
Sadly, this idea is still largely untested. One VRML company did implement the Portals concept, but never actually got to market. (Indeed, I found out that they'd implemented my ideas entirely by accident, when I was evaluating implementations for our Windhaven educational-MUD project a couple of years later.)
A curious coda to all this: around a year or so after I proposed this, a new technology emerged in the 3D graphics world. It is called "portalization", and is essentially a limited version of my Portals design -- it is exactly Portals, but with the assumption that every entrance/exit is paired with exactly one other. While it has its limitations, it has become well-established as one of the easier techniques for optimizing 3D. To this day, I haven't figured out where portalization came from, and whether it was inspired by my proposal. Bringing this all full circle, it turns out that the very best of the freeware 3D renderers -- Crystalspace -- is portalization-based. So it appears that actually implementing Portals in freeware wouldn't be all that hard at this point.
I mean, a 3D space is really lonely if you're the only person there. People have observed this about the Web: one of the things that makes it a less compelling experience is that you experience it alone. Various systems have been built to deal with this, but they've generally been commercial enterprises, and thus very fragmented. The Web, and the Braid, are only going to become social experiences if they're built on a common standards-based platform that everyone can buy into.
This has some implications. It means that we need to be able to build social servers, which operate in parallel to the space servers. Whereas the latter serves out the static descriptions of the world (the geometry and objects in it), the former distributes the dynamic information. At the least, this includes information about the other people who are present: their avatars, their positions, and so on. More generally, it might include all of the dynamic objects in the world -- ideally, all objects will be moveable, so *something* has to be keeping track of their locations.
So say we've got a social world, where you can see the other people who are wandering around in it. If I go to a club, I can see all the other people in the club. This brings up another problem: what do we do about crowds?
This isn't a new issue -- programmers have been dealing with crowd control for as long as they've been building 3D multiuser systems. The problem is this: say that you have a club that will hold 300 people comfortably. Now say that something major is happening, and word gets around. 5000 people try to get into the club. How do you deal?
In the real world, of course, this is similar to the Real Estate problem above, and the solution is similar: again, it's essentially an economic problem. Either you make it very expensive, so that only the richest 300 can get in, or you make it first-come-first-serve, so the first 300 get in. But the same question deserves asking as in the Real Estate problem: why should we build our cyberworld so that it artificially preserves the problems of the real world?
Now, most massively multiuser spaces *do* actually address this problem nowadays, using a technique I call "static winnowing" (for historical reasons from Trenza, mostly). Basically, they replicate the club over and over again. The first 300 people arrive, and go into the club. When the 301st arrives, a new *copy* of the club is created, and they go into that. It looks exactly the same, but it has different people in it. This happens over and over again, with new copies being created as needed. The exact details vary, but the high concept is pretty consistent: you aren't in *the* club, you're in a copy of the club.
But this has its own problems. In particular, it's socially something of a hack. Say that I want to meet my friends at the club. I can't just say "meet me at Club Foo" -- I have to find out which *instance* I'm going to be in, and meet them there. And there's no good way to simply run into friends there -- if I happen to be in a different instance than my friend, even if we're standing in the same location for half the night, we won't see each other.
My proposed solution for this is what I call dynamic winnowing. It goes kinda like this.
When we get into a crowded situation, we don't create replicas of the space. Everyone is in the same room. However, much as in the real world, you can't necessarily *see* everyone in that room. That's fairly straightforward, and very much like what happens in static winnowing. What's different in dynamic winnowing is that you aren't necessarily seeing the same people as everyone else around you -- visibility is based on relationships, not just on location.
For example, say that there are 5000 people in the room. Three of them are friends of mine -- I've declared them to be friends. One is someone I don't like, who I have specifically blocked. Another ten are acquaintances: people who I've been in conversations with, but haven't specifically friended. The club is configured so that I can see 200 people: the owners of the club like it crowded. Of those 200, all three of my friends will be included in the people who I see, as will the lower-priority acquaintances. The rest are chosen randomly by the system, but the person I don't like is specifically excluded.
Okay, yes -- this has all kinds of complications. To make it work socially, you need a formal concept of conversations -- everyone involved in a conversation has to be able to see everyone else. There are obvious physics problems, since people's locations can overlap with each other -- I might be in exactly the same location as someone else who I can't see, but you can see both of us, implying that we have to fudge locations to a fair degree. This isn't a world designed for first-person shooters, where precise location is everything. But it isn't meant to be: this is a world for socializing, not shooting.
Would it work? I have no damned idea. It's definitely harder in the distributed world that I envision for the Braid -- I originally designed the dynamic winnowing concept at Trenza, which was strictly client/server based, which makes the problem much easier. (Indeed, I wrote a patent, which fortunately died on the vine when Trenza went down, describing the whole process and how to make it scale on the server side.) It's definitely a controversial concept, far moreso than Portals: I cannot say with confidence that it's possible to build a world this way and have it actually make sense. But damn, I'd like to see it tried: combining the social-grouping concepts of IM into a greater social space has huge unexplored potential.
(Note, BTW, that the problems are strictly related to the 3D side of things. I've contemplated building a chat engine built around dynamic winnowing. I'm quite sure that these ideas work *quite* well for generic text chatting, and have real potential to make something like IRC scalable to truly large numbers of users.)
First, objects need to be properly classable. The best model I've found is the MOO one. Anyone can create new object classes, and those classes can be instantiated more or less freely. It's a prototype-based object-oriented environment, which I've found works extremely well when building a simulated environment. We independently developed something quite similar at Looking Glass for the Dark Engine, used in the games Thief and System Shock II. Using prototype-based objects, plus a flexible class-based object-relationship system, you can achieve extraordinary simulations pretty easily.
Second, objects need to be controllable. If I create a class, that does not necessarily mean that you can create an object of that class. By putting this limitation directly into the object engine, we permit economies to arise, which otherwise would be more or less impossible. It should be possible to create unlimited classes, which anyone can instantiate at will -- there's no reason to build limitations in at the architectural level. But the world becomes much more interesting if the creators of an object class can control the usage of that class if they so wish.
Third, we need to be able to establish trust in our objects. We need to be able to guarantee that they are unique, and have clear mechanisms for establishing true relationships between those objects that we can have some faith in. This is a fairly complex problem, still in the research arena at this point. But I commend the documentation on the E Language for a lot of very good thoughts on it. (I regard E as a fairly hideous language syntactially -- I just find it inaesthetic. But many of its ideas are really quite innovative and clever, and worth adopting.)
Fourth, objects should be fully programmable. This obviously relates closely to the above points. And for *heaven's* sake, they shouldn't be programmed using a crap language like Javascript. There are a lot of good languages out there, that have reasonably complete semantics while still being fairly simple. At Trenza, our plan was to create a language that combined the semantics of E and Scheme, with a Javascript-like syntax layered on top of it. (In reality, the syntax is the least important part of a language. But most people nowadays go into conniptions if you try to get them to program in a language without curly braces.)
The objective here is to enable the users to do whatever the heck they want in this world. I can't pretend to have the foggiest notion of what folks would do with these tools. That's the joy of it: throwing it open and letting people play.
First, it needs to be built on entirely open standards. I'm entirely comfortable with having a reference implementation (preferably an open-source one), but folks should be able to reimplement it from scratch. That means open standards for the geometry, the communication protocols, the languages, and so on.
Second, those standards can't suck. This isn't a flip comment: VRML *does* suck, as it turns out. We were babes in the woods while we were designing it, and managed to come up with a format that, while powerful, was also unbelieveably bulky, and pretty much behind the times in the 3D graphics world. The geometry formats, in particular, need to be highly optimized, since we're talking a world that is high-bandwidth at best.
Third, it needs tools that are pitifully easy to use. Ideally, the geometric formats will be complex enough that a skilled 3D designer can build spaces that are massively cool. But the ordinary schmuck with no design skill (like me) should be able to create simple spaces that aren't utterly identikit, and easily customize them with objects.
Fourth, the processing of this world probably needs to be massively distributed. This was always my qualm with Trenza: it was centralizing a lot of powerful processing. It makes a lot more sense to have an overall architecture much like that of the Web. Anyone should be able to run a server for their own spaces, or put their space on the server of a service. Absolutely nothing can be centralized here, or it'll prove to be a bottleneck. Ideally, every client would also potentially be a server. At the moment, the big ISPs (especially the cable companies) are doing everything they can to put the thumbscrews on end-user servers, so this architecture can't be assumed. But it would be nice.
So is this thing actually going to happen? I don't actually know. I'm still passionate about the concepts, but I'm not sure that I'm quite passionate enough to create and run an open-source project of this magnitude. Really, the biggest problem is that it's just a little too much like work. I love programming, but I get eight hours a day of it as is. If I wasn't programming for a living, I suspect that I'd throw myself into this whole-heartedly. As it is, I dunno.
But I do know this: the ideas aren't going away. I've been architecting this project in my head for fully ten years now -- it began to gel way back in 1994 with VRML, and I don't see it seeking an exit from my brain any time in the forseeable future. I keep hoping that someone else will decide to take this project seriously, and I can simply contribute architecture, ideas and some code. If not, who knows. I can imagine myself finally getting fed up with it ten years from now, and taking it on just to get it out of my head and into code...
Introduction
Okay, I have to admit it: the idea of an immersive virtual space has always attracted me. Despite having actually read less cyberpunk than most people, I find the idea intuitively obvious, not to mention useful.I know enough about memory and mnemonics that I really suspect that a well-done cyberspace would be easier to make one's way around in than the web. Those of you who have poked around my sadly neglected homepage know that I've been following spatial metaphors for many, many years. One of the most ancient mnemonic secrets is that it's much easier (for most people) to remember things if you place them spatially. The text version of Chez Coeur (which is still pretty similar to my original layout circa 1992) was always intended to be a stopgap until good online 3D tools existed to do it right.
The problem is, as far as I can tell, those tools still don't exist. Originally, I expected other people to create them -- I figured it was obvious. From time to time, I've worked on it myself. I was one of the original founders of the VRML project, which sought to build the standards for the 3D Web; however, that got mired in conflicting interests and agendas, and never really took the problem of creating a unified cyberspace at all seriously. (That generated some very good ideas, though: more on this later.) Two jobs ago, I was the client lead at Trenza, an overweeningly ambitious dotcom that sought to build a social 3D environment on top of the Web. (That also generated some key ideas, more on which later.)
Over the years, I've developed more and more ideas and opinions about this immersive 3D world, which I've wound up nicknaming The Braid. (I was originally planning on using that name for Trenza's version of it, but as it happens it's wound up applied to the concept in general.) Here are some of the more interesting elements that have floated into the project over time. They aren't necessarily all critical -- in some cases I may be wildly offbase that they're even good ideas -- but I think they're all probably necessary to really make the thing hum.
One important thing to understand going into this: I'm something of a radical when it comes to cyberspace. In my opinion, most previous attempts have been hamstrung by an implicit assumption that cyberspace should be as much as possible like the real world. I think that's foolish: we already have a real world, and while it works fairly well, it has serious limitations. The interesting question is, how can we build something that works far better than the real world?
Portals, and the Problem of Real Estate
VRML is the "Virtual Reality Modeling Language" (the acronym actually came first; I added the name behind it). It was created in the heady early days of the Web, back in 1994. It was one of those "steam engine time" projects -- the idea got proposed, and took off like wildfire when it turned out that loads of people were already thinking, "hey, wouldn't it be cool if we could build 3D worlds on the Web?". Within days of the initial proposal, there were hundreds of people on the list -- an amazingly fast growth rate back then.Problem is, there wasn't much agreement on what this would be for. Some folks were mainly in it for scientific visualization; others wanted to be able to show off discrete spaces. I turned out to be in a relatively small minority in wanting to use this to create a connected cyberspace: a single continuous world that people could wander through.
Really, there were two of us who were focused on this problem, myself and Mark Pesce. Mark was known in those days as "the prophet of VRML" -- he was the one touring around, lecturing on the possibilities and so on. He and I both attacked the problem of the connected world, and came to radically different conclusions about how to do it. His approach was known as The Cyberspace Protocol: it was vaguely DNS-like in its concepts, and I always found it horribly confusing. My approach was different, and pretty damned radical, because it was designed to deal with a problem that most 3D worlds have largely ignored: real estate.
Consider: in the real world, real estate is a pain in the ass. If you have a "downtown" -- a desireable place that folks want to be close to -- then there is only a limited amount of land near to it. That finite land area has all sorts of economic impact: people wind up having to pay real money to be near the good spots, and those who can't afford it are out of luck. It's basic economics, due to the fact that you have a finite resource that is in more demand than you have supply.
My question is this: given that "land area" is a completely artificial construct in cyberspace, why does everyone insist upon applying real-world logic, with its resulting problems, to it?
Portals were my solution to the problem. I proposed them back in 1994, and I still think they're the right way to go. A full description can be found in the archives of the VRML mailing list but the key concept is that Real Estate is a problem only if you assume that cyberspace is globally consistent: that every point in cyberspace maps into a single, coherent, global spatial map. But such a map is largely useless, because isn't really how people think on a day-to-day basis. What folks really need is local consistency: paths that will always take them from A to B. So long as one can give directions, you don't really need the ability to show a top-level map of the "world". (Indeed, as far as I can tell, most people think innately in terms of directions, not in terms of maps. Most people don't do well with compass-based dead reckoning in the real world.)
So the solution is Portals, which are essentially threespace hyperlinks. A given space declares its "entrances": 2D planes that correspond to anchors in HTML. It also declares "exits", which are also 2D planes, and which lead to entances in other spaces. Together, these things are called Portals. If a pair of Portals are mutually linked -- each one is both an entrance and an exit pointing to the other -- then the spaces are simply joined together. But it isn't necessarily so: a given entrance can have many exits pointing to it.
So for example, if I have built a nightclub that people like, I can simply have an entrance at the front of the club. Folks from *anywhere* can declare exits that lead to it. If you want your back door to lead to my nightclub, you can do so. Everything is right next door to anything it wants to be next door to.
There are a bunch of subtleties involved, of course. In order for this to make intuitive sense, you need a concept of "binding" -- if I walk from A to B through a Portal, I should be able to walk back to A again through the same portal. You need to be able to pass bindings on to others, so that people can follow each other through their bindings. Etc; see the above-referenced article for more thoughts. (And even that doesn't cover all the ramifications I've thought of over the years. Feel free to chat about this.)
I'll be the first to admit that this scheme has some odd qualities; it doesn't map perfectly to the real world, so there will be some unintuitive aspects. But it results in a world that is non-Euclidean in all the right ways, I believe, avoiding the real-world problems of real-space while preserving enough to be easy to use. It allows each person building a space to hook it to whatever else they feel is appropriate, without having to worry about scarcity of Real Estate, and the resulting economic complications.
Sadly, this idea is still largely untested. One VRML company did implement the Portals concept, but never actually got to market. (Indeed, I found out that they'd implemented my ideas entirely by accident, when I was evaluating implementations for our Windhaven educational-MUD project a couple of years later.)
A curious coda to all this: around a year or so after I proposed this, a new technology emerged in the 3D graphics world. It is called "portalization", and is essentially a limited version of my Portals design -- it is exactly Portals, but with the assumption that every entrance/exit is paired with exactly one other. While it has its limitations, it has become well-established as one of the easier techniques for optimizing 3D. To this day, I haven't figured out where portalization came from, and whether it was inspired by my proposal. Bringing this all full circle, it turns out that the very best of the freeware 3D renderers -- Crystalspace -- is portalization-based. So it appears that actually implementing Portals in freeware wouldn't be all that hard at this point.
Friends, Crowds and Dynamic Winnowing
Okay, so much for the static world. The next problem is an important one, which the VRML project largely ducked for years: how do you make this place social?I mean, a 3D space is really lonely if you're the only person there. People have observed this about the Web: one of the things that makes it a less compelling experience is that you experience it alone. Various systems have been built to deal with this, but they've generally been commercial enterprises, and thus very fragmented. The Web, and the Braid, are only going to become social experiences if they're built on a common standards-based platform that everyone can buy into.
This has some implications. It means that we need to be able to build social servers, which operate in parallel to the space servers. Whereas the latter serves out the static descriptions of the world (the geometry and objects in it), the former distributes the dynamic information. At the least, this includes information about the other people who are present: their avatars, their positions, and so on. More generally, it might include all of the dynamic objects in the world -- ideally, all objects will be moveable, so *something* has to be keeping track of their locations.
So say we've got a social world, where you can see the other people who are wandering around in it. If I go to a club, I can see all the other people in the club. This brings up another problem: what do we do about crowds?
This isn't a new issue -- programmers have been dealing with crowd control for as long as they've been building 3D multiuser systems. The problem is this: say that you have a club that will hold 300 people comfortably. Now say that something major is happening, and word gets around. 5000 people try to get into the club. How do you deal?
In the real world, of course, this is similar to the Real Estate problem above, and the solution is similar: again, it's essentially an economic problem. Either you make it very expensive, so that only the richest 300 can get in, or you make it first-come-first-serve, so the first 300 get in. But the same question deserves asking as in the Real Estate problem: why should we build our cyberworld so that it artificially preserves the problems of the real world?
Now, most massively multiuser spaces *do* actually address this problem nowadays, using a technique I call "static winnowing" (for historical reasons from Trenza, mostly). Basically, they replicate the club over and over again. The first 300 people arrive, and go into the club. When the 301st arrives, a new *copy* of the club is created, and they go into that. It looks exactly the same, but it has different people in it. This happens over and over again, with new copies being created as needed. The exact details vary, but the high concept is pretty consistent: you aren't in *the* club, you're in a copy of the club.
But this has its own problems. In particular, it's socially something of a hack. Say that I want to meet my friends at the club. I can't just say "meet me at Club Foo" -- I have to find out which *instance* I'm going to be in, and meet them there. And there's no good way to simply run into friends there -- if I happen to be in a different instance than my friend, even if we're standing in the same location for half the night, we won't see each other.
My proposed solution for this is what I call dynamic winnowing. It goes kinda like this.
When we get into a crowded situation, we don't create replicas of the space. Everyone is in the same room. However, much as in the real world, you can't necessarily *see* everyone in that room. That's fairly straightforward, and very much like what happens in static winnowing. What's different in dynamic winnowing is that you aren't necessarily seeing the same people as everyone else around you -- visibility is based on relationships, not just on location.
For example, say that there are 5000 people in the room. Three of them are friends of mine -- I've declared them to be friends. One is someone I don't like, who I have specifically blocked. Another ten are acquaintances: people who I've been in conversations with, but haven't specifically friended. The club is configured so that I can see 200 people: the owners of the club like it crowded. Of those 200, all three of my friends will be included in the people who I see, as will the lower-priority acquaintances. The rest are chosen randomly by the system, but the person I don't like is specifically excluded.
Okay, yes -- this has all kinds of complications. To make it work socially, you need a formal concept of conversations -- everyone involved in a conversation has to be able to see everyone else. There are obvious physics problems, since people's locations can overlap with each other -- I might be in exactly the same location as someone else who I can't see, but you can see both of us, implying that we have to fudge locations to a fair degree. This isn't a world designed for first-person shooters, where precise location is everything. But it isn't meant to be: this is a world for socializing, not shooting.
Would it work? I have no damned idea. It's definitely harder in the distributed world that I envision for the Braid -- I originally designed the dynamic winnowing concept at Trenza, which was strictly client/server based, which makes the problem much easier. (Indeed, I wrote a patent, which fortunately died on the vine when Trenza went down, describing the whole process and how to make it scale on the server side.) It's definitely a controversial concept, far moreso than Portals: I cannot say with confidence that it's possible to build a world this way and have it actually make sense. But damn, I'd like to see it tried: combining the social-grouping concepts of IM into a greater social space has huge unexplored potential.
(Note, BTW, that the problems are strictly related to the 3D side of things. I've contemplated building a chat engine built around dynamic winnowing. I'm quite sure that these ideas work *quite* well for generic text chatting, and have real potential to make something like IRC scalable to truly large numbers of users.)
Scripting, Trust and Language Underpinnings
Next problem: objects. For this world to really be fun and interesting, it can't just be a static space of people wandering around. There have to be things in it, and it has to be fairly straightforward to add new ones. I don't have a single coherent design for objects yet, but I do have a number of elements that need to go into this.First, objects need to be properly classable. The best model I've found is the MOO one. Anyone can create new object classes, and those classes can be instantiated more or less freely. It's a prototype-based object-oriented environment, which I've found works extremely well when building a simulated environment. We independently developed something quite similar at Looking Glass for the Dark Engine, used in the games Thief and System Shock II. Using prototype-based objects, plus a flexible class-based object-relationship system, you can achieve extraordinary simulations pretty easily.
Second, objects need to be controllable. If I create a class, that does not necessarily mean that you can create an object of that class. By putting this limitation directly into the object engine, we permit economies to arise, which otherwise would be more or less impossible. It should be possible to create unlimited classes, which anyone can instantiate at will -- there's no reason to build limitations in at the architectural level. But the world becomes much more interesting if the creators of an object class can control the usage of that class if they so wish.
Third, we need to be able to establish trust in our objects. We need to be able to guarantee that they are unique, and have clear mechanisms for establishing true relationships between those objects that we can have some faith in. This is a fairly complex problem, still in the research arena at this point. But I commend the documentation on the E Language for a lot of very good thoughts on it. (I regard E as a fairly hideous language syntactially -- I just find it inaesthetic. But many of its ideas are really quite innovative and clever, and worth adopting.)
Fourth, objects should be fully programmable. This obviously relates closely to the above points. And for *heaven's* sake, they shouldn't be programmed using a crap language like Javascript. There are a lot of good languages out there, that have reasonably complete semantics while still being fairly simple. At Trenza, our plan was to create a language that combined the semantics of E and Scheme, with a Javascript-like syntax layered on top of it. (In reality, the syntax is the least important part of a language. But most people nowadays go into conniptions if you try to get them to program in a language without curly braces.)
The objective here is to enable the users to do whatever the heck they want in this world. I can't pretend to have the foggiest notion of what folks would do with these tools. That's the joy of it: throwing it open and letting people play.
Ease of Use and Ubiquity
Finally, it's important that this thing be easy to use, and ubiquitous. This has a few ramifications.First, it needs to be built on entirely open standards. I'm entirely comfortable with having a reference implementation (preferably an open-source one), but folks should be able to reimplement it from scratch. That means open standards for the geometry, the communication protocols, the languages, and so on.
Second, those standards can't suck. This isn't a flip comment: VRML *does* suck, as it turns out. We were babes in the woods while we were designing it, and managed to come up with a format that, while powerful, was also unbelieveably bulky, and pretty much behind the times in the 3D graphics world. The geometry formats, in particular, need to be highly optimized, since we're talking a world that is high-bandwidth at best.
Third, it needs tools that are pitifully easy to use. Ideally, the geometric formats will be complex enough that a skilled 3D designer can build spaces that are massively cool. But the ordinary schmuck with no design skill (like me) should be able to create simple spaces that aren't utterly identikit, and easily customize them with objects.
Fourth, the processing of this world probably needs to be massively distributed. This was always my qualm with Trenza: it was centralizing a lot of powerful processing. It makes a lot more sense to have an overall architecture much like that of the Web. Anyone should be able to run a server for their own spaces, or put their space on the server of a service. Absolutely nothing can be centralized here, or it'll prove to be a bottleneck. Ideally, every client would also potentially be a server. At the moment, the big ISPs (especially the cable companies) are doing everything they can to put the thumbscrews on end-user servers, so this architecture can't be assumed. But it would be nice.
Conclusions
The educated reader will realize that, long though the above is, it's just scratching the surface of the problem. Building a *good* immersive, distributed 3D world is an enormous task -- just getting the design right is huge. But it's one of the most interesting architectural problems I've ever come across, and I find that I keep coming back to it.So is this thing actually going to happen? I don't actually know. I'm still passionate about the concepts, but I'm not sure that I'm quite passionate enough to create and run an open-source project of this magnitude. Really, the biggest problem is that it's just a little too much like work. I love programming, but I get eight hours a day of it as is. If I wasn't programming for a living, I suspect that I'd throw myself into this whole-heartedly. As it is, I dunno.
But I do know this: the ideas aren't going away. I've been architecting this project in my head for fully ten years now -- it began to gel way back in 1994 with VRML, and I don't see it seeking an exit from my brain any time in the forseeable future. I keep hoping that someone else will decide to take this project seriously, and I can simply contribute architecture, ideas and some code. If not, who knows. I can imagine myself finally getting fed up with it ten years from now, and taking it on just to get it out of my head and into code...
(no subject)
Date: 2004-01-02 09:41 pm (UTC)I've heard for a while about how "useful" an immersive 3-D interface will be, and in general I'm not convinced (specialized tasks such as, modelling molecules, surgical telepresence, specialized statistical representations, yes; but not general computer use). But for social application of a 3-D interface, I guess I can see there being some merit there. I'm not entirely convinced yet but I'll say that it does not sound silly.
But the notion of cyberspace as something that has a feeling of "space" to it does intrigue me, graphically representational interface or not. Not so much because of reading cyberpunk fiction but because I've experienced that and found it fascinating to be inside of. And my interface at the time was a VT-52 terminal and a 2400 baud modem.
Unsurprisingly in light of what you've written, realtime social interaction within the "space" was one element that drew me into the "experience of space". But what you may find interesting was that I was most conscious of the spatial thinking when I was in more than one "location" at the same time! (Then again, this agrees with your "why artificially replicate the limitations of the real world" argument. Bilocation is easier in cyberspace than in meatspace.)
Participating in a MUD wasn't enough, nor the "channels/rooms" metaphor of IRC; but when I was simultaneously (using the 'screen' program that gave me ten virtual terminals on my VT-52) in the one room of the MUD in one virtual screen, on two channels plus a private chat on IRC on a second, in a 'talk' session on a third, engaging in a rapid-fire email conversation on a fourth, using a 'tail -f' on a fifth screen to note when the next email message had arrived, ignoring a newsreader on a sixth screen, and issuing 'nslookup', telnet, ftp, and 'traceroute' commands in the seventh and eighth virtual screens trying to solve somebody else's urgent problem that they were chatting with me in, uh, I think the IRC screen ... that was when I was most keenly aware of a sense of "location" (along with my presence in several locations at once) and felt the most immersed in The Net that I have ever felt. Although all I saw was text, text, and text, I felt as though I were 'seeing' each of the virtual spaces where I was interacting with people or machines.
I didn't feel like I was flitting from space to space as I switched virtual screens. I felt like I was in multiple "places" at once, and when I noticed this, I was fascinated.
I'm not sure what it'll take to make me feel "present" in the world you propose, but I suspect it'll take some sense of importance-of-action, and location-of-action (I'm not quite sure what I mean by that and am trying to find a better way to phrase it), to make it feel like more than just interesting window-dressing on a chat facility. And if I do get drawn in to the point that I feel I'm "in a location" (or locations) there, it'll probably be addictive.
As long as we're stepping beyond real-world limitations, I want teleportation (save me from retracing the steps once I know the address) and telepathy (don't make me trek all over town to find someone I need to whisper a short message to). And, of course, bi- and tri-location. :-) (Even if I have to put my left hand on one keyboard and my right hand on another -- yes I can type that way -- I'll find a way to be in two places at once anyhow. Might as well make it easier for me.)
I do find your idea of dynamic winnowing very interesting. On the one hand I'd feel I was missing some opportunity for a fascinating random encounter; more realistically, if the space is too crowded, I'm not likely to notice or manage to meaningfully interact with most of the potential random new acquaintances anyhow, so we may as well make it more manageable in a way that ensures "my community" will be visible to me. (And yes, I see what you mean about using that on IRC, though it's been ages since I fired up an IRC client.)
(no subject)
Date: 2004-01-02 11:51 pm (UTC)I might be in exactly the same location as someone else who I can't see, but you can see both of us, implying that we have to fudge locations to a fair degree. This isn't a world designed for first-person shooters, where precise location is everything.
Dynamic winnowing is an excellent notion. The above statement feels limiting, though. I'll try to think through why, but my brain is fuzzy, so I may not make much sense at the moment.
One thing that the Web has shown is that even if you say "it's not about precise control of appearance/layout", people will want it to be. (I'd also be unsurprised if some people did want to build first-person shooter areas.)
Between those factors and the complication introduced of trying to manage fudged positions, might it be easier to try and come up with some sort of interface that allows for and/or represents overlap instead of trying to avoid it utterly?
(Hmm - it occurs to me that you can think of this problem as representing a fourspace in threespace, sort of - two people may be overlapping in a 3D environment, but if they're offset in a 4th dimension, they're not occupying the same space. I probably think of this sheerly because I built a "move around in a fourspace" 3D visualization program back in college.
It does raise the thought, though: If you're going to break the "there is a universal, realspace-comparable map in which only one object can occupy a given space at a given time" paradigm for the world at large, why keep the same limitation for the world at small? Admittedly, you've got to figure a way around it, which is much trickier in personal interaction than in travel...but I have the nagging feeling that there's some lateral-thinking means of representing it that wouldn't be too counterintuitive and could make it work.)
Yeah, I'm rambling. Time to go to bed now...
(no subject)
Date: 2004-01-03 11:26 am (UTC)This can be rendered with some simple convention like when two people overlap, their avatars become a vague grey outline of a person.
But if you have the processing power, I vote you average their appearances. :)
If someone wants to build a FPS game area, very well. They have the choice: a bullet which hits an overlap avatar can have its damage spread proportionally, or everyboy in the overlap can take full damage.
Heck, you can leave it up to the author of each space to choose an "index of overlap" -- how much two things have to overlap before they meld.
Another option I'll call Astralism. Posit there are N planes of reality. Objects (including users) have a property which indicates which planes they exist in; users also have a property which indicates which plane they observe. Let's say 10 planes for example's sake, though in reality you might want more like 100. Everything defaults to existing on plane 1. A building (a "set") might typically exist on planes 1-9.
If I go into a crowded cafe looking for my friends, and it's so crowded I can't find anyone for the overlap, I "project" my avatar to some other plane than the first -- ideally one me and my friends have agreed upon. That is, I set my avatar to be on both plane 1 and plane X, but to observe only on plane X. My avatar is still present to all those people on the first plane, still contributing to the big crowd. But I'm not seeing them. I'm only seeing whoever else is on plane X.
Since people are jerks, and some will just set themselves to exist on as many planes as possible all the time, thereby causing more crowding, I suggest you make astral projection cost something. Use a point system. Say it takes energy. You have to rest from projection once you've exhausted your points until you've amassed enough to do it again.
Oh, and you can reserve a plane, say 10, for emergencies or anything which might need to access a local completely free of window dressing and random passers-by.
So, with a contraint on usage, you'll get people using astral projection to find one another, but not to hang out. They'll project for comments and quips (must be user-setable on the fly which plane(s) one's voice is on!) and planning, but rarely for hanging out.
Actually, I think maybe the right solution is to have something like 1k planes. Everything is (by default at least) on plane 1. Planes 2-50 are free or trivially cheap to project to. Planes 51+ are rather less cheap.
That way, if I want to have, say, a folksinger perform in my cafe, and 200 more people show up than the space can render, I can set the folksinger+stage to be on planes 1-50. The folksinger's voice will be on 1-50, but the folksinger will only hear on 1. I'll set my space to not allow anyone else to speak on 1-20 (no other voices exist on 1-20.) People can then randomly pick other planes of existence to watch the show from. If they just want a good seat near the stage without crowds blocking their view, they find a place near the stage (maybe already thickly populated) and project into any of 2-50. If they want concert-quiet for listening, they want one of 2-20; if they want to chat with friends and don't mind other people's chat, they pick one of 21-50. If someone wants to be particularly findable/converse-with-able during the performance, they will also exist into some >50 plane; when a friend is trying to find them in the crowd, that friend can project onto the same higher plane, and the singer won't be there, the audience member will be, and they can both talk.
™
(no subject)
Date: 2004-01-03 11:42 am (UTC)E
(no subject)
Date: 2004-01-05 03:43 pm (UTC)"But if you have the processing power, I vote you average their appearances. :)"
This strikes me as being ... I dunno, cleverly evil. In the sense that it amuses me, and I can't resist trying to imagine ways to entertainingly abuse it.
(no subject)
Date: 2004-01-05 03:54 pm (UTC)(no subject)
Date: 2004-01-06 01:22 pm (UTC)think innately in terms of directions, not in terms of maps. Most people don't do well with compass-based dead reckoning in the real world.)
Oh dear. I never really thought of that. I hope it doesn't mean much then that I frequently drive by my own internal compass.... taking turns not based on where I know they go, but based on the direction they're going to take me in...
Admittedly, this only works because I also have a pretty good feel for the 'ley lines of driving' known as major roads. I _know_ that if I keep driving in _this_ direction, I am _going_ to intersect with this major road, or maybe that one which will also meet _this_ one, etc. Cyberspace has no such major routes to use as reference marks, boundary lines, or vector corrections.
Ah well, at least it beats the Dirk Gently method of driving (follow someone who looks like they know where they are going) that pervades cyberspace now (search engines, link sites, reference articles. Heck, Google's algorith is one mathematical instance of the Gently Protocol).