Slowing the data mining
Mar. 21st, 2008 12:32 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
I just came across an interesting article in the NY Times, talking about a proposed bill in New York to outlaw some current advertising practices. My initial reaction was, "Uh-oh", but on reading through it, I rather like it.
The thing is, one of the most fraught questions for CommYou is advertising. I'm tentatively inclined to have a *little* of it, at least for non-members, to firm up the income flow. And I really quite like Gmail's model of advertising: just a bit, based on the text of what you're reading, with an eye towards maybe being useful. I intensely dislike the advertising-all-over look of many websites -- frankly, I think it's kind of dumb and counterproductive, with the saturation-bombing by some advertisers leading me to tune them out entirely. If all you're going for is awareness it might be okay, but I rarely click on the ads. By contrast, I *do* occasionally click on Gmail ads, because they're more likely to be quirky, interesting and relevant. (Even sometimes fun.) Seeing an ad for HDMI cables when I'm *talking* about HDML cables is far more useful to me than yet another advertisement for Microsoft.
The problem, though, is that I don't much trust the advertisement aggregators, and that's where the ads would necessarily come from. I mean, yes, Google (the likely supplier) tries to "Do no evil", and I think that for the moment they're mostly sticking to that. But I have every confidence that that will slowly erode over time, and I think most of their competitors are further along down the evil curve than they are.
In particular, the issue that I dislike is tracking. I mean, the data I would send them would be as anonymized as I can make it -- just the text contents, nothing personally identifying. (That probably weakens the value prop slightly, but I don't much care: the whole point of running a lean company is that I can do things like that.) But they can still get at your IP address, and that's gold to advertisers: they try to correlate every bit of data they can get their hands on. Granted, nearly every website participates in this giant data-mining operation, so it's not like CommYou would be *unusual* in it, but I can't say that I love the idea of being part of that. From a privacy viewpoint, it makes me decidedly uncomfortable.
So I like the bill that's being proposed in NY, which (according to the article) targets exactly that -- it would theoretically give you a way to get the data-mining to back off. (Whether it would succeed or not remains to be seen, but I like the principle.) Indeed, the possible consequence that I would *love* to see is a more nuanced attitude towards data retention and mining in general. If it forced Google to have the internal mechanisms to *not* retain any user data in some cases, then it wouldn't be such a big leap for them to permit it on a site-by-site basis. And I would be *much* more comfortable buying ads from Google if I could stipulate that they cannot retain any data that I send to them. Yes, it might mean I get a little less money -- but again, I can live with that.
We'll see where it all goes -- it'll be some months yet before any of this is relevant. But the tension between advertising and user privacy is one I'm going to have to be very careful about...
The thing is, one of the most fraught questions for CommYou is advertising. I'm tentatively inclined to have a *little* of it, at least for non-members, to firm up the income flow. And I really quite like Gmail's model of advertising: just a bit, based on the text of what you're reading, with an eye towards maybe being useful. I intensely dislike the advertising-all-over look of many websites -- frankly, I think it's kind of dumb and counterproductive, with the saturation-bombing by some advertisers leading me to tune them out entirely. If all you're going for is awareness it might be okay, but I rarely click on the ads. By contrast, I *do* occasionally click on Gmail ads, because they're more likely to be quirky, interesting and relevant. (Even sometimes fun.) Seeing an ad for HDMI cables when I'm *talking* about HDML cables is far more useful to me than yet another advertisement for Microsoft.
The problem, though, is that I don't much trust the advertisement aggregators, and that's where the ads would necessarily come from. I mean, yes, Google (the likely supplier) tries to "Do no evil", and I think that for the moment they're mostly sticking to that. But I have every confidence that that will slowly erode over time, and I think most of their competitors are further along down the evil curve than they are.
In particular, the issue that I dislike is tracking. I mean, the data I would send them would be as anonymized as I can make it -- just the text contents, nothing personally identifying. (That probably weakens the value prop slightly, but I don't much care: the whole point of running a lean company is that I can do things like that.) But they can still get at your IP address, and that's gold to advertisers: they try to correlate every bit of data they can get their hands on. Granted, nearly every website participates in this giant data-mining operation, so it's not like CommYou would be *unusual* in it, but I can't say that I love the idea of being part of that. From a privacy viewpoint, it makes me decidedly uncomfortable.
So I like the bill that's being proposed in NY, which (according to the article) targets exactly that -- it would theoretically give you a way to get the data-mining to back off. (Whether it would succeed or not remains to be seen, but I like the principle.) Indeed, the possible consequence that I would *love* to see is a more nuanced attitude towards data retention and mining in general. If it forced Google to have the internal mechanisms to *not* retain any user data in some cases, then it wouldn't be such a big leap for them to permit it on a site-by-site basis. And I would be *much* more comfortable buying ads from Google if I could stipulate that they cannot retain any data that I send to them. Yes, it might mean I get a little less money -- but again, I can live with that.
We'll see where it all goes -- it'll be some months yet before any of this is relevant. But the tension between advertising and user privacy is one I'm going to have to be very careful about...
(no subject)
Date: 2008-03-21 05:03 pm (UTC)What this means, as far as I know, is that Google (the 'least evil') can't actually realistically advertise to anything that is not publicly available on the web.
This is true for every 'standard' advertising thing that's out there: anything else requires a 'signed contract' kinda deal with the advertisers, I think -- this would change the 'giving away the IP address' equation, but would also be a significant investment for early CommYou development, I think...
Anyway, just thought I'd mention.
(no subject)
Date: 2008-03-21 06:40 pm (UTC)That matches my original assumptions, but doesn't match what it looked like when I actually poked under the hood. (That is, I took a Google ad and briefly deconstructed what it was doing.) I am by *no* means sure of this, mind, but it appeared that, instead, you include a Google script that actually takes some of the page content directly (at render time), and uses AJAX to send that directly to Google, where it is processed and an ad returned. And it apparently had mechanisms to allow you to say "these are the parts of the page that are fair game".
If that's not true (and like I said, I'm not at all sure), then I'll have to reconsider a bunch of things. But I'm hoping it's correct -- it's a sophisticated and elegant solution, and would give me enough control to balance the equations in a reasonably nuanced way...
(no subject)
Date: 2008-03-21 09:21 pm (UTC)(Note that in my entire year of having ads on my 10,000 hits a day website, I only got about $35 out of the bargain. I was not too impressed.)
(no subject)
Date: 2008-03-21 11:01 pm (UTC)And yes, it's going to be interesting to see if ads are even worthwhile. My assertion is that, given ads targeted at what users are talking about, I should get a non-trivial clickthrough, but it's entirely possible that I am Just Plain Wrong...