jducoeur: (Default)

When I am in the middle of reading my Reading page on Dreamwidth, and it's been a day or so, and I refresh the page with the browser's reload button, it keeps me at exactly the same entry (indeed, the same relative page position), and just adds the newer stuff on top. That is very cool, precisely what I want -- and I have no idea offhand how they do it. Any guesses? Far as I can tell, this is newish: I don't remember this working this well beyond a month or two ago.

(Yes, I could go ask on the official channels, and might do so when it isn't one in the morning, but I toss it out as a random thought experiment for my nerdy friends. The question is mostly idle curiosity, but not entirely: it's a lovely bit of UX, and I might want to steal it in the future...)

jducoeur: (Default)

Following up on my post from last week about mysterious accounts with clearly auto-generated profiles trying to friend me:

The Dreamwidth Anti-spam team (thanks to [personal profile] watersword for pointing me in the right direction), in their reply to me, indicated that they think it's SEO -- Search Engine Optimization -- spam. That makes sense, and I'm kicking myself for not thinking of it.

SEO is all about gaming Google and other search engines, by creating interconnected networks of web pages that look real, which point to some webpage that you're trying to promote. Modern search engines rank pages based on, essentially, a web of trust -- the more pages that, directly or indirectly, point to this one, the more important this one is considered. So SEO companies (and it's a fairly big business) try to build up networks of pages, and get them linked from lots of real pages, in order to raise the ranking of their paid clients. They mostly do this by injecting bogus links anywhere that they can manage it.

These fake DW accounts don't have any content yet, but they're probably playing a long game, building up cross-links. And just the act of following me creates a useful cross-link for them, because it means that my very-real profile has their fake account listed under my "Other Subscribers" section.

So that's a plausible answer to my question of, "What do they get from this?" They are building up plausible-looking networks via those Other Subscribers linkbacks -- even though no sensible person would friend them back, they don't actually care, since they are getting a link anyway. Presumably, once the network is well-established, the 'bot accounts will begin posting spam articles that link to the webpages that are paying them. Google will see that my clearly-real page is linking to my profile, which links to the 'bot, which links to the external pages, and will tend to score those external pages higher since they are part of a real-looking network. That's just a guess (Google's actual algorithms are a well-kept secret, and there is an entire industry devoted to sussing it and gaming it), but it seems reasonably likely.

The moral of the story is, when somebody friends you on DW, always look at their profile -- if it looks bizarre and computer-generated, report it to the anti-spam team. Vigilance on the part of the community is the only way to keep this game from paying off, and if we let them profit from it, they'll start to play this game in more force. We want to nip this stupidity in the bud, and the 'bots are counting on nobody actually trying to deal with it. (This is why they don't want to have any obvious spam content when they friend you: they want you to just leave it alone until it's too late.)


Thinking about it a little more, a tiny DW enhancement is probably in order. This problem is precisely why external links from Querki are automatically marked nofollow, and you can't turn that off -- that is the signal to search engines to disregard this link in terms of figuring out a page's importance. This makes Querki useless for most SEO, even the relatively benign sorts; I made the conscious decision that that's not a use case we support.

Any online site (especially any free site) that allows end users to contribute data should use nofollow where possible: otherwise the spammers will find you and overrun you. I learned this the hard way many years ago, when The Rolls Ethereal -- the SCA's online phone book, which I ran in the early days of the Internet -- was destroyed by web spammers.

I believe that DW ought to mark "Other Subscribers" links as nofollow, in order to break the network linkages that make this scam particularly useful. They'd still be able to create SEO accounts, but without those backlinks from clearly-real accounts, Google is probably significantly less likely to score them well. It wouldn't entirely break that game, but it would reduce the economic incentive, without impacting DW in any serious way. In particular, it is correct to mark these links as nofollow, because I can't control those backlinks -- they are forced upon me, so my account shouldn't be considered as linking to them.


Oh, and your bonus for reading to the end: today's bit of spambot improvisation, which friended me last night:

Prior to my current job I was analyzing shaving cream in Nigeria. Spent college summers lecturing about mantra to get husband back after separation in Jacksonville, FL. Spent college summers supervising the production of pond scum in Salisbury, MD. Spent 2002-2010 analyzing velcro in Phoenix, AZ. Spent 2002-2009 selling pond scum for the government. Managed a small team researching childrens books in Naples, FL.

There is something singularly appropriate about these SEO scammers supporting the pond scum industry...

jducoeur: (Default)

Over the past month or two, I have gotten several subscribes from the most preposterous bots. I mean, the bio of the one I just got says:

Practiced in the art of working with real estate staging Companies Orange County. Spent 2002-2010 exporting glucose in Deltona, FL. Spent 2001-2004 promoting salsa in Hanford, CA. Spent 2001-2008 buying and selling deodorant in Bethesda, MD. Earned praised for my work deploying squirt guns in Orlando, FL. Prior to my current job I was merchandising dust in Bethesda, MD.

They're all like that, with clearly computer-generated resumes in the profile.

I can't for the life of me figure out the scam. I'm used to this nonsense on FB, but those tend to make at least an nod to sleaze appeal -- the feeds usually are full of bikini-clad models, trying to lure you into friending them -- but here, there are no actual posts, just deeply weird obviously-fake profiles.

Anyone have any guesses what they are up to? I don't generally think of DW as having FB's common ethos of automatically accepting all friend requests, although I suppose some people might do so...

jducoeur: (Default)

Okay -- it isn't by any means perfect, but so far it's the best solution I've come up with.

The only thing I miss, in moving from LiveJournal to Dreamwidth, is the native support for cross-posting from here to FB. So for the past couple of months I've been exploring alternatives. The one I've been using was dlvr.it, as described here -- that's adequate, and makes it fairly easy to post links on FB that point to your posts here.

But the thing is, I don't love that, because not many people actually click through those links. And while I may not love FB, I do have a lot more friends there than on Dreamwidth, so I'd like to be able to actually cross-post, not just link.

For a while, I had thought that the answer was Zapier, and I put in a lot of work getting a true cross-post solution working there. But Zapier has one critical flaw: the approach I'm using for cross-posting requires a feature that only exists in their paid version, and Zapier is insanely expensive. (Like, $20/month.) It's just not worth that kind of money. (Yes, I talked to them about it; they brushed me off and refused to even contemplate a more reasonably-priced tier.) So I gave up and went back to dlvr.it.

But -- as of today IFTTT, the grand old man of the "plug-and-play applications" space, officially opened up their Applet program to all comers: you can build your own tools in it, and yes -- like Zapier, it allows you to insert some JavaScript in the middle.

(Why JavaScript? Because your DW feed is in HTML, and if you just post it directly the results look kind of crappy. I want something better.)

So I've spent a little time in the workday cracks today taking the solution I'd built for Zapier and adjusting it for IFTTT. The experience with IFTTT is a bit different from that of Zapier -- a bit less powerful (in particular, their RSS reader doesn't pick up your DW tags, which Zapier did), but with a much better built-in IDE.

I think that's now working adequately -- it's not The One True Solution, but it mostly works. I've published it as a public Applet on IFTTT; feel free to pick it up and use it. You give it the URL of your Dreamwidth RSS feed, and you need to connect Facebook to IFTTT; once you have that, it should, in theory, quietly check your RSS feed every 15 minutes or so, and cross-post new entries to your Facebook wall. It takes each DW post, translates it into something that looks okay on Facebook (basically, it back-translates the HTML to something vaguely like Markdown), and includes the link to the original DW post at the bottom.

Please pass word on to anybody who might care, and tell me about problems. (Hopefully, I can fix any problems -- once I published, IFTTT gave me dire warnings that I could no longer alter my triggers or actions; hopefully I can still edit the critical filter in the middle.)

jducoeur: (Default)

In the comments from my previous post about cross-posting from Dreamwidth to FB and TW, [personal profile] laurion recommended trying Zapier. At the time, I said that it wasn't worth the effort -- that using Zapier in a straightforward way didn't produce any better results than dlvr.it, which is easier to use.

Problem is, that was basically accepting a mediocre solution, and the engineer in me rebelled. In particular, just judging from "likes" on FB, people are reading my direct status updates there, but most probably aren't actually following through to read the links. So what I really want is to repost the text of my post here over there. The only difficulty is that what you have available is either the raw HTML of the post (which looks like crap on FB), or the completely-stripped version (which drops all the links, formatting, and so on). Both are kind of ugly.

But the thing that makes Zapier so particularly interesting is that you can inject semi-arbitrary code into your pipelines -- it's pretty limited, but you can use both JavaScript and Python. And while I may hate JavaScript, I do know it modestly well. So I just spent an hour hacking up a stupid but adequate regular-expression engine that basically takes the HTML output from your RSS feed here, and turns it back into something vaguely like Markdown. Once I figured out the ins and outs, it wasn't terribly hard, and the results are exactly what I want: the text of my post as a decently readable and complete Status Update on Facebook, with the link to the original post here at the bottom.

Of course, there's a catch (which I didn't figure out until I had all of this working): Zapier's Free plan doesn't include "multi-step Zaps", and as far as I can tell you have to have multiple steps in order to make this work. And their Basic plan is insanely expensive for personal use ($20/month). My solution seems to be working, but I'm still in the "trial" period, and I suspect they're lying to me about the claim that I'm currently on the Free plan. I would bet that, during the trial, I'm secretly upgraded to a higher level, and once the trial is over, they'll tell me I can't use this Zap unless I pay them a fortune.

We'll see how it plays out: for now, I'm going to leave this solution working, but I'm prepared to go back to dlvr.it if this goes up in a puff of extremely-expensive smoke. For reference, in case anybody else wants to play with it, here's the JavaScript that I injected into my Zap to get it working.

jducoeur: (Default)

Since I'm pretty sure some folks care, here are my findings on cross-posting.

Background: my journal is largely public, and I like to have it disseminated to where people want to read it -- Facebook (FB hereafter), Twitter (TW), whatever. (While I think DreamWidth (DW) is the best place for following one's friends, the reality is that most of my friends are only on FB.) LiveJournal (LJ) has had built-in cross-posting to those services for a long time now, and I've been using that; even after I moved to DW, I've been cross-posting from DW to LJ, and thence to FB and TW. But now that I'm thinking of dropping LJ entirely, the question is how I keep the other services in the loop.

After doing some research (and finding that most of the crosspost-to-FB services have gone away in the past two years), I came to the conclusion that the most robust option seems to be dlvr.it. We are not their primary target market: they are really focused on marketing people who want to be able to write something once and then spew it widely, and their Pro plan is oriented to that. But they do have a Free plan, and their service -- take RSS feeds and post them to social networks -- is more or less what we need. I've been using it for a few days, and it seems to work.

To get this up and running:

  • Go to the DreamWidth FAQ about RSS feeds, which should show the URLs of your feed.
  • Sign up for dlvr.it. I signed up using my Facebook account.
  • Once you're signed up, it will take you their "Automate" page. There, you set up an automation with a "Feed" (the URL of your DreamWidth RSS feed) connected to one or more "socials" (Facebook, Twitter, whatever). The Free plan allows you to take up to 5 Feeds as inputs and 3 Socials as outputs.

That's pretty much it, and it seems to work pretty well; I might even upgrade to Pro eventually, if I decide to use this for official Querki stuff.

That said, some caveats:

  • Most importantly, this is a third-party service, and they conspicuously indirect all links through themselves. This stuff is really only for public posts anyway, but keep in mind that they are probably doing traffic analysis on who clicks through to your DW page.
  • dlvr.it requires slightly more FB permissions than I love. I believe I understand why they require what they do, but basically you have to agree to all of the permissions required by all of their features, even if you aren't using all of those features.
  • dlvr.it is a commercial service, and they support themselves with subscriptions. They really want you to be buying their Pro service, which is expensive. ($10/month) They don't seem to be nasty about Free users, but be prepared to see "Try Pro Now -- Free Trial!" buttons on all the screens.
  • They provide a good deal of control over how cross-posts show up, and I had been quite encouraged that there was a "Status Update" option -- that is, cross-post the post in its entirety to Facebook. Sadly, though, it looks like links don't survive the process, so I've backed off to simply posting links.
  • The Free plan is explicitly a bit slow in its cross-post speed -- they only check for new posts every half hour. (Unsurprisingly, the plans that cost real money are quicker.)

So -- that appears to be a viable approach to cross-posting from here without LiveJournal in the middle. There are likely other possibilities. (It looks to me like IFTTT is possible, although I'm not sure how well that would work.)

Anybody have other suggestions?

jducoeur: (Default)
My lady asked me to track down some feeds she'd been reading on LJ.  I found some and wound up creating a few of them, and suspect that other friends might be interested, particularly in:

The East Kingdom Gazette

XKCD

What If?

Popehat

Note that not all of these have populated yet...

Profile

jducoeur: (Default)
jducoeur

June 2025

S M T W T F S
12 34567
891011121314
15161718192021
22232425262728
2930     

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags