software


14
May 10

some data points on facebook

My boss has written a blog post that tries to bring together some recent data points from across the privacy spectrum; it is worth a read. I’ve been noting a few (much smaller, more trivial) things myself over the past few days that suggest to me that privacy concerns in general, but facebook-related privacy concerns in particular, may be reaching a bit of a critical mass.


No Facebook by avlxyz used under CC-BY-SA.

Some anecdata:

These are just anecdotes, and not real data, but to me this feels vaguely different from the ‘rebellion’ in 2006. At that time I said ‘people adjust and things blow over sometimes.‘ This one feels different to me, but that is just a vague feeling; it may stem as much from my own facebook fatigue as from any concrete reality. It will be interesting to watch, at any rate.


8
May 10

responding to joindiaspora

The joindiaspora guys, in a generally good response to my questions, conclude by asking:

[W]hat would be un-pragmatic about giving four excited dudes who spent their last semester of school thinking about a problem you are “worried-about-but-can’t-deal-with-now,” twenty bucks so they can take an honest crack at solving it? :)

Lots of people asked some variant on ‘it is just $20′ or whatever. First, I tend to be one of these people who don’t give token amounts to charity- I prefer to give larger amounts to a small number of projects that have very high impact (or very high odds of success if they aren’t having an impact yet.)

But the money is secondary. The important thing is that there are already a fairly good number of projects in this space, with a fairly small amount of users, developers, testers, and attention to spread between them. And to be blunt, I don’t want someone coming in with more web design and marketing sense than actual hacking chops and using up all the oxygen in the room. I think DiSo did this to some extent, frankly. So yes, giving a little bit of money to someone can be quite counterproductive and unpragmatic- and I wanted to reassure myself that I wouldn’t be contributing to that problem again.

Given that it looks like they’re going to be doing this crazy thing ($13K raised of their $10K target) that concern is now irrelevant.


(untitled) by Môsieur J. [version 3.0b], used under CC-BY.

So some thoughts on the rest of the responses, again in hopes that they are supportive and constructive:

We plan to “build less.”

Hooray! Most of these questions don’t have right answers, but this one did. And the followup priorities seem reasonable- those probably are the right minimum bits necessary. That said, where people have already built things, consider building less than less by working with other projects. Status.net comes screamingly to mind for the message passing component, but I’m sure there are others. Don’t just build shared specs- where possible, build shared code.

We see all of this communication happening between two Diaspora servers, rather than strictly between peers.

This seems like the very pragmatic solution to me; all the talk of real peer-to-peer is terrific but that is a very hard slog- both technically (getting it working) and socially (getting users to install it.)

With regards to DiSo, the response had one set of great things, and one part that was very ambiguous to me:

It seems to us that all of the previous attempts at solving the problem are trying to create the perfect solution in the first version.

I think this is right, and I’m heartened to hear the talk about building answers that satisfy rather than perfect. These are all signs
of excellent taste (not just this sentence, but many of the things both in this specific answer and in the entire blog post.)

Ambiguously:

[DiSo] tried to add on to WordPress, a project which was not designed from the ground up to be a distributed
network.

I’d love to hear more elaboration about ‘designed from the ground up to be a distributed network.’ WordPress has
proven to be a very flexible platform for a lot of things, and it both publishes and consumes structured data very well to that distributed network we call the internet (particularly that subset of the distributed network that consists of Atom/RSS publishers and consumers- I subscribe successfully to many friend’s wordpress blogs in something that looks very much to me like a distributed network.) In addition, things have improved since DiSo started, since there is now PuSH, possibly webfinger, etc. So which features are you looking for in a ‘designed from the ground up’ distributed network that wordpress doesn’t have? I’m not saying that wordpress is the solution, but I’m curious to hear more about what it specifically lacks.

With regards to Mugshot… I wish the Red Hat folks had posted a good post-mortem on that; to the best of my recollection I never saw one. My own sense is that: (1) it was very difficult for others to set up, so it never got an outside development community, and no one looked to it as a distributed solution to the problem. (2) The community it attracted was heavily tech-y, so the community that built on it looked to outsiders (frankly) like it was a bunch of nerds, which made it hard to expand into a more broad-based audience. (e.g., it was a great source of community for linux distributions, not so much for sports. Identica has the same problem relative to twitter; compare a search for lebron on twitter to a search for lebron on identica some time. Ditto Bieber or Gaga. This is very related to Pick The Right Customers.) Both are problems worth being aware of.

Solid answers on specs and services, including a couple projects I hadn’t been aware of- usually a good sign (even if one of them appears to be completely insane :)

We will be constantly sharing our ideas, and 100% of our code at the end of the summer.

I’m still not clear on why no code until the end of the summer. Care to elaborate? I’m not an absolutist on this- mostly for reasons related to bikeshedding and design- but it does seem like an odd default choice.

We think in the future (after the summer), we will work on an easy installation…

Only clearly wrong answer of the whole thing. Easy installation should be baked-in from day one- adding it afterwards is hard. As a bonus, it helps you write automated tests (since automated deployment is easy) and easy installation helps you choose the right customers by helping you attract users who are interesting in talking to other people rather than playing with software.

What are your three favorite books on software development? three favorite essays? what about on design?

Is this one of the questions where if I don’t say “Kernighan and Ritchie,” “Getting Real”, “Mythical Man-Month,” “Don’t Make Me Think!” or something like that, you will disapprove? :)

Yeah, sort of. But ‘Getting Real’ was the right answer. ;) (I sort of wish I had the time to write a mashup of Getting Real and Producing OSS, maybe with a dash of The Poignant Guide.) I also highly recommend Rework and Designing From Both Sides of the Screen. Blog-wise, you might find this list interesting, though not necessarily pertinent to this discussion.

Finally:

We bought him some arepas. They were delicious.

I’m sort of bitter that you live near that particular deliciousness. Also that you called me an old dude. But mostly because I miss those arepas. And the yo-yos. Enjoy one or two for me during your hacking breaks. :)


27
Apr 10

Questions for the Diaspora

So lots of friends were tweeting this morning about Diaspora, a project to raise funds to get a summer’s worth of hacking done on a distributed, Libre social network. A distributed, Libre social network would be a terrific thing to have; I’d love to support it. And I love the eager energy I’m seeing around Diaspora.


Questioned Proposal, by Eleaf, used under CC-BY

But I’m also keenly aware that distributed social networks are hard, and so I’d only give of my money (or my time) to someone who looks like they have what it takes to take a serious stab at the problem. They’re hard:

  • as a design question: how do you make a social network whose UI doesn’t suck?
  • as a technical question: the code involved is complex, particularly if you want to interoperate robustly with other platforms, and doubly so if you want to do that with proprietary platforms.
  • as a social question: getting users to migrate is not easy.

So here are some questions for Diaspora, or really for anyone working in this space. These are not questions with right answers, necessarily. But anyone serious about solving this problem probably has at least some answers for them, so showing that you’ve given them some thought will go a long way towards convincing people that you’re serious about attacking the problem. If you haven’t given them thought yet, I could think of worse places to start. :)

  • What do you think are the most important features a social network should have? How would you prioritize them? Do you plan to Build Less or go big? If building less, what is the minimal set of features you can get away with?
  • DiSo is now two-plus years old. Any ideas why it didn’t get off the ground? Bonus points: same question for Mugshot.
  • What standards, if any, do you plan to work with/build on? (just to throw out a couple, all of which have strengths and flaws to consider: webfinger, oauth, xauth, the buzz APIs.)
  • What other services, if any, do you want to interoperate with? why? how will you prioritize?
  • Any other Libre code bases in the same space you’d like to work with? GNU Social? StatusNet? What ones are you aware of, and why will you/won’t you build on/work with those?
  • Would a smarter client (like Mozilla Contacts) be useful to you? If so, how?
  • What is the strategy to get to a critical mass of users (or avoid having to get a critical mass?)
  • What are your three favorite books on software development? three favorite essays? what about on design?

I don’t mean to ask these questions to piss on anyone’s parade; I deeply want to believe. Heck, what I want to do is fly to New York, sit down in a room, and help you brainstorm and plan. But unfortunately I’m a pragmatist with a day job. I can’t directly help out. So instead I offer these questions. Answer these1 and you’ll begin convincing people that you are also pragmatists: that you’ve thought hard about the questions at hand and you are worth investing in. And I’ll be first in line to do that.

(I should note that unlike some I don’t need code; I think code that is created without much thinking is all too common and frequently damaging. But if you don’t have code, I suggest doing planning- and talking about it- before doing a PR week. :)

  1. or questions like these- you’ll note I skipped some hard ones like ‘business model?’ []

22
Apr 10

three features I’d love in leechblock

I love leechblock. It really helps keep my life sane. I was working some with it this morning to tweak my settings, and it seemed like a good time to write about three features I’d kill for, even to the point of putting up money for them if there were a way to sponsor features (or sponsor third-party hacking- bounties, if you will):

  1. ‘block all sites except this one for ____ minutes’- a button that blocks the browser from going to any other domain for ____ minutes. Sort of like Freedom but with an out for one specific website (and obviously only for your browser.) Goal would be to allow you to focus on one site for 45-50 minutes, so it should probably close or block access to other tabs as well.
  2. ‘no, I really need this’- give an option (probably a popup of some sort?) that allows you to break through leechblock in the off case you really need something. For example, I like to block nytimes.com all the time, but (when I was editing a journal) I often needed to access a times article to confirm a claim in an article. This would obviously require some sort of time-consuming task in order to prevent you from ‘just’ clicking through all the time- probably something like a series of captchas, or just typing in a specific series of random characters that can’t be copied/pasted.
  3. ‘this page has an RSS feed: subscribe to it and add it to leechblock? [yes/no/don't ask again for this domain]‘- one of the strategies that has helped save me over the past few years is aggressively subscribing to feeds, blocking the actual domains, and then only visiting them when I’m in feedreading mode. Making that semi-automatic, and easy to do, would be terrific.

It may well be time to port leechblock to jetpack and rethink the UI as well, but those are bigger projects that I can’t randomly beg (or bribe) people to do.


4
Apr 10

building software to let me read more and less at the same time?

There are lots of sources of links these days- delicious, twitter, and blogs. Many of these are interesting, but not so interesting that I want to read them all the time. Currently I have to decide either to read or not read these people.

I’d like to add a third option: to have a ‘middle’ pool of sources who I don’t read directly, but who are monitored and serve as pointers to other, interesting things. I think having such a third option would let me read less (because I’d stop skimming these intermediate sources), but still also give me fairly good confidence that I’m not missing important things that I should read.

The outline of the software in my head goes something like this:

Step 1: User provides a list of RSS feeds (a mix of blogs, twitter/identica, and delicious feeds).

Step 2: A harvester collects the contents of said RSS feeds.1

Step 3: Parse the content of those feeds for URLs and dump them in a db.2

Step 4: Unshorten the URLs if necessary. 3

Step 5: When a particular url has been mentioned X times in the past Y days4, fetch the URL5, find the content within it6, and jam it in an RSS feed for consumption along with the rest of my top-level RSS feeds.

Bonus step: mash up snippets from the posts/twitters/delicious feeds to provide context for the URL’s content, similar to what Google Reader does when friends comment on a feed item.

I feel like someone must have done this already. If not, the pieces are available (see the footnotes for details on many of the pieces); I sure wish I had the time/skill to put them together myself. :/ This project is one of the things I wish we had more reliable bounty infrastructure for- I’d actually put money up for it if I thought there were a reliable way to get some matching funds and find good developers for it.

Ideas, either about the rough feature sketch, existing software that fits this need, or about methods to make it happen, are all welcome.

  1. Planet is an example of infrastructure that does this. []
  2. Planet’s meme plugin can do this. []
  3. There are scripts and web services available for this; the basics aren’t that complicated. []
  4. again, meme plugin has this concept already implemented []
  5. not in memeplugin- it only provides links []
  6. not trivial, but source is available that does this via readability []

17
Mar 10

Mailing lists are parties. Or they should be.

I can’t go to bed because Mairin is right on the internet and so I want to (1) say she’s awesome and (2) add two cents on mailing lists and using the power of a web interface to make them better. Bear with me; maybe this is completely off-base (probably I should just stick to law), but it has been bouncing around in my head for years and maybe me writing it down will help the lightbulb go off for someone who can actually implement it :)

Here is the thing: I think mailing lists are almost like parties in a lot of ways, and so we can steal ideas from parties to help write better mailing list software. I know this sounds silly, but bear with me.

First, the similarities. At most parties, like most mailing lists, most people want to have interesting conversations, and they understand the shared social standards and interests of the other people at the party. And at most parties and most mailing lists there are a handful of people are boors who probably don’t want to spoil the party, but who violate those shared norms- some in very mild ways (boring, talking too loud, posting too much), or maybe some less mild (the guy who doesn’t think he’s a racist, but really is.)1 If you’ve got similar mixes of people, why then do parties usually handle boors well, while mailing lists often fail and flame out?

At a party, one thing that helps keep conversations functional is that people who lack social graces or are uninteresting get social cues which encourage behavioral change. Sometimes these cues are very explicit- someone saying out loud ‘you’re not interesting, I’m leaving.’ But those direct cues are a pain to send- they are usually considered ‘rude,’ they require a lot of emotional energy, and they often mean more interaction with the boor- which is the last thing anyone wants. And blatant signals are often counter-productive too, since they make well-intentioned people defensive instead of giving them a face-saving way to learn they have a problem. Since direct signals are a pain, at parties we’ve evolved a range of more subtle cues to use- people cough and shuffle their feet, or quietly move to another part of the room, or say ‘how about the weather?’ And this actually works pretty well- worst case, people walk away from the boor and have good conversations elsewhere; best case the boor gets the message, changes their behavior, and becomes more fun to be around.

Mailing lists have no low-cost equivalents to coughing and walking away. There is only silence, or confrontation. Mairin’s mockup excites me since, if implemented, it could provide those more subtle, less confrontational cues by allowing ‘-1′ digg-style votes on posts. You could imagine making the cues even more subtle and non-confrontational than she suggests, perhaps by sending positive cues to everyone but negative cues anonymously and only directly/privately to the boor.

Another way that parties and mailing lists aren’t enough alike: in a party, if you are part of a boring conversation, you just walk away. Besides giving the social cues already discussed, this also has the awesome effect of allowing you not to hear that conversation anymore. In contrast, a mailing list is like a party where you can’t walk away from a conversation. You hear every single conversation whether you like it or not. Some of the best email software allows killing entire threads, but that doesn’t give the social cue to the boor. They think everyone is paying attention and so they keep talking. And for people with less good email clients (most of us), the options are to just tolerate the boors or leave the list altogether. Imagine if you had to leave every party that had even a single boring conversation. You wouldn’t go to many parties. That is what most mailing lists are like, though.

We can fix that. You can easily imagine mailing list software that allows you to tell the server ‘don’t send me this thread anymore.’ As a side-effect, if enough people ignored a thread, you could tell people posting in the thread that ‘X people have walked away from this conversation- maybe you should take this off-list?’ These would probably both require a fair bit of hacking, but it seems like the upside is a more party-like list.

On the more positive side (Mairin said she liked to focus on the positive!), at a party it is easy to find the good conversations. Just wander around the room at any decent-sized party; you’ll see a tight knot of people and hear they are talking excitedly. Can’t do that with a mailing list; you’ve got to at least start reading every thread. Once you know which threads people like (maybe via a ‘like’ link in the footer?) you can offer a party-like ‘subscribe only to threads that already have a crowd.’ Twitter/identica sort of do this through the idea of retweets/repeats; you don’t have to follow everyone on earth- some people will just pass the cool stuff along- and that seems like it could be pretty useful for mailing lists.

Note that virtually none of these behaviors require browsing the email through a web interface or a specialized mail interface. All of them could be implemented by ‘click here to mod up/click here to mod down’ links in the footer of each email, so people who live in their mail clients could still participate and benefit, which I think is a must.

Bottom line: Software can’t save a mailing list full of people who actively dislike each other. Maybe I’m crazy, though, but it seems like software that helped mailing lists function more like parties could really help mailing lists cope better with anti-social people.

  1. There are only a small number who are actively malign and I’ll ignore them for the purposes of this post- if you have too many of them on a list, you have problems software can’t solve. That said, the analogy may have some use in dealing with trolls too. []

9
Mar 10

Digging up my old Red Hat/e-voting posts

The DOJ is breaking up ES&S, the country’s largest provider of voting machinery, the OSDV project seems to be gaining some attention, and RHAT stock recently hit a five-year high. This seems like as good a time as any to dig up my ‘Red Hat should be in electronic voting‘ post and followup. Take the gamble, Raleigh! Buy the ES&S assets at bargain prices and get into the game.


12
Jan 10

Credit where credit is due (more Google tea leaves to read)

One of the very first things that made me skeptical about Google was their approach to censorship in China, which I thought deeply compromised their supposed ‘don’t be evil’ approach to the world. It struck me that their position- summarized as “the benefits of increased access to information for people in China and a more open Internet outweighed our discomfort in agreeing to censor some results” bespoke a fair amount of arrogance about the value of Google and a discounting of the value of uncensored information. I didn’t mention that issue in my recent post about Google and reading their tea leaves, but it certainly is one of the big tea leaves to be read.

And so they’ve added another layer to the tea leaves with this announcement that Google will be backing out of censorship in China and possibly abandoning China altogether. Go read it.

It is hard to imagine any other American company having the cojones to make a public statement like it, and I have to applaud them for it. Google is different; anyone who tells you otherwise doesn’t understand them very well. The question we must continually ask is ‘how different, and for how long will they remain different?’ Schmidt’s quotes the other day suggest they are becoming more like others, and that is troubling, and worth writing about and reflecting on (not least by people within Google.) But to even post this is a reminder that they are still very different from most of their peer large corporations. I suppose for those of us who continue to read the tea leaves the followthrough after this post will say a lot as well.


29
Dec 09

software for massive document collaboration?

As part of my new role at work I’m going to be working on writing and editing some legal documents that I’d like to get both public and private feedback on.1

real text is edited in black and green (picture: Zenith Z-19 Terminal, by ajmexico, used under CC-BY)

real text is edited in black and green (picture: Zenith Z-19 Terminal, by ajmexico, used under CC-BY)

I’m trying to wrap my head around the available options, and none of them seem quite ideal. Some thoughts, first, on my requirements:

  • ease of use: I’m going to be collaborating with (among other people) lawyers, managers, etc.- i.e., non-technical people. So the solution should be easy to use, or at least have one face that is easy to use.
  • large-scale collaboration: this has to scale to input from lots of people (at least for commenting- editing will be a smaller group.)
  • maintaining the canonical version: somewhere other than my laptop should hold the canonical version of the text, including revision history.
  • commenting: it should be possible to open up a version of the document to the public, and to have them be able to comment on specific sections of the text- ‘I don’t like this paragraph’, ‘I suggest replacing A with B’, etc.
  • editing: I don’t need a massive multi-user text editor; we want feedback from many people but only a few people will be empowered to actually do edits. Ideally, though, I’d love to be able to review public comments, delete (or respond to) the bad ones, and integrate the good ones, all within the same tool. It should also be possible to do private revisions.
  • diffs/versioning: I need to be able to show the differences between two versions of a document; ideally with commentary on the reasons for the change, and with output that looks less like diff and more like an editor’s redline.

So what options do I have? These are the tools I’ve thought about so far:

  • a markup language + revision control: this would give me a lot of what I want, but it totally fails the ease of use test, and it isn’t clear that it handles the commenting role terribly well. Potentially great for canonical versions and diffs, though, especially if word-level diffs are an option and if I could figure out a way to produce good-looking diffs. With a distributed RCS this approach has the bonus of allowing for some work to exist in a non-canonical branch when changes are still being discussed/debated.
  • traditional word processors: traditional word processors can be great at diffs/versioning, and obviously they exist to edit, but they aren’t very good at scalable commenting and collaboration- things break down very quickly when you’re emailing around files, and expecting someone to merge them all together. odf-svn seems like it deals with some of these problems, at least conceptually, but development seems very stalled. I will also look at abicollab, but many of my collaborators will be on Mac- which AFAICT is not supported for newish versions of Abi. :/
  • stet/co-ment.net: Stet was great at handling mass commenting; its successor, co-ment.net, seems to be similarly good. But they don’t really allow you to do diffs between versions, so at best it could be only part of the solution.
  • wiki: no wiki that I know of can handle commenting like co-ment.net can. This is a shame, since they are great for showing revisions and (small-scale) collaborative editing. Also, doing ‘branches’ to propose changes that may get rejected is not possible in any wiki I’m aware of. Would love to be proven wrong on this one.
  • etherpad: etherpad is even slicker than wikis for showing revisions, and obviously superior for collaborative editing, but no facility for commenting on texts. Also lots of uncertainty about the maintainability/supportability of the code base.
  • bespin: this is so code-focused that it may not pass the ‘user friendly’ test, but hg integration is nice, and it may be sufficient for collaboration on plain text.
  • wave: this is almost exactly the kind of problem wave seems designed for, but it is such a constantly evolving product (not to mention a ‘run on someone else’s server’ problem) that I’m a little reluctant to use it. And of course since it is in semi-private beta it can’t do public commenting.

So far, I’m leaning towards gathering comments via a co-ment.net instance, using hg + markup (or even plain text?) to store the canonical version and generate revisions, and using etherpad, bespin, or a wiki for collaborative editing when necessary. But that still feels like a pretty fragile solution to me- lots of file transitions where things could go wrong, especially between hg and etherpad/wiki. I’d need to find a markup which can transparently/reliably go in and out of the editing tool from hg (or just admit defeat and use plain text), and the diffs from hg would almost certainly need some processing to make them look good.

So does anyone have suggestions on other tools, or specific suggestions on how to make this toolchain more robust and/or powerful?

  1. Sorry, no details quite yet on what the project is, and no prizes for guessing… []

23
Dec 09

continued notes on the macbook experiment, week 3

Some more notes on running a mac (original post and explanation here):

  • installing new software is insanely nice. Yes, apt and yum are nice, but I don’t find out about software that way. I find out about software by reading something on the web (for me usually a blog post, but for others a news article) and from there installation on mac is a click, download, and drag away. That is it. That is insanely great.
  • suspend and resume is, like everyone says, perfect. It just works. Every kernel developer should be given a laptop and not allowed to do anything else until suspend and resume works this well.
  • one interesting side-effect that I’ve noticed of controlling the hardware is that you don’t need to fit the OS on a CD, so the OS preload is huge- 13-14 gigs. Which is insanely great!1 Instead, even when I tried to do something 99% of mac users will ever do (install a rails app locally) it Just Worked. Rails was there; gems was there; sqlite was there. That is specifically because they don’t have to worry about fitting everything on a CD and can instead rely on the humongous hard drives that every system comes with these days. A very nice luxury, that. (Or to put it another way: emacs is in the default install. And it isn’t in the default install on most linux distros anymore. I understand why it must be so for linux distros, but still, it is sad.)
  • I’ve long suspected that Dashboard + Expose is roughly 1,000x better as a user experience than panel applets. Now I know I’m right.
  • it is great that a lot of the libre software that I love is available on mac; having tomboy and tracks available is already making me more productive. (And obviously I’m using firefox. Sadly it is way more performant on mac than linux- someone who was serious about the linux desktop experience but didn’t know where to start hacking would be well advised to work on firefox performance.)
  • I just saw the following on Krissa’s fresh F12 install:

I am not yet an expert on Mac-style UI design, but I’m pretty sure anyone who put an error message like this in a product shipping from Cupertino would be flogged. Anyone who put it in in such a way that it (as far as I can tell) always comes up on a default install would be fired on the spot.

  1. First commenter who calls it bloat is shot. []

This work by Luis Villa is licensed under a Creative Commons Attribution-ShareAlike 3.0 United States.