December, 2009


29
Dec 09

software for massive document collaboration?

As part of my new role at work I’m going to be working on writing and editing some legal documents that I’d like to get both public and private feedback on.1

real text is edited in black and green (picture: Zenith Z-19 Terminal, by ajmexico, used under CC-BY)

real text is edited in black and green (picture: Zenith Z-19 Terminal, by ajmexico, used under CC-BY)

I’m trying to wrap my head around the available options, and none of them seem quite ideal. Some thoughts, first, on my requirements:

  • ease of use: I’m going to be collaborating with (among other people) lawyers, managers, etc.- i.e., non-technical people. So the solution should be easy to use, or at least have one face that is easy to use.
  • large-scale collaboration: this has to scale to input from lots of people (at least for commenting- editing will be a smaller group.)
  • maintaining the canonical version: somewhere other than my laptop should hold the canonical version of the text, including revision history.
  • commenting: it should be possible to open up a version of the document to the public, and to have them be able to comment on specific sections of the text- ‘I don’t like this paragraph’, ‘I suggest replacing A with B’, etc.
  • editing: I don’t need a massive multi-user text editor; we want feedback from many people but only a few people will be empowered to actually do edits. Ideally, though, I’d love to be able to review public comments, delete (or respond to) the bad ones, and integrate the good ones, all within the same tool. It should also be possible to do private revisions.
  • diffs/versioning: I need to be able to show the differences between two versions of a document; ideally with commentary on the reasons for the change, and with output that looks less like diff and more like an editor’s redline.

So what options do I have? These are the tools I’ve thought about so far:

  • a markup language + revision control: this would give me a lot of what I want, but it totally fails the ease of use test, and it isn’t clear that it handles the commenting role terribly well. Potentially great for canonical versions and diffs, though, especially if word-level diffs are an option and if I could figure out a way to produce good-looking diffs. With a distributed RCS this approach has the bonus of allowing for some work to exist in a non-canonical branch when changes are still being discussed/debated.
  • traditional word processors: traditional word processors can be great at diffs/versioning, and obviously they exist to edit, but they aren’t very good at scalable commenting and collaboration- things break down very quickly when you’re emailing around files, and expecting someone to merge them all together. odf-svn seems like it deals with some of these problems, at least conceptually, but development seems very stalled. I will also look at abicollab, but many of my collaborators will be on Mac- which AFAICT is not supported for newish versions of Abi. :/
  • stet/co-ment.net: Stet was great at handling mass commenting; its successor, co-ment.net, seems to be similarly good. But they don’t really allow you to do diffs between versions, so at best it could be only part of the solution.
  • wiki: no wiki that I know of can handle commenting like co-ment.net can. This is a shame, since they are great for showing revisions and (small-scale) collaborative editing. Also, doing ‘branches’ to propose changes that may get rejected is not possible in any wiki I’m aware of. Would love to be proven wrong on this one.
  • etherpad: etherpad is even slicker than wikis for showing revisions, and obviously superior for collaborative editing, but no facility for commenting on texts. Also lots of uncertainty about the maintainability/supportability of the code base.
  • bespin: this is so code-focused that it may not pass the ‘user friendly’ test, but hg integration is nice, and it may be sufficient for collaboration on plain text.
  • wave: this is almost exactly the kind of problem wave seems designed for, but it is such a constantly evolving product (not to mention a ‘run on someone else’s server’ problem) that I’m a little reluctant to use it. And of course since it is in semi-private beta it can’t do public commenting.

So far, I’m leaning towards gathering comments via a co-ment.net instance, using hg + markup (or even plain text?) to store the canonical version and generate revisions, and using etherpad, bespin, or a wiki for collaborative editing when necessary. But that still feels like a pretty fragile solution to me- lots of file transitions where things could go wrong, especially between hg and etherpad/wiki. I’d need to find a markup which can transparently/reliably go in and out of the editing tool from hg (or just admit defeat and use plain text), and the diffs from hg would almost certainly need some processing to make them look good.

So does anyone have suggestions on other tools, or specific suggestions on how to make this toolchain more robust and/or powerful?

  1. Sorry, no details quite yet on what the project is, and no prizes for guessing… []

23
Dec 09

continued notes on the macbook experiment, week 3

Some more notes on running a mac (original post and explanation here):

  • installing new software is insanely nice. Yes, apt and yum are nice, but I don’t find out about software that way. I find out about software by reading something on the web (for me usually a blog post, but for others a news article) and from there installation on mac is a click, download, and drag away. That is it. That is insanely great.
  • suspend and resume is, like everyone says, perfect. It just works. Every kernel developer should be given a laptop and not allowed to do anything else until suspend and resume works this well.
  • one interesting side-effect that I’ve noticed of controlling the hardware is that you don’t need to fit the OS on a CD, so the OS preload is huge- 13-14 gigs. Which is insanely great!1 Instead, even when I tried to do something 99% of mac users will ever do (install a rails app locally) it Just Worked. Rails was there; gems was there; sqlite was there. That is specifically because they don’t have to worry about fitting everything on a CD and can instead rely on the humongous hard drives that every system comes with these days. A very nice luxury, that. (Or to put it another way: emacs is in the default install. And it isn’t in the default install on most linux distros anymore. I understand why it must be so for linux distros, but still, it is sad.)
  • I’ve long suspected that Dashboard + Expose is roughly 1,000x better as a user experience than panel applets. Now I know I’m right.
  • it is great that a lot of the libre software that I love is available on mac; having tomboy and tracks available is already making me more productive. (And obviously I’m using firefox. Sadly it is way more performant on mac than linux- someone who was serious about the linux desktop experience but didn’t know where to start hacking would be well advised to work on firefox performance.)
  • I just saw the following on Krissa’s fresh F12 install:

I am not yet an expert on Mac-style UI design, but I’m pretty sure anyone who put an error message like this in a product shipping from Cupertino would be flogged. Anyone who put it in in such a way that it (as far as I can tell) always comes up on a default install would be fired on the spot.

  1. First commenter who calls it bloat is shot. []

15
Dec 09

google is your butler- the tension between utility and privacy

I’ve often defended Google’s thirst to know things about people with a butler analogy. Good software should, like a butler, try hard to understand your preferences and act on them for you without you even realizing they are there. That means learning and remembering things you’ve done in the past, and using that to base recommendations on. When you tell your butler ‘bring me desert, please’, he should remember that you usually like chocolate, and that all this week you’ve been experimenting with different cakes, and therefore bring you another variant on chocolate cake. If he suddenly forgot you liked chocolate and you’ve been having cake all week, you’d be irritated when he asked you those things again, or if he just brought you a canoli out of the blue.

Ideally you want your butler to know at least something about what your friends and co-workers are doing too- if I say ‘bring me a shirt’, and the butler knows I’m going out with the cool kids tonight, then I want my trendiest shirt based on what my friends think is trendy. But if I am going to the office and say ‘bring me a shirt’, I want the butler to know that my workplace is casual, but not too casual, and so on. I could of course tell him all these things every time he brought me a shirt, but it is easier for everyone if he just remembers, and perhaps does some outside research on his own.

Like a butler, you want your tools to work intelligently based on context and history, and Google is without doubt one of those tools- for many of us, the most important single tool in our computing lives. The problem, of course, is that your butler has a lot of incentives to keep your private information private. Surely the butler can be bribed, but therefore you pay him well and treat him like a human being, and you try to avoid these sorts of problems. Google’s incentives run at least partially the other way- they have strong incentives to mine that data extensively, to share it with others, and to collect well more than most people might think is useful, in the name of being the ultimate butler. And these incentives lead to risks- incentives to share with third parties that you might not trust; risks that things might be subpoenaed; risks that they might leak to Google employees or even outside Google; risks that effective advertising might use such information to manipulate your political views. On balance, most of us are going to look at these issues and decide that we’re OK with Google knowing these things, because the risks are remote and the benefits tangible. So we acknowledge there is a tension between privacy and functionality, and move on.

I wish that at this point I could announce some deep new insight about the balance between these two competing forces. I can’t; most of what there is to be said has been said already. The thing that makes me write about it right now is, of course, Eric Schmidt’s recent comment. The thing that bugs me about it is that he doesn’t seem to realize there is a tension. These words don’t speak of ‘we’re wrestling hard with this question every day’ (a reasonable compromise position) or ‘we’re doing everything we can to collect as little data as possible’ (the pragmatic civil libertarian perspective). They speak of a company (or at least a CEO) which doesn’t realize or doesn’t care that there are balances and compromises to be struck and continuously re-considered. And that, to me, is very, very troubling; more troubling than any particular policy position could be.

So I’m experimenting this week with other search engines, and once I finish moving I’ll be looking again at other mail and rss readers. I really don’t ask much of Google in return for trusting them; I’m not an absolutist, I just need to know that they are continuing to treat privacy as a difficult, multi-faceted issue that constantly has to be evaluated and considered. And if Schmidt is any indication, that isn’t what Google is doing right now.


14
Dec 09

hello planet mozilla!

Hello planet mozilla! As the class notice said, I am a long-time moz lurker, first-time poster, and I’m really excited to be joining mozilla.

factoids, possibly relevant:

  • My college next-door neighbor downloaded the first mozilla source release. He couldn’t get it to build. He was (still is) a genius, and I’m not, so if he couldn’t build it, I sure as heck wasn’t going to touch the code with a ten-foot pole. I decided to help out by doing bug triage instead. You can see my first full contribution here.
  • My first job title was ‘bugmaster’; I have also answered to ‘geek in residence.’
  • Two summers during college I was able to say ‘I am paid to play with legos’. The results are here and here.
  • IAAL, as of a couple of weeks ago. My strong, strong advice to you: don’t go to law school. If it is good for you (like it was to me), it can be great, but for the vast majority of people, it isn’t the right thing to do. Trust me on this.
  • My wife spent two years in a village of 800 people in Africa and then three years working for the farmer’s market in New York.
  • After growing up in Miami but spending time in Boston and Manhattan, my goal is to continue to move south. If Harvey will let me work remotely from Costa Rica, that will be just fine, but in the mean time Mountain View and San Francisco will be acceptable.
  • I did Linux desktop stuff for 8 years of my life; it left me with an appreciation of how difficult it is to design good GUIs and maintain large code bases, and with an appreciation of how rewarding it can be to do what you love with good people beside you.
  • I once spent eight weeks in a tent to go to a basketball game. It was totally worth it.
  • I had the horrible realization last week that even though I hate wearing suits I have basically become one.
me, in a suit, at the Supreme Court for oral arguments in Grokster

me, in a suit, at the Supreme Court for oral arguments in Grokster

what I’m doing with Moz

I’m working with Harvey and Julie on a variety of things, possibly including sexy things like the MPL, trademark, and legal community building, and certainly also mundane but important stuff like bandwidth contracts. I hope to blog about a lot of it, but of course lots of it is privileged/confidential, and even more of it is boring, so we’ll see how that goes :)

how to find me

I’m pretty easy to find via email. I’ll also probably lurk in mozilla IRC once I figure out the right places. Say hi; unlike most lawyers I don’t bite.


11
Dec 09

the macbook experiment, day 2

The last time I regularly used an operating system other than Linux was fall of 1997. Windows 98 was all the rage; Mac OS/X was not yet (publicly) a glimmer in Steve Jobs’ eye. So this means I have a fairly dysfunctional view of desktop software- I basically really don’t know what Linux and GNOME are competing against. I’ve read reviews, and played with the competitors from time to time, but I’ve never really seriously forced myself to use them- to learn the keyboard shortcuts, the quirks, and their real benefits. And I think that is a problem- it makes me a less effective part of the software ecosystem if I don’t know how most people experience computers.

'An apple a day', by angermann, used under CC-BY-SA
An apple a day, by angermann, used under CC-BY-SA.

So when my new employer offered to get me a new laptop, I decided to get a mac, and set myself to using it for a year so that I can learn how the other half lives. It will also have Win7 installed (probably mainly in a VM) as well as Office.

Some thoughts so far:

  • OS/X is nice, but has not really jumped out at me as particularly awesome. It gets the job done, and is very polished (very consistent; low effort required), but by and large my experience with the core OS hasn’t felt that radically different than from any modern Linux distro- the differences are (so far) probably smaller than I expected.
  • that said, there are definitely nice touches- the multitouch trackpad is definitely leaps and bounds above any other touchpad I’ve ever used, though I’m going to miss the Lenovo/Thinkpad nipple a lot. The hardware in general is just awesome- solid like a rock.
  • the mac software ecosystem seems to be a mixed bag; I’d heard good things about adium, for example, but I’m not very impressed so far, and in my very limited playing with mail.app it seems roughly on par with thunderbird; that is to say, well behind gmail in usability.
  • but some of the software is brilliant- a friend pointed me at scrivener, which may be imperfect (time will tell) but so far impresses me as a rare piece of software which truly seeks (and may actually achieve) fundamental reinvention of a class of software; it just seems like software dedicated to the process of writing rather than primarily to text layout, and that feels to me like a huge, huge leap. The only reason I don’t look forward to using it is because I don’t want to get hooked :/

Anyway, I expect this will be an interesting, and potentially very troubling, year, as I get a better grip on what was accomplished, software-wise, during the time I’ve been working on the Free Software desktop.


10
Dec 09

starting fresh with mozilla

After some bumps in the road which delayed my start by a week, I started today in the legal department at Mozilla. Last night I lost a little sleep worrying if this was the right thing for me, but after a day around the office (during an all-hands meeting, no less) I’m pretty much glowing. The projects I’ve already been charged with are interesting and important (more on those very soon, I expect); the other things going on are relevant (as someone said ‘we get to change the world every day, though some days more than others’); and the energy and enthusiasm are infectious. And of course it doesn’t hurt to be able to work with old friends.

Albino Alligator 2008 by Mila Zinkova, used under CC-BY-SA 3.0 license

Albino Alligator 2008 by Mila Zinkova, used under CC-BY-SA 3.0 license

Also, there are reports that my boss wrestled an albino alligator after dinner; reports were conflicting over whether he bested the beast with his bare hands or if he merely threatened to subpoena it. So yeah… things are interesting.

Weird moment of the day: get introduced at a meeting. Guy across table: ‘wait, are you the Luis Villa?’ me: ‘probably?’ Meeting then starts immediately. Turns out a sure-fire way to make a meeting seem very long is to leave a statement like that unexplained and hanging over your head the whole meeting… :) Led to a great conversation later, though, as did basically everything else all day.


This work by Luis Villa is licensed under a Creative Commons Attribution-ShareAlike 3.0 United States.