Daytime Running Lightshttp://jchrisa.net/drl/_design/sofa/_list/index/recent-posts?descending=true&limit=5Sofa on CouchDB2009-07-04T16:16:47Zhttp://jchrisa.net/drl/Kings-of-Code-slidesKings of Code slides<p>This talk has a bit more attitude than some of my others. I'm not sure if it was recorded, but that's cool: "you had to be there."</p>
<p>My favorite bits were getting to tell <a href="http://twitter.com/stevenpemberton">Steven Pemperton</a> that "in the fullness of time there is only one CouchDB." I find the first-wave web guys really get it.</p>
<p>Also fun in this slide deck is a new bullet point on my "Why CouchDB?" slide: "Makes Google look old-school."</p>
<p><a href="http://jchrisa.net/drl/Kings-of-Code-slides/CouchDB-KoC.pdf">Kings of Code CouchDB slides here - 6.5MB pdf</a></p>2009-07-04T14:54:18Zjchrishttp://jchrisa.net/drl/The-P2P-Web-part-oneThe P2P Web (part one)<p>(Part one because this is only just some of what I'd like to say on the topic.)</p>
<p>The web was originally designed as a peer-to-peer medium. Tim Berners-Lee needed a way to share physics papers with his friends around the world. Since they were physicists, and the medium was simply published texts, the barriers to entry were low. All you had to do to become an independent publisher was run a copy of the NCSA web server, and point it to an HTML directory of your papers.</p>
<p>To make a long story short, the idea caught on. In the 15 years since we've seen an explosion of uses for the web, alongside an steady increase in the complexity of web applications and deployments. Google's web index is a long way away from the humble list of all the university web servers of the early years. Even much smaller sites (like the ones you might be building right now) are more likely to consist of service-oriented architecture and complex caching protocols, than a directory index of static html files.</p>
<p>This complexity comes for a reason, the web has gotten bigger and users have grown to expect sites to integrate data from all kinds of sources. Even if your site is simple enough to be hosted as plain old html, you're still subject to the traffic spikes that millions of active users can bring without warning.</p>
<p>More importantly, all these changes have made it that much harder for the average user to run their own web sites. At the same time, companies have stepped into the gap to make it easy to publish a blog or connect with your friends (as long as you host your writing or your friendships on the company server.) Most users have never considered controlling their own platforms, instead we see frustration about control manifest itself in ways that many of us geeks find amusing: clamboring to grap "your" Facebook URL before someone else does, pushing the limits of how ugly one can style a Myspace page, anger at site operators for not implementing your favorite imaginary features. These are all reactions to the very real power site operators have over their users.</p>
<p>There are also more-geeky reactions: the OpenID and Data-Portability movements, for instance. While these geek movements have their heart in the right place, I was never able to get excited about them. For as much as they see the value of giving control to users, they're still steeped in the idea of a web owned by vendors, where users can at best "go on strike" to demand more respect. The data-portability movement may have had user's interests at heart, but the ability to send your photos from Flickr to Picasa isn't a huge step up.</p>
<p>I suppose if there's a similar term to describe the p2p web, it'd be "application portability." When an application is designed to run against a client-based CouchDB node, not only can it (and the data it touches) be replicated from (say) Flickr to Picasa, it can also be replicated from Alice to Bob, or even modified by Alice and then replicated for publishing by any generic CouchDB hosting provider.</p>
<p>I tend to dig into the technical details of how this is accomplished, but I'll skip that part for today. Instead I'll just say it's easy. Once you've got a group of people with CouchDB on their laptops, it only takes a few minutes before they are actively sharing and aggregating data in an ad-hoc way. Since applications are just another form of data, they are replicated along with other changes. The big picture here seems to be a little bit harder for experienced developers to grasp, than for new developers and end users.</p>
<p>If you've never jumped through the mental hoops required to build a modern web application cluster, if you've never struggled with defining indexes on a relational database or configuring an HTTP proxy, than you have less invested in the centralized model of application development. I've found that people who haven't yet learned how "to do it right" are quite comfortable with the relaxed model CouchDB provides. "You mean I just save this document and then load it again when I need it?" "I can get this same data onto your computer by clicking that button?" </p>
<p>What is simple for a user requires a lot of relearning on the part of experienced developers. The physics of the web are changing, and a lot of the hard work we had to do to make the centralized model function just doesn't apply to the p2p web. So far I've said more about the centralized web than the peer-to-peer web. I should at least mention that I don't see the centralized web going anywhere anytime soon. There will always be room for online shopping carts and even centralized message routers. But as a general rule of thumb technology has followed the path of least resistance. </p>
<p>When users have the data they care about locally, they can afford to burn much more CPU time pulling interesting patterns from it than (say) Google can use to categorize a particular web page. This in turn frees the centralized services to provide what they do best: message routing and peer discovery. It's nice that Facebook can help me find photos of my friends, but it's frustrating that each time I visit an album I have to wait for image files to cross the wire. Social graph services of the future will be built on the assumption that users have data they may want to share with their friends but not the service provider. It's up to them to figure out how to meet that need.</p>2009-07-03T18:10:48Zjchrishttp://jchrisa.net/drl/NoSQL-SlidesNoSQL Slides<p><a href="http://nosql.net">NoSQL</a> was a rip-roaring good time. It was fun to catch up with old friends as well as get an all day brain-dump of what's going on in the distributed database world. I'm pretty heads-down on CouchDB, so seeing how others have approached a similar problem space was eye opening.</p>
<p>Mostly I was amazed at the level of complexity in the various Big Table clones. My impression is that most of them blend storage concerns with distribution concerns. Mixing it all up in one big distributed porridge allows for optimizations that CouchDB's approach can't yield. However, I think the CouchDB approach of building a solid single-node implementation always with an eye toward distributed uses is just as viable as always.</p>
<p>Meebo's <a href="http://code.google.com/p/couchdb-lounge/">CouchDB-Lounge</a> proxy is a pure HTTP approach to building CouchDB clusters that span multiple machines. It doesn't handle the dynamic nature of truly large clusters, where you can count on nodes continuously leaving and joining. But it makes up for that with a truly simple implementation (just a few hundred lines of Python and C). When the need arises, someone will have an easy time adding facilities for managing dynamic large clusters, because the separation of concerns is so clear.</p>
<p>Maybe I'm underestimating the necessary complexity in large cluster management, but so far betting on simplicity has been a winning strategy and I intend to continue it.</p>
<p>The <a href="http://jchrisa.net/drl/NoSQL-Slides/CouchDB-NoSQL.pdf">slides are available as a PDF here</a>.</p>2009-06-12T21:29:18Zjchrishttp://jchrisa.net/drl/LessLess<p>I like it when people talk about less code. Code slower, all of that. Here's my attempt to talk about less.</p>
<h3>Less Layers</h3>
<p>This is one aspect people find appealing about pure Couch apps. Less layers makes deployment easier. Less layers means less impedance mismatch between layers. This makes applications different.</p>
<p>But which layer do we drop? Model, View, Controller, Client, Server?</p>
<h3>Unlearning</h3>
<p>The hardest things about switching to a document store is learning to think in documents. Documents are self-contained so reconstructing their meaning for users is a simpler task, mostly consisting of presentation. However, CouchDB enforces some constraints of it's own, but they are largely a consequence of it's distributed programming model, so we learn to live with them.</p>
<h3>Model</h3>
<p>Validating user input is crucial for security, as well as useful for providing guarantees to your application's views etc. Most frameworks have you doing this sort of thing in an application server, which is scaled and distributed differently from your database. Without an application server, where do the models live?</p>
<p>In CouchDB models feel functional, not object oriented. This can be a mind fuck but after about six months you get used to it. For instance, CouchDB's validation functions can access <em>only one document</em> at a time, and have no side effects other than blocking invalid database updates.</p>
<p>Imagine if your Rails controller had read-only access to your database, and could make only one select query per request (determined by the URL), and was required to return true or fail with a message. It's that weird. Did I mention validation functions are run during CouchDB replication, not just during user access?</p>
<p>But it makes sense, when you understand the physics of distributed computing. Of course you run validations at replication, because replication is accomplished via the same HTTP channels as normal client access. It's not hard to "replicate" into CouchDB from JavaScript running in a browser.</p>
<h3>View</h3>
<p>Do I have to tell you that CouchDB can transform JSON documents and Map Reduce rows into formatted output, for instance as XML feeds or HTML pages? This blog's Atom feed is generated on CouchDB's server side using a _list function stored in <a href="http://jchrisa.net/drl/_design/sofa">Sofa's design document.</a></p>
<h3>Client / Server</h3>
<p>When CouchDB is running on localhost, this distinction becomes less important in certain ways. In other ways, the constraints of HTTP (definitely a networking protocol) make local applications easier to deploy and manage. When a web application is local, what makes it a web application?</p>
<h3>Controller</h3>
<p>I think the controller is largely moved to the client, in my experience it's wrapped up in event hooks attached to HTML elements. Good riddance, that bit of my Rails app was always a pain anyway. Keep refactoring the controller layer, eventually it's gone. ;)</p>
<p>There are other application models besides MVC, I'm not going into them right away, but I'm taking requests.</p>
<h3>Karaoke</h3>
<p>I could go on all night, but @amysue's playing <em>Jump</em> really loud on the Piano, which means it's time for out-the-door to Chopsticks III. Little known fact, my karaoke name is Grandpa Chris. Protip: if you've ever heard it before, you can sing Wooly Bully, but you have to #leanintoit.</p>2009-05-31T03:22:46Zjchrishttp://jchrisa.net/drl/First-PrincipalsFirst Principles<p>I'm not particularly concerned with people who take issue with some of the CouchDB demos I've been doing lately. Either they don't get it or they're trying hard not too. If you're on the cusp, and you're not sure whether or not you get it, I encourage you to read Jacob Kaplan-Moss's blog post from a couple of years ago: <a href="http://jacobian.org/writing/of-the-web/">"Of the Web"</a>. For the non link-clicking types I'll quote the bits that make me happiest:</p>
<blockquote>
<p>Let me tell you something: Django may be built <em>for</em> the Web, but CouchDB is built <em>of</em> the Web. I've never seen software that so completely embraces the philosophies behind HTTP. CouchDB makes Django look old-school in the same way that Django makes ASP look outdated.</p>
</blockquote>
<p>And more:</p>
<blockquote>
<p>Look, CouchDB may succeeded, and it may fail; who knows. I'm sure of one thing, though - this is what the software of the future looks like.</p>
</blockquote>
<p>So while there may be platforms that can pass bytes around among a particular set of listeners with less overhead than CouchDB, I'd be surprised if they can also subsume that functionality into a web-first database that also has the resilience and flexibility of CouchDB. </p>
<p>Maybe I should characterize all the chat rooms in <a href="http://jchrisa.net/toast/_design/toast/index.html">Toast</a> with word clouds, so you can see which room has the highest signal/noise ratio just by looking at the top terms. Can your MQ do that?</p>2009-05-29T04:42:17Zjchris