More

colonhyphenp · on Sept 12, 2011

Can you go into some more detail why you would not choose it? I am curious.

deweller · on Sept 12, 2011

I don't have facts to back up my opinion - It is formed by memories of articles I've read.

I believe CouchDB is a better choice for very large data sets because of its design.

+ CouchDB uses a Map Reduce design that I believe would scale better over very large data sets. + CouchDB always stores data in a consistent state on disk. You can literally pull the plug on the server at any time and the data will never be inconsistent.

MongoDB is geared for performance and is a great bridge between a relational database and a high-performance No-SQL database. But I don't recall that it's strength is handling large datasets.

ethangunderson · on Sept 12, 2011

Comparing map/reduce in Mongo and Couch is really apples and oranges. They are designe to do two different things. i.e data processing vs building views.

Mongo is designed from the ground up to deal with large datasets. Take a look at their sharding architecture.

catch23 · on Sept 12, 2011

I guess it depends what you consider "very large". If you're talking about multi-petabyte, then I'd probably use hdfs, but otherwise mongodb might fit. I hear craigslist uses mongodb to store their data since 1997, which is a fair amount of data I believe.

jzawodn · on Sept 12, 2011

Uhm. The data goes back that far but the usage is far more recent than that.

Wijnand · on Sept 12, 2011

1997, are you sure?

simonw · on Sept 12, 2011

See http://blog.mongodb.org/post/5545198613/mongodb-live-at-crai...

duggan · on Sept 12, 2011

When he says "store their data since 1997" I believe he means "store data dating from 1997 on" - slight confusion in choice of wording!

alnayyir · on Sept 12, 2011

CouchDB is better'ish for larger datasets, but not for arbitrary scaling. MapReduce in CouchDB requires dumb full-scans if you're not just refreshing an existing view.

Arbitrarily large data is the exclusive domain of hadoop/hypertable/cassandra AFAIK atm.

tmcneal · on Sept 12, 2011

To be fair CouchDB is very explicit that to get any sort of performance, everything must be a view. "Ad-hoc queries" (i.e. queries that are written on the fly instead of uploaded as a view) are clearly stated as "for development only".

Where CouchDB really falls flat is for write-heavy applications. The default configuration in CouchDB is to not reindex a view until it has been read. When a read occurs, any new data in a view that was added since the last read must be re-indexed by executing the map/reduce functions on that data. If you're writing frequently to CouchDB but not reading a lot (as in a data warehouse) the first query you run is going to be extremely slow, since it will need to run map/reduce on a lot of new data. CouchDB doesn't distribute work to multiple nodes like Hadoop, and I've found even simple reduce functions to slow down re-indexing by a factor of 10. I think CouchDB has settings now to update the index on commit, or you could always run a cron job to regularly query the view and force a reindex, but it's still going to be slow.

BigCouch (https://cloudant.com/#!/solutions/bigcouch) might be a potential choice for data warehousing, since it advertises full compatibility with the CouchDB API but offers distributed map/reduce like Hadoop/Hive/etc. I haven't used it though.

alnayyir · on Sept 12, 2011

Couch is definitely a lot more honest about their limitations than mongo or riak, but my experiences make me hesitant to recommend it to anyone not intimately familiar with those limitations.

jchrisa · on Sept 12, 2011

This is part of what we are addressing with Couchbase Server, an autosharding rebalancing Couch fronted by memcached. For K/V read and write we measure microsecond latency.

We are currently optimizing the views for cluster access, but the design goal is to offer at least the query performance CouchDB offers on small datasets, even on very large clusters.

More info: http://blog.couchbase.com/couchbase-server-2-0-tour-and-demo

alnayyir · on Sept 13, 2011

Is it going to have the same/similar multi-replication for enterprise only limitation that Riak has or is this going to be funded in some other way?

noodle · on Sept 13, 2011

i would not use it in future projects, myself, because my company is currently using it in several products in several different ways, and it has been nothing but headaches, problems, etc..

the theory behind the thing is great. in reality, its buggy and not fun to work with.

colonhyphenp · on June 19, 2011

That list is just for .com domains. This Wikipedia entry lists old .com, .org, .edu and .net domains: http://en.wikipedia.org/wiki/List_of_the_oldest_currently_re...

Also, I found it amusing how much overlap there is between the 50 oldest .edu domains and the US News list of top CS grad programs: http://grad-schools.usnews.rankingsandreviews.com/best-gradu...

colonhyphenp · on Oct 8, 2010

Without thinking I tried moving the character around using the vi "hjkl" navigation keys, and it works! Mad props for that.

benologist · on Oct 8, 2010

What kind of keyboard do you have? I've never heard of hjkl as a control scheme (I make Flash games).

Edit: never mind, I found it.

psawaya · on Oct 8, 2010

I was hoping that someone here would notice this! :)

colonhyphenp · on June 27, 2010

Regarding the heat issue you are having with your laptop: I was having similar problems with a different laptop, and I got an "XPad Slim" (http://www.xpad4laptop.com/) cooling pad to put under it. It weighs around a pound, has no fans, and does a pretty decent job of dispersing heat from the computer. No more burning legs!

cperciva · on June 27, 2010

Hmm, I may try that. Thanks!

colonhyphenp · on April 24, 2010

Ha, got that right !

jacquesm · on April 24, 2010

Just out of curiosity, how much do you make on writing an article like that and stuffing it with links?

colonhyphenp · on April 24, 2010

I doubt I'll make anything. I honestly just posted the article to show how easy it is to upgrade the MBP hard drive. It's my first time messing with Amazon Affiliates, and blogspot has a nice plug-in that makes it really easy to go amazon-link-happy.

jacquesm · on April 24, 2010

Ah, that explains a lot. I was wondering why quite a few blogspot articles have amazon links in them. The problem is that it becomes very hard to see the difference between borderline spam and genuine articles. That's also how you got those tor-x links in there, I actually figured that was proof positive it was a spam article, because who would go out of their way to link an affiliate code to a bunch of tools they used.

I'll 'unflag' the article.

Thanks for the explanation!

colonhyphenp · on April 24, 2010

Interesting. I didn't realize amazon affiliate links carried such a negative connotation. I'll definitely keep that in mind for future posts and remove a few of the affiliate links from this post.

jacquesm · on April 24, 2010

Don't worry about it, it's a pet peeve of mine :)

colonhyphenp · on March 18, 2010

Last year a friend taught me about the bash trick "CTRL-R" <start typing 'ssh' or some other previously run command> on the command line for reverse history searching, and it is an amazing time saver. It acts as a great alternative to #8, "Find the last command that begins with “whatever,” but avoid running it"

$ !whatever:p

10ren · on March 18, 2010

I use up-arrow for that. My .inputrc has:

    "\e[A": history-search-backward

If you don't type anything, it acts exactly as the old up-arrow.

imurray · on March 19, 2010

Or in ~/.zshrc:

    bindkey '\e[A'  history-beginning-search-backward

oyving · on March 18, 2010

An important addition to that is to repeatedly hit ctrl-r to cycle through the matching commands in the history.

erlanger · on March 18, 2010

How do you reverse the search? <S-C-R> didn't do it for me. Either way this could be a good substitute for what I do now, tediously grepping .zsh/bash_history...

mziulu · on March 18, 2010

If by reverse you mean going forward instead of backwards, that should be done with Ctrl-S; it does not work in Bash though, since Ctrl-S locks the scrolling of the terminal. I believe this behavior can be overridden, through some .xinputrc settings perhaps, but I haven't still found enough motivation to look it up!

edit: ok, I've found this: http://wiki.archlinux.org/index.php/Bash#History_search

araneae · on March 18, 2010

I'm pretty I learned this from reading an article on HN... anyone got a citation for me?

pkrumins · on March 18, 2010

Perhaps my article "The Definitive Guide to Bash History?"

http://www.catonmat.net/blog/the-definitive-guide-to-bash-co...

colonhyphenp · on Feb 23, 2010

I learned the basics of Haskell by doing some problems from http://projecteuler.net. It's probably not as exciting as actually building something using the language, but it's definitely worth checking out.

colonhyphenp · on Oct 26, 2009

The maintainers of the contraptor project have done some pretty neat demonstrations with an Arduino. I was particularly impressed with this X-Y plotter/drilling assembly: http://www.contraptor.org/fast-drilling-contraption

colonhyphenp · on Aug 3, 2009

Yeah the title is definitely misleading. The word "hacking" isn't even mentioned in the article besides in the title. Replace "hacking contest" with "programming competitions" and "NSA" with "NSA sponsored" and you have a more accurate title.

colonhyphenp · on May 21, 2009

Whole Foods is a good place to go for freshly roasted beans - Most locations I have been to roast their coffee in the store and put roast dates on it.