Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm really excited for this. I might finally switch to python now!


Indeed, there's some incredible work put into the Python data analysis toolkit. Pandas is very impressive. However, no GGplot was holding me back from switching but this and improvements to other graphing libraries really make me want to use Python in my next projects.


As someone who does all my data analysis in Python, I'm curious what it is about GGplot that it's absence was a deal breaker for you? It's obviously quite powerful, considering all the excitement this port has generated, but I haven't seen anything explain what it provides over other plotting solutions.


You need understand what ggplot is all about, i.e a structured/formulaic way to think about and build graphs. Explaining it cannot possibly do it justice. You have dive into it, even a little bit and the way you see graphs and plots totally changes, you immediately buy into it and would never wish to go back.

I haven't really said anything to give you a sense of comparison. It's kind of like PGs essay (http://www.paulgraham.com/avg.html) about how the merits of higher level programming (specifically in lisp) can hardly be explained to those who have not been exposed to it.


The merits of higher level programming are hard to explain to a Blub programmer, but the differences are trivial to explain. I can tell a C programmer that Lisp has automatic memory management, higher order functions, macros, and atoms. She may not know why those matter, but I can tell her they exist. I can tell a Python programmer that Haskell has pure functions, Monads, Arrows, and an advanced type system. He'll think those will make his life harder, but he'll know they're there.

What makes GGplot different? I know that I won't understand why this difference matters until I've played with it, but it will at least tell me where to start playing.


Here's another stab at why ggplot2 is a Good Thing. In the easiest plotting case, there is a function that builds exactly the visualization that you're looking for. This is great as long as you don't need to do anything that deviates from the normal set of barplots, histograms, linegraphs, piecharts (shudder), etc.

But let's say you need something a little different. Maybe you want to add additional dimensions to that dot plot. Maybe you want the dots' size or color to be mapped to different aspects of your data. In a world without ggplot2 (or more generally the grammar of graphics), you're pretty much stuck going to a very primitive drawing system in which you're specifying the virtual path of a pen, or working with basic geometric figures.

The grammar of graphics and ggplot2 occupy a sweet middle ground between being able to simply pick an off-the-shelf visualization, and needing to draw the whole damn works manually. And because the grammar really is consistent, you can also play with different facets of your data and build completely different charts to see which is better at presenting your thesis.

In short, ggplot2 rocks, Hadley/Leland rock, and a port of ggplot2 to Python is nothing but good news for the Python community.


I mentioned the difference

"a structured/formulaic way to think about and build graphs"

This is what plotting using the "grammar" provides. Each plot is a like a sentence, for a sentence can be composed using some of the following - verb, noun, adverb, adjective, interjection, pronoun, proposition, conjunction. Alike, ggplot2 allows you to think of a graph in terms of layers, which layers can have different components that describe the "geometry" or "statistics" of the plot. Plus, there is a lot more.

You could probably describe your favourite plotting package as structured and formulaic, but an experience with ggplot2 would convince you otherwise.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: