Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

PyTorch is a project primarily funded/developed by Facebook/FAIR. Pandas is fully an open-source project, without corporate control.

Corporate control means tighter development schedules and consistent API's. It also means that if you don't like the path FAIR has chosen, too bad. As a result, there's multiple competing options in the deep learning space: Tensorflow (Google), MXNet (Amazon), CNTK (Microsoft), Paddle (Baidu), etc.

On the other hand, Pandas is something for everyone. The lack of opinioniation means that it can be easily adopted anywhere. Can you imagine what data science/analysis would feel like with multiple low-level Pandas competitors, from different corporations? Each one would feel consistent, but none would work together (and imagine building an ML platform which supported multiple dataframe sources).

I do sometimes miss working in R - yes, R takes flexibility to a fault, but there's a consistent set of primitives that mostly get reused. Perhaps R gives off that impression because of the work done by Hadley and others to build tooling according to the tidyverse principles. I wonder if Julia will combine the best of these worlds in the future.



> Can you imagine what data science/analysis would feel like with multiple low-level Pandas competitors, from different corporations? Each one would feel consistent, but none would work together (and imagine building an ML platform which supported multiple dataframe sources).

> I do sometimes miss working in R - yes, R takes flexibility to a fault, but there's a consistent set of primitives that mostly get reused. Perhaps R gives off that impression because of the work done by Hadley and others to build tooling according to the tidyverse principles. I wonder if Julia will combine the best of these worlds in the future.

Funny that you mention R, which has exactly what you criticized before (base R data.frame, tidyverse/tibble, data.table), not to mention at least 6 different packages/datastructures to represent time series.


I feel data.frame and tibble are mostly compatible (you can use tidyverse tools on dataframes), and nearly all R users use one or the other, while data.table is used by a few finance folks who grumble about how slow tidyverse is.


Tidyverse tools work on data.frame objects but they do have the unfortunate and undocumented habit of turning them into tibbles.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: