Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That’s a really good question and I’m not sure I fully understand all the reasons. One big one is that Java intentionally has made interop with native libraries quite difficult. One cannot take a language seriously for numerical computing if you can’t easily call BLAS, LAPACK and FTTW just to start, and there’s a neverending supply of such efficient native libraries. Julia, on the other hand makes it easy to write efficient code in Julia but also makes it easy to call native libraries written in other languages. Easy interop with numerical libraries was pretty much the first design criterion.

There’s also some unfortunate choices Java made like standardizing one specific semantics for reproducible floating point code. That’s unfortunate because adjusting for native SIMD widths sacrifices reproducibility but improves both accuracy and speed. The only choice if you want perfect reproducibility on all hardware that Java supports is the worst performance model with the worst accuracy.

There’s also the fact that Java integers are 32-bit and Java arrays are limited to 2GB, which was reasonable when Java was designed, but is pretty limiting for modern numerical computing.

I also think that the JVM object model is quite limiting for numerical computing. They still don’t support value types, but value types are precisely what you want to represent efficient numerical values like complex numbers, or quaternions or rationals, and so on. Java forces all user-defined types to be heap-allocated reference types. Julia solves this by defaulting to immutable structures, which is exactly what you want for numerical values: the semantics are still referential (if you can’t mutate you can’t distinguish value semantics from reference semantics), you just can’t change values, which is exactly how you want numbers to behave (you don’t want to be able to change the value of 2).

Lack of value types in Java also makes memory management unnecessarily challenging. You can’t make a user-defined type with an efficient C-compatible array layout in Java. Because the objects are references, so the array is an array of pointers to individual heap-allocated objects. The ability to subtype classes forces that, but even with final classes, the ability to mutate objects also forces it, since pulling an object reference out of an array and modifying it is required to modify the object in the array (reference semantics), and that’s incompatible with the inline array layout.

And finally, this same lack of value types puts a LOT of pressure on the garbage collector.



> I also think that the JVM object model is quite limiting for numerical computing. They still don’t support value types

This is mostly true, but the primitives are value types and you can get some things done with them. (Not enough to make Java good for these use cases, no.) I.e. write float[] instead of Float[] and you have a contiguously allocated region of memory that can be efficiently accessed.


the fact that they couldn't make primitive vs object invisible is just incredibly dumb though. it's one of the many ways c# is purely superior.


C# is certainly more flexible in this regard but I wouldn’t say they made the distinction between primitive and object invisible, they just allowed user-defined “primitives” in the sense that users can define C-style structs that have value semantics. The type system is still bifurcated between value types and reference types. That’s ugly but livable in a static language where you can just disallow mixing the two kinds of types. And it’s not just theoretically ugly: it means you cannot write generic code that works for both kinds of types. Or you can yolo it like C++ templates and just allow whatever by textual substitution—to hell with the semantic differences. But of course then you get all kinds of surprising differences in fundamental behavior based what kinds of types you try to apply your generics to. Which is exactly what happens with C++ templates.

What do you do in a dynamic language where no such separation can be enforced? Or in a static language where you just don’t want that kind of ugly anti-generic bifurcation of your type system? The classic static PL answer is to just make all objects have reference behavior. (Which is the approach Java takes with a pragmatic ugly exception of primitive types.) This is the approach that ML and its derivatives take, but which is why they’re unsuitable for numerical computing—in ML, Haskell, etc. objects have “uniform representation” which means that all objects are represented as pointers (integers are usually represented as special invalid pointers for efficiency), including floating point values. In other words an array of floats is represented in ML et al. as an array of pointers to individual heap-allocated, boxed floats. That makes them incompatible with BLAS, LAPACK, FFTW, and just generally makes float arrays inefficient.

So what does Julia do? Instead of having mutable value and reference types, which have observably different semantics, it has only reference types but allows—and even defaults to—immutable reference types. Why does this help? Because immutable reference types have all the performance and memory benefits of value types! Yet they still have reference-compatible language semantics, because there’s no way to distinguish reference from value semantics without mutation. In other words immutable structs can be references as far as the language semantics are concerned but values as far as the compiler and interop are concerned. And you can even recover efficient mutation whenever the compiler can see that you’re just replacing a value with a slightly modified copy—and compilers are great at that kind of optimization. So all you give up is the ability to mutate your value types—which you don’t even want to allow for must numeric types anyway—and which you can simulate by replacement with modification. This seems like a really good trade off and it’s a little surprising that more languages don’t make it.


> C++ templates and just allow whatever by textual substitution

The Julia approach is indeed elegant. An in-between position taken by F# is statistically resolved type parameters, where type safety is maintained and semantics largely preserved by constraints on members. It's a pain to write but trivial to consume though errors can be obstruse. More type level flexibility on the CLR (which seems to be planned) will further improve things when it comes to generic programming.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: