Getting specific about generics

guscost · on Sept 8, 2018

This is the best outline of the Go generics issue I've seen. As an outsider to the language (for now), thanks for writing this.

What's the current conversation on sharing binary code? Unless I'm missing something, this is fundamentally incompatible with generic functions that are optimized at compile time. Either the compiled version has to have implementations for "every valid type" or the shared code must be recompiled at the time it is consumed, right?

It looks like the Go ecosystem tends to discourage distributing precompiled binary packages, so maybe this is not a real issue.

grok2 · on Sept 8, 2018

Going meta regarding tech articles, I find that it is easy to read some text-heavy articles (this being one of them) while my eyes glaze over other text-heavy articles (I couldn't easily follow Dave Cheney's article on the same topic of Go generics a few days back, though with a slightly different focus and much shorter). I've wondered what is it about specific pages/articles/writing that make them flow nicely and make it easy to read -- is it the layout of the page (font/font-size/other styling) or is it how the writing flows together or is it a factor of how much thought the author put into presenting what they are thinking or is it me (time of day, interests, etc). Maybe it is just everything!

guscost · on Sept 8, 2018

While the typesetting helps here, I think a lot is due to the writing style. Writing easy-to-read content is difficult, and it’s usually not that obvious when you’re reading it, because you’re absorbing the concepts rather than struggling with the language.

coldtea · on Sept 9, 2018

Obviously it's the writing. You can use "reader" mode on your browser (or with an extension) to verify this.

kbsletten · on Sept 8, 2018

If I remember correctly, Go was only allows source packages and does whole-program optimizations. They designed their compiler to be fast to enable this.

thefounder · on Sept 8, 2018

Why would you use binary code when Go compiles so fast?

nimrody · on Sept 8, 2018

Plugins.

Go programs typically use RPC to communicate with plugins instead of loading dynamic libraries and calling functions in these libraries.

atombender · on Sept 8, 2018

The author promotes code generation as a practical solution to some problems — but doesn't mention the huge drawbacks for generics, which is why it's generally not used.

The reason is that while you can generate a specific implementation when you need it, all the client code will still be using non-generic code. You can't write a function that takes a Set(T) as argument, returns Set(T) as a return value, or receives a Set(T) from a called function — unless you make that part of your code generation pipeline. So the code generation becomes viral, infecting each piece of code that wants to be generic. The entire language needs to be transpiled.

You could lower the barrier considerably by supporting some kind of pluggable preprocessing ("//go:preprocess my_generics_syntax_plugin"), but you'd still run into the problem with combining packages from multiple sources that all wanted to share the same generic types. They would all have to be transpiled the same way.

rocqua · on Sept 8, 2018

Here it depends on how you want to use generics. For map<T1, T2> this isn't much of an issue, since whenever you use it you'll probably be using a concrete instantiation.

Your point still stnds though, which is why proper generics aren't something you 'tack on' to a language. You have to be prepared for it to be used pervasively. This is probably why templates are deep in the C++ StdLib; as an attempt to ensure templates really are 'part of the language'. Sadly, in this case it really expanded the language.

Having a lot of code go through generics isn't necessarily an issue. The potential issue comes from having a lot of different concrete instantiations of your abstract code. That issue might be solveable by smarter compilers.

jasonwatkinspdx · on Sept 8, 2018

Within go this is not an issue, because go dependencies are source based. Outside of go, assuming compiling a c compat lib, is more tricky, but I don't immediately see why go generics can't reduce to the intuitive pattern you'd get from selective/sparse template instantiation in your dependency before it's included.

atombender · on Sept 8, 2018

They're source-based, but that still requires that the caller (calling into a package that's using generated code for generics) use the exact same transpiration tech to generate its own code; if it doesn't, it can only interface with the concrete instances of the generic types (e.g. Set_int or whatever).

coldtea · on Sept 9, 2018

>o the code generation becomes viral, infecting each piece of code that wants to be generic.

Isn't that the whole idea?

atombender · on Sept 11, 2018

Generics spreading throughout your app is not a problem, it's that a specialized code-generation solution would work this way. Both apps and libraries would need the same level of support, and at the end of the day you've invented a new language that happens to transpile to Go, on which you'll want to build everything. (In fact, such a language has already been created [1].) It's the awkwardness of the split between ordinary code and transpilation that causes the issues, and the resulting interop issues and fragmentation/competition between flavours.

[1] https://github.com/albrow/fo

hinkley · on Sept 8, 2018

I spent a bit of time thinking about this when I was younger, and got as far as thinking about the compiler or runtime relying on some sort of code reuse to improve the space complexity of generics. Each use would have to have its own metadata but should be able to share many function bodies.

It won’t work for primitives but I find myself using arrays of primitives less and less.

mikeash · on Sept 8, 2018

Swift does this. A generic function compiled to a single actual function, which accepts metadata about the types passed in which allow it to do the things it needs. The optimizer may, of it decides it’s worthwhile, generate specialized versions of the function, but it’s never necessary.

jasonwatkinspdx · on Sept 8, 2018

I'm guessing the final golang solution will be like this, but with both runtime/boxed and static/typeswitch implementations generated by default, and some heuristic specializations between composing them.

knocte · on Sept 8, 2018

She mentions the C approach, C++ approach and Java approach. But I think she should have added the C# approach: taking all the advantages of the Java approach, without the slowness of casting.

owenjonesuk · on Sept 8, 2018

I think the author is a woman, from the name Emily.

knocte · on Sept 8, 2018

fixed, thanks

magicalhippo · on Sept 8, 2018

Casting isn't slow, it's the boxing. A generics List<T> isn't much slower than the non-generic List if T is some class. But if T is say an integer, then you're paying for the implicit boxing and unboxing.

microtonal · on Sept 8, 2018

A generics List<T> isn't much slower than the non-generic List if T is some class.

This is only true generics using erasure, such as in Java. Since Java will store Objects (pointers to objects) in both cases. But in languages that use e.g. monomorphization a List<T> stores the Ts adjacent in memory, since the size of each element is known, whereas a non-generic list has to store pointers to objects, since the object sizes are not known and may vary per element. A generic List<T> that is monomorphized has two large benefits: fewer cache misses due to locality and more opportunities to inline T method implementations.

Someone · on Sept 8, 2018

You can only do that if you can prove that T will never have subclasses that are larger.

To check that, the compiler either has to see all code that will run in the process, which means whole program compilation and no plugins, or the language needs a way to make classes that cannot be subclassed, preventing code that the compiler didn’t see from subclassing T (https://en.wikipedia.org/wiki/Class_(computer_programming)#N...)

microtonal · on Sept 9, 2018

You can only do that if you can prove that T will never have subclasses that are larger.

Only if generic type parameters are covariant, if type parameters are invariant, this is not a problem. I guess that most languages with subtypes use invariance by default.

Also, many languages that do generics through monomorphization do not support subclassing (Rust, Haskell, etc.).

no plugins,

Well, there is always the option of boxing collection elements. (E.g. by using Box in Rust.)

catnaroek · on Sept 9, 2018

In general, Haskell does not do parametric polymorphism through monomorphization. In particular, higher-rank polymorphism becomes unusable if polymorphism is implemented through monomorphization. On the other hand, if I recall correctly, Rust only supports higher-rank polymorphism for lifetimes, which neither have nor need a runtime representation. This is why monomorphization is a viable implementation strategy for Rust generics.

(Aside: Type checkers essentially see recursive function definitions as applications of a fixed point operator to non-recursive functions. If your function has a rank-1 type but uses polymorphic recursion, the type checker sees it as the application of a rank-2 fixed point operator to a non-recursive function. This is why I see polymorphic recursion as “morally higher-rank polymorphism”, even when the type signatures in your code are ostensibly rank-1 ones. Polymorphic recursion is widely used in Haskell.)

However, IMO, you only need rank-1 polymorphism 95% of the time anyway, so optimizing for the common use case is a good strategy. By far, the main use case for generics is implementing efficient and reasonably reusable data structures and algorithms in a reasonably type-safe way. For this use case, monomorphization and aggressive inlining of small functions are evidently the right things to do. Other uses of generics (say, streaming I/O frameworks) strike me as a lot more questionable.

magicalhippo · on Sept 8, 2018

I was assuming the parent post was referencing the Java implementation as described in the article.

marmaduke · on Sept 8, 2018

Doesn’t that work because C# has a compiler to JIT stuff at runtime?

In principle it could work in te JVM as well right?

magicalhippo · on Sept 8, 2018

Yes, the C# approach relies on JIT. IIRC for Java they decided backwards compatibility in the JVM was more important than doing it the C# way.

int_19h · on Sept 9, 2018

Conversely, once you remove the JIT, a bunch of stuff stops working. Look at the restrictions on generics for .NET Native.

skybrian · on Sept 8, 2018

Pointing to inlining and saying "problem solved" isn't really a solution because inlining can lead to code bloat just like templates can. They amount to generating the same thing at runtime.

For this to be an implementation detail, the compiler needs to be able to choose not to inline sometimes.