Package Specialization in Go
This is a follow-up to my inital post about Go Generics. In this post I attempt to scale back the approach outlined there to be the simplest possible while still providing the full power of parametrized code, and sketch the semantics more precisely. The problem space I’m trying to explore remains the same, which is to use interfaces as a means to write generic code in Go.
Headline features are:
- use interfaces to express parametric types;
- no new keyword;
- the only new syntax is “package specialization”:
pkg(T=int, Y=*MyType).Func(); - generic code which doesn’t use the package specialization syntax is already valid Go.
Overview
This proposal introduces the ability to specialize a package. This is
achieved by “replacing” a set of interface types defined in that package with
some concrete types. In this section I will go through a basic example, before
attempting to define more precisely the syntax and semantics of this feature and
then giving a more varied set of examples. The slice package below defines a
Reverse function that reverses a slice of type T in place.
package slice
type T interface{}
func Reverse(l []T) {
i, j := 0, len(l) - 1
for i < j {
l[i], l[j] = l[j], l[i]
i++
j--
}
}
Under this proposal, I can use this to reverse a slice of any type like this:
package main
import "github.com/arnodel/slice"
func main() {
hw := []byte("!dlrow ,olleH")
// Type T is "specialized" to byte in the slice package
slice(T=byte).Reverse(hw)
fmt.Println(string(hw))
// ... means the compiler may infer the type specializations
slice(...).Reverse(hw)
fmt.Println(string(hw))
}
In particular, existing packages can now be specialized without change
(including in the standard library). Here is an example from the sort
package.
package main
import sort
func sortStrings(l []string) {
// This is different from sort.Sort(sort.StringSlice(l))
// because the specialized Sort function has type
// func(sort.StringSlice)
// instead of
// func(sort.Interface)
sort(...).Sort(sort.StringSlice(l))
}
Syntax
The only syntactical change to Go required to express this is to the Qualified Identifier rule, which is changed from the current rule
QualifiedIdent = PackageName "." identifier .
to the following rule
QualifiedIdent = PackageName [ PackageSpec ] "." identifier .
where PackageSpec is defined as follows (AliasDecl being the Go Alias
Declaration rule)
PackageSpec = "(" { AliasDecl ";" } [ "..." ] ")" .
Semantics
The following explanation could be made rigorous, but I think that would be at the expense of a lot of readability so I have brushed over details intentionally. I have kept some form of mathematical notation though to avoid being to verbose, although I am aware that the result is far from satisfactory
Package specialization $\sigma$
We define a package specialization as
where
- $P$ is a package that defines public interface types $I = I_1, \ldots, I_n$ at package scope;
- $T = T_1, \ldots, T_n$ are types defined in packages $P_1, \ldots, P_n$ ($P_i$ is allowed to be the “builtin” package if $T_i$ is a builtin type).
Example. In the expression slice(T=byte).Reverse() in the previous
section, the specialization $\sigma$ is slice(T=byte).
The concrete package of $\sigma$
We can define the concrete package $K_\sigma$ of $\sigma$ as follows.
- If there is a package $K\in{P, P_1, \ldots, P_n}$ so that $K$ import all the other packages in this set, then $K_\sigma$ is defined to be $K$ but on a different namespace (to allow dependencies between both but avoid name clashes).
- Otherwise we make a unique package $K_\sigma = K(P, P_1,\ldots,P_n)$ which imports all of $P, P_1, \ldots, P_n$.
Example. The concrete package of the specialization slice(T=byte), is the
slice package itself as the byte type is builtin.
Implementation of $K_\sigma$
The package $K_\sigma$ is populated with
- definitions for $I_1, \ldots, I_n$ as $T_1, \ldots, T_n$ (this is possible because $K_\sigma$ imports all of $P_1, \ldots, P_n$);
- all the package-scope definitions from the original package $P$ that depend on $I_1, \ldots, I_n$;
- package-scope definitions which do not depend on $I_1, \ldots, I_n$ are just aliases for the original ones in $P$.
If this results in valid Go, then we call $\sigma$ a valid specialization.
Example. To implement the slice(T=byte) specialization, we add the
following definitions to the slice package, where __spec__ represents a
namespace prefix unique to this specialization, resulting in the following
definitions.
type __spec__T = byte
func __spec__Reverse(l []__spec__T) {
i, j := 0, len(l) - 1
for i < j {
l[i], l[j] = l[j], l[i]
i++
j--
}
}
Implementation of $\sigma.S$
Now if $S$ is an identifier declared in $P$ at package scope, then $P(I_1=T1, \ldots, I_n=T_n).S$ is simply an alias for $K_\sigma.S$.
Example. The expression slice(T=byte).Reverse() is resolved as
slice.__spec__Reverse().
Examples
In this section, I am trying to give simple but realistic examples of issues that this proposal would be able to solve. I have picked a few use cases from the Why Generics? post from the Go Blog.
A generic set package
Here is a simple generic set package.
package set
type Item interface{}
type Set map[Item]struct{}
func New() Set {
return make(Set)
}
func (s Set) Add(x Item) {
s[x] = struct{}{}
}
func (s Set) Remove(x Item) {
delete(s, x)
}
func (s Set) Has(x Item) bool {
_, ok := s[x]
return ok
}
I can implement a generic Uniq function using it:
package uniq
import "github.com/arnodel/set"
type Item interface{}
func Uniq(items []Item) []Item {
seen := set(Item=Item).New() // <- Specialization of set here!
var uniq []Item
for _, x := range items {
if !seen.Has(x) {
seen.Add(x)
uniq = append(uniq, x)
}
}
return uniq
}
Generic channel operations
This is copied almost verbatim from the Pipelines post in the Go Blog.
package channel
type T interface{}
func Merge(cs ...<-chan T) <-chan T {
var wg sync.WaitGroup
out := make(chan T)
// Start an output goroutine for each input channel in cs. output
// copies values from c to out until c is closed, then calls wg.Done.
output := func(c <-chan T) {
for n := range c {
out <- n
}
wg.Done()
}
wg.Add(len(cs))
for _, c := range cs {
go output(c)
}
// Start a goroutine to close out once all the output goroutines are
// done. This must start after the wg.Add call.
go func() {
wg.Wait()
close(out)
}()
return out
}
It could then be used as follows to make a basic implementation of tail -f
package main
import "github.com/arnodel/channel"
func main() {
files := getInputFiles()
inputChans := make([]chan string, len(files))
for i, f := range files {
inputChans[i] := getLineChannel(f)
}
outputChan := channel(T=string).Merge(intpuChans...)
for l := range outputChan {
fmt.Println(l)
}
}
func getInputFiles() []io.Reader {
// Return the files specified on the command line.
}
func getLineChannel(r io.Reader) chan string {
// Return a channel onto which lines from the reader are pushed.
}
Generic Map and Reduce functions on slices
Imagine I have a slice package that provide some slice manipulation primitives.
package slice
type T interface{}
type U interface{}
func Map(ts []T, f func(T) U) []U {
us := make([]U, len(ts))
for i, t := range ts {
us[i] = f(t)
}
return us
}
func Reduce(ts []T, f func(T, U) U, u U) U {
for _, t := tange ts {
u = f(t, u)
}
return u
}
This package defines “generic” Map and Reduces functions. They could be
used as is, but that wouldn’t be very useful because of the constant interface
wrapping / unwrapping that would be required. But specializing them makes them
more universally applicable.
Now imagine I want to use those in my vector package.
package vector
import "github.com/arnodel/slice"
type Scalar interface {
Add(Scalar) Scalar
Mul(Scalar) Scalar
}
type Vector []Scalar
func (v Vector) Norm() Scalar {
sq := slice(T=Scalar; U=Scalar).Map(v, func (x Scalar) { return x.Mul(x) })
sumsq := slice(T=Scalar; U=Scalar).Reduce(sq, Scalar.Add)
return math.Sqrt(sumsq)
}
// Many more functions!
I can specialize the vector.Vector type to Float
package vec64
import "github.com/arnodel/vector"
type Float64 float64
func (x Float64) Add(y Float64) {
return x + y
}
func (x Float64) Mul(y Float64) {
return x*y
}
type Vec64 = vector(Scalar=Float64).Vector
Mutually referencing type parameters
In the Go Contracts Proposal there is an example of generic types / functions
with mutually referencing type
parameters
which involves a graph package defining a contract over two type parameters:
Node and Edge. It can be expressed using package specialization as follows.
Note that the code is valid Go already!
package graph
type Node interface {
Edges() []Edge
}
type Edge interace {
Nodes() (from Node, to Node)
}
type Graph struct { ... }
func (g *Graph) ShortestPath(from, to Node) []Edge { ... }
// For reference, the contract proposal implements this package as follows:
//
// contract G(Node, Edge) {
// Node Edges() []Edge
// Edge Nodes() (from Node, to Node)
// }
// type Graph(type Node, Edge G) struct { ... }
// func New(type Node, Edge G)(nodes []Node) *Graph(Node, Edge) { ... }
// func (g *Graph(Node, Edge)) ShortestPath(from, to Node) []Edge { ... }
This package can then be used as follows in another package
type Vertex struct { ... }
func (v *Vertex) Edges() []*FromTo { ... }
type FromTo struct { ... }
func (ft *FromTo) Nodes() (*Vertex, *Vertex) { ... }
var g = graph(Node=*Vertex, Edge=*FromTo).New([]*Vertex)
// In the contract proposal, this is how you instanciate a graph:
//
// var g = graph.New(*Vertex, *FromTo)([]*Vertex{ ... })