summaryrefslogtreecommitdiffhomepage
path: root/tools/go_marshal/README.md
blob: 481575bd3b3c843e31cf9f972b0a44070c9fec7f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
This package implements the go_marshal utility.

# Overview

`go_marshal` is a code generation utility similar to `go_stateify` for
automatically generating code to marshal go data structures to memory.

`go_marshal` attempts to improve on `binary.Write` and the sentry's
`binary.Marshal` by moving the go runtime reflection necessary to marshal a
struct to compile-time.

`go_marshal` automatically generates implementations for `abi.Marshallable` and
`safemem.{Reader,Writer}`. Call-sites for serialization (typically syscall
implementations) can directly invoke `safemem.Reader.ReadToBlocks` and
`safemem.Writer.WriteFromBlocks`. Data structures that require custom
serialization will have manual implementations for these interfaces.

Data structures can be flagged for code generation by adding a struct-level
comment `// +marshal`.

# Usage

See `defs.bzl`: two new rules are provided, `go_marshal` and `go_library`.

The recommended way to generate a go library with marshalling is to use the
`go_library` with mostly identical configuration as the native go_library rule.

```
load("<PKGPATH>/gvisor/tools/go_marshal:defs.bzl", "go_library")

go_library(
    name = "foo",
    srcs = ["foo.go"],
)
```

Under the hood, the `go_marshal` rule is used to generate a file that will
appear in a Go target; the output file should appear explicitly in a srcs list.
For example (note that the above is the preferred method):

```
load("<PKGPATH>/gvisor/tools/go_marshal:defs.bzl", "go_marshal")

go_marshal(
    name = "foo_abi",
    srcs = ["foo.go"],
    out = "foo_abi.go",
    package = "foo",
)

go_library(
    name = "foo",
    srcs = [
        "foo.go",
        "foo_abi.go",
    ],
    deps = [
        "<PKGPATH>/gvisor/pkg/abi",
        "<PKGPATH>/gvisor/pkg/sentry/safemem/safemem",
        "<PKGPATH>/gvisor/pkg/sentry/usermem/usermem",
    ],
)
```

As part of the interface generation, `go_marshal` also generates some tests for
sanity checking the struct definitions for potential alignment issues, and a
simple round-trip test through Marshal/Unmarshal to verify the implementation.
These tests use reflection to verify properties of the ABI struct, and should be
considered part of the generated interfaces (but are too expensive to execute at
runtime). Ensure these tests run at some point.

```
$ cat BUILD
load("<PKGPATH>/gvisor/tools/go_marshal:defs.bzl", "go_library")

go_library(
    name = "foo",
    srcs = ["foo.go"],
)
$ blaze build :foo
$ blaze query ...
<path-to-dir>:foo_abi_autogen
<path-to-dir>:foo_abi_autogen_test
$ blaze test :foo_abi_autogen_test
<test-output>
```

# Restrictions

Not all valid go type definitions can be used with `go_marshal`. `go_marshal` is
intended for ABI structs, which have these additional restrictions:

-   At the moment, `go_marshal` only supports struct declarations.

-   Structs are marshalled as packed types. This means no implicit padding is
    inserted between fields shorter than the platform register size. For
    alignment, manually insert padding fields.

-   Structs used with `go_marshal` must have a compile-time static size. This
    means no dynamically sizes fields like slices or strings. Use statically
    sized array (byte arrays for strings) instead.

-   No pointers, channel, map or function pointer fields, and no fields that are
    arrays of these types. These don't make sense in an ABI data structure.

-   We could support opaque pointers as `uintptr`, but this is currently not
    implemented. Implementing this would require handling the architecture
    dependent native pointer size.

-   Fields must either be a primitive integer type (`byte`,
    `[u]int{8,16,32,64}`), or of a type that implements abi.Marshallable.

-   `int` and `uint` fields are not allowed. Use an explicitly-sized numeric
    type.

-   `float*` fields are currently not supported, but could be if necessary.

# Appendix

## Working with Non-Packed Structs

ABI structs must generally be packed types, meaning they should have no implicit
padding between short fields. However, if a field is tagged
`marshal:"unaligned"`, `go_marshal` will fall back to a safer but slower
mechanism to deal with potentially unaligned fields.

Note that the non-packed property is inheritted by any other struct that embeds
this struct, since the `go_marshal` tool currently can't reason about alignments
for embedded structs that are not aligned.

Because of this, it's generally best to avoid using `marshal:"unaligned"` and
insert explicit padding fields instead.

## Debugging go_marshal

To enable debugging output from the go marshal tool, pass the `-debug` flag to
the tool. When using the build rules from above, add a `debug = True` field to
the build rule like this:

```
load("<PKGPATH>/gvisor/tools/go_marshal:defs.bzl", "go_library")

go_library(
    name = "foo",
    srcs = ["foo.go"],
    debug = True,
)
```

## Modifying the `go_marshal` Tool

The following are some guidelines for modifying the `go_marshal` tool:

-   The `go_marshal` tool currently does a single pass over all types requesting
    code generation, in arbitrary order. This means the generated code can't
    directly obtain information about embedded marshallable types at
    compile-time. One way to work around this restriction is to add a new
    Marshallable interface method providing this piece of information, and
    calling it from the generated code. Use this sparingly, as we want to rely
    on compile-time information as much as possible for performance.

-   No runtime reflection in the code generated for the marshallable interface.
    The entire point of the tool is to avoid runtime reflection. The generated
    tests may use reflection.