avoid checked entry points in closures

Given the code like this:

void main(List<String> args) {
  // Condition here just to avoid inlining of the closure
  int Function(int) x = args.length > 0 ? (x) => x + 1 : (x) => x + 2;
  x(args.length);
}

we produce the following graph for the closure entry:

After AllocateRegisters
==== file:///tmp/x.dart_::_main_<anonymous closure>
  0: B0[graph]:0 {
      v0 <- Constant(#null) T{Null?}
      v1 <- Constant(#<optimized out>) T{_OneByteString}
      v6 <- Constant(#2) [2, 2] T{_Smi}
      v17 <- Constant(#0) [0, 0] T{_Smi}
      v26 <- Constant(#1) [1, 1] T{_Smi}
}
  2: B1[function entry]:2 {
      v31 <- Parameter(0) T{*?}
      v32 <- Parameter(1) T{*?}
      v33 <- SpecialParameter(ArgDescriptor) T{_ImmutableList}
}
  3:     ParallelMove rax <- S+2
  4:     ParallelMove rax <- C, rbx <- rax, r10 <- r10 goto:64 B12
  6: B13[function entry]:66 {
      v2 <- Parameter(0) T{*?}
      v3 <- Parameter(1) T{int?}
      v4 <- SpecialParameter(ArgDescriptor) T{_ImmutableList}
}
  7:     ParallelMove rax <- S+2
  8:     ParallelMove rax <- C, rbx <- rax, r10 <- r10 goto:68 B12
 10: B12[join]:62 pred(B1, B13) {
      v13 <- phi(v17, v6) alive [0, 2] T{_Smi}
      v9 <- phi(v32, v3) alive T{*?}
      v11 <- phi(v33, v4) alive T{_ImmutableList}
}
 11:     ParallelMove S-1 <- rbx
 12:     v15 <- LoadField(v11 . ArgumentsDescriptor.type_args_len {final}) [0, 4611686018427387903] T{_Smi}
 14:     Branch if StrictCompare:8(===, v15, v17) goto (3, 5)
 16: B3[target]:12
 18:     v18 <- LoadField(v11 . ArgumentsDescriptor.count {final}) [0, 4611686018427387903] T{_Smi}
 20:     Branch if StrictCompare:22(===, v18, v6) goto (7, 8)
 22: B7[target]:26
 24:     v20 <- LoadField(v11 . ArgumentsDescriptor.positional_count {final}) [0, 4611686018427387903] T{_Smi}
 26:     Branch if StrictCompare:30(===, v18, v20) goto (11, 10)
 28: B11[target]:34
 30:     Branch if StrictCompare:70(===, v13, v6) goto (14, 15)
 32: B14[target]:74
 34:     ParallelMove rcx <- rbx goto:82 B16
 36: B15[target]:76
 37:     ParallelMove rax <- rbx, rdx <- C, rcx <- C
 38:     AssertAssignable:52(v9, int, 'x', instantiator_type_args(v0), function_type_args(v0)) T{int?}
 40:     ParallelMove rcx <- S-1 goto:80 B16

Notice that checked and unchecked entries merge first and then we perform argument descriptor checks and AssertAssignable against parameter. Though these are guarded by check against an integer which encodes which entry we arrived from.

There are two issues here:

  • Performance: we should not actually need to check arguments descriptor – nor type check the parameter on unchecked entry. Dart 2 type system guarantees that signature matches on typed calls to a closure. This can be addressed by moving this code around in the graph. Note: we don’t actually type check the arguments when entering from unchecked entry, I made a mistake reading the graph.
  • Code size: this prologue adds up to significant code in at the start of each function. For example on X64 the function just returning x + 1 ends up being 268 bytes.

I propose to change to the implementation which penalises dynamic invocations of closures in attempt to save on code size: we already emit direct invocations of function body at all typed call sites. So all dynamic invocations should be going through Closure.call method, which means we can just change the body of that method to perform necessary checks against closure signature (which should be available for type checks anyway).

/cc @alexmarkov @mkustermann @askeksa-google

Author: Fantashit

1 thought on “avoid checked entry points in closures

  1. Applied the following patch, which causes argument type checks to be omitted (which has the side effect of also making it a single entry point):

    diff --git a/runtime/vm/object.h b/runtime/vm/object.h
    index fbeb2c80e1..c2776afa86 100644
    --- a/runtime/vm/object.h
    +++ b/runtime/vm/object.h
    @@ -2774,7 +2774,7 @@ class Function : public Object {
       // Whether this function can receive an invocation where the number and names
       // of arguments have not been checked.
       bool CanReceiveDynamicInvocation() const {
    -    return IsClosureFunction() || IsFfiTrampoline();
    +    return IsFfiTrampoline();
       }
     
       bool HasThisParameter() const {
    @@ -2844,7 +2844,7 @@ class Function : public Object {
         if (!I->should_emit_strong_mode_checks()) {
           return false;
         }
    -    return IsClosureFunction() ||
    +    return !IsClosureFunction() &&
                !(is_static() || (kind() == RawFunction::kConstructor));
       }
    

    Numbers for Flutter gallery:

    CPU Benchmark mode Size comparison Change (percentage)
    arm7 release instructions sections -6.20%%
    uncompressed app.so size -4.10%%
    brotli app.so size -2.40%%
    release-sizeopt instructions sections -6.37%%
    uncompressed size -4.71%%
    brotli size -2.50%%
    arm8 release instructions sections -6.40%%
    uncompressed app.so size -3.93%%
    brotli app.so size -2.38%%
    release-sizeopt instructions sections -6.39%%
    uncompressed size -4.54%%
    brotli size -3.26%%

    So this gives us an approximation of the possible savings by the approach suggested by @mraleph .

Comments are closed.