UTF8.decode is slow (performance)

UTF8.decode seems surprisingly slow for large buffers.

Using this Flutter program:

import 'dart:async';
import 'dart:convert';
import 'dart:typed_data';

import 'package:flutter/services.dart';

Future<Null> main() async {
  ByteData data = await rootBundle.load('LICENSE');
  for (int i = 1024; i < data.lengthInBytes; i += 1024) {
    final Stopwatch watch = new Stopwatch()..start();
    UTF8.decode(data.buffer.asUint8List(0, i));
    int time = watch.elapsedMilliseconds;
    print('${"$i bytes".padLeft(15)} ${"${time}ms".padLeft(10)} ${(1000 / 1024 * i / time).toStringAsFixed(1)} KB per second');
  }
}

…I find that on a Pixel 2 XL, 4KB is parsed in about 1ms, 40KB in about 12ms, and 400KB in about 135ms. This is only about 3MB per second. Then again, I’m not sure what would be considered good throughput on this device. Maybe this is as good as we can get?

Presumably this could be fixed in the most radical fashion by supporting UTF-8 as a native data type.

Author: Fantashit

1 thought on “UTF8.decode is slow (performance)

  1. Recently I’ve worked on unifying the heap object layout of our typed data classes so that our AOT compiler can directly access the bytes of typed data members (e.g. of Uint8List) independent of whether the object is internal/external typed data array or a view on those (**)

    All of the work combined has lead to the following numbers (measured on the benchmark mentioned above in #31954 (comment) on my workstation):

    JIT non-view/view
    executed in 0:00:03.429451 (verify="XXXXXXXXX")
    executed in 0:00:06.607781 (verify="XXXXXXXXX")
    
    AOT (view and non-view roughly the same)
    executed in 0:00:05.855320 (verify="XXXXXXXXX")
    

    So the AOT is almost on par with the JIT (for this particular benchmark)

    There is still room for improvement and we continue working on it!

    (**) Notice of Warning: Our AOT compiler will only directly access the bytes (e.g. in foo(Uint8List bytes) => bytes[0];) if there are no 3rd party user classes implementing these interfaces.

Comments are closed.