Dictionary order is not stable in Python < 3.6 so we need to sort
by key to have consistent results.
The LogHandler output is also different on older Python versions.
Also, don't stop running python tests after the first error.
It is a bit more code, but it is more "pythonic" and easier to debug
as the enum values also know their names.
It is also an API break, eg. sudo.RC_OK becomes sudo.RC.OK as sudo.RC will
be the "type" of the enum, but I guess that is acceptable before the
initial release.