BigTable: Cell.from_pb() performance improvement

Looking at the performance graph for reading a large number (1000) of rows, each row having 10 cells in one column family, the following class method on Cell takes more time percentage-wise (~10%) than expected:

    @classmethod
    def from_pb(cls, cell_pb):
        """Create a new cell from a Cell protobuf.

        :type cell_pb: :class:`._generated.data_pb2.Cell`
        :param cell_pb: The protobuf to convert.

        :rtype: :class:`Cell`
        :returns: The cell corresponding to the protobuf.
        """
        timestamp = _datetime_from_microseconds(cell_pb.timestamp_micros)
        if cell_pb.labels:
            return cls(cell_pb.value, timestamp, labels=cell_pb.labels)
        else:
            return cls(cell_pb.value, timestamp)

It turns out that _datetime_from_microseconds is relatively expensive:

    return _EPOCH + datetime.timedelta(microseconds=value)

If you trace down to look at the code for _EPOCH and datetime.timedelta you will see the amount of work done to get a proper datetime.

It is suggested that Cell store the microseconds from the Cell protobuf and a property annotation be used to get the timestamp as a datetime, when requested. This makes sense since it moves the performance penalty to only the code which needs to access this timestamp, which may actually be a small minority of code. The property annotation would implement the datetime conversion, using the saved cell_pb.timestamp_micros:

    @property
    def timestamp(self):
        return _EPOCH + datetime.timedelta(self.timestamp_micros)

As an additional consideration, the use of labels in the constructor for Cell should be evaluated to determine if this feature is in use consistently across languages.

See approved pull request #4745.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions