[NFC] Log reader changes for MLGO environments. #242

jacob-hegna · 2023-05-16T02:45:33Z

No description provided.

mtrofin · 2023-05-16T15:46:19Z

compiler_opt/rl/log_reader.py

@@ -86,6 +86,11 @@
 }


+def convert_dtype_to_ctype(dtype: str) -> Any:


-> Tuple(type, tf.DType) or something

mtrofin · 2023-05-16T15:47:18Z

compiler_opt/rl/log_reader.py

@@ -229,7 +252,7 @@ def _add_feature(se: tf.train.SequenceExample, spec: tf.TensorSpec,


 def read_log_as_sequence_examples(
-    fname: str) -> Dict[str, tf.train.SequenceExample]:
+    fname: str,) -> Dict[str, tf.train.SequenceExample]:


why the comma here?

mtrofin · 2023-05-16T15:47:55Z

compiler_opt/rl/log_reader.py

@@ -74,7 +74,7 @@
    'int32_t': (ctypes.c_int32, tf.int32),
    'uint32_t': (ctypes.c_uint32, tf.uint32),
    'int64_t': (ctypes.c_int64, tf.int64),
-    'uint64_t': (ctypes.c_uint64, tf.uint64)
+    'uint64_t': (ctypes.c_uint64, tf.uint64),


is the comma necessary?

mtrofin · 2023-05-16T15:48:05Z

compiler_opt/rl/log_reader.py

@@ -95,7 +100,8 @@ def create_tensorspec(d: Dict[str, Any]) -> tf.TensorSpec:
  return tf.TensorSpec(
      name=name,
      shape=tf.TensorShape(shape),
-      dtype=_element_type_name_to_dtype[element_type_str])
+      dtype=_element_type_name_to_dtype[element_type_str],


is the comma necessary?

mtrofin · 2023-05-16T15:48:15Z

compiler_opt/rl/log_reader.py

@@ -108,6 +114,7 @@ class LogReaderTensorValue:

  Endianness is assumed to be the same as the log producer's.
  """
+


spurious change

mtrofin · 2023-05-16T15:49:10Z

compiler_opt/rl/log_reader.py

@@ -120,6 +127,14 @@ def __init__(self, spec: tf.TensorSpec, buffer: bytes):
  def spec(self):
    return self._spec

+  @property
+  def buffer(self):


why do you need this and the len property? should they have a small unit test, too?

also, maybe buffer -> raw_bytes to make it clear it's that, not another way to dereference the typed view

and then len -> do you need to use it as a length of the raw buffer? if so, maybe call it like that and return the _len multiplied by the scalar type size.

I convert the data given by the log_reader to a numpy array, and the easiest way to do that was to use https://numpy.org/doc/stable/reference/generated/numpy.frombuffer.html.

I use numpy in the environments instead of tensorflow because numpy is universally used in every ML framework (tf, pytorch, jax) and I want to start decoupling things from TF when they don't fundamentally require it.

I'm not sure if they need unit tests - if you want I can add them, but they're not doing anything surprising. I think if the other tests fail it should catch any issues here.

But the object itself acts as a buffer, it has an indexer and a len.

np has its own notion of a buffer interface which is pretty specific, and if we naively try to pass one of these objects to np.frombuffer we get the error:

TypeError: a bytes-like object is required, not 'LogReaderTensorValue'

ha! interesting. OK. so can we do with just a raw_bytes, looks like len() wouldn't be needed?

ok, replaced raw_bytes and len with a to_numpy method directly, and added a unit test for it.

a, neat. Thanks!

compiler_opt/rl/log_reader_test.py

jacob-hegna

The reason for all the spurious changes is that I realized my vim was autoformatting the python files not using yapf, but instead using the internal google formatting tool... so I would save the file, thinking it was formatting, upload the commit, see that yapf failed in CI, then use yapf and reupload. So, all the spurious changes were because of the google internal formatter and yapf (also a google python formatter) fighting each other. went back through and tried to revert all the spurious things.

jacob-hegna · 2023-05-16T18:18:35Z

compiler_opt/rl/log_reader.py

@@ -74,7 +74,7 @@
    'int32_t': (ctypes.c_int32, tf.int32),
    'uint32_t': (ctypes.c_uint32, tf.uint32),
    'int64_t': (ctypes.c_int64, tf.int64),
-    'uint64_t': (ctypes.c_uint64, tf.uint64)
+    'uint64_t': (ctypes.c_uint64, tf.uint64),


jacob-hegna · 2023-05-16T18:19:16Z

compiler_opt/rl/log_reader.py

@@ -86,6 +86,11 @@
 }


+def convert_dtype_to_ctype(dtype: str) -> Any:


jacob-hegna · 2023-05-16T18:19:32Z

compiler_opt/rl/log_reader.py

@@ -95,7 +100,8 @@ def create_tensorspec(d: Dict[str, Any]) -> tf.TensorSpec:
  return tf.TensorSpec(
      name=name,
      shape=tf.TensorShape(shape),
-      dtype=_element_type_name_to_dtype[element_type_str])
+      dtype=_element_type_name_to_dtype[element_type_str],


jacob-hegna · 2023-05-16T18:23:07Z

compiler_opt/rl/log_reader.py

@@ -120,6 +127,14 @@ def __init__(self, spec: tf.TensorSpec, buffer: bytes):
  def spec(self):
    return self._spec

+  @property
+  def buffer(self):


I convert the data given by the log_reader to a numpy array, and the easiest way to do that was to use https://numpy.org/doc/stable/reference/generated/numpy.frombuffer.html.

I use numpy in the environments instead of tensorflow because numpy is universally used in every ML framework (tf, pytorch, jax) and I want to start decoupling things from TF when they don't fundamentally require it.

I'm not sure if they need unit tests - if you want I can add them, but they're not doing anything surprising. I think if the other tests fail it should catch any issues here.

jacob-hegna · 2023-05-16T18:24:44Z

compiler_opt/rl/log_reader.py

@@ -229,7 +252,7 @@ def _add_feature(se: tf.train.SequenceExample, spec: tf.TensorSpec,


 def read_log_as_sequence_examples(
-    fname: str) -> Dict[str, tf.train.SequenceExample]:
+    fname: str,) -> Dict[str, tf.train.SequenceExample]:


compiler_opt/rl/log_reader_test.py

mtrofin · 2023-05-16T20:13:51Z

lgtm

Log reader changes for MLGO environments.

ba8285d

jacob-hegna changed the title ~~Log reader changes for MLGO environments.~~ [NFC] Log reader changes for MLGO environments. May 16, 2023

jacob-hegna requested a review from mtrofin May 16, 2023 02:51

mtrofin reviewed May 16, 2023

View reviewed changes

Addressing comments.

899e146

jacob-hegna commented May 16, 2023

View reviewed changes

jacob-hegna added 4 commits May 16, 2023 18:32

Fix a few more spurious changes.

1dc39cd

Tuple -> Union.

97bc654

Replace raw_bytes with to_numpy.

f090675

Type hint for to_numpy.

cdc0eb8

NDArray -> ndarray.

c9bf5a4

jacob-hegna merged commit 9d00bcf into main May 16, 2023

jacob-hegna deleted the log_reader_change branch May 16, 2023 20:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NFC] Log reader changes for MLGO environments. #242

[NFC] Log reader changes for MLGO environments. #242

jacob-hegna commented May 16, 2023

mtrofin May 16, 2023

jacob-hegna May 16, 2023

mtrofin May 16, 2023

jacob-hegna May 16, 2023

mtrofin May 16, 2023

jacob-hegna May 16, 2023

mtrofin May 16, 2023

jacob-hegna May 16, 2023

mtrofin May 16, 2023

mtrofin May 16, 2023

jacob-hegna May 16, 2023

mtrofin May 16, 2023

jacob-hegna May 16, 2023

mtrofin May 16, 2023

jacob-hegna May 16, 2023

mtrofin May 16, 2023

jacob-hegna left a comment

jacob-hegna May 16, 2023

jacob-hegna May 16, 2023

jacob-hegna May 16, 2023

jacob-hegna May 16, 2023

jacob-hegna May 16, 2023

mtrofin commented May 16, 2023

		@@ -86,6 +86,11 @@
		}


		def convert_dtype_to_ctype(dtype: str) -> Any:

		@@ -108,6 +114,7 @@ class LogReaderTensorValue:

		Endianness is assumed to be the same as the log producer's.
		"""

[NFC] Log reader changes for MLGO environments. #242

[NFC] Log reader changes for MLGO environments. #242

Conversation

jacob-hegna commented May 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacob-hegna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtrofin commented May 16, 2023