diff --git a/README.md b/README.md
index b0423e673..70fcecb20 100644
--- a/README.md
+++ b/README.md
@@ -212,6 +212,15 @@ The format is explicitly designed to separate the metadata from the data.  This
 allows splitting columns into multiple files, as well as having a single metadata
 file reference multiple parquet files.  
 
+## RowGroup Statistics
+In Parquet, the metadata for each RowGroup contains Statistics, which can be used by
+clients for filtering purposes. An example implementation of filtering logic can be found
+in [parquet-mr](https://github.com/apache/parquet-mr). Statistics include information
+like the minimum and maximum for primitive types, while for binary data there is an
+additional notion of _signed_ and _unsigned_ interpretations of the byte strings, which
+have different comparison operations and are stored in the optional fields
+`unsigned_min`, `unsigned_max`, `signed_min` and `signed_max`.
+
 ## Configurations
 - Row group size: Larger row groups allow for larger column chunks which makes it 
 possible to do larger sequential IO.  Larger groups also require more buffering in 
diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
index ac4d50eb4..011c31823 100644
--- a/src/main/thrift/parquet.thrift
+++ b/src/main/thrift/parquet.thrift
@@ -194,6 +194,22 @@ enum FieldRepetitionType {
 /**
  * Statistics per row group and per page
  * All fields are optional.
+ *
+ * Binaries are sorted lexicographically (byte by byte), treating each byte as
+ * an integer.  The signed sorting treats each byte as a signed two's
+ * compliment number, and the unsigned treats the byte as an unsigned number.
+ * When one bytestring is a prefix of another, the containing bytestring is
+ * "greater than" the prefix.
+ *
+ * For BinaryStatistics in Parquet, we want to distinguish between the
+ * statistics derived from comparisons of signed or unsigned bytes.  The min
+ * and max fields are deprecated for BinaryStatistics, instead relying on
+ * specification of {unsigned,signed}_{min,max}. The filter API should allow
+ * clients to specify which statistics and method of comparison should be used
+ * for filtering. To maintain backward format compatibility, when filtering
+ * based on signed statistics the signed_min and signed_max are checked first,
+ * and if they are unset it falls back to using the values in min and max,
+ * treating them as signed bytestrings.
  */
 struct Statistics {
    /** min and max value of the column, encoded in PLAIN encoding */
@@ -203,6 +219,12 @@ struct Statistics {
    3: optional i64 null_count;
    /** count of distinct values occurring */
    4: optional i64 distinct_count;
+   /* Signed min and max for binary fields */
+   5: optional binary signed_max;
+   6: optional binary signed_min;
+   /* Unsigned min and max for binary fields */
+   7: optional binary unsigned_max;
+   8: optional binary unsigned_min;
 }
 
 /**