vitessio · sougou · Jan 12, 2021 · Dec 20, 2020 · Dec 20, 2020 · Dec 20, 2020
diff --git a/content/en/docs/user-guides/configuration-advanced/_index.md b/content/en/docs/user-guides/configuration-advanced/_index.md
@@ -1,5 +1,5 @@
 ---
 title: Advanced Configuration
 description: User guides covering advanced configuration concepts
-weight: 3
----
+weight: 5
+---
diff --git a/content/en/docs/user-guides/configuration-advanced/createlookupvindex.md b/content/en/docs/user-guides/configuration-advanced/createlookupvindex.md
@@ -5,7 +5,7 @@ aliases: ['/docs/user-guides/createlookupvindex/']
 ---
 
 {{< info >}}
-This guide follows on from the Get Started guides. Please make sure that you have an [Operator](../../../get-started/operator), [local](../../../get-started/local) or [Helm](../../../get-started/helm) installation ready.  Make sure you are at the point where you have the sharded keyspace called `customer` setup.
+This guide follows on from the Get Started guides. Please make sure that you have an [Operator](../../../get-started/operator) or [local](../../../get-started/local) installation ready.  Make sure you are at the point where you have the sharded keyspace called `customer` setup.
 {{< /info >}}
 
 **CreateLookupVindex** is a new VReplication workflow in Vitess 6.  It is used to create **and** backfill a lookup Vindex automatically for a table that already exists, and may have a significant amount of data in it already.
@@ -324,18 +324,13 @@ mysql> select sku, hex(keyspace_id) from corder_lookup;
 +-----------+------------------+
 ```
 
-Basically, this shows exactly what we expected.  Now, we can clean up the
-VReplication streams.  Note these commands will clean up all VReplication
-streams on these tablets. You may want to filter by `id` if there are other
-streams running:
+Basically, this shows exactly what we expected.  Now, we have to clean-up
+the artifacts of the backfill. The `ExternalizeVindex` command will delete 
+the vreplication streams and also clear the `write_only` flag from the
+vindex indicating that it is not backfilling any more.
 
 ```sh
-$ vtctlclient -server localhost:15999 VReplicationExec zone1-0000000300 "delete from _vt.vreplication"
-+
-+
-$ vtctlclient -server localhost:15999 VReplicationExec zone1-0000000400 "delete from _vt.vreplication"
-+
-+
+$ vtctlclient -server localhost:15999 ExternalizeVindex customer.corder_lookup
 ```
 
 Next, to confirm the lookup Vindex is doing what we think it should, we can
@@ -475,3 +470,6 @@ mysql> select sku, hex(keyspace_id) from corder_lookup;
 We added a new row to the `corder` table, and now we have a new row in the
 lookup table.
 
+### ExternalizeVindex
+
+Once the backfill is done, 
diff --git a/content/en/docs/user-guides/configuration-basic/_index.md b/content/en/docs/user-guides/configuration-basic/_index.md
@@ -1,5 +1,5 @@
 ---
 title: Configuration
 description: User guides covering basic configuration concepts
-weight: 1
----
+weight: 2
+---
diff --git a/content/en/docs/user-guides/migration/_index.md b/content/en/docs/user-guides/migration/_index.md
@@ -1,5 +1,5 @@
 ---
 title: Migration
 description: User guides covering migration to Vitess
-weight: 2
----
+weight: 3
+---
diff --git a/content/en/docs/user-guides/operating-vitess/_index.md b/content/en/docs/user-guides/operating-vitess/_index.md
@@ -2,5 +2,5 @@
 title: Operational 
 description: User guides for covering operational aspects of Vitess
 description: User guides covering operational aspects of Vitess
-weight: 4
----
+weight: 5
+---
diff --git a/content/en/docs/user-guides/sql/_index.md b/content/en/docs/user-guides/sql/_index.md
@@ -1,5 +1,5 @@
 ---
 title: SQL Statement Analysis
 description: User guides covering analyzing SQL statements
-weight: 3
----
+weight: 4
+---
diff --git a/content/en/docs/user-guides/vschema-guide/_index.md b/content/en/docs/user-guides/vschema-guide/_index.md
@@ -0,0 +1,5 @@
+---
+title: VSchema and Query Serving
+description: Configuring VSchema for serving queries
+weight: 1
+---
diff --git a/content/en/docs/user-guides/vschema-guide/advanced-vschema.md b/content/en/docs/user-guides/vschema-guide/advanced-vschema.md
@@ -0,0 +1,135 @@
+---
+title: Advanced VSchema Properties
+weight: 11
+---
+
+With the exception of Multi-Column Vindexes, advanced VSchema Properties do not have DDL constructs. They can only be updated through `vtctld` CLI commands.
+
+## Multi-Column Vindexes
+
+Multi-Column Vindexes are useful in the following two use cases:
+
+* Grouping customers by their regions so they can be hosted in specific geographical locations. This may be required for compliance, and also to achieve better performance.
+* For a multi-tenant system, grouping all rows of a tenant in a separate set of shards. This limits the fan out of queries if searching only for rows that are related to a single tenant.
+
+In both cases the leading column is the region or tenant, and is used to form the first few bits of the `keyspace_id`. The second column is used for the bits that follow. Since Vitess shards by keyrange, this approach will naturally group all rows of a region or tenant within the same shard, or within a group of consecutive shards. Since each shard is its own MySQL cluster, these can then be deployed to different regions as needed.
+
+Please refer to [Region-based Sharding](../../configuration-advanced/region-sharding) for an example on how to use the `region_json` vindex.
+
+Currently, the Vindex gets used for assigning a `keyspace_id` at the time of insert and at the time of resharding. Additional vindexes need to be added to the table for routing query constructs that contain WHERE clauses.
+
+Vitess does not have the capability to route a query based on multiple values of a multi-column vindex in a where clause yet. This feature will be added soon.
+
+#### Alternate approach
+
+You have the option to pre-combine the region and id bits into a single column and use that as an input for a single column vindex. This approach achieves the same goals as a multi-column vindex. Moreover, you avoid having to define additional vindexes for query routing.
+
+The downside of this approach is that it is harder to migrate an id to a different region.
+
+## Reference Tables
+
+Sharded databases often need the ability to join their tables with smaller “reference” tables. For example, the `product` table could be seen as a reference table. Other use cases are tables that map static information like zipcode to city, etc.
+
+Joining against these tables across keyspaces results in cross-shard joins that may not be very efficient or fast.
+
+Vitess allows you to create a table in a sharded keyspace as a reference table. This means that it will treat the table as having an identical set of rows across all shards. A query that joins a sharded table against such reference tables is then performed locally within each shard.
+
+A reference table should not have any vindex, and is defined in the VSchema as a reference type:
+
+```json
+{
+  "sharded": true,
+  "tables": {
+    "zip_detail": { "type": "reference" }
+  }
+}
+```
+
+It may become a challenge to keep a reference table correctly updated across all shards. Vitess supports the [Materialize](../../migration/materialize) feature that allows you to maintain the original table in an unsharded keyspace and automatically propagate changes to that table in real-time across all shards.
+
+## Column List
+
+The VSchema allows you to specify the list of columns along with their types for every table. This allows Vitess to make optimization decisions where necessary.
+
+For example, specifying that a column contains text allows VTGate to request further collation specific information (`weight_string`) if additional sorting is needed after collecting results from all shards.
+
+For example, issuing this query against `customer` would fail:
+
+```text
+mysql> select customer_id, uname from customer order by uname;
+ERROR 1105 (HY000): vtgate: http://sougou-lap1:12345/: types are not comparable: VARCHAR vs VARCHAR
+```
+
+However, we can modify the VSchema as follows:
+
+```json
+    "customer": {
+      "column_vindexes": [{
+        "column": "customer_id",
+        "name": "hash"
+      }],
+      "auto_increment": {
+        "column": "customer_id",
+        "sequence": "product.customer_seq"
+      },
+      "columns": [{
+        "name": "uname",
+        "type": "VARCHAR"
+      }]
+    }
+```
+
+Re-issuing the same query will now succeed:
+
+```text
+mysql> select customer_id, uname from customer order by uname;
++-------------+---------+
+| customer_id | uname   |
++-------------+---------+
+|           1 | alice   |
+|           2 | bob     |
+|           3 | charlie |
+|           4 | dan     |
+|           5 | eve     |
++-------------+---------+
+5 rows in set (0.00 sec)
+```
+
+Specifying columns against tables also allows VTGate to resolve ambiguous naming of columns against the right tables.
+
+#### Authoritative List
+
+If you have listed all columns of a table in the VSchema, you can add the `column_list_authoritative` flag to the table:
+
+```json
+    "customer": {
+      "column_vindexes": [{
+        "column": "customer_id",
+        "name": "hash"
+      }],
+      "auto_increment": {
+        "column": "customer_id",
+        "sequence": "product.customer_seq"
+      },
+      "columns": [{
+        "name": "uname",
+        "type": "VARCHAR"
+      }],
+      "column_list_authoritative": true
+    }
+```
+
+This flag causes VTGate to automatically expand expressions like `select *` or insert statements that don’t specify the column list.
+
+The caveat about using this feature is that you have to keep this column list in sync with the underlying schema.
+
+In the future, Vitess will allow you to pull this information from the vttablets and automatically keep it up-to-date.
+
+## Routing Rules
+
+Routing Rules are an advanced method of redirecting queries meant for one table to another. They are just pointers and are analogous to symbolic links in a file system. You should generally not have to use routing rules in Vitess.
+
+Workflows like `MoveTables` make use of routing rules to create the existence of the target tables and manage traffic switch from source to target by manipulating these routing rules.
+
+For more information, please refer to the [Routing Rules](../../../reference/features/schema-routing-rules) section.
+
diff --git a/content/en/docs/user-guides/vschema-guide/img/vschema1.png b/content/en/docs/user-guides/vschema-guide/img/vschema1.png
diff --git a/content/en/docs/user-guides/vschema-guide/img/vschema2.png b/content/en/docs/user-guides/vschema-guide/img/vschema2.png
diff --git a/content/en/docs/user-guides/vschema-guide/lookup-as-primary.md b/content/en/docs/user-guides/vschema-guide/lookup-as-primary.md
@@ -0,0 +1,146 @@
+---
+title: Lookup as Primary Vindex
+weight: 10
+---
+
+It is likely that a customer order goes through a life cycle of events. This would best be represented in a separate `corder_event` table that will contain a `corder_id` column as a foreign key into `corder.corder_id`. It would also be beneficial to co-locate the event rows with their associated order.
+
+Just like we shared the `hash` vindex between `customer` and `corder`, we can share `corder_keyspace_idx` between `corder` and `corder_event`. We can also make it the Primary Vindex for `corder_event`. When an order is created, the lookup row for it is also created. Subsequently, an insert into `corder_event` will request the vindex to compute the `keyspace_id` for that `corder_id`, and that will succeed because the lookup entry for it already exists. This is where the significance of the owner table comes into play: The owner table creates the entries, whereas other tables only read those entries.
+
+Inserting a `corder_event` row without creating a corresponding `corder` entry will result in an error. This behavior is in line with the traditional foreign key constraint enforced by relational databases.
+
+Sharing the lookup vindex also has the additional benefit of saving space because we avoid creating separate entries for the new table.
+
+We start with creating the sequence table in the `product` keyspace.
+
+Schema:
+
+```sql
+create table corder_event_seq(id bigint, next_id bigint, cache bigint, primary key(id)) comment 'vitess_sequence';
+insert into corder_event_seq(id, next_id, cache) values(0, 1, 3);
+```
+
+VSchema:
+
+```json
+    "corder_event_seq": { "type": "sequence" }
+```
+
+We then create the `corder_event` table in `customer`:
+
+```sql
+create table corder_event(corder_event_id bigint, corder_id bigint, ename varchar(128), primary key(corder_id, corder_event_id));
+```
+
+In the VSchema, there is no need to create a vindex because we are going to reuse the existing one:
+
+```json
+    "corder_event": {
+      "column_vindexes": [{
+        "column": "corder_id",
+        "name": "corder_keyspace_idx"
+      }],
+      "auto_increment": {
+        "column": "corder_event_id",
+        "sequence": "product.corder_event_seq"
+      }
+    }
+```
+
+Alternate VSchema DDL:
+
+```sql
+alter vschema add sequence product.corder_event_seq;
+alter vschema on customer.corder_event add vindex corder_keyspace_idx(corder_id);
+alter vschema on customer.corder_event add auto_increment corder_event_id using product.corder_event_seq;
+```
+
+We can now insert rows in `corder_event` against rows in `corder`:
+
+```text
+mysql> insert into corder(customer_id, product_id, oname) values (1,1,'gift'),(1,2,'gift'),(2,1,'work'),(3,2,'personal'),(4,1,'personal');
+Query OK, 5 rows affected (0.04 sec)
+
+mysql> insert into corder_event(corder_id, ename) values(1, 'paid'), (5, 'delivered');
+Query OK, 2 rows affected (0.01 sec)
+
+mysql> insert into corder_event(corder_id, ename) values(6, 'expect failure');
+ERROR 1105 (HY000): vtgate: http://sougou-lap1:12345/: execInsertSharded: getInsertShardedRoute: could not map [INT64(6)] to a keyspace id
+```
+
+As expected, inserting a row for a non-existent order results in an error.
+
+### Reversible Vindexes
+
+In Vitess, it is insufficient for a table to only have a Lookup Vindex. This is because it is not practical to reshard such a table. The overhead of performing a lookup before redirecting every row event to a new shard would be prohibitively expensive.
+
+To overcome this limitation, we must add a column with a non-lookup vindex, also known as Functional Vindex to the table. By rule, the Primary Vindex computes the keyspace id of the row. This means that the value of the column should also be such that it yields the same keyspace id.
+
+A Reversible Vindex is one that can back-compute the column value from a given keyspace id. If such a vindex is used for this new column, then Vitess will automatically perform this work and fill the correct value for it. The list of vindex properties, like Functional, Reversible, etc. are listed in the [Vindexes Reference](../../../features/vindexes).
+
+In other words, adding a column with a vindex that is both Functional and Reversible allows Vitess to auto-fill the values, thereby avoiding any impact to the application logic.
+
+The `binary` vindex is one that yields the input value itself as the `keyspace_id`, and is naturally reversible. Using this Vindex will generate the `keyspace_id` as the column value. The modified schema for the table will be as follows:
+
+```sql
+create table corder_event(corder_event_id bigint, corder_id bigint, ename varchar(128), keyspace_id varbinary(10), primary key(corder_id, corder_event_id));
+```
+
+We create a vindex instantiation for `binary`:
+
+```json
+    "binary": {
+      "type": "binary"
+    }
+```
+
+Modify the table VSchema:
+
+```json
+    "corder_event": {
+      "column_vindexes": [{
+        "column": "corder_id",
+        "name": "corder_keyspace_idx"
+      }, {
+        "column": "keyspace_id",
+        "name": "binary"
+      }],
+      "auto_increment": {
+        "column": "corder_event_id",
+        "sequence": "product.corder_event_seq"
+      }
+    }
+```
+
+Alternate VSchema DDL:
+
+```sql
+alter vschema on customer.corder_event add vindex `binary`(keyspace_id) using `binary`;
+```
+
+Note that `binary` needs to be backticked because it is a keyword.
+
+After these modifications, we can now observe that the `keyspace_id` column is getting automatically populated:
+
+```text
+mysql> insert into corder(customer_id, product_id, oname) values (1,1,'gift'),(1,2,'gift'),(2,1,'work'),(3,2,'personal'),(4,1,'personal');
+Query OK, 5 rows affected (0.01 sec)
+
+mysql> insert into corder_event(corder_id, ename) values(1, 'paid'), (5, 'delivered');
+Query OK, 2 rows affected (0.01 sec)
+
+mysql> select corder_event_id, corder_id, ename, hex(keyspace_id) from corder_event;
++-----------------+-----------+-----------+------------------+
+| corder_event_id | corder_id | ename     | hex(keyspace_id) |
++-----------------+-----------+-----------+------------------+
+|               1 |         1 | paid      | 166B40B44ABA4BD6 |
+|               2 |         5 | delivered | D2FD8867D50D2DFE |
++-----------------+-----------+-----------+------------------+
+2 rows in set (0.00 sec)
+```
+
+There is no support for backfilling the reversible vindex column yet. This will be added soon.
+
+{{< info >}}
+The original `keyspace_id` for all these rows came from `customer_id`. Since `hash` is also a reversible vindex, reversing the `keyspace_id` using `hash` will yield the `customer_id`. We could instead leverage this knowledge to replace `keyspace_id+binary` with `customer_id+hash`. Vitess will auto-populate the correct value. Using this approach may be more beneficial because `customer_id` is a value the application can understand and make use of.
+{{< /info >}}