Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FLASH-275/276/277/278/281/282: Complete DDL implementation #110

Merged
merged 135 commits into from
Aug 5, 2019
Merged

Conversation

zanmato1984
Copy link
Contributor

@zanmato1984 zanmato1984 commented Jul 18, 2019

Doc links: design doc, test doc

There are several relative individual parts of this huge PR:

  1. Getting schema from TiKV instead of TiDB:
    1.1 TiKV client fixes: contrib/client-c/*
    1.2 Schema syncer: the old TiDB-based syncer is moved to Debug dir for mock usage only, ./ dbms/src/Storages/Transaction/SchemaSyncer.h only contains the interface, the new TiKV-based schema syncer is ./ dbms/src/Storages/Transaction/TiDBSchemaSyncer.h (the naming is terrible though) and related util classes (SchemaGetter and SchemaBuilder)
    1.3 Table info json format change, therefore reimplemented in dbms/src/Storages/Transaction/TiDB.*
    1.4 Remove all the stuff that concerns TiDB schema syncer: curl dependency, schema_sync test, TiDBService and related configs, table ignoring config being moved to TMTContext
  2. Background schema syncing service is in dbms/src/Storages/Transaction/SchemaSyncService.* and related Context and Server changes.
  3. Avoiding data writing by data widening for potential future column type change (only int widening is supported in TiDB) is in DataType* files and related calls.
  4. Schema sync calling when reading/writing:
    4.1 Refine flushRegion in RegionTable and underlying PartitionStream/RegionBlockReader by considering schema being out-of-date
    4.2 Refine IntepreterSelectQuery by considering schema alignment between upper end (TiSpark/chspark/TiDB) and lower end (CH).
  5. Testing:
    5.1 Lots of debug function for schema syncing and mock TiDB table
    5.2 Lots of tests using mock TiDB and schema syncer

zanmato1984 and others added 30 commits June 11, 2019 15:07
* DDL read and update by tikv's meta

* add dbg back

* refine test

* address comments
@zanmato1984
Copy link
Contributor Author

/run-integration-tests

contrib/client-c/include/tikv/Region.h Show resolved Hide resolved
}

static ::grpc::Status doRPCCall(grpc::ClientContext * context, std::unique_ptr<tikvpb::Tikv::Stub> stub, const RequestType & req, ResultType * res) {
return stub -> ReadIndex(context, req, res);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stub -> ReadIndex => stub->ReadIndex

using ResultType = kvrpcpb::ReadIndexResponse;

static const char* err_msg() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is not errMsg?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will report compile error. Old driver teach me !

}

static ::grpc::Status doRPCCall(grpc::ClientContext * context, std::unique_ptr<tikvpb::Tikv::Stub> stub, const RequestType & req, ResultType * res) {
return stub -> KvGet(context, req, res);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

using RequestType = kvrpcpb::GetRequest;
using ResultType = kvrpcpb::GetResponse;

static const char* err_msg() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

contrib/client-c/include/tikv/Rpc.h Outdated Show resolved Hide resolved
@zanmato1984
Copy link
Contributor Author

/run-integration-tests


struct Backoff {
struct Backoff
{
int base;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t may be more accurate, but not a big deal


Logger * log;

RegionClient(RegionCachePtr cache_, RpcClientPtr client_, const RegionVerID & id) : cache(cache_), client(client_), store_addr("you guess?"), region_id(id), log(&Logger::get("pingcap.tikv")) {}
RegionClient(RegionCachePtr cache_, RpcClientPtr client_, const RegionVerID & id)
: cache(cache_), client(client_), store_addr("you guess?"), region_id(id), log(&Logger::get("pingcap.tikv"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you guess => a address that can lead to fast failure, should be better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you guess may cause a long time network searching, stuck till timeout

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd better refine in future

@@ -59,31 +66,34 @@ struct Backoff {
break;
case EqualJitter:
v = expo(base, cap, attempts);
sleep_time = v/2 + rand() % (v/2);
sleep_time = v / 2 + rand() % (v / 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is is a little jumpy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break;
case DecorrJitter:
sleep_time = int(std::min(double(cap), double(base + rand() % (last_sleep*3-base))));
sleep_time = int(std::min(double(cap), double(base + rand() % (last_sleep * 3 - base))));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment would be nice, I don't really get it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pd_server -> stores[1] -> inject_region_not_found = true;
pd_server->addRegion(region, 0, 1);
pd_server->stores[1]->setReadIndex(5);
pd_server->stores[1]->inject_region_not_found = true;

::sleep(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more graceful waiting method like join?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try {
RegionPtr RegionCache::loadRegion(Backoffer & bo, std::string key)
{
for (;;)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a stop method? for TiFlash process's graceful exiting

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dbms/src/Server/Server.cpp Show resolved Hide resolved
dbms/src/Storages/Transaction/Region.cpp Show resolved Hide resolved
@zanmato1984
Copy link
Contributor Author

/run-integration-tests

}
default:
{
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to process other ddl operations.

@zanmato1984
Copy link
Contributor Author

/run-integration-tests

: 10))))))));
return x < (1ULL << 7)
? 1
: (x < (1ULL << 14)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

::cry::

@@ -10,7 +10,7 @@ rm -rf ./data ./log

docker-compose up -d --scale tics0=0 --scale tiflash0=0 --scale tikv-learner0=0

sleep 10
sleep 60
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More graceful checking to avoid this waiting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bootstrap process could be tricky as we observed several times that TiKV was NOT ready after several tens of seconds. However TiFlash doesn't have graceful retry for errors during bootstrap - it just bails out. So here we do this workaround by waiting more time.

Copy link
Contributor

@innerr innerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for comments and *.test, we can merge this PR after all comments are addressed, I will check *.test later.

@zanmato1984
Copy link
Contributor Author

/run-integration-tests

@zanmato1984
Copy link
Contributor Author

/run-integration-tests

@zanmato1984 zanmato1984 merged commit 71e1b04 into master Aug 5, 2019
@zanmato1984 zanmato1984 deleted the ddl branch August 6, 2019 09:27
guo-shaoge pushed a commit to guo-shaoge/tiflash that referenced this pull request Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants