Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
d3d6e61
add simple support for Parquet
LuciferYang Nov 24, 2020
c7c9736
use a footer copy
LuciferYang Nov 24, 2020
f0ac389
remove copy method
LuciferYang Nov 24, 2020
8357771
rename FileMetaCacheManager
LuciferYang Nov 24, 2020
0b0ecf4
fix format
LuciferYang Nov 24, 2020
8bba51a
Merge branch 'upmaster' into SPARK-33449
LuciferYang Nov 25, 2020
a112791
parquet v2
LuciferYang Nov 25, 2020
3e2db1a
fix format
LuciferYang Nov 25, 2020
92d2f37
add ttl conf
LuciferYang Nov 25, 2020
b8b45ec
add a test case
LuciferYang Nov 25, 2020
c63d7cb
add a test case
LuciferYang Nov 25, 2020
7ff0502
add a test case
LuciferYang Nov 25, 2020
44ca052
add a test case
LuciferYang Nov 25, 2020
7254d88
use table name
LuciferYang Nov 25, 2020
bc25c4e
Add meta cache support for orc
LuciferYang Nov 25, 2020
6079adc
update conf version
LuciferYang Dec 21, 2020
190dc8a
Merge branch 'upmaster' into SPARK-33449
LuciferYang Dec 21, 2020
c485cc5
Merge branch 'upmaster' into SPARK-33449
LuciferYang Jan 6, 2021
b872010
Merge branch 'upmaster' into SPARK-33449
LuciferYang Jan 25, 2021
3f39531
config namespace and comments
LuciferYang Jan 25, 2021
99f18f5
remove guava import
LuciferYang Jan 25, 2021
6360580
add for testing comments
LuciferYang Jan 25, 2021
0a224ff
rename config
LuciferYang Jan 25, 2021
98ef2de
add comments
LuciferYang Jan 25, 2021
120678d
Merge branch 'upmaster' into SPARK-33449
LuciferYang Feb 3, 2021
eb8fa71
Merge branch 'upmaster' into SPARK-33449
LuciferYang Feb 26, 2021
0c93459
Merge branch 'upmaster' into SPARK-33449
LuciferYang Aug 9, 2021
61175ed
fix compile
LuciferYang Aug 9, 2021
0d4f3e8
fix compile
LuciferYang Aug 9, 2021
c5b827e
fix compile
LuciferYang Aug 9, 2021
179e7b0
add teste case
LuciferYang Aug 9, 2021
36f502e
add teste case
LuciferYang Aug 9, 2021
850c52b
fix java style
LuciferYang Aug 9, 2021
630f8db
Add some helper method
LuciferYang Aug 10, 2021
aace310
Merge branch 'upmaster' into SPARK-33449
LuciferYang Aug 16, 2021
fa75a95
to parquet only pr
LuciferYang Aug 16, 2021
4c022d7
remove unused import
LuciferYang Aug 16, 2021
104b125
remove private sql and add comments
LuciferYang Aug 16, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -967,6 +967,20 @@ object SQLConf {
.booleanConf
.createWithDefault(false)

val FILE_META_CACHE_PARQUET_ENABLED = buildConf("spark.sql.fileMetaCache.parquet.enabled")
.doc("To indicate if enable parquet file meta cache, it is recommended to enabled " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm curious whether this can help if your Spark queries is running as separate Spark jobs, where each of them may use different executors.

Copy link
Contributor Author

@LuciferYang LuciferYang Aug 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this feature does have limitations, NODE_LOCAL + thrift-server with interactive analysis should be the best scene. If the architecture is storage and computing are separated, we need to consider the task scheduling.

In fact, in the OAP project, the fileMetaCache is relies on dataCache(PROCESS_LOCAL)

"this config when multiple queries are performed on the same dataset, default is false.")
.version("3.3.0")
.booleanConf
.createWithDefault(false)

val FILE_META_CACHE_TTL_SINCE_LAST_ACCESS =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe FILE_META_CACHE_TTL_SINCE_LAST_ACCESS_SEC and spark.sql.fileMetaCache.ttlSinceLastAccessSec so it's easier to know that the unit is second?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion

buildConf("spark.sql.fileMetaCache.ttlSinceLastAccess")
.version("3.3.0")
.doc("Time-to-live for file metadata cache entry after last access, the unit is seconds.")
.timeConf(TimeUnit.SECONDS)
.createWithDefault(3600L)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change default value to 1hour (3600s)


val HIVE_VERIFY_PARTITION_PATH = buildConf("spark.sql.hive.verifyPartitionPath")
.doc("When true, check all the partition paths under the table\'s root directory " +
"when reading data stored in HDFS. This configuration will be deprecated in the future " +
Expand Down Expand Up @@ -3600,6 +3614,8 @@ class SQLConf extends Serializable with Logging {

def parquetVectorizedReaderBatchSize: Int = getConf(PARQUET_VECTORIZED_READER_BATCH_SIZE)

def fileMetaCacheParquetEnabled: Boolean = getConf(FILE_META_CACHE_PARQUET_ENABLED)

def columnBatchSize: Int = getConf(COLUMN_BATCH_SIZE)

def cacheVectorizedReaderEnabled: Boolean = getConf(CACHE_VECTORIZED_READER_ENABLED)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,16 @@
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.parquet.HadoopReadOptions;
import org.apache.parquet.ParquetReadOptions;
import org.apache.parquet.filter2.compat.FilterCompat;
import org.apache.parquet.filter2.compat.RowGroupFilter;
import org.apache.parquet.format.converter.ParquetMetadataConverter;
import org.apache.parquet.hadoop.BadConfigurationException;
import org.apache.parquet.hadoop.ParquetFileReader;
import org.apache.parquet.hadoop.ParquetInputFormat;
import org.apache.parquet.hadoop.api.InitContext;
import org.apache.parquet.hadoop.api.ReadSupport;
import org.apache.parquet.hadoop.metadata.ParquetMetadata;
import org.apache.parquet.hadoop.metadata.BlockMetaData;
import org.apache.parquet.hadoop.util.ConfigurationUtil;
import org.apache.parquet.hadoop.util.HadoopInputFile;
import org.apache.parquet.schema.MessageType;
Expand Down Expand Up @@ -77,28 +82,31 @@ public abstract class SpecificParquetRecordReaderBase<T> extends RecordReader<Vo

protected ParquetFileReader reader;

protected ParquetMetadata cachedFooter;

@Override
public void initialize(InputSplit inputSplit, TaskAttemptContext taskAttemptContext)
throws IOException, InterruptedException {
Configuration configuration = taskAttemptContext.getConfiguration();
FileSplit split = (FileSplit) inputSplit;
this.file = split.getPath();

ParquetReadOptions options = HadoopReadOptions
.builder(configuration)
.withRange(split.getStart(), split.getStart() + split.getLength())
.build();
this.reader = new ParquetFileReader(HadoopInputFile.fromPath(file, configuration), options);
this.fileSchema = reader.getFileMetaData().getSchema();
Map<String, String> fileMetadata = reader.getFileMetaData().getKeyValueMetaData();
ParquetMetadata footer =
Copy link
Contributor Author

@LuciferYang LuciferYang Aug 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun The key problem is here: after SPARK-32703, we use new API to create a ParquetFileReader.
However, in order to reuse file footer, we have to use some deprecated APIs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, got it. Thank you for pointing out this issue, @LuciferYang .

readFooterByRange(configuration, split.getStart(), split.getStart() + split.getLength());
this.fileSchema = footer.getFileMetaData().getSchema();
FilterCompat.Filter filter = ParquetInputFormat.getFilter(configuration);
List<BlockMetaData> blocks =
RowGroupFilter.filterRowGroups(filter, footer.getBlocks(), fileSchema);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this apply all the filter levels? e.g., stats, dictionary, and bloom filter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to investigate it again

Map<String, String> fileMetadata = footer.getFileMetaData().getKeyValueMetaData();
ReadSupport<T> readSupport = getReadSupportInstance(getReadSupportClass(configuration));
ReadSupport.ReadContext readContext = readSupport.init(new InitContext(
taskAttemptContext.getConfiguration(), toSetMultiMap(fileMetadata), fileSchema));
this.requestedSchema = readContext.getRequestedSchema();
reader.setRequestedSchema(requestedSchema);
String sparkRequestedSchemaString =
configuration.get(ParquetReadSupport$.MODULE$.SPARK_ROW_REQUESTED_SCHEMA());
this.sparkSchema = StructType$.MODULE$.fromString(sparkRequestedSchemaString);
this.reader = new ParquetFileReader(
configuration, footer.getFileMetaData(), file, blocks, requestedSchema.getColumns());
this.totalRowCount = reader.getFilteredRecordCount();

// For test purpose.
Expand All @@ -116,6 +124,28 @@ public void initialize(InputSplit inputSplit, TaskAttemptContext taskAttemptCont
}
}

public void setCachedFooter(ParquetMetadata cachedFooter) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't want to add a similar API, we can also retrieve footer from cache in this file

this.cachedFooter = cachedFooter;
}

private ParquetMetadata readFooterByRange(Configuration configuration,
long start, long end) throws IOException {
if (cachedFooter != null) {
List<BlockMetaData> filteredBlocks = new ArrayList<>();
List<BlockMetaData> blocks = cachedFooter.getBlocks();
for (BlockMetaData block : blocks) {
long offset = block.getStartingPos();
if (offset >= start && offset < end) {
filteredBlocks.add(block);
}
}
return new ParquetMetadata(cachedFooter.getFileMetaData(), filteredBlocks);
} else {
return ParquetFileReader
.readFooter(configuration, file, ParquetMetadataConverter.range(start, end));
}
}

/**
* Returns the list of files at 'path' recursively. This skips files that are ignored normally
* by MapReduce.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.sql.execution.datasources

import java.util.concurrent.TimeUnit

import com.github.benmanes.caffeine.cache.{CacheLoader, Caffeine}
import com.github.benmanes.caffeine.cache.stats.CacheStats
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.Path

import org.apache.spark.SparkEnv
import org.apache.spark.internal.Logging
import org.apache.spark.sql.internal.SQLConf

/**
* A singleton Cache Manager to caching file meta. We cache these file metas in order to speed up
* iterated queries over the same dataset. Otherwise, each query would have to hit remote storage
* in order to fetch file meta before read files.
*
* We should implement the corresponding `FileMetaKey` for a specific file format, for example
* `ParquetFileMetaKey` or `OrcFileMetaKey`. By default, the file path is used as the identification
* of the `FileMetaKey` and the `getFileMeta` method of `FileMetaKey` is used to return the file
* meta of the corresponding file format.
*/
object FileMetaCacheManager extends Logging {

private lazy val cacheLoader = new CacheLoader[FileMetaKey, FileMeta]() {
override def load(entry: FileMetaKey): FileMeta = {
logDebug(s"Loading Data File Meta ${entry.path}")
entry.getFileMeta
}
}

private lazy val ttlTime =
SparkEnv.get.conf.get(SQLConf.FILE_META_CACHE_TTL_SINCE_LAST_ACCESS)

private lazy val cache = Caffeine
.newBuilder()
.expireAfterAccess(ttlTime, TimeUnit.SECONDS)
.recordStats()
.build[FileMetaKey, FileMeta](cacheLoader)

/**
* Returns the `FileMeta` associated with the `FileMetaKey` in the `FileMetaCacheManager`,
* obtaining that the `FileMeta` from `cacheLoader.load(FileMetaKey)` if necessary.
*/
def get(dataFile: FileMetaKey): FileMeta = cache.get(dataFile)

/**
* This is visible for testing.
*/
def cacheStats: CacheStats = cache.stats()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is visible for only Testing?

Copy link
Contributor Author

@LuciferYang LuciferYang Jan 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, add commets


/**
* This is visible for testing.
*/
def cleanUp(): Unit = cache.cleanUp()
}

abstract class FileMetaKey {
def path: Path
def configuration: Configuration
def getFileMeta: FileMeta
override def hashCode(): Int = path.hashCode
override def equals(other: Any): Boolean = other match {
case df: FileMetaKey => path.equals(df.path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the same file gets replaced? how do we invalidate the cache? this is very common from my experience, e.g., Hive overwrite a partition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good question, we discussed in #33748 (comment),

If the file name has the timestamp, I think we don't have to worry too much. The names of the new file and the old file are different and they can ensure that they don't read the wrong data.

If it is manually file replaced and the file has the same name and the corresponding file meta exists in the cache, an incorrect file meta will be used to read the data. If the data reading fails, the job will fail. But if the data reading happens to be successful, the job will read the wrong data.

In fact, even if there is no `FileMetaCache`, there is a similar risk in manually replace files with same name, because the offset and length of PartitionedFile maybe don't match after manually replace for a running job

And At the same time, I added a warning for this feature in SQLConf.

Now Parquet is a draft because the Deprecated API, We are focusing on ORC (SPARK-36516) now

case _ => false
}
}

trait FileMeta
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,10 @@ class ParquetFileFormat
SQLConf.PARQUET_INT96_AS_TIMESTAMP.key,
sparkSession.sessionState.conf.isParquetINT96AsTimestamp)

hadoopConf.setBoolean(
SQLConf.FILE_META_CACHE_PARQUET_ENABLED.key,
sparkSession.sessionState.conf.fileMetaCacheParquetEnabled)

val broadcastedHadoopConf =
sparkSession.sparkContext.broadcast(new SerializableConfiguration(hadoopConf))

Expand Down Expand Up @@ -263,9 +267,14 @@ class ParquetFileFormat
val split = new FileSplit(filePath, file.start, file.length, Array.empty[String])

val sharedConf = broadcastedHadoopConf.value.value
val metaCacheEnabled =
sharedConf.getBoolean(SQLConf.FILE_META_CACHE_PARQUET_ENABLED.key, false)

lazy val footerFileMetaData =
lazy val footerFileMetaData = if (metaCacheEnabled) {
ParquetFileMeta.readFooterFromCache(filePath, sharedConf).getFileMetaData
} else {
ParquetFooterReader.readFooter(sharedConf, filePath, SKIP_ROW_GROUPS).getFileMetaData
}
val datetimeRebaseMode = DataSourceUtils.datetimeRebaseMode(
footerFileMetaData.getKeyValueMetaData.get,
datetimeRebaseModeInRead)
Expand Down Expand Up @@ -327,6 +336,11 @@ class ParquetFileFormat
int96RebaseMode.toString,
enableOffHeapColumnVector && taskContext.isDefined,
capacity)
// Set footer before initialize.
if (metaCacheEnabled) {
val footer = ParquetFileMeta.readFooterFromCache(filePath, sharedConf)
vectorizedReader.setCachedFooter(footer)
}
val iter = new RecordReaderIterator(vectorizedReader)
// SPARK-23457 Register a task completion listener before `initialization`.
taskContext.foreach(_.addTaskCompletionListener[Unit](_ => iter.close()))
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.sql.execution.datasources.parquet

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.Path
import org.apache.parquet.format.converter.ParquetMetadataConverter.NO_FILTER
import org.apache.parquet.hadoop.ParquetFileReader
import org.apache.parquet.hadoop.metadata.ParquetMetadata

import org.apache.spark.sql.execution.datasources.{FileMeta, FileMetaCacheManager, FileMetaKey}

case class ParquetFileMetaKey(path: Path, configuration: Configuration)
extends FileMetaKey {
override def getFileMeta: ParquetFileMeta = ParquetFileMeta(path, configuration)
}

class ParquetFileMeta(val footer: ParquetMetadata) extends FileMeta

object ParquetFileMeta {
def apply(path: Path, conf: Configuration): ParquetFileMeta = {
new ParquetFileMeta(ParquetFileReader.readFooter(conf, path, NO_FILTER))
}

def readFooterFromCache(path: Path, conf: Configuration): ParquetMetadata =
readFooterFromCache(ParquetFileMetaKey(path, conf))

def readFooterFromCache(key: ParquetFileMetaKey): ParquetMetadata =
FileMetaCacheManager.get(key).asInstanceOf[ParquetFileMeta].footer
}
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ case class ParquetPartitionReaderFactory(
private val pushDownDecimal = sqlConf.parquetFilterPushDownDecimal
private val pushDownStringStartWith = sqlConf.parquetFilterPushDownStringStartWith
private val pushDownInFilterThreshold = sqlConf.parquetFilterPushDownInFilterThreshold
private val parquetMetaCacheEnabled = sqlConf.fileMetaCacheParquetEnabled
private val datetimeRebaseModeInRead = parquetOptions.datetimeRebaseModeInRead
private val int96RebaseModeInRead = parquetOptions.int96RebaseModeInRead

Expand Down Expand Up @@ -131,8 +132,11 @@ case class ParquetPartitionReaderFactory(
val filePath = new Path(new URI(file.filePath))
val split = new FileSplit(filePath, file.start, file.length, Array.empty[String])

lazy val footerFileMetaData =
lazy val footerFileMetaData = if (parquetMetaCacheEnabled) {
ParquetFileMeta.readFooterFromCache(filePath, conf).getFileMetaData
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happen if the file is removed and replaced?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can discuss it in #33748 first. I'll set this PR to draft first

} else {
ParquetFooterReader.readFooter(conf, filePath, SKIP_ROW_GROUPS).getFileMetaData
}
val datetimeRebaseMode = DataSourceUtils.datetimeRebaseMode(
footerFileMetaData.getKeyValueMetaData.get,
datetimeRebaseModeInRead)
Expand Down Expand Up @@ -249,6 +253,12 @@ case class ParquetPartitionReaderFactory(
int96RebaseMode.toString,
enableOffHeapColumnVector && taskContext.isDefined,
capacity)
// Set footer before initialize.
if (parquetMetaCacheEnabled) {
val fileMeta =
ParquetFileMeta.readFooterFromCache(split.getPath, hadoopAttemptContext.getConfiguration)
vectorizedReader.setCachedFooter(fileMeta)
}
val iter = new RecordReaderIterator(vectorizedReader)
// SPARK-23457 Register a task completion listener before `initialization`.
taskContext.foreach(_.addTaskCompletionListener[Unit](_ => iter.close()))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ import org.apache.spark.sql._
import org.apache.spark.sql.catalyst.{InternalRow, TableIdentifier}
import org.apache.spark.sql.catalyst.expressions.SpecificInternalRow
import org.apache.spark.sql.execution.FileSourceScanExec
import org.apache.spark.sql.execution.datasources.{SchemaColumnConvertNotSupportedException, SQLHadoopMapReduceCommitProtocol}
import org.apache.spark.sql.execution.datasources.{FileMetaCacheManager, SchemaColumnConvertNotSupportedException, SQLHadoopMapReduceCommitProtocol}
import org.apache.spark.sql.execution.datasources.parquet.TestingUDT.{NestedStruct, NestedStructUDT, SingleElement}
import org.apache.spark.sql.execution.datasources.v2.BatchScanExec
import org.apache.spark.sql.execution.datasources.v2.parquet.ParquetScan
Expand Down Expand Up @@ -933,6 +933,39 @@ class ParquetV1QuerySuite extends ParquetQuerySuite {
}
}
}

test("SPARK-33449: simple select queries with file meta cache") {
withSQLConf(SQLConf.FILE_META_CACHE_PARQUET_ENABLED.key -> "true") {
val tableName = "parquet_use_meta_cache"
withTable(tableName) {
(0 until 10).map(i => (i, i.toString)).toDF("id", "value")
.write.saveAsTable(tableName)
try {
val statsBeforeQuery = FileMetaCacheManager.cacheStats
checkAnswer(sql(s"SELECT id FROM $tableName where id > 5"),
(6 until 10).map(Row.apply(_)))
val statsAfterQuery1 = FileMetaCacheManager.cacheStats
// The 1st query triggers 4 times file meta read: 2 times related to
// push down filter and 2 times related to file read. The 1st query
// run twice: df.collect() and df.rdd.count(), so it triggers 8 times
// file meta read in total. missCount is 2 because cache is empty and
// 2 meta files need load, other 6 times will read meta from cache.
assert(statsAfterQuery1.missCount() - statsBeforeQuery.missCount() == 2)
assert(statsAfterQuery1.hitCount() - statsBeforeQuery.hitCount() == 6)
checkAnswer(sql(s"SELECT id FROM $tableName where id < 5"),
(0 until 5).map(Row.apply(_)))
val statsAfterQuery2 = FileMetaCacheManager.cacheStats
// The 2nd query also triggers 8 times file meta read in total and
// all read from meta cache, so missCount no growth and hitCount
// increase 8 times.
assert(statsAfterQuery2.missCount() - statsAfterQuery1.missCount() == 0)
assert(statsAfterQuery2.hitCount() - statsAfterQuery1.hitCount() == 8)
} finally {
FileMetaCacheManager.cleanUp()
}
}
}
}
}

class ParquetV2QuerySuite extends ParquetQuerySuite {
Expand Down