Predicate pushdown

Parquet only pulls data that is filtered for a row group, column chunks, and select partitions. This makes queries both fast and light. A natural question that comes to mind is: what is the big deal about it, every database does it? The big deal is that Parquet is not a database, just a file format--files that can be stored in a regular store, such as HDFS or S3.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset