Dremel: Interactive Analysis of Web-Scale Datasets
Published:
Introduction
Parquet is one of the important and impactful format in recent data engineering history. So this blog tries to understand how does parquet works at a very basic level. Parquet format was largely influenced by Dremel Paper as mentioned in the motivation statement. This blog post is designed to walk you through the key points of the paper using language that’s more approachable. It can be particularly useful if you’ve already read the paper and are looking for clarification on certain parts, or if you simply prefer the conversational tone of a blog over the formal language of an academic paper. However, I want to emphasize that the original paper is quite straightforward, and I recommend reviewing it either before or after reading this post.