Data Engineering Blogroll
Discover the latest articles, tutorials, and insights curated from DataLakehouseHub.com.
Migrating to Apache Iceberg: Strategies for Every Source System
<!-- Meta Description: Migrate to Iceberg from Hive, data warehouses, or raw files using in-place migration, full rewrite, or the zero-downtime view s...
Read ArticleHands-On with Apache Iceberg Using Dremio Cloud
<!-- Meta Description: A practical walkthrough of creating, querying, and optimizing Iceberg tables on Dremio Cloud, from account setup to AI-powered ...
Read ArticleApproaches to Streaming Data into Apache Iceberg Tables
<!-- Meta Description: Stream data into Iceberg with Spark Structured Streaming, Flink, or Kafka Connect. Here is how each works and the trade-offs be...
Read ArticleUsing Apache Iceberg with Python and MPP Query Engines
<!-- Meta Description: Access Iceberg tables from Python with PyIceberg, DuckDB, and Polars, or through MPP engines like Dremio, Spark, and Trino. Her...
Read ArticleApache Iceberg Metadata Tables: Querying the Internals
<!-- Meta Description: Iceberg metadata tables let you query snapshots, files, manifests, and partitions using SQL. Here is every metadata table and h...
Read ArticleMaintaining Apache Iceberg Tables: Compaction, Expiry, and Cleanup
<!-- Meta Description: Keep Iceberg tables fast with compaction, snapshot expiry, orphan cleanup, and manifest rewriting. Here is when and how to run ...
Read ArticleConcurrency, Isolation, and MVCC: How Engines Handle Contention
<!-- Meta Description: Databases handle concurrent access using locks, MVCC, or optimistic concurrency control. Here is how each approach works and wh...
Read ArticleHow Data Lake Table Storage Degrades Over Time
<!-- Meta Description: Iceberg tables degrade through small files, orphan files, metadata bloat, sort order decay, and partition skew. Here is how to ...
Read ArticleHash, Sort-Merge, Broadcast: How Distributed Joins Work
<!-- Meta Description: Distributed joins move data across the network using shuffle, broadcast, or co-location strategies. Here is how each works and ...
Read ArticleWhen Catalogs Are Embedded in Storage
<!-- Meta Description: S3 Tables and MinIO AI Stor embed the Iceberg catalog directly in the storage layer. Here is when embedded catalogs make sense ...
Read ArticlePartitioning, Sharding, and Data Distribution Strategies
<!-- Meta Description: Hash partitioning distributes data evenly. Range partitioning enables fast range scans. Both create tradeoffs. Here is how data...
Read ArticleWhat Are Lakehouse Catalogs? The Role of Catalogs in Apache Iceberg
<!-- Meta Description: Lakehouse catalogs store metadata pointers, manage namespaces, and enforce access control. Here is the complete catalog landsca...
Read ArticleBuffer Pools, Caches, and the Memory Hierarchy
<!-- Meta Description: Databases use buffer pools, column caches, and result caches to keep hot data in RAM. Here is how each caching strategy works a...
Read ArticleWriting to an Apache Iceberg Table: How Commits and ACID Actually Work
<!-- Meta Description: Here is exactly how an engine writes to an Iceberg table, step by step, from data files through the atomic commit that makes AC...
Read ArticleVolcano, Vectorized, Compiled: How Engines Execute Your Query
<!-- Meta Description: The Volcano model processes one row at a time. Vectorized execution processes batches with SIMD. Code generation fuses operator...
Read ArticleHidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans
<!-- Meta Description: Iceberg's hidden partitioning separates physical layout from user queries using transform functions. Here is how it works and w...
Read ArticleInside the Query Optimizer: How Engines Pick a Plan
<!-- Meta Description: Query optimizers transform SQL into execution plans using rule-based rewrites, cost-based search, and adaptive runtime adjustme...
Read ArticlePartition Evolution: Change Your Partitioning Without Rewriting Data
<!-- Meta Description: Iceberg lets you change partition schemes without rewriting data. Here is how partition evolution works internally and why Hive...
Read ArticleB-Trees, LSM Trees, and the Indexing Tradeoff Spectrum
<!-- Meta Description: B-trees balance reads and writes for OLTP. LSM trees maximize write throughput. Bitmap indexes accelerate OLAP filtering. Here ...
Read ArticlePerformance and Apache Iceberg's Metadata
<!-- Meta Description: Iceberg's three-layer metadata tree eliminates directory listing and enables multi-level data skipping. Here is how scan planni...
Read ArticleHow Databases Organize Data on Disk: Pages, Blocks, and File Formats
<!-- Meta Description: Databases structure data on disk as heap files, sorted files, or LSM trees, then wrap it in formats like Parquet with metadata ...
Read ArticleThe Metadata Structure of Modern Table Formats
<!-- Meta Description: Iceberg uses a metadata tree, Delta Lake uses a transaction log, Hudi uses a timeline. Here is exactly how each format organize...
Read ArticleRow vs. Column: How Storage Layout Shapes Everything
<!-- Meta Description: Row stores keep records together for fast transactions. Column stores keep field values together for fast analytics. Here is ho...
Read ArticleWhat Are Table Formats and Why Were They Needed?
<!-- Meta Description: Table formats like Apache Iceberg solved the ACID, schema, and performance problems that turned data lakes into data swamps. He...
Read ArticleHow Query Engines Think: The Tradeoffs Behind Every Data System
<!-- Meta Description: Every database is a collection of engineering tradeoffs. Learn the 9 design decisions that shape how query engines store, index...
Read ArticleAgentic Analytics on the Apache Lakehouse
*Read the complete Open Source and the Lakehouse series:* * [Part 1: Apache Software Foundation: History, Purpose, and Process](/blog/2026-04-apache-s...
Read ArticleWhat is Apache Iceberg? The Table Format Revolution
*Read the complete Open Source and the Lakehouse series:* * [Part 1: Apache Software Foundation](/blog/2026-04-apache-software-foundation) * [Part 2: ...
Read ArticleWhat is Apache Arrow? Erasing the Serialization Tax
*Read the complete Open Source and the Lakehouse series:* * [Part 1: Apache Software Foundation: History, Purpose, and Process](/blog/2026-04-apache-s...
Read ArticleWhat is Apache Parquet? Columns, Encoding, and Performance
*Read the complete Open Source and the Lakehouse series:* * [Part 1: Apache Software Foundation: History, Purpose, and Process](/blog/2026-04-apache-s...
Read ArticleWhat is Apache Polaris? Unifying the Iceberg Ecosystem
*Read the complete Open Source and the Lakehouse series:* * [Part 1: Apache Software Foundation: History, Purpose, and Process](/blog/2026-04-apache-s...
Read ArticleAssembling the Apache Lakehouse: The Modular Architecture
*Read the complete Open Source and the Lakehouse series:* * [Part 1: Apache Software Foundation: History, Purpose, and Process](/blog/2026-04-apache-s...
Read ArticleApache Software Foundation: History, Purpose, and Process
*Read the complete Open Source and the Lakehouse series:* * [Part 1: Apache Software Foundation](/blog/2026-04-apache-software-foundation) * [Part 2: ...
Read ArticleThe Model Context Protocol (MCP) Explained: A Complete Guide to How Every Major AI Tool Connects to External Data
The Model Context Protocol (MCP) has become the universal standard for connecting AI models to external tools, data sources, and services. Originally ...
Read ArticleContext Management Strategies for VS Code with LLM Plugins: A Complete Guide to Building Your Own AI-Powered IDE
Visual Studio Code is the most widely used code editor in the world, and its extensibility means you can integrate AI capabilities through a growing e...
Read ArticleContext Management Strategies for T3 Chat: A Complete Guide to the Unified Multi-Model AI Interface
T3 Chat is a modern web-based AI chat interface that gives you access to multiple AI models through a single unified platform. Its primary value propo...
Read ArticleContext Management Strategies for Zed: A Complete Guide to the High-Performance AI Code Editor
Zed is a high-performance code editor built in Rust that prioritizes speed, simplicity, and real-time collaboration. Its AI integration is designed to...
Read ArticleContext Management Strategies for Windsurf: A Complete Guide to the AI Flow IDE
Windsurf is an AI-powered IDE built on the VS Code foundation that introduces the concept of "Flows," a paradigm where the AI maintains deep awareness...
Read ArticleContext Management Strategies for Perplexity AI: A Complete Guide to Research-First AI Conversations
Perplexity AI occupies a unique position in the AI landscape: it is a research-first tool that combines conversational AI with real-time web search to...
Read ArticleContext Management Strategies for Cursor: A Complete Guide to the AI-Native Code Editor
Cursor is an AI-native code editor built on the VS Code foundation that integrates AI deeply into every aspect of the development workflow. Its contex...
Read ArticleContext Management Strategies for OpenWork: A Complete Guide to the Desktop AI Agent Framework
OpenWork is a desktop-native AI agent framework designed for local, multi-step task execution on your computer. Unlike browser-based AI tools or termi...
Read ArticleContext Management Strategies for OpenCode: A Complete Guide to the Open-Source Terminal AI Agent
OpenCode is an open-source terminal-based AI coding agent that prioritizes privacy, local-first operation, and broad model provider support. Built as ...
Read ArticleContext Management Strategies for Google Antigravity: A Complete Guide to the Agent-First IDE
Google Antigravity is an agent-first IDE built by Google DeepMind's Advanced Agentic Coding team. It approaches context management differently from ot...
Read ArticleContext Management Strategies for Gemini CLI: A Complete Guide to Terminal-Native AI Development
Gemini CLI is an open-source terminal agent powered by Gemini models that operates directly in your command line. It brings Google's AI capabilities i...
Read ArticleContext Management Strategies for Gemini Web and NotebookLM: A Complete Guide to Google's AI Knowledge Ecosystem
Google's AI ecosystem for knowledge work consists of two deeply integrated tools: Gemini (the conversational AI at gemini.google.com) and NotebookLM (...
Read ArticleContext Management Strategies for Claude Code: A Complete Guide for Developers
Claude Code is a terminal-native agentic coding assistant that lives in your command line and operates directly on your codebase. Unlike chat-based in...
Read ArticleContext Management Strategies for Claude CoWork: A Complete Guide for Knowledge Workers
Claude CoWork represents a fundamentally different approach to AI context management. Unlike chat interfaces where you send messages and receive respo...
Read ArticleContext Management Strategies for Claude Desktop: A Complete Guide to MCP, Computer Use, and Local File Access
Claude Desktop takes everything available in Claude Web and adds three capabilities that fundamentally change how you manage context: MCP server conne...
Read ArticleContext Management Strategies for Claude Web: A Complete Guide to Projects, Artifacts, and Intelligent Context
Claude's web interface at claude.ai combines one of the largest context windows in the industry with a structured Project system that makes it genuine...
Read ArticleContext Management Strategies for OpenAI Codex: A Complete Guide Across Browser, CLI, and App
OpenAI Codex is not a chatbot. It is an autonomous software engineering agent that runs tasks in isolated cloud sandboxes, operates across a browser i...
Read ArticleContext Management Strategies for ChatGPT: A Complete Guide to Getting Better Results
Getting consistently useful results from ChatGPT requires more than writing good prompts. The real differentiator is how you manage context: the backg...
Read ArticleHow to Use Dremio with OpenWork: Connect, Query, and Build Data Apps
OpenWork is an open-source desktop AI agent built on the OpenCode engine. It runs entirely on your machine with your own API keys, giving you full con...
Read ArticleHow to Use Dremio with OpenCode: Connect, Query, and Build Data Apps
OpenCode is an open-source, terminal-based AI coding agent released under the MIT license. It provides a TUI with split panes, uses the Language Serve...
Read ArticleHow to Use Dremio with Zed: Connect, Query, and Build Data Apps
Zed is an open-source, GPU-accelerated code editor written in Rust. It is designed for speed and collaboration, with a built-in AI assistant that supp...
Read ArticleHow to Use Dremio with OpenAI Codex CLI: Connect, Query, and Build Data Apps
OpenAI Codex CLI is a terminal-based coding agent built in Rust. It reads your codebase, writes files, executes commands, and supports MCP for connect...
Read ArticleHow to Use Dremio with Amazon Kiro: Connect, Query, and Build Data Apps
Amazon Kiro is an agentic AI IDE from AWS that introduces spec-driven development to the coding workflow. Instead of jumping straight to code, Kiro he...
Read ArticleHow to Use Dremio with JetBrains AI Assistant: Connect, Query, and Build Data Apps
JetBrains AI Assistant is built into IntelliJ IDEA, PyCharm, DataGrip, and every JetBrains IDE. It provides AI chat, inline code generation, multi-fil...
Read ArticleHow to Use Dremio with Gemini CLI: Connect, Query, and Build Data Apps
Gemini CLI is Google's open-source terminal-based AI agent. It runs directly in your terminal, powered by Gemini models with a 1-million token context...
Read ArticleHow to Use Dremio with Google Antigravity: Connect, Query, and Build Data Apps
Google Antigravity is an agent-first IDE built by Google DeepMind. Its autonomous agents plan multi-step tasks, write code, browse documentation, and ...
Read ArticleHow to Use Dremio with Windsurf: Connect, Query, and Build Data Apps
Windsurf is an AI-native code editor built as a fork of VS Code. Its standout feature is Cascade, an agentic AI system that plans and executes multi-s...
Read ArticleHow to Use Dremio with GitHub Copilot: Connect, Query, and Build Data Apps
GitHub Copilot is the most widely adopted AI coding assistant, integrated into VS Code, JetBrains IDEs, and the GitHub platform. Its agent mode allows...
Read ArticleHow to Use Dremio with Claude CoWork: Connect, Query, and Build Data Apps
Claude CoWork is Anthropic's desktop agentic assistant. Unlike Claude Code (a terminal coding agent), CoWork operates as a general-purpose autonomous ...
Read ArticleHow to Use Dremio with Claude Code: Connect, Query, and Build Data Apps
Claude Code is Anthropic's terminal-based coding agent. It reads your files, writes code, runs commands, and maintains context across a session. Dremi...
Read ArticleHow to Use Dremio with Cursor: Connect, Query, and Build Data Apps
Cursor is an AI-native code editor built as a fork of VS Code. It integrates AI directly into the editing experience with features like Chat, Composer...
Read ArticleThe 2025 State of the Apache Iceberg Ecosystem Results
 **Raw Results at Bottom of Post** **Apache Iceberg Literature from Alex Merced and/or Andrew Madsen:**...
Read ArticleConnect Dremio Software to Dremio Cloud: Hybrid Federation Across Deployments
Dremio Cloud can connect to Dremio Software (self-managed) instances as a federated data source. This creates a hybrid deployment where Dremio Cloud s...
Read ArticleDremio's Built-in Open Catalog: Your Zero-Configuration Apache Iceberg Lakehouse
Every Dremio Cloud account starts with a built-in Open Catalog — a fully managed Apache Iceberg catalog with integrated storage. When you create a Dre...
Read ArticleConnect Any Iceberg REST Catalog to Dremio Cloud: Universal Lakehouse Access
The Apache Iceberg REST Catalog specification defines a standard HTTP API for managing Iceberg table metadata. Any catalog implementation that conform...
Read ArticleConnect Databricks Unity Catalog to Dremio Cloud: Query Delta Lake Tables with Federation and AI
Databricks Unity Catalog is Databricks' governance layer for data and AI assets. It manages Delta Lake tables, machine learning models, feature stores...
Read ArticleConnect Snowflake Open Catalog to Dremio Cloud: Multi-Engine Iceberg Analytics
Snowflake Open Catalog is Snowflake's managed implementation of the Apache Iceberg REST catalog specification, based on the open-source Apache Polaris...
Read ArticleConnect AWS Glue Data Catalog to Dremio Cloud: Query and Manage Your AWS Iceberg Tables
AWS Glue Data Catalog is AWS's managed metadata service for data lakes. It stores table definitions, schemas, partition information, and statistics fo...
Read ArticleConnect Apache Druid to Dremio Cloud: Add SQL Joins, AI, and Governance to Your Real-Time Analytics
Apache Druid is a real-time analytics database designed for sub-second queries on high-ingestion-rate event data. Clickstream analytics, application m...
Read ArticleConnect MongoDB to Dremio Cloud: SQL Analytics on Document Data
MongoDB is the most popular NoSQL document database. It stores data in flexible JSON-like documents, making it ideal for applications with evolving sc...
Read ArticleConnect Vertica to Dremio Cloud: Federation for Analytics-Optimized Data
Vertica is a columnar analytics database engineered for fast aggregate queries on large datasets. It was built from the ground up for analytical workl...
Read ArticleConnect Azure Synapse Analytics to Dremio Cloud: Multi-Cloud Data Warehouse Federation
Microsoft Azure Synapse Analytics combines big data analytics and enterprise data warehousing into a single Azure-integrated platform. If your organiz...
Read ArticleConnect Snowflake to Dremio Cloud: Federate, Govern, and Accelerate Beyond Snowflake
Snowflake is a popular cloud data warehouse known for its separation of storage and compute, near-zero maintenance, and broad ecosystem. Many organiza...
Read ArticleConnect Google BigQuery to Dremio Cloud: Cross-Cloud Analytics Without Data Movement
Google BigQuery is Google Cloud's serverless data warehouse. If your organization uses Google Cloud Platform, BigQuery is where your analytics data, m...
Read ArticleConnect Amazon Redshift to Dremio Cloud: Extend Your Warehouse with Federation and AI Analytics
Amazon Redshift is AWS's managed data warehouse, designed for petabyte-scale analytics. If your organization chose Redshift for analytical workloads, ...
Read ArticleConnect Azure Storage to Dremio Cloud: Query Your Microsoft Data Lake with SQL and AI
Azure Storage is Microsoft's cloud storage platform, spanning Blob Storage, Azure Data Lake Storage Gen2 (ADLS Gen2), and Azure Files. If your organiz...
Read ArticleConnect Amazon S3 to Dremio Cloud: Query Your Data Lake with SQL, Federation, and AI
Amazon S3 is the default landing zone for data in the cloud. Log files, Parquet datasets, CSV exports, JSON events, IoT telemetry, and raw data dumps ...
Read ArticleConnect SAP HANA to Dremio Cloud: Unlock Analytics Beyond the SAP Ecosystem
SAP HANA is the in-memory database platform that powers SAP S/4HANA, SAP BW/4HANA, and custom enterprise applications across finance, manufacturing, l...
Read ArticleConnect IBM Db2 to Dremio Cloud: Modernize Mainframe Analytics with Federation and AI
IBM Db2 is the relational database that powers critical applications across banking, insurance, government, healthcare, and manufacturing. For organiz...
Read ArticleConnect Microsoft SQL Server to Dremio Cloud: Federate Enterprise Data Without ETL
Microsoft SQL Server is one of the most widely deployed enterprise databases in the world. ERP systems, CRM platforms, financial applications, and cus...
Read ArticleConnect Oracle Database to Dremio Cloud: Enterprise Analytics Without Data Movement
Oracle Database runs the most critical enterprise applications in the world — ERP systems, financial ledgers, supply chain management, and HR platform...
Read ArticleConnect MySQL to Dremio Cloud: Federated Analytics Without ETL
MySQL runs more web applications, SaaS platforms, and e-commerce backends than any other database. It's fast for transactional reads and writes, but i...
Read ArticleConnect PostgreSQL to Dremio Cloud: Query, Federate, and Accelerate Your Data
PostgreSQL powers more production applications than almost any other open-source database. It's where your customer records, transaction logs, product...
Read ArticleExtract Structured Data from Text with Dremio's AI_GENERATE Function
Unstructured text is the most underused data in most organizations. Customer emails sit in inboxes. Contract notes live in text fields. Meeting summar...
Read ArticleGenerate Summaries and Insights with Dremio's AI_COMPLETE Function
Every data team has a version of this problem: a table full of raw data that needs human-readable summaries, translations, or narrative descriptions. ...
Read ArticleClassify Your Data with SQL: A Hands-On Guide to Dremio's AI_CLASSIFY Function
Most classification workflows require exporting data to Python, running a model, and importing results back into your warehouse. Dremio's `AI_CLASSIFY...
Read ArticleSemantic Layer Best Practices: 7 Mistakes to Avoid
 Semantic layers don't fail because t...
Read ArticleHow a Self-Documenting Semantic Layer Reduces Data Team Toil
 Every data...
Read ArticleHeadless BI: How a Universal Semantic Layer Replaces Tool-Specific Models
 Your organization uses Tableau for executive d...
Read ArticleData Virtualization and the Semantic Layer: Query Without Copying
 Every da...
Read ArticleThe Role of the Semantic Layer in Data Governance
 Most organi...
Read ArticleWhy Your AI Initiatives Fail Without a Semantic Layer
 Your team builds an AI agent. It ...
Read ArticleSemantic Layer vs. Data Catalog: Complementary, Not Competing
 "We already have a d...
Read ArticleSemantic Layer vs. Metrics Layer: What's the Difference?
 Both terms appear in every mo...
Read ArticleHow to Build a Semantic Layer: A Step-by-Step Guide
 Most teams start building a seman...
Read ArticleWhat Is a Semantic Layer? A Complete Guide
 Ask three teams in your ...
Read ArticleData Engineering Best Practices: The Complete Checklist
 Best practices documen...
Read ArticlePipeline Observability: Know When Things Break
 An analyst messages you on...
Read ArticleTesting Data Pipelines: What to Validate and When
 A table with 50...
Read ArticleBatch vs. Streaming: Choose the Right Processing Model
 "We need real-time data." This is o...
Read ArticleSchema Evolution Without Breaking Consumers
 A source team renames a column f...
Read ArticleIdempotent Pipelines: Build Once, Run Safely Forever
 A pipeline runs, processes 100,000 re...
Read ArticleData Quality Is a Pipeline Problem, Not a Dashboard Problem
 When an a...
Read ArticleHow to Design Reliable Data Pipelines
 The median ...
Read ArticleData Modeling Best Practices: 7 Mistakes to Avoid
 A bad ...
Read ArticleData Vault Modeling: Hubs, Links, and Satellites
 Dimensional ...
Read ArticleDenormalization: When and Why to Flatten Your Data
 Normali...
Read ArticleData Modeling for Analytics: Optimize for Queries, Not Transactions
 The data model that runs yo...
Read ArticleSlowly Changing Dimensions: Types 1-3 with Examples
 Dimensions cha...
Read ArticleDimensional Modeling: Facts, Dimensions, and Grains
 Dime...
Read ArticleData Modeling for the Lakehouse: What Changes
 T...
Read ArticleStar Schema vs. Snowflake Schema: When to Use Each
 Both star schema...
Read ArticleConceptual, Logical, and Physical Data Models Explained
 Most data tea...
Read ArticleWhat Is Data Modeling? A Complete Guide
 Every databas...
Read ArticleA 2026 Introduction to Apache Iceberg
Apache Iceberg is an open-source table format for large analytic datasets. It defines how data files stored on object storage (S3, ADLS, GCS) are orga...
Read ArticleA Practical Guide to AI-Assisted Coding Tools
**Get Data Lakehouse Books:** - [Apache Iceberg: The Definitive Guide](https://drmevn.fyi/tableformatblog) - [Apache Polaris: The Definitive Guide](ht...
Read ArticleWhat Are Recursive Language Models?
**Get Data Lakehouse Books:** - [Apache Iceberg: The Definitive Guide](https://drmevn.fyi/tableformatblog) - [Apache Polaris: The Defintive Guide](htt...
Read ArticleRAG Isn’t a Modeling Problem. It’s a Data Engineering Problem.
**Get Data Lakehouse Books:** - [Apache Iceberg: The Definitive Guide](https://drmevn.fyi/tableformatblog) - [Apache Polaris: The Defintive Guide](htt...
Read ArticleBuilding Pangolin - My Holiday Break, an AI IDE, and a Lakehouse Catalog for the Curious
**Get Data Lakehouse Books:** - [Apache Iceberg: The Definitive Guide](https://drmevn.fyi/tableformatblog) - [Apache Polaris: The Defintive Guide](htt...
Read Article2025 Year in Review Apache Iceberg, Polaris, Parquet, and Arrow
**Get Data Lakehouse Books:** - [Apache Iceberg: The Definitive Guide](https://drmevn.fyi/tableformatblog) - [Apache Polaris: The Defintive Guide](htt...
Read Articledremioframe & iceberg - Pythonic interfaces for Dremio and Apache Iceberg
Modern data teams want simple tools to work with Iceberg tables and Dremio. Two new Python libraries now make that work easier. The first is DremioFra...
Read ArticleIntroducing dremioframe - A Pythonic DataFrame Interface for Dremio
If you're a data analyst or Python developer who prefers chaining expressive `.select()` and `.mutate()` calls over writing raw SQL, you're going to l...
Read ArticleComprehensive Hands-on Walk Through of Dremio Cloud Next Gen (Hands-on with Free Trial)
[Video Playlist of this Walkthough](https://www.youtube.com/playlist?list=PL-gIUf9e9CCvY0bcRBGu2SzFFR-yJGIB6) On November 13, at the [Subsurface Lake...
Read Article2025-2026 Guide to Learning about Apache Iceberg, Data Lakehouse & Agentic AI
The data world is evolving fast. Just a few years ago, building a modern analytics stack meant stitching together tools, ETL pipelines, and compromise...
Read ArticleAn Exploration of the Commercial Iceberg Catalog Ecosystem
**Get Data Lakehouse Books:** - [Apache Iceberg: The Definitive Guide](https://drmevn.fyi/tableformatblog) - [Apache Polaris: The Defintive Guide](htt...
Read ArticleBuilding a Universal Lakehouse Catalog - Beyond Iceberg Tables
**Get Data Lakehouse Books:** - [Apache Iceberg: The Definitive Guide](https://drmevn.fyi/tableformatblog) - [Apache Polaris: The Defintive Guide](htt...
Read ArticleIntro to Apache Iceberg with Apache Polaris and Apache Spark
**Get Data Lakehouse Books:** - [Apache Iceberg: The Definitive Guide](https://drmevn.fyi/tableformatblog) - [Apache Polaris: The Defintive Guide](htt...
Read ArticleThe State of Apache Iceberg v4 - October 2025 Edition
**Get Data Lakehouse Books:** - [Apache Iceberg: The Definitive Guide](https://drmevn.fyi/tableformatblog) - [Apache Polaris: The Defintive Guide](htt...
Read ArticleThe Ultimate Guide to Open Table Formats - Iceberg, Delta Lake, Hudi, Paimon, and DuckLake
**Get Data Lakehouse Books:** - [Apache Iceberg: The Definitive Guide](https://drmevn.fyi/tableformatblog) - [Apache Polaris: The Defintive Guide](htt...
Read ArticleThe 2025 & 2026 Ultimate Guide to the Data Lakehouse and the Data Lakehouse Ecosystem
- [Join the Data Lakehouse Community](https://www.datalakehousehub.com) - [Data Lakehouse Blog Listings](https://lakehouseblogs.com) *Year-end 2025 r...
Read ArticleComposable Analytics with Agents - Leveraging Virtual Datasets and the Semantic Layer
- **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_external_blog&utm_me...
Read ArticleThe Endgame — Building an Autonomous Optimization Pipeline for Apache Iceberg
- **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_external_blog&utm_me...
Read ArticleManaging Large-Scale Optimizations — Parallelism, Checkpointing, and Fail Recovery
- **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_external_blog&utm_me...
Read ArticleUnlocking the Power of Agentic AI with Apache Iceberg and Dremio
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleHidden Pitfalls — Compaction and Partition Evolution in Apache Iceberg
- **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_external_blog&utm_me...
Read ArticleUsing Iceberg Metadata Tables to Determine When Compaction Is Needed
- **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_external_blog&utm_me...
Read ArticleDesigning the Ideal Cadence for Compaction and Snapshot Expiration
- **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_external_blog&utm_me...
Read ArticleAvoiding Metadata Bloat with Snapshot Expiration and Rewriting Manifests
- **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_external_blog&utm_me...
Read ArticleSmarter Data Layout — Sorting and Clustering Iceberg Tables
- **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_external_blog&utm_me...
Read ArticleOptimizing Compaction for Streaming Workloads in Apache Iceberg
- **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_external_blog&utm_me...
Read ArticleThe Basics of Compaction — Bin Packing Your Data for Efficiency
- **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_external_blog&utm_me...
Read ArticleThe Cost of Neglect — How Apache Iceberg Tables Degrade Without Optimization
- **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_external_blog&utm_me...
Read ArticleHow to Discover or Organize Lakehouse & Apache Iceberg Meetups
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleWhat is an API? And Why Data Architecture Depends on Them
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleDecoding AWS EC2 Instance Type Names
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | What is Data Engineering?
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Understanding Data Sources and Ingestion
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | ETL vs ELT – Understanding Data Pipelines
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Batch Processing Fundamentals
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Streaming Data Fundamentals
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Data Modeling Basics
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Data Warehousing Fundamentals
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Data Lakes Explained
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Storage Formats and Compression
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Data Quality and Validation
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Metadata, Lineage, and Governance
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Scheduling and Workflow Orchestration
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Building Scalable Pipelines
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Cloud Data Platforms and the Modern Stack
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | DevOps for Data Engineering
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Data Lakehouse Architecture Explained
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | Apache Iceberg, Arrow, and Polaris
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleIntroduction to Data Engineering Concepts | The Power of Dremio in the Modern Lakehouse
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleA Journey from AI to LLMs and MCP - 10 - Sampling and Prompts in MCP — Making Agent Workflows Smarter and Safer
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleA Journey from AI to LLMs and MCP - 9 - Tools in MCP — Giving LLMs the Power to Act
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleA Journey from AI to LLMs and MCP - 8 - Resources in MCP — Serving Relevant Data Securely to LLMs
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleA Journey from AI to LLMs and MCP - 7 - Under the Hood — The Architecture of MCP and Its Core Components
# A Journey from AI to LLMs and MCP - 7 - Under the Hood — The Architecture of MCP and Its Core Components ## Free Resources - **[Free Apache Icebe...
Read ArticleJourney from AI to LLMs and MCP - 6 - Enter the Model Context Protocol (MCP) — The Interoperability Layer for AI Agents
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleA Journey from AI to LLMs and MCP - 5 - AI Agent Frameworks — Benefits and Limitations
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleA Journey from AI to LLMs and MCP - 4 - What Are AI Agents — And Why They're the Future of LLM Applications
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleA Journey from AI to LLMs and MCP - 3 - Boosting LLM Performance — Fine-Tuning, Prompt Engineering, and RAG
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleA Journey from AI to LLMs and MCP - 2 - How LLMs Work — Embeddings, Vectors, and Context Windows
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleA Journey from AI to LLMs and MCP - 1 - What Is AI and How It Evolved Into LLMs
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleBuilding a Basic MCP Server with Python
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleUsing Helm with Kubernetes - A Guide to Helm Charts and Their Implementation
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleCrash Course on Developing AI Applications with LangChain
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read ArticleThe Data Lakehouse - The Benefits and Enhancing Implementation
## Free Resources - **[Free Apache Iceberg Course](https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_source=ev_...
Read Article2025 Comprehensive Guide to Apache Iceberg
- [Free Apache Iceberg Crash Course](https://university.dremio.com/?utm_source=ev_external_blog&utm_medium=influencer&utm_campaign=2025-iceberg-comp-g...
Read ArticleWhen to use Apache Xtable or Delta Lake Uniform for Data Lakehouse Interoperability
- [Blog: What is a Data Lakehouse and a Table Format?](https://www.dremio.com/blog/apache-iceberg-crash-course-what-is-a-data-lakehouse-and-a-table-fo...
Read Article2025 Guide to Architecting an Iceberg Lakehouse
- [Blog: What is a Data Lakehouse and a Table Format?](https://www.dremio.com/blog/apache-iceberg-crash-course-what-is-a-data-lakehouse-and-a-table-fo...
Read Article10 Future Apache Iceberg Developments to Look forward to in 2025
- [Blog: What is a Data Lakehouse and a Table Format?](https://www.dremio.com/blog/apache-iceberg-crash-course-what-is-a-data-lakehouse-and-a-table-fo...
Read ArticleDeep Dive into Dremio's File-based Auto Ingestion into Apache Iceberg Tables
- [Blog: What is a Data Lakehouse and a Table Format?](https://www.dremio.com/blog/apache-iceberg-crash-course-what-is-a-data-lakehouse-and-a-table-fo...
Read ArticleIntro to SQL using Apache Iceberg and Dremio
- [Blog: What is a Data Lakehouse and a Table Format?](https://www.dremio.com/blog/apache-iceberg-crash-course-what-is-a-data-lakehouse-and-a-table-fo...
Read ArticleDremio, Apache Iceberg and their role in AI-Ready Data
- [Blog: What is a Data Lakehouse and a Table Format?](https://www.dremio.com/blog/apache-iceberg-crash-course-what-is-a-data-lakehouse-and-a-table-fo...
Read ArticleIntroduction to Cargo and cargo.toml
When working with Rust, Cargo is your go-to tool for managing dependencies, building, and running your projects. Acting as Rust's package manager and ...
Read ArticleLeveraging Python's Pattern Matching and Comprehensions for Data Analytics
- [Blog: What is a Data Lakehouse and a Table Format?](https://www.dremio.com/blog/apache-iceberg-crash-course-what-is-a-data-lakehouse-and-a-table-fo...
Read ArticleHands-on with Apache Iceberg & Dremio on Your Laptop within 10 Minutes
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=ev_external_b...
Read ArticleData Modeling - Entities and Events
Structuring data thoughtfully is critical for both operational efficiency and analytical value. Data modeling helps us define the relationships, const...
Read ArticleAll About Parquet Part 01 - An Introduction
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&ut...
Read ArticleAll About Parquet Part 02 - Parquet's Columnar Storage Model
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&ut...
Read ArticleAll About Parquet Part 03 - Parquet File Structure | Pages, Row Groups, and Columns
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&ut...
Read ArticleAll About Parquet Part 04 - Schema Evolution in Parquet
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&ut...
Read ArticleAll About Parquet Part 05 - Compression Techniques in Parquet
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&ut...
Read ArticleAll About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&ut...
Read ArticleAll About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&ut...
Read ArticleAll About Parquet Part 08 - Reading and Writing Parquet Files in Python
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&ut...
Read ArticleAll About Parquet Part 09 - Parquet in Data Lake Architectures
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&ut...
Read ArticleAll About Parquet Part 10 - Performance Tuning and Best Practices with Parquet
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&ut...
Read ArticleOrchestrating Airflow DAGs with GitHub Actions - A Lightweight Approach to Data Curation Across Spark, Dremio, and Snowflake
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&ut...
Read ArticleA Deep Dive Into GitHub Actions From Software Development to Data Engineering
- [Free Copy of Apache Iceberg the Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_content=alexmerced&u...
Read ArticleA Guide to dbt Macros - Purpose, Benefits, and Usage
- [Apache Iceberg 101](https://www.dremio.com/lakehouse-deep-dives/apache-iceberg-101/?utm_source=ev_external_blog&utm_medium=influencer&utm_campaign=...
Read ArticleData Lakehouse Roundup 1 - News and Insights on the Lakehouse
I’m excited to kick off a new series called "Data Lakehouse Roundup," where I’ll cover the latest developments in the data lakehouse space, approximat...
Read ArticleGetting Started with Data Analytics Using PyArrow in Python
- [Apache Iceberg Crash Course: What is a Data Lakehouse and a Table Format?](https://www.dremio.com/blog/apache-iceberg-crash-course-what-is-a-data-l...
Read ArticleWhat is Three-Tier Data (Bronze, Silver, Gold) and How Dremio Simplifies It
- [Apache Iceberg 101](https://www.dremio.com/lakehouse-deep-dives/apache-iceberg-101/?utm_source=ev_external_blog&utm_medium=influencer&utm_campaign=...
Read ArticleA Brief Guide to the Governance of Apache Iceberg Tables
- [Apache Iceberg Crash Course: What is a Data Lakehouse and a Table Format?](https://www.dremio.com/blog/apache-iceberg-crash-course-what-is-a-data-l...
Read ArticleExploring Data Operations with PySpark, Pandas, DuckDB, Polars, and DataFusion in a Python Notebook
- [Apache Iceberg Crash Course: What is a Data Lakehouse and a Table Format?](https://www.dremio.com/blog/apache-iceberg-crash-course-what-is-a-data-l...
Read ArticleUltimate Directory of Apache Iceberg Resources
This article is a comprehensive directory of Apache Iceberg resources, including educational materials, tutorials, and hands-on exercises. Whether you...
Read ArticleChange Data Capture (CDC) when there is no CDC
- [Free Copy of Apache Iceberg: The Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&u...
Read ArticleVirtualization + Lakehouse + Mesh = Data At Scale
- [Free Copy of Apache Iceberg: The Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=ev_external_...
Read ArticleDeep Dive into Data Apps with Streamlit
# Introduction The ability to quickly develop and deploy interactive applications is invaluable. **Streamlit** is a powerful tool that enables data s...
Read ArticleA Deep Dive into Docker Compose
## Understanding the Docker Compose File Structure Docker Compose uses a YAML file (`docker-compose.yml`) to define services, networks, and volumes t...
Read ArticleHands-on with Apache Iceberg on Your Laptop - Deep Dive with Apache Spark, Nessie, Minio, Dremio, Polars and Seaborn
- [Free Copy of Apache Iceberg: The Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=ev_external_...
Read ArticleWhy Data Analysts, Engineers, Architects and Scientists Should Care about Dremio and Apache Iceberg
- [Free Copy of Apache Iceberg: The Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=ev_external_...
Read Article5 Trends in the Data Lakehouse Space
- [Free Copy of Apache Iceberg: The Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=ev_external_...
Read ArticleUsing the alexmerced/datanotebook Docker Image
- [Watch My Intro to Data Playlist](https://www.youtube.com/watch?v=nq8ETrTgT7o&list=PLsLAVBjQJO0p_4Nqz99tIjeoDYE97L0xY&pp=iAQB) - [Download Free Copy...
Read ArticleUnderstanding Apache Iceberg Delete Files
- [Free Copy of Apache Iceberg: The Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=ev_external_...
Read ArticleUnderstanding the Apache Iceberg Manifest
- [Free Copy of Apache Iceberg: The Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=ev_external_...
Read ArticleUnderstanding the Apache Iceberg Manifest List (Snapshot)
- [Free Copy of Apache Iceberg: The Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=ev_external_...
Read ArticleUnderstanding Apache Iceberg's Metadata.json
- [Free Copy of Apache Iceberg: The Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&u...
Read ArticleWhat Apache Iceberg REST Catalog is and isn't
- [Free Copy of Apache Iceberg: The Definitive Guide](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html?utm_source=alexmerced&u...
Read ArticleACID Guarantees and Apache Iceberg - Turning Any Storage into a Data Warehouse
Apache Iceberg has become a prominent name in the data world, with numerous platforms integrating support for Iceberg tables as part of the growing op...
Read ArticleData Lakehouse 101 - The Who, What and Why of Data Lakehouses
- [Sign-up for this free Apache Iceberg Crash Course](https://bit.ly/am-2024-iceberg-live-crash-course-1) - [Get a free copy of Apache Iceberg the Def...
Read ArticleUnderstanding the Polaris Iceberg Catalog and Its Architecture
NOTE: I am working on a hands-on tutorial for Polaris, so please watch for the [Dremio Blog](https://www.dremio.com/blog) in the coming days. Also, ch...
Read ArticleApache Iceberg Reliability
- [Get a Free Copy of "Apache Iceberg: The Definitive Guide"](https://bit.ly/am-iceberg-book) - [Sign Up for the Free Apache Iceberg Crash Course](htt...
Read ArticleUpcoming Data Talks from Alex Merced (And how to follow)
In this article, I will provide you with a list of events I'm currently scheduled to speak at. New events are regularly being added, so here are a cou...
Read ArticleDatabases Deconstructed - The Value of Data Lakehouses and Table Formats
- [Checkout out my Apache Iceberg Crash Course](https://bit.ly/am-2024-iceberg-live-crash-course-1) - [Get a free copy of Apache Iceberg the Definitiv...
Read ArticleVideo Course - Basics of Lakehouse Engineering - Apache Iceberg, Nessie, Dremio
[Get a Free Copy of "Apache Iceberg: The Definitive Guide"](https://bit.ly/am-iceberg-book) ## #1 - Intro - Basics of Lakehouse Engineering - Apache ...
Read ArticlePartitioning with Apache Iceberg - A Deep Dive
- [Apache Iceberg 101](https://www.dremio.com/blog/apache-iceberg-101-your-guide-to-learning-apache-iceberg-concepts-and-practices/) - [Get Hands-on W...
Read Article3 Reasons Data Engineers Should Embrace Apache Iceberg
Data engineers are constantly seeking ways to streamline workflows and enhance data management efficiency. [Apache Iceberg, a high-performance table f...
Read ArticleRunning SQL on your Excel Files From Your Laptop with Dremio
Being able to quickly analyze and gain insights from your data is crucial. Excel is widely used for data storage, but when it comes to complex queries...
Read ArticleUnderstanding the Future of Apache Iceberg Catalogs
[Apache Iceberg](https://www.dremio.com/blog/apache-iceberg-101-your-guide-to-learning-apache-iceberg-concepts-and-practices/) is revolutionizing the ...
Read ArticleA Deep Intro to Apache Iceberg and Resources for Learning More
For a long time, siloed data systems such as databases and data warehouses were sufficient. These systems provided convenient abstractions for various...
Read ArticleEnd-to-End Basic Data Engineering Tutorial (Spark, Dremio, Superset)
Data engineering aims to make data accessible and usable for data analytics and data science purposes. This involves several key aspects: - Transferr...
Read Article5 Open Source Data Projects You Should Be Following
[Follow Me On Social](https://bio.alexmerced.com/data) [Subscribe to my SubStack](https://amdatalakehouse.substack.com) Open source technology signif...
Read Article5 Reasons Dremio is the Ideal Apache Iceberg Lakehouse Platform
[The Apache Iceberg table format](https://www.dremio.com/blog/apache-iceberg-101-your-guide-to-learning-apache-iceberg-concepts-and-practices/) has se...
Read ArticleThe Apache Iceberg Lakehouse - The Great Data Equalizer
> [Get a Free Copy of "Apache Iceberg: The Definitive Guide"](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html) > [Build an I...
Read Article10 Reasons to Make Apache Iceberg and Dremio Part of Your Data Lakehouse Strategy
> [Get a Free Copy of "Apache Iceberg: The Definitive Guide"](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html) > [Build an I...
Read ArticleA deep dive into the concept and world of Apache Iceberg Catalogs
> [Get a Free Copy of "Apache Iceberg: The Definitive Guide"](https://hello.dremio.com/wp-apache-iceberg-the-definitive-guide-reg.html) > [Build an I...
Read ArticleIntroduction to ANSI SQL - Understanding the Syntax and Concepts
[Subscribe to my Data Youtube Channel and Podcasts, Links Here](https://bio.alexmerced.com/data) [Subscribe to my web development youtube channel and...
Read ArticleThe Role of Ontologies in Data Management
The concept of ontologies plays a pivotal role in organizing and making sense of the vast information available. In data management, ontologies are cr...
Read ArticleWhat is the Data Lakehouse and the Role of Apache Iceberg, Nessie and Dremio?
Organizations are constantly seeking more efficient, scalable, and flexible solutions to manage their ever-growing data assets. This quest has led to ...
Read ArticlePartitioning Practices in Apache Hive and Apache Iceberg
# Partitioning Practices in Apache Hive and Apache Iceberg ## Introduction The efficiency of query execution is paramount. One of the key strategies ...
Read ArticleColumnar vs. Row-based Data Structures in OLTP and OLAP Systems
[Follow my Data Youtube Channel](https://www.youtube.com/@alexmerceddata) The decision between using columnar and row-based data structures can signi...
Read ArticleIntroduction to Data Vault Modeling
[Subscribe to my Data Youtube Channel and Podcasts, Links Here](https://bio.alexmerced.com/data) Data Vault modeling is an approach to data warehouse...
Read ArticleTable Format FUD - Thinking Through the Table Format Conversion (Apache Iceberg, Apache Hudi, Delta Lake)
## Context This article is meant to be a sober reflection on the data lakehouse table format conversation I have had as a participant over the last t...
Read ArticleEmbracing the Future of Data Management - Why Choose Lakehouse, Iceberg, and Dremio?
Data is not just an asset but the cornerstone of business strategy. The way we manage, store, and process this invaluable resource has evolved dramati...
Read ArticleOpen Lakehouse Engineering/Apache Iceberg Lakehouse Engineering - A Directory of Resources
The concept of the **Open Lakehouse** has emerged as a beacon of flexibility and innovation. An Open Lakehouse represents a specialized form data lake...
Read ArticleNessie - An Alternative to Hive & JDBC for Self-Managed Apache Iceberg Catalogs
Unlike traditional table formats, Apache Iceberg provides a comprehensive solution for handling big data's complexity, volume, and diversity. It's des...
Read ArticleApache Iceberg, Git-Like Catalog Versioning and Data Lakehouse Management - Pillars of a Robust Data Lakehouse Platform
Managing vast amounts of data efficiently and effectively is crucial for any organization aiming to leverage its data for strategic decisions. The key...
Read ArticleNo articles found matching your search.