Dipankar MazumdarHudi-rs with DuckDB, Polars, Daft, DataFusion — Single-node LakehouseUsing Lakehouse Table formats like Apache Hudi with Python & Rust with no JVM, Spark dependency.Jul 261Jul 261
Dipankar MazumdarUnderstanding Compression Codecs in Apache ParquetApache Parquet is a columnar storage file format optimized for fast processing and querying with large-scale data volumes. It offers…Jun 7Jun 7
Dipankar MazumdarUsing Apache Hudi & Iceberg tables in Databricks with Apache XTableHow to use Apache Iceberg & Apache Hudi tables in Databricks with Apache XTableJun 41Jun 41
Dipankar MazumdarUsing XTable to translate from Iceberg to Hudi & Delta Lake with a File System Catalog like S3While going through some of the recent issues in the Apache XTable repository, I stumbled upon this error from a user who was attempting to…May 15May 15
Dipankar Mazumdarinapache-hudi-blogsBuilding Analytical Apps on the Lakehouse using Apache Hudi, Daft & StreamlitBuilding user-facing analytical apps and dashboards is critical for organizations that want to make decisions actionable. While traditional…May 10May 10
Dipankar MazumdarWhat is Apache XTable (formerly OneTable) — Interoperability for Apache Hudi, Iceberg & Delta LakeApache Hudi, Iceberg, and Delta Lake provide a table-like abstraction on top of the native file formats like Parquet by serving as a…Dec 6, 20231Dec 6, 20231
Dipankar MazumdarApache Hudi (Part 1): History, Getting StartedI recently joined Onehouse.ai to contribute to Apache Hudi and work on advocacy efforts, helping engineering teams build and scale robust…Nov 29, 20232Nov 29, 20232
Dipankar MazumdarNew Job: Apache Hudi, Iceberg, what lies ahead?Today marks the end of my third week at Onehouse.ai, and these initial weeks have been nothing short of exhilarating. As I navigate through…Nov 23, 2023Nov 23, 2023
Dipankar MazumdarBuilding a Plotly Dashboard on a Lakehouse using Apache Iceberg & ArrowIn my last blog, I highlighted the advantages of using low-code platforms like Streamlit to build full-stack data applications on a data…Oct 4, 2023Oct 4, 2023
Dipankar MazumdarEmbracing Diversity in Open Source: Apache Iceberg CommunityOpen-source communities are more than just lines of code; they are vibrant ecosystems where diverse individuals come together to share…Jun 23, 2023Jun 23, 2023