Engineering

How to Fix Corrupted Iceberg Tables

Omer Hadari

November 4, 2025

Remediation and Fixes

In Part 1 and Part 2 of this series, we analyzed two different scenarios that led to Iceberg table corruption - from silent overwrites to inconsistent metadata. Since publishing these posts we have received more requests from people who encountered these situations on how to safely repair those tables.

In this post, we’ll focus on the remediation process: identifying what’s affected, how to safely clean it up, and how to prevent further damage.

Identifying the Scope of Corruption

Before acting, it’s critical to understand which files and snapshots were affected.

Collect files from errors - If your table is corrupted - file paths of the offending files will appear when simply trying to scan the table
Identify the affected files from metadata: Use your metadata or monitoring system to extract the full list of Parquet files that were corrupted or overwritten, find files that appear twice in the metadata.
Identify the affected partitions: Use the metadata tables to identify which partitions do the corrupted files belong to. Once the table is repaired those partition will have data loss so it’s important to either acknowledge it or be able to re-ingest the data.

Why You Can’t Just Delete from S3

Deleting the files from object storage might seem like an easy cleanup step, but it will only make things worse.

Iceberg metadata references each data file by its absolute path.

If the file is missing but still tracked in manifests, readers will fail with errors such as:

software.amazon.awssdk.services.s3.model.S3Exception:
The specified key does not exist

‍

Deleting data files without updating metadata will corrupt the table metadata.

Safe Remediation: Delete via Iceberg API

To safely remove corrupted files, use the Iceberg API itself, this ensures that both metadata and manifests are updated consistently.

Below is an example using Spark and the Iceberg Java API:

import org.apache.iceberg.DeleteFiles;
import org.apache.iceberg.Table;
import org.apache.iceberg.spark.Spark3Util;
import org.apache.spark.sql.SparkSession;
import org.apache.spark.sql.connector.catalog.Identifier;

public class DeleteFilesJob {
    public static void main(String[] args) throws Exception {
        String catalog = "glue_catalog";
        String database = "raw_data";
        String table = "customers";
        String[] filesToDelete = new String[]{"s3://lake-bucket/tables/customers/data/..."};
        SparkSession spark = SparkSession.builder()
                .appName("Delete Iceberg Files")
                .config("spark.sql.catalog.glue_catalog", "org.apache.iceberg.spark.SparkCatalog")
                .config("spark.sql.catalog.glue_catalog.warehouse", "s3://warehouse/")
                .config("spark.sql.catalog.glue_catalog.catalog-impl", "org.apache.iceberg.aws.glue.GlueCatalog")
                .getOrCreate();
        try {
            Identifier ident = Identifier.of(new String[]{database}, table);
            Table icebergTable = Spark3Util.loadIcebergTable(spark, Spark3Util.quotedFullIdentifier(catalog, ident));
            DeleteFiles deleteFiles = icebergTable.newDelete();
            for (String path : filesToDelete) {
                deleteFiles.deleteFile(path);
            }
            deleteFiles.commit();
            System.out.println("Committed delete of " + filesToDelete.length + " file(s).");
        } finally {
            spark.stop();
        }
    }
}

‍

This safely:

Removes the files from Iceberg’s manifests
Updates snapshot metadata
Commits a new snapshot reflecting the deletion

After the commit, the next table scan will not read those corrupted files.

Validate and Rebuild

After running the delete operation:

Verify that the new snapshot no longer references the deleted files:

SELECT COUNT(*) FROM my_table.files WHERE file_path IN (...);

the query should return 0 if the corrupted files are no longer referenced in the current snapshot.

Final Notes

Although the table is corrupted, fixing it should be done with care and precision as it can lead to further corruptions.
It’s advised to copy the files that are targeted for deletions, for a backup and for further debugging purposes to understand the origin of corruption
Always prefer to repair the data using native Iceberg API’s to avoid corruption metadata

Table of Contents

Example H2

Example H3

Get the latest posts straight to your inbox

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Browse other blogs

Engineering

How to Fix Corrupted Iceberg Tables

In Part 1 and Part 2 of this series, we analyzed two different scenarios that led to Iceberg table corruption - from silent overwrites to inconsistent metadata. Since publishing these posts we have received more requests from people who encountered these situations on how to safely repair those tables. In this post, we’ll focus on the remediation process: identifying what’s affected, how to safely clean it up, and how to prevent further damage.

Omer Hadari

November 4, 2025

Engineering

Handling Commit Conflicts in Apache Iceberg: Patterns and Fixes

Commit conflicts in Apache Iceberg are one of those problems that seem rare - until you start operating at scale. The first time a long-running compaction job fails after hours of compute, or a CDC pipeline spends half its time retrying commits, you realize this isn’t a corner case. It’s a core operational challenge that directly impacts cost, latency, and reliability.This post covers what commit conflicts are, why they happen, and how to fix them without creating new problems in the process.

Yossi Reitblat

October 23, 2025

Engineering

Athena vs. Snowflake on Iceberg: Performance and Cost Comparison on TPC-H

How do Amazon Athena and Snowflake compare when running real-world analytics on Apache Iceberg tables? We ran a TPC-H benchmark to break down the trade-offs in performance, cost, and architecture.

Yuval Yogev

Yossi Reitblat

October 27, 2025

May 26, 2025

See Ryft in Action