Ryft Blog

Athena vs. Snowflake on Iceberg: Performance and Cost Comparison on TPC-H

Yuval Yogev
May 26, 2025
4
Mins read
Engineering
Athena vs. Snowflake on Iceberg: Performance and Cost Comparison on TPC-H

Preface

At Ryft, we’re constantly testing different query engines, layouts, and configs to bring the best performance out of lakehouses.

One of the great things about Iceberg as an open table format is that it lets us compare engines objectively, since they operate on the same single copy of the data. This eliminates differences in storage formats and data duplication, making the results more meaningful.

When picking a query engine, Amazon Athena and Snowflake are two popular names that always come up. Both are cloud-based, support SQL, and play nicely with modern data stacks. But how do they compare in terms of performance and cost? We ran a TPC-H benchmark on both systems to find out.

Before diving into the results, a few disclaimers:

Benchmarking is hard. It’s not just about running a few queries and comparing numbers — tiny differences in setup, data layout, or query optimizations can completely change the results. Performance depends on workload, data structure, concurrency, and how well each system is tuned. The numbers we’re sharing reflect our specific setup, not some universal truth. Think of this as a real-world comparison, not a final verdict on which engine is “better.”

Architectural differences matter. Snowflake and Athena have fundamentally different designs that impact cost and scalability. Snowflake runs on virtual warehouses, where you provision dedicated compute resources and manually scale them up or down. This allows for better performance control but requires active management. Athena, being serverless, charges per TB of data scanned and doesn’t let you adjust compute resources directly, making it simpler but limiting tuning options.

Performance isn’t everything. It’s easy to get caught up in benchmark numbers, but real-world decisions aren’t just about raw speed. Cost, ease of use, ecosystem integration, and day-to-day operational complexity all matter just as much — sometimes even more. There’s a great blog post from the folks at MotherDuck, “Perf is not enough,” — They make a solid point: the fastest system isn’t always the best one. At the end of the day, the best tool is the one that fits your workload, your team, and your budget — without making your life harder.

Benchmark Setup

For fairness, we ran the same TPC-H dataset on both engines, stored in Apache Iceberg format. Iceberg is a modern table format that supports ACID transactions and efficient data pruning — features that both Athena and Snowflake can leverage.

Configurations

  • Athena: We used Athena SQL engine version 3 (powered by Trino).
  • Snowflake: We ran queries on a Small & Medium virtual warehouse.
  • Data: The data was generated using DuckDB TPC-H extension with a scale factor of 1024 which results with roughly ~10 billion records in total, and ~350GB compressed data size, ~1.7TB uncompressed, saved on S3.
  • Data Format: All tables were stored in Iceberg with Parquet file and zstd compression to ensure that both engines had access to the same performance optimizations.
  • Data Layout: TPC-H does not determine how data should be partitioned or clustered, which as stated before can drastically change benchmark results. In our dataset, the biggest tables were partitioned by time. Monthly partitioning was applied on orders and lineitem tables, by o_orderdate, and (l_shipdate, l_commitdate) respectively. The data layout is far from being optimal, but it is consistent for this benchmark.

Performance Comparison

Snowflake vs Athena performance comparison
Runtime is in seconds
Snowflake vs Athena performance comparison

Cost Comparison

Cost comparison was done using the following parameters:

  • 1 Credit = 3.00$ (Enterprise edition cost)
  • Small Warehouse = 0.0006 credits per second
  • Medium Warehouse = 0.0011 credits per second
  • Athena scan cost = 0.005 per GB scanned
Snowflake vs Athena cost comparison
Snowflake vs Athena cost comparison

Note: Calculating “cost per query” in snowflake is not trivial, as warehouses can run multiple queries at once, and also can suspend and resume based on activity.

  • Athena Total Cost: $4.62
  • Snowflake Small Total Cost: $2.65
  • Snowflake Medium Total Cost: $2.38

Key Observations

Athena was faster in 8 out of 22 queries, while Snowflake was cheaper in 18 out of 22. Overall, Snowflake’s total cost was 49% lower (not including S3 costs, which apply to both engines). Interestingly, opting for a more expensive Snowflake warehouse didn’t increase overall costs — it actually reduced them. Higher-tier warehouses helped eliminate bottlenecks like disk spillage from low-memory configurations, leading to better efficiency.

Final Thoughts

The most exciting part of this comparison isn’t just the results — it’s that we could benchmark two different query engines on the exact same Iceberg dataset. Both Athena and Snowflake performed well, each excelling in different scenarios. Iceberg’s support in both engines was solid, proving its potential as an open table format for analytics at scale. That said, there’s still room for improvement, especially in areas like query pruning and pushdown filters. As Iceberg matures, we can expect even better performance and efficiency, making it an even more compelling choice for modern data architectures.

Stay tuned for more benchmarks in the future, and feel free to reach out if there’s an engine or data layout you’d like us to test next!

Table of content