AWS

AWS related contents
Feb
13
EMRFS S3 Optimized Committer and Committer Protocol for Improving Spark Write Performance - Why and How?

EMRFS S3 Optimized Committer and Committer Protocol for Improving Spark Write Performance - Why and How?

What are EMRFS S3 Optimized Committer and EMRFS S3 Optimized Committer Protocol and how to use and identify if these are working for your Spark Jobs to improve write performance?
30 min read
Jan
24
Copy-on-Write or Merge-on-Read? What, When, and How?

Copy-on-Write or Merge-on-Read? What, When, and How?

Copy-on-Write or Merge-on-Read? Optimizing Row-level updates in Apache Iceberg Table by understanding both the approaches and deciding when to use which approach and its impact on the Read and Write speed of the table. How to identify these using Iceberg Metadata tables on AWS?
15 min read
Jan
02
Write-Audit-Publish Pattern with Apache Iceberg on AWS using WAP id

Write-Audit-Publish Pattern with Apache Iceberg on AWS using WAP id

Detailed implementation of Write-Audit-Publish (WAP) Data Quality Pattern in AWS using Apache Iceberg WAP ID i.e. for Apache Iceberg < 1.2.0.
11 min read
Dec
29
Write-Audit-Publish Pattern with Apache Iceberg on AWS using Branches

Write-Audit-Publish Pattern with Apache Iceberg on AWS using Branches

Detailed implementation of Write-Audit-Publish (WAP) Data Quality Pattern in AWS using Apache Iceberg Braches i.e. for Apache Iceberg > 1.2.0. It also covers the gotchas while using this pattern and using Athena as a query Engine.
10 min read
Dec
24
PyDeequ - Testing Data Quality at Scale

PyDeequ - Testing Data Quality at Scale

How to use PyDeequ to test your data quality on AWS
12 min read
Dec
18
Building Discord Bot with AWS Serverless - Part 3

Building Discord Bot with AWS Serverless - Part 3

This blog post is part of the Let's Build Series, where we pick and build an idea. In
5 min read
Dec
18
Building Discord Bot with AWS Serverless - Part 2

Building Discord Bot with AWS Serverless - Part 2

This blog post is part of the Let's Build Series, where we pick and build an idea. In
6 min read
Dec
18
Building Discord Bot with AWS Serverless - Part 1

Building Discord Bot with AWS Serverless - Part 1

This blog post is part of the Let's Build Series, where we pick and build an idea. In
4 min read
Sep
02
How I cleared AWS Data Analytics Speciality Certification

How I cleared AWS Data Analytics Speciality Certification

On 28th August 2023, After preparing for like 2.5 months, I appeared on AWS Certified Data Analytics Speciality Certification
5 min read