theCodingInterface

Providing quality software engineering content in the form of tutorials, applications, services, and commentary suited for developers.

Posts about AWS Glue

Building Data Lakes in AWS with S3, Lambda, Glue, and Athena from Weather Data

In this aricle I cover creating rudimentary Data Lake on AWS S3 filled with historical Weather Data consumed from a REST API. The S3 Data Lake is populated using traditional serverless technologies like AWS Lambda, DynamoDB, and EventBridge rules along with several modern AWS Glue features such as Crawlers, ETL PySpark Jobs, and Triggers.

Introduction to Redshift using Pagila Sample Dataset Including ETL from Postgres using AWS Glue

In this article I give a practical introductory tutorial to using Amazon Redshift as an OLAP Data Warehouse solution for the popular Pagila Movie Rental dataset. I start with a basic overview of the unique architecture Redshift uses to accomplish its scalable and robust use case as an enterprise cloud data warehouse. Then armed with this basic knowledge of Redshift architecture I move on to give a practical example of designing a schema optimal for Redshift based off the Pagila sample dataset.

Navigation

theCodingInterface