Friday, April 5, 2024

AWS Real Estate

Just some notes I've made playing around with AWS real estate.

ECS
Amazon's offering that scales Docker containers. Whereas EC2 is simply a remote VM, ECS is a "logical grouping of EC2 machines" [SO]

Fargate
Is a serverless version of EC2 [SO].
 
Kinesis
A propriertary Amazon Kafka replacement. While Kafka writes data locally, Kinesis uses a quorum of shards.

MSK
Amazon also offers a hosted Kafka solution called MSK (Managed Streaming for Kafka). 

Lambda
Runs containers like Docker that exists for up to 15 minutes and whose storage is ephemeral.

Glue
A little like Hive. It has crawlers that are batch jobs that compile metadata, thus doing some of the job of Hive's metastore. In fact, you can delegate the meta store that Spark uses to use Glue as its backing store. See:

EMR
EMR is AWS's MapReduce tool on which we can run Spark. "You can configure Hive to use the AWS Glue Data Catalog as its metastore." [docs] If you want to run Spark locally but still take advantage of Glue, follow these instructions.

Athena
Athena is AWS's hosted Trino offering. You can make data in S3 buckets available to Athena by using Glue crawlers.

Step Functions
AWS's orchestration of different services within Amazon's cloud.

CodePipeline
...is AWS's CI/CD offering.

Databases
DynamoDB is a key/value store and Aurora is a distributed relational DB.

No comments:

Post a Comment