This is how you get Snowflake to talk to your AWS real estate. Before we start, get your AWS account ID with:
aws sts get-caller-identity
This will be used as your GLUE_CATALOG_ID (see below).
Now, you need to create in Snowflake a volume like this:
CREATE OR REPLACE EXTERNAL VOLUME YOUR_VOLUME_NAME
STORAGE_LOCATIONS = (
( NAME = 'eu-west-2'
STORAGE_PROVIDER = 'S3'
STORAGE_BASE_URL = 's3://ROOT_DIRECTORY_OF_TABLE/'
STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::GLUE_CATALOG_ID:role/ROLE_NAME'
)
)
ALLOW_WRITES = FALSE;
You run this even though you have yet to create the role. Then run:
select system$verify_external_volume('YOUR_VOLUME_NAME');
This will give you some JSON that includes a STORAGE_AWS_IAM_USER_ARN. You never create this user. Snowflake does it itself. Its ARN is what you need to create a role in AWS that allows Snowflake's user to see data.
You create a role was created with an ordinary aws iam create-role --role-name S3ReadWriteRoleSF --assume-role-policy-document... using the ARN that we got from Snowflake, above. That is, our Snowflake instance has its own AWS user and you must give that user access to your real estate.
Now, give Snowflake access to your cloud assets with:
aws iam put-role-policy --role-name ROLE_NAME --policy-name GlueReadAccess --policy-document file://glue-read-policy.json
Where glue-read-policy.json just contains the Actions needed to talk to Glue.
Finally, we create the Glue catalog (note that this is not a REST catalog like Polaris) but Glue:
CREATE OR REPLACE CATALOG INTEGRATION CATALOG_NAME
CATALOG_SOURCE = GLUE
TABLE_FORMAT = ICEBERG
CATALOG_NAMESPACE = 'YOUR_DB_NAME'
GLUE_CATALOG_ID = 'GLUE_CATALOG_ID'
GLUE_AWS_ROLE_ARN = 'arn:aws:iam::GLUE_CATALOG_ID:role/ROLE_NAME'
GLUE_REGION = 'eu-west-2'
ENABLED = TRUE;
Now you bring all these threads together when you create a table with:
CREATE ICEBERG TABLE arbitrary_name
EXTERNAL_VOLUME = 'YOUR_VOLUME_NAME'
CATALOG = 'CATALOG_NAME'
CATALOG_TABLE_NAME = 'TABLE_NAME';
Create a REST catalog with:
CREATE OR REPLACE CATALOG INTEGRATION polaris_int
CATALOG_SOURCE = POLARIS
TABLE_FORMAT = ICEBERG
REST_CONFIG = (
CATALOG_URI = 'https://YOUR_HOST:8181/api/catalog/v1/'
)
REST_AUTHENTICATION = (
TYPE = BEARER
BEARER_TOKEN = 'TOKEN'
)
ENABLED = TRUE;
Note that the URI must be talking HTTPS not HTTP.
No comments:
Post a Comment