Persisting the metadata
You can check what IP address the greater internet sees you as with:
curl -s https://checkip.amazonaws.com
Bootstrap Polaris with:
docker run --rm -it --env="polaris.persistence.type=relational-jdbc" --env="quarkus.datasource.username=$DB_USERNAME" --env="quarkus.datasource.password=$DB_PASSWORD" --env="quarkus.datasource.jdbc.url=jdbc:postgresql://$DB_HOST:5432/polaris_db" apache/polaris-admin-tool:latest bootstrap -r POLARIS -c POLARIS,root,s3cr3t
(You can purge the database by using the above but with arguments purge -r POLARIS)
Then you can see in Postgres:
postgres=> \c polaris_db
polaris_db=> SELECT * FROM pg_catalog.pg_tables;
schemaname | tablename | tableowner | tablespace | hasindexes | hasrules | hastriggers | rowsecurity
--------------------+-------------------------------+------------+------------+------------+----------+-------------+-------------
polaris_schema | version | postgres | | t | f | f | f
polaris_schema | entities | postgres | | t | f | f | f
polaris_schema | grant_records | postgres | | t | f | f | f
polaris_schema | principal_authentication_data | postgres | | t | f | f | f
Create the database with:
CREATE DATABASE polaris_db;
CREATE USER polaris_user WITH PASSWORD 'your_secure_password';
GRANT ALL PRIVILEGES ON DATABASE polaris_db TO polaris_user;
\c polaris_db
GRANT ALL ON SCHEMA public TO polaris_user;
If you mess up your Polaris, just run:
kubectl rollout restart deployment polaris-deployment
as now the data is all in the database.
Access Control
For integration tests, you can just use the client_id and client_secret with which you set up Polaris. But if you put it in production, you'll want to create users (Principals).
"At the most basic level, Polaris' persistence layer stores Entities and Grants, where Grants define the access-control-related relationship between entities." [Apache Polaris Catalog Federation Proposal]
To access Polaris, you need a Principal. This will have Principal Roles. They need to be associated with the Catalog Roles that in turn belong to a Catalog.
REST via curl
Effective debugging of Polaris can be done by poking its REST API.
First, you need a token
POLARIS_TOKEN=$(curl -X POST "https://$HOST/api/catalog/v1/oauth/tokens" -H "Content-Type: application/x-www-form-urlencoded" -d "grant_type=client_credentials&client_id=$CLIENT_ID&client_secret=$CLIENT_SECRET&scope=PRINCIPAL_ROLE:ALL" | jq -r '.access_token')
View namespaces
curl -X GET "https://$HOST/api/catalog/v1/$CATALOG_NAME/namespaces/$NAMESPACE" -H "Authorization: Bearer ${POLARIS_TOKEN}" | jq
{
"namespace": [
"samples"
],
"properties": {
"owner": "henryp",
"location": "s3a://emrys-afon-bucket/samples/"
}
}
View tables
curl -X GET "https://$HOST/api/catalog/v1/$CATALOG_NAME/namespaces/$NAMESPACE/tables" -H "Authorization: Bearer ${POLARIS_TOKEN}" -H "Accept: application/json" -s | jq .
or given a table:
curl -X GET "https://$HOST/api/catalog/v1/$CATALOG_NAME/namespaces/$NAMESPACE/tables/$TABLE" -H "Authorization: Bearer ${POLARIS_TOKEN}" | jq
Clean up
Remove lingering details in the namespace with something like:
curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer ${POLARIS_TOKEN}" "https://$HOST/api/catalog/v1/aws/namespaces/samples/properties" --data '{"removals": ["owner"] }' | jq
Delete Catalog
curl -X DELETE "https://$HOST/api/management/v1/catalogs/$CATALOG_NAME" -H "Authorization: Bearer ${POLARIS_TOKEN}" -H "Content-Type: application/json" -o /dev/null -s -w "%{http_code}\n"