First:
Polaris DevOps
Polaris has integration tests in the integration-tests/src/main/java/ directory not the test directory as you would have imagined. The reason for this is that they can be packaged as a JAR and used elsewhere in the codebase.
The advantage to doing it this way is that the same tests can be run against a local Polaris, a Polaris in the cloud, a Polaris running in Docker etc.
So, if we take CatalogFederationIntegrationTest, we can see it subclassed in
the spark-tests Gradle package where it can be run with:
./gradlew :polaris-runtime-spark-tests:intTest
If you try to run the superclass in its own module with Gradle, it cannot be found as it's not in the test directory. If you try to run it with your IDE, you'll find that classes to be wired in at runtime are missing. Running the subclass, CatalogFederationIT, starts a MinIO docker container against which it can run.
Federation
The DTOs (Data Transfer Objects) for creating catalogs etc live in
org.apache.polaris.core.admin.model
For example ExternalCatalog which can be serialized into JSON.
These are passed across the wire and are turned into DPOs (Data Persistence Objects) that live in
org.apache.polaris.core.connection.iceberg
In the case of the IcebergRestConnectionConfigInfoDpo, this DPO object is not a mere aneamic domain model. It has the logic to, for instance, create the properties that will be used to instantiate the class that will govern authentication. It does this by delegating to this factory class:
org.apache.iceberg.rest.auth.AuthManagers
Notice that we have moved from Polaris to the world of Iceberg. The various AuthManagers implement access to OAuth2 providers, Google, SigV4 for AWS etc.
However, there is a mismatch. The AuthenticationParameters DTO classes don't fully align with the AuthManager classes. For instance, there doesn't appear to be a way of creating an external catalog with authorisation via org.apache.iceberg.gcp.auth.GoogleAuthManager.
So, after a day of investigating and trying to hack something together, it looks like this:
- Iceberg can talk to Google no problem using org.apache.iceberg.gcp.auth.GoogleAuthManager.
- However, there is currently no Polaris code to use GoogleAuthManager in an external catalog.
- Instead, the only way to do it currently is to use the standard OAuth2 code.
- However, Google does not completely follow the OAuth2 spec, hence this Iceberg ticket that lead to the writing of GoogleAuthManager and this StackOverflow post that says GCP does not support the grant_type that Iceberg's OAuth2Util uses.
This has no been raised in this Polaris ticket.
No comments:
Post a Comment