Clusters
This guide provides a comprehensive overview of Temporal Clusters.
A Temporal Cluster is the group of services, known as the Temporal Server, combined with persistence stores, that together act as a component of the Temporal Platform.
Persistence
A Temporal Cluster's only required dependency for basic operation is a database. Multiple types of databases are supported.
The database stores the following types of data:
- Tasks: Tasks to be dispatched.
- State of Workflow Executions:
- Execution table: A capture of the mutable state of Workflow Executions.
- History table: An append only log of Workflow Execution History Events.
- Namespace metadata: Metadata of each Namespace in the Cluster.
- Visibility data: Enables operations like "show all running Workflow Executions". For production environments, we recommend using Elasticsearch.
An Elasticsearch database can be added to enable Advanced Visibility.
Dependency versions
Temporal tests compatibility by spanning the minimum and maximum stable non-EOL major versions for each supported database. As of time of writing, these specific versions are used in our test pipelines and actively tested before we release any version of Temporal:
- Cassandra v3.11 and v4.0
- PostgreSQL v10.18 and v13.4
- MySQL v5.7 and v8.0 (specifically 8.0.19+ due to a bug)
We update these support ranges once a year. The release notes of each Temporal Server declare when we plan to drop support for database versions reaching End of Life.
- Because Temporal Server primarily relies on core database functionality, we do not expect compatibility to break often. Temporal has no opinions on database upgrade paths; as long as you can upgrade your database according to each project's specifications, Temporal should work with any version within supported ranges.
- We do not run tests with vendors like Vitess and CockroachDB, so you rely on their compatibility claims if you use them. Feel free to discuss them with fellow users in our forum.
- Temporal is working on official SQLite v3.x persistence, but this is meant only for development and testing, not production usage. Cassandra, MySQL, and PostgreSQL schemas are supported and thus can be used as the Server's database.
Monitoring and observation
Temporal emits metrics by default in a format that is supported by Prometheus. Monitoring and observing those metrics is optional. Any software that can pull metrics that supports the same format could be used, but we ensure it works with Prometheus and Grafana versions only.
- Prometheus >= v2.0
- Grafana >= v2.5
Visibility
Temporal has built-in Visibility features. To enhance this feature, Temporal supports an integration with Elasticsearch.
- Elasticsearch v7.10 is supported from Temporal version 1.7.0 onwards
- Elasticsearch v6.8 is supported in all Temporal versions
- Both versions are explicitly supported with AWS Elasticsearch
mTLS encryption
Temporal supports Mutual Transport Layer Security (mTLS) as a method of encrypting network traffic between services within a Temporal Cluster, or between application processes and a Cluster.
Mutual TLS can be enabled in Temporal’s TLS configuration.
This configuration can be passed through WithConfig
or WithConfigLoader
.
This configuration includes two sections that serve to separate intra-cluster and external traffic. That way, different certificates and settings can be used to encrypt each section of traffic:
internode
: configuration for encrypting communication between nodes within the Cluster.frontend
: configuration for encrypting the Frontend's public endpoints
Temporal Client connections
A client's network access can be limited by using certificates issued by a specific Certificate Authority (CA).
To restrict access to Temporal Cluster endpoints, use the clientCAFiles
or clientCAData
property and the requireClientAuth
property.
These properties can be specified in both the internode
and frontend
sections of the mTLS configuration.
Server name specification
Specify the serverName
in the client
section of your mTLS configuration to prevent spoofing and MITM attacks.
Entering a value for serverName
enables established connections to authenticate the endpoint.
This ensures that the server certificate presented to any connected client has the specified server name in its CN property.
This measure can be used for internode
and frontend
endpoints.
For more information on mTLS configuration, refer to our TLS configuration guide.
Auth
Authentication is the process of verifying users who want to access your application are actually the users you want accessing it. Authorization is the verification of applications and data that a user on your Cluster or application has access to.
Temporal has several authentication protocols that can be set to restrict access to your data. These protocols address three areas: servers, client connections, and users.
Server attacks can be prevented by specifying serverName
in the client
section of your mTLS configuration.
This can be done for both frontend
and internode
endpoints.
Client connections can be restricted to certain endpoints by requiring certificates from a specific CA.
Modify the clientCaFiles
, clientCaData
, and requireClientAuth
properties in the internode
and frontend
sections of the mTLS configuration.
User access can be restricted through extensibility points and plugins.
When implemented, the frontend
invokes the plugin before executing the requested operation.
Temporal offers two plugin interfaces for API call authentication and authorization.
The logic of both plugins can be customized to fit a variety of use cases. When provided, the frontend invokes the implementation of the plugins before running the requested operation.
Temporal Server
The Temporal Server consists of four independently scalable services:
- Frontend gateway: for rate limiting, routing, authorizing.
- History subsystem: maintains data (mutable state, queues, and timers).
- Matching subsystem: hosts Task Queues for dispatching.
- Worker Service: for internal background Workflows.
For example, a real-life production deployment can have 5 Frontend, 15 History, 17 Matching, and 3 Worker Services per cluster.
The Temporal Server services can run independently or be grouped together into shared processes on one or more physical or virtual machines. For live (production) environments, we recommend that each service runs independently, because each one has different scaling requirements and troubleshooting becomes easier. The History, Matching, and Worker Services can scale horizontally within a Cluster. The Frontend Service scales differently than the others because it has no sharding or partitioning; it is just stateless.
Each service is aware of the others, including scaled instances, through a membership protocol via Ringpop.
Versions and support
All Temporal Server releases abide by the Semantic Versioning Specification.
Fairly precise upgrade paths and support have been established starting from Temporal v1.7.0
.
We provide maintenance support for previously published minor and major versions by continuing to release critical bug fixes related to security, the prevention of data loss, and reliability, whenever they are found.
We aim to publish incremental upgrade guides for each minor and major version, which include specifics about dependency upgrades that we have tested for (such as Cassandra 3.0 -> 3.11).
We offer maintenance support of the last three minor versions after a release and do not plan to "backport" patches beyond that.
We offer maintenance support of major versions for at least 12 months after a GA release, and we provide at least 6 months' notice before EOL/deprecating support.
Dependencies
Temporal offers official support for, and is tested against, dependencies with the exact versions described in the go.mod
file of the corresponding release tag.
(For example, v1.5.1 dependencies are documented in the go.mod for v1.5.1.)
Frontend Service
The Frontend Service is a stateless gateway service that exposes a strongly typed Proto API. The Frontend Service is responsible for rate limiting, authorizing, validating, and routing all inbound calls.
Types of inbound calls include the following:
- Domain CRUD
- External events
- Worker polls
- Visibility requests
- Admin operations via tctl (the Temporal CLI)
- Calls from a remote Cluster related to Multi-Cluster Replication
Every inbound request related to a Workflow Execution must have a Workflow Id, which is hashed for routing purposes. The Frontend Service has access to the hash rings that maintain service membership information, including how many nodes (instances of each service) are in the Cluster.
Inbound call rate limiting is applied per host and per namespace.
The Frontend Service talks to the Matching Service, History Service, Worker Service, the database, and Elasticsearch (if in use).
- It uses the grpcPort 7233 to host the service handler.
- It uses port 6933 for membership-related communication.
History Service
The History Service tracks the state of Workflow Executions.
The History Service scales horizontally via individual shards, configured during the Cluster's creation. The number of shards remains static for the life of the Cluster (so you should plan to scale and over-provision).
Each shard maintains data (routing identifiers, mutable state) and queues. A History shard maintains four types of queues:
- Transfer queue: transfers internal tasks to the Matching Service. Whenever a new Workflow Task needs to be scheduled, the History Service transactionally dispatches it to the Matching Service.
- Timer queues: durably persists Timers.
- Replicator queue: asynchronously replicates Workflow Executions from active Clusters to other passive Clusters (experimental Multi-Cluster feature).
- Visibility queue: pushes data to the visibility index (Elasticsearch).
The History Service talks to the Matching Service and the database.
- It uses grpcPort 7234 to host the service handler.
- It uses port 6934 for membership-related communication.
Matching Service
The Matching Service is responsible for hosting Task Queues for Task dispatching.
It is responsible for matching Workers to Tasks and routing new Tasks to the appropriate queue. This service can scale internally by having multiple instances.
It talks to the Frontend Service, History Service, and the database.
- It uses grpcPort 7235 to host the service handler.
- It uses port 6935 for membership related communication.
Worker Service
The Worker Service runs background processing for the replication queue, system Workflows, and (in versions older than 1.5.0) the Kafka visibility processor.
It talks to the Frontend Service.
- It uses port 6939 for membership-related communication.
Retention Period
A Retention Period is the amount of time a Workflow Execution Event History remains in the Cluster's persistence store.
A Retention Period applies to a single Namespace and is set when the Namespace is registered.
If the Retention Period isn't set, it defaults to 2 days. The minimum Retention Period is 1 day. The maximum Retention Period is 30 days. Setting the Retention Period to 0 results in the error A valid retention period is not set on request.
Archival
Archival is a feature that automatically backs up Event Histories and Visibility records from Temporal Cluster persistence to a custom blob store.
Workflow Execution Event Histories are backed up after the Retention Period is reached. Visibility records are backed up immediately after a Workflow Execution reaches a Closed status.
Archival enables Workflow Execution data to persist as long as needed, while not overwhelming the Cluster's persistence store.
This feature is helpful for compliance and debugging.
Temporal's Archival feature is considered experimental and not subject to normal versioning and support policy.
Archival is not supported when running Temporal via docker-compose and is disabled by default when installing the system manually and when deploying through helm charts (but can be enabled in the config).
Multi-Cluster Replication
Multi-Cluster Replication is a feature which asynchronously replicates Workflow Executions from active Clusters to other passive Clusters, for backup and state reconstruction. When necessary, for higher availability, Cluster operators can failover to any of the backup Clusters.
Temporal's Multi-Cluster Replication feature is considered experimental and not subject to normal versioning and support policy.
Temporal automatically forwards Start, Signal, and Query requests to the active Cluster. This feature must be enabled through a Dynamic Config flag per Global Namespace.
When the feature is enabled, Tasks are sent to the Parent Task Queue partition that matches that Namespace, if it exists.
All Visibility APIs can be used against active and standby Clusters. This enables Temporal Web to work seamlessly for Global Namespaces. Applications making API calls directly to the Temporal Visibility API continue to work even if a Global Namespace is in standby mode. However, they might see a lag due to replication delay when querying the Workflow execution state from a standby Cluster.
Namespace Versions
A version is a concept in Multi-Cluster Replication that describes the chronological order of events per Namespace.
With Multi-Cluster Replication, all Namespace change events and Workflow Execution History events are replicated asynchronously for high throughput. This means that data across clusters is not strongly consistent. To guarantee that Namespace data and Workflow Execution data will achieve eventual consistency (especially when there is a data conflict during a failover), a version is introduced and attached to Namespaces. All Workflow Execution History entries generated in a Namespace will also come with the version attached to that Namespace.
All participating Clusters are pre-configured with a unique initial version and a shared version increment:
initial version < shared version increment
When performing failover for a Namespace from one Cluster to another Cluster, the version attached to the Namespace will be changed by the following rule:
- for all versions which follow
version % (shared version increment) == (active cluster's initial version)
, find the smallest version which hasversion >= old version in namespace
When there is a data conflict, a comparison will be made and Workflow Execution History entries with the highest version will be considered the source of truth.
When a cluster is trying to mutate a Workflow Execution History, the version will be checked. A cluster can mutate a Workflow Execution History only if the following is true:
- The version in the Namespace belongs to this cluster, i.e.
(version in namespace) % (shared version increment) == (this cluster's initial version)
- The version of this Workflow Execution History's last entry (event) is equal or less than the version in the Namespace, i.e.
(last event's version) <= (version in namespace)
Namespace version change example
Assuming the following scenario:
- Cluster A comes with initial version: 1
- Cluster B comes with initial version: 2
- Shared version increment: 10
T = 0: Namespace α is registered, with active Cluster set to Cluster A
namespace α's version is 1
all workflows events generated within this namespace, will come with version 1
T = 1: namespace β is registered, with active Cluster set to Cluster B
namespace β's version is 2
all workflows events generated within this namespace, will come with version 2
T = 2: Namespace α is updated to with active Cluster set to Cluster B
namespace α's version is 2
all workflows events generated within this namespace, will come with version 2
T = 3: Namespace β is updated to with active Cluster set to Cluster A
namespace β's version is 11
all workflows events generated within this namespace, will come with version 11
Version history
Version history is a concept which provides a high level summary of version information in regards to Workflow Execution History.
Whenever there is a new Workflow Execution History entry generated, the version from Namespace will be attached. The Workflow Executions's mutable state will keep track of all history entries (events) and the corresponding version.
Version history example (without data conflict)
- Cluster A comes with initial version: 1
- Cluster B comes with initial version: 2
- Shared version increment: 10
T = 0: adding event with event ID == 1 & version == 1
View in both Cluster A & B
| -------- | ------------- | --------------- | ------- |
| Events | Version History |
| -------- | ------------- | --------------- | ------- |
| Event ID | Event Version | Event ID | Version |
| -------- | ------------- | --------------- | ------- |
| 1 | 1 | 1 | 1 |
| -------- | ------------- | --------------- | ------- |
T = 1: adding event with event ID == 2 & version == 1
View in both Cluster A & B
| -------- | ------------- | --------------- | ------- |
| Events | Version History |
| -------- | ------------- | --------------- | ------- |
| Event ID | Event Version | Event ID | Version |
| -------- | ------------- | --------------- | ------- |
| 1 | 1 | 2 | 1 |
| 2 | 1 | | |
| -------- | ------------- | --------------- | ------- |
T = 2: adding event with event ID == 3 & version == 1
View in both Cluster A & B
| -------- | ------------- | --------------- | ------- |
| Events | Version History |
| -------- | ------------- | --------------- | ------- |
| Event ID | Event Version | Event ID | Version |
| -------- | ------------- | --------------- | ------- |
| 1 | 1 | 3 | 1 |
| 2 | 1 | | |
| 3 | 1 | | |
| -------- | ------------- | --------------- | ------- |
T = 3: Namespace failover triggered, Namespace version is now 2 adding event with event ID == 4 & version == 2
View in both Cluster A & B
| -------- | ------------- | --------------- | ------- |
| Events | Version History |
| -------- | ------------- | --------------- | ------- |
| Event ID | Event Version | Event ID | Version |
| -------- | ------------- | --------------- | ------- |
| 1 | 1 | 3 | 1 |
| 2 | 1 | 4 | 2 |
| 3 | 1 | | |
| 4 | 2 | | |
| -------- | ------------- | --------------- | ------- |
T = 4: adding event with event ID == 5 & version == 2
View in both Cluster A & B
| -------- | ------------- | --------------- | ------- |
| Events | Version History |
| -------- | ------------- | --------------- | ------- |
| Event ID | Event Version | Event ID | Version |
| -------- | ------------- | --------------- | ------- |
| 1 | 1 | 3 | 1 |
| 2 | 1 | 5 | 2 |
| 3 | 1 | | |
| 4 | 2 | | |
| 5 | 2 | | |
| -------- | ------------- | --------------- | ------- |
Since Temporal is AP, during failover (change of active Temporal Cluster Namespace), there can exist cases where more than one Cluster can modify a Workflow Execution, causing divergence of Workflow Execution History. Below shows how the version history will look like under such conditions.
Version history example (with data conflict)
Below, shows version history of the same Workflow Execution in 2 different Clusters.
- Cluster A comes with initial version: 1
- Cluster B comes with initial version: 2
- Cluster C comes with initial version: 3
- Shared version increment: 10
T = 0:
View in both Cluster B & C
| -------- | ------------- | --------------- | ------- |
| Events | Version History |
| -------- | ------------- | --------------- | ------- |
| Event ID | Event Version | Event ID | Version |
| -------- | ------------- | --------------- | ------- |
| 1 | 1 | 2 | 1 |
| 2 | 1 | 3 | 2 |
| 3 | 2 | | |
| -------- | ------------- | --------------- | ------- |
T = 1: adding event with event ID == 4 & version == 2 in Cluster B
| -------- | ------------- | --------------- | ------- |
| Events | Version History |
| -------- | ------------- | --------------- | ------- |
| Event ID | Event Version | Event ID | Version |
| -------- | ------------- | --------------- | ------- |
| 1 | 1 | 2 | 1 |
| 2 | 1 | 4 | 2 |
| 3 | 2 | | |
| 4 | 2 | | |
| -------- | ------------- | --------------- | ------- |
T = 1: namespace failover to Cluster C, adding event with event ID == 4 & version == 3 in Cluster C
| -------- | ------------- | --------------- | ------- |
| Events | Version History |
| -------- | ------------- | --------------- | ------- |
| Event ID | Event Version | Event ID | Version |
| -------- | ------------- | --------------- | ------- |
| 1 | 1 | 2 | 1 |
| 2 | 1 | 3 | 2 |
| 3 | 2 | 4 | 3 |
| 4 | 3 | | |
| -------- | ------------- | --------------- | ------- |
T = 2: replication task from Cluster C arrives in Cluster B
Note: below are a tree structures
| -------- | ------------- |
| Events |
| -------- | ------------- |
| Event ID | Event Version |
| -------- | ------------- |
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| -------- | ------------- |
|
| ------------- | ------------ |
| |
| -------- | ------------- | | -------- | ------------- |
| Event ID | Event Version | | Event ID | Event Version |
| -------- | ------------- | | -------- | ------------- |
| 4 | 2 | | 4 | 3 |
| -------- | ------------- | | -------- | ------------- |
| --------------- | ------- |
| Version History |
| --------------- | ------- |
| Event ID | Version |
| --------------- | ------- |
| 2 | 1 |
| 3 | 2 |
| --------------- | ------- |
|
| ------- | ------------------- |
| |
| --------------- | ------- | | --------------- | ------- |
| Event ID | Version | | Event ID | Version |
| --------------- | ------- | | --------------- | ------- |
| 4 | 2 | | 4 | 3 |
| --------------- | ------- | | --------------- | ------- |
T = 2: replication task from Cluster B arrives in Cluster C, same as above
Conflict resolution
When a Workflow Execution History diverges, proper conflict resolution is applied.
In Multi-cluster Replication, Workflow Execution History Events are modeled as a tree, as shown in the second example in Version History.
Workflow Execution Histories that diverge will have more than one history branch.
Among all history branches, the history branch with the highest version is considered the current branch
and the Workflow Execution's mutable state is a summary of the current branch.
Whenever there is a switch between Workflow Execution History branches, a complete rebuild of the Workflow Execution's mutable state will occur.
Temporal Multi-Cluster Replication relies on asynchronous replication of Events across Clusters, so in the case of a failover it is possible to have an Activity Task dispatched again to the newly active Cluster due to a replication task lag. This also means that whenever a Workflow Execution is updated after a failover by the new Cluster, any previous replication tasks for that Execution cannot be applied. This results in loss of some progress made by the Workflow Execution in the previous active Cluster. During such conflict resolution, Temporal re-injects any external Events like Signals in the new Event History before discarding replication tasks. Even though some progress could roll back during failovers, Temporal provides the guarantee that Workflow Executions won’t get stuck and will continue to make forward progress.
Activity Execution completions are not forwarded across Clusters. Any outstanding Activities will eventually time out based on the configuration. Your application should have retry logic in place so that the Activity gets retried and dispatched again to a Worker after the failover to the new Cluster. Handling this is similar to handling an Activity Task timeout caused by a Worker restarting.
Zombie Workflows
There is an existing contract that for any Namespace and Workflow Id combination, there can be at most one run (Namespace + Workflow Id + Run Id) open / executing.
Multi-cluster Replication aims to keep the Workflow Execution History as up-to-date as possible among all participating Clusters.
Due to the nature of Multi-cluster Replication (for example, Workflow Execution History events are replicated asynchronously) different Runs (same Namespace and Workflow Id) can arrive at the target Cluster at different times, sometimes out of order, as shown below:
| ------------- | | ------------- | | ------------- |
| Cluster A | | Network Layer | | Cluster B |
| ------------- | | ------------- | | ------------- |
| | |
| Run 1 Replication Events | |
| -----------------------> | |
| | |
| Run 2 Replication Events | |
| -----------------------> | |
| | |
| | |
| | |
| | Run 2 Replication Events |
| | -----------------------> |
| | |
| | Run 1 Replication Events |
| | -----------------------> |
| | |
| ------------- | | ------------- | | ------------- |
| Cluster A | | Network Layer | | Cluster B |
| ------------- | | ------------- | | ------------- |
Since Run 2 appears in Cluster B first, Run 1 cannot be replicated as "runnable" due to the rule at most one Run open
(see above), thus the "zombie" Workflow Execution state is introduced.
A "zombie" state is one in which a Workflow Execution which cannot be actively mutated by a Cluster (assuming the corresponding Namespace is active in this Cluster). A zombie Workflow Execution can only be changed by a replication Task.
Run 1 will be replicated similar to Run 2, except when Run 1's execution will become a "zombie" before Run 1 reaches completion.
Workflow Task processing
In the context of Multi-cluster Replication, a Workflow Execution's mutable state is an entity which tracks all pending tasks. Prior to the introduction of Multi-cluster Replication, Workflow Execution History entries (events) are from a single branch, and the Temporal Server will only append new entries (events) to the Workflow Execution History.
After the introduction of Multi-cluster Replication, it is possible that a Workflow Execution can have multiple Workflow Execution History branches. Tasks generated according to one history branch may become invalidated by switching history branches during conflict resolution.
Example:
T = 0: task A is generated according to Event Id: 4, version: 2
| -------- | ------------- |
| Events |
| -------- | ------------- |
| Event ID | Event Version |
| -------- | ------------- |
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| -------- | ------------- |
|
|
| -------- | ------------- |
| Event ID | Event Version |
| -------- | ------------- |
| 4 | 2 | <-- task A belongs to this event
| -------- | ------------- |
T = 1: conflict resolution happens, Workflow Execution's mutable state is rebuilt and history Event Id: 4, version: 3 is written down to persistence
| -------- | ------------- |
| Events |
| -------- | ------------- |
| Event ID | Event Version |
| -------- | ------------- |
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| -------- | ------------- |
|
| ------------- | -------------------------------------------- |
| |
| -------- | ------------- | | -------- | ------------- |
| Event ID | Event Version | | Event ID | Event Version |
| -------- | ------------- | | -------- | ------------- |
| 4 | 2 | <-- task A belongs to this event | 4 | 3 | <-- current branch / mutable state
| -------- | ------------- | | -------- | ------------- |
T = 2: task A is loaded.
At this time, due to the rebuild of a Workflow Execution's mutable state (conflict resolution), Task A is no longer relevant (Task A's corresponding Event belongs to non-current branch). Task processing logic will verify both the Event Id and version of the Task against a corresponding Workflow Execution's mutable state, then discard task A.
Plugins
Temporal Clusters support some pluggable components.
Claim Mapper
The Claim Mapper component is a pluggable component that extracts Claims from JSON Web Tokens (JWTs).
This process is achieved with the method GetClaims
, which translates AuthInfo
structs from the caller into Claims
about the caller's roles within Temporal.
A Role
(within Temporal) is a bit mask that combines one or more of the role constants.
In the following example, the role is assigned constants that allow the caller to read and write information.
role := authorization.RoleReader | authorization.RoleWriter
GetClaims
is customizable and can be modified with the temporal.WithClaimMapper
server option.
Temporal also offers a default JWT ClaimMapper
for your use.
A typical approach is for ClaimMapper
to interpret custom Claims
from a caller's JWT, such as membership in groups, and map them to Temporal roles for the user.
The subject information from the caller's mTLS certificate can also be a parameter in determining roles.
AuthInfo
AuthInfo
is a struct that is passed to GetClaims
. AuthInfo
contains an authorization token extracted from the authorization
header of the gRPC request.
AuthInfo
includes a pointer to the pkix.Name
struct.
This struct contains an x.509 Distinguished Name from the caller's mTLS certificate.
Claims
Claims
is a struct that contains information about permission claims granted to the caller.
Authorizer
assumes that the caller has been properly authenticated, and trusts the Claims
when making an authorization decision.
Default JWT ClaimMapper
Temporal offers a default JWT ClaimMapper
that extracts the information needed to form Temporal Claims
.
This plugin requires a public key to validate digital signatures.
To get an instance of the default JWT ClaimMapper
, call NewDefaultJWTClaimMapper
and provide it with the following:
- a
TokenKeyProvider
instance - a
config.Authorization
pointer - a logger
The code for the default ClaimMapper
can also be used to build a custom ClaimMapper
.
Token key provider
A TokenKeyProvider
obtains public keys from specified issuers' URIs that adhere to a specific format.
The default JWT ClaimMapper
uses this component to obtain and refresh public keys over time.
Temporal provides an rsaTokenKeyProvider
.
This component dynamically obtains public keys that follow the JWKS format.
rsaTokenKeyProvider
uses only the RSAKey
and Close
methods.
provider := authorization.NewRSAKeyProvider(cfg)
KeySourceURIs
are the HTTP endpoints that return public keys of token issuers in the JWKS format.
RefreshInterval
defines how frequently keys should be refreshed.
For example, Auth0 exposes endpoints such as https://YOUR_DOMAIN/.well-known/jwks.json
.
By default, "permissions" is used to name the permissionsClaimName
value.
Configure the plugin with config.Config.Global.Authorization.JWTKeyProvider
.
JSON Web Token format
The default JWT ClaimMapper
expects authorization tokens to be formatted as follows:
Bearer <token>
The Permissions Claim in the JWT Token is expected to be a collection of Individual Permission Claims. Each Individual Permission Claim must be formatted as follows:
<namespace> : <permission>
These permissions are then converted into Temporal roles for the caller. This can be one of Temporal's four values:
- read
- write
- worker
- admin
Multiple permissions for the same Namespace are overridden by the ClaimMapper
.
Example of a payload for the default JWT ClaimMapper
{
"permissions":[
"system:read",
"namespace1:write"
],
"aud":[
"audience"
],
"exp":1630295722,
"iss":"Issuer"
}
Authorizer Plugin
The Authorizer
plugin contains a single Authorize
method, which is invoked for each incoming API call.
Authorize
receives information about the API call, along with the role and permission claims of the caller.
Authorizer
allows for a wide range of authorization logic, including call target, role/permissions claims, and other data available to the system.
Configuration
The following arguments must be passed to Authorizer
:
context.Context
: General context of the call.authorization.Claims
: Claims about the roles assigned to the caller. Its intended use is described in theClaims
section earlier on this page.authorization.CallTarget
: Target of the API call.
Authorizer
then returns one of two decisions:
DecisionDeny
: the requested API call is not invoked and an error is returned to the caller.DecisionAllow
: the requested API call is invoked.
Authorizer
allows all API calls pass by default. Disable the nopAuthority
authorizer and configure your own to prevent this behavior.
Configure your Authorizer
when you start the server via the temporal.WithAuthorizer
server option.
If an Authorizer
is not set in the server options, Temporal uses the nopAuthority
authorizer that unconditionally allows all API calls to pass through.
a := authorization.NewDefaultAuthorizer()