Activities
This guide provides a comprehensive overview of Temporal Activities.
In day-to-day conversations, the term Activity frequently denotes either an Activity Definition, an Activity Type, or an Activity Execution. Temporal documentation aims to be explicit and differentiate between them.
An Activity is a normal function or object method that executes a single, well-defined action (either short or long running), such as calling another service, transcoding a media file, or sending an email message.
Workflow code orchestrates the execution of Activities, persisting the results. If an Activity Function Execution fails, any future execution starts from initial state (except Heartbeats). Therefore, an Activity function is allowed to contain any code without restrictions.
Activity Functions are executed by Worker Processes.
When the Activity Function returns, the Worker sends the results back to the Temporal Cluster as part of the ActivityTaskCompleted
Event.
The Event is added to the Workflow Execution's Event History.
Activity Definition
An Activity Definition is the code that defines the constraints of an Activity Task Execution.
The term 'Activity Definition' is used to refer to the full set of primitives in any given language SDK that provides an access point to an Activity Function Definition——the method or function that is invoked for an Activity Task Execution. Therefore, the terms Activity Function and Activity Method refer to the source of an instance of an execution.
Activity Definitions are named and referenced in code by their Activity Type.
Constraints
Activity Definitions are executed as normal functions.
In the event of failure, the function begins at its initial state when retried (except when Activity Heartbeats are established).
Therefore, an Activity Definition has no restrictions on the code it contains.
Parameters
An Activity Definition can support as many parameters as needed.
All values passed through these parameters are recorded in the Event History of the Workflow Execution. Return values are also captured in the Event History for the calling Workflow Execution.
Activity Definitions must contain the following parameters:
- Context: an optional parameter that provides Activity context within multiple APIs.
- Heartbeat: a notification from the Worker to the Temporal Cluster that the Activity Execution is progressing. Cancelations are allowed only if the Activity Definition permits Heartbeating.
- Timeouts: intervals that control the execution and retrying of Activity Task Executions.
Other parameters, such as Retry Policies and return values, can be seen in the implementation guides, listed in the next section.
Activity Type
An Activity Type is the mapping of a name to an Activity Definition.
Activity Types are scoped through Task Queues.
Activity Execution
An Activity Execution is the full chain of Activity Task Executions.
An Activity Execution has no time limit. Activity Execution time limits and retries can be optimized for each situation within the Temporal Application.
If for any reason an Activity Execution does not complete (exhausts all retries), the error is returned to the Workflow, which decides how to handle it.
Request Cancellation
A Workflow can request to cancel an Activity Execution.
When an Activity Execution is canceled, or its Workflow Execution has completed or failed, the context passed into its function is canceled, which also sets its channel’s closed state to Done
.
An Activity can use that to perform any necessary cleanup and abort its execution.
Cancellation requests are only delivered to Activity Executions that Heartbeat:
- The Heartbeat request fails with a special error indicating that the Activity Execution is canceled. Heartbeats can also fail when the Workflow Execution that spawned it is in a completed state.
- The Activity should perform all necessary cleanup and report when it is done.
- The Workflow can decide if it wants to wait for the Activity cancellation confirmation or proceed without waiting.
Activity Id
A unique identifier for an Activity Execution. The identifier can be generated by the system, or it can be provided by the Workflow code that spawns the Activity Execution. An Activity Id can be used to complete the Activity asynchronously.
Schedule-To-Start Timeout
A Schedule-To-Start Timeout is the maximum amount of time that is allowed from when an Activity Task is scheduled (that is, placed in a Task Queue) to when a Worker starts (that is, picks up from the Task Queue) that Activity Task. In other words, it's a limit for how long an Activity Task can be enqueued.
How to set a Schedule-To-Start Timeout
The moment that the Task is picked by the Worker from the Task Queue is considered to be the start of the Activity Task for the purposes of the Schedule-To-Start Timeout and associated metrics. This definition of "Start" avoids issues that a clock difference between the Temporal Cluster and a Worker might create.
"Schedule" in Schedule-To-Start and Schedule-To-Close have different frequency guarantees.
The Schedule-To-Start Timeout is enforced for each Activity Task, whereas the Schedule-To-Close Timeout is enforced once per Activity Execution. Thus, "Schedule" in Schedule-To-Start refers to the scheduling moment of every Activity Task in the sequence of Activity Tasks that make up the Activity Execution, while "Schedule" in Schedule-To-Close refers to the first Activity Task in that sequence.
A Retry Policy attached to an Activity Execution retries an Activity Task.
This timeout has two primary use cases:
- Detect whether an individual Worker has crashed.
- Detect whether the fleet of Workers polling the Task Queue is not able to keep up with the rate of Activity Tasks.
The default Schedule-To-Start Timeout is ∞ (infinity).
If this timeout is used, we recommend setting this timeout to the maximum time a Workflow Execution is willing to wait for an Activity Execution in the presence of all possible Worker outages, and have a concrete plan in place to reroute Activity Tasks to a different Task Queue. This timeout does not trigger any retries regardless of the Retry Policy, as a retry would place the Activity Task back into the same Task Queue. We do not recommend using this timeout unless you know what you are doing.
In most cases, we recommend monitoring the temporal_activity_schedule_to_start_latency
metric to know when Workers slow down picking up Activity Tasks, instead of setting this timeout.
Start-To-Close Timeout
A Start-To-Close Timeout is the maximum time allowed for a single Activity Task Execution.
The default Start-To-Close Timeout is the same as the default Schedule-To-Close Timeout.
An Activity Execution must have either this timeout (Start-To-Close) or the Schedule-To-Close Timeout set. We recommend always setting this timeout; however, make sure that it is always set to be longer than the maximum possible time for the Activity Execution to take place. For long running Activity Executions, we recommend also using Activity Heartbeats and Heartbeat Timeouts.
The main use case for the Start-To-Close timeout is to detect when a Worker crashes after it has started executing an Activity Task.
A Retry Policy attached to an Activity Execution retries an Activity Task Execution. Thus, the Start-To-Close Timeout is applied to each Activity Task Execution within an Activity Execution.
If the first Activity Task Execution returns an error the first time, then the full Activity Execution might look like this:
If this timeout is reached, the following actions occur:
- An ActivityTaskTimedOut Event is written to the Workflow Execution's mutable state.
- If a Retry Policy dictates a retry, the Temporal Cluster schedules another Activity Task.
- The attempt count increments by 1 in the Workflow Execution's mutable state.
- The Start-To-Close Timeout timer is reset.
Schedule-To-Close Timeout
A Schedule-To-Close Timeout is the maximum amount of time allowed for the overall Activity Execution, from when the first Activity Task is scheduled to when the last Activity Task, in the chain of Activity Tasks that make up the Activity Execution, reaches a Closed status.
Example Schedule-To-Close Timeout period for an Activity Execution that has a chain Activity Task Executions:
The default Schedule-To-Close Timeout is ∞ (infinity).
An Activity Execution must have either this timeout (Schedule-To-Close) or Start-To-Close set. By default, an Activity Execution Retry Policy dictates that retries will occur for up to 10 years. This timeout can be used to control the overall duration of an Activity Execution in the face of failures (repeated Activity Task Executions), without altering the Maximum Attempts field of the Retry Policy.
Activity Heartbeat
An Activity Heartbeat is a ping from the Worker that is executing the Activity to the Temporal Cluster. Each ping informs the Temporal Cluster that the Activity Execution is making progress and the Worker has not crashed.
Activity Heartbeats work in conjunction with a Heartbeat Timeout.
Activity Heartbeats are implemented within the Activity Definition. Custom progress information can be included in the Heartbeat which can then be used by the Activity Execution should a retry occur.
An Activity Heartbeat can be recorded as often as needed (e.g. once a minute or every loop iteration). It is often a good practice to Heartbeat on anything but the shortest Activity Function Execution. Temporal SDKs control the rate at which Heartbeats are sent to the Cluster.
Heartbeating is not required from Local Activities, and does nothing.
For long-running Activities, we recommend using a relatively short Heartbeat Timeout and a frequent Heartbeat. That way if a Worker fails it can be handled in a timely manner.
A Heartbeat can include an application layer payload that can be used to save Activity Execution progress. If an Activity Task Execution times out due to a missed Heartbeat, the next Activity Task can access and continue with that payload.
Activity Cancellations are delivered to Activities from the Cluster when they Heartbeat. Activities that don't Heartbeat can't receive a Cancellation. Heartbeat throttling may lead to Cancellation getting delivered later than expected.
Throttling
Heartbeats may not always be sent to the Cluster—they may be throttled by the Worker. The throttle interval is the smaller of the following:
- If
heartbeatTimeout
is provided,heartbeatTimeout * 0.8
; otherwise,defaultHeartbeatThrottleInterval
maxHeartbeatThrottleInterval
defaultHeartbeatThrottleInterval
is 30 seconds by default, and maxHeartbeatThrottleInterval
is 60 seconds by default.
Each can be set in Worker options.
Throttling is implemented as follows:
- After sending a Heartbeat, the Worker sets a timer for the throttle interval.
- The Worker stops sending Heartbeats, but continues receiving Heartbeats from the Activity and remembers the most recent one.
- When the timer fires, the Worker:
- Sends the most recent Heartbeat.
- Sets the timer again.
Which Activities should Heartbeat?
Heartbeating is best thought about not in terms of time, but in terms of "How do you know you are making progress?" For short-term operations, progress updates are not a requirement. However, checking the progress and status of Activity Executions that run over long periods is almost always useful.
Consider the following when setting Activity Hearbeats:
Your underlying task must be able to report definite progress. Note that your Workflow cannot read this progress information while the Activity is still executing (or it would have to store it in Event History). You can report progress to external sources if you need it exposed to the user.
Your Activity Execution is long-running, and you need to verify whether the Worker that is processing your Activity is still alive and has not run out of memory or silently crashed.
For example, the following scenarios are suitable for Heartbeating:
- Reading a large file from Amazon S3.
- Running a ML training job on some local GPUs.
And the following scenarios are not suitable for Heartbeating:
- Making a quick API call.
- Reading a small file from disk.
Heartbeat Timeout
A Heartbeat Timeout is the maximum time between Activity Heartbeats.
If this timeout is reached, the Activity Task fails and a retry occurs if a Retry Policy dictates it.
Asynchronous Activity Completion
Asynchronous Activity Completion is a feature that enables an Activity Function to return without causing the Activity Execution to complete. The Temporal Client can then be used to both Heartbeat Activity Execution progress and eventually provide a result.
When to use Async Completion
The intended use-case for this feature is when an external system has the final result of a computation, started by an Activity.
Consider using Asynchronous Activities instead of Signals if the external process is unreliable and might fail to send critical status updates through a Signal.
Consider using Signals as an alternative to Asynchronous Activities to return data back to a Workflow Execution if there is a human in the process loop. The reason is that a human in the loop means multiple steps in the process. The first is the Activity Function that stores state in an external system and at least one other step where a human would “complete” the activity. If the first step fails, you want to detect that quickly and retry instead of waiting for the entire process, which could be significantly longer when humans are involved.
Task Token
A Task Token is a unique Id that correlates to an Activity Execution.
Activity Execution completion calls take either a single Task Token, or the Namespace, Workflow Id, and Activity Id as a set of arguments.
Local Activity
A Local Activity is an Activity Execution that executes in the same process as the Workflow Execution that spawns it.
Some Activity Executions are very short-living and do not need the queuing semantic, flow control, rate limiting, and routing capabilities. For this case, Temporal supports the Local Activity feature.
The main benefit of Local Activities is that they use less Temporal service resources (e.g. lower state transitions) and have much lower latency overhead (because no need to roundtrip to the Cluster) compared to normal Activity Executions. However, Local Activities are subject to shorter durations and a lack of rate limiting.
Consider using Local Activities for functions that are the following:
- can be implemented in the same binary as the Workflow that calls them.
- do not require global rate limiting.
- do not require routing to a specific Worker or Worker pool.
- no longer than a few seconds, inclusive of retries (shorter than the Workflow Task Timeout, which is 10 seconds by default).
Using a Local Activity without understanding its limitations can cause various production issues. We recommend using regular Activities unless your use case requires very high throughput and large Activity fan outs of very short-lived Activities. More guidance in choosing between Local Activity vs Activity is available in our forums.