Alerts can be used to notify recipients about status changes to your checks. They can be sent by email address, SMS messages, or integrated 3rd party service.
Here’s an example of some configured recipients, some of which are disabled: | |
Alerts configured for a browser check, all enabled: |
Note: The alerts are all enabled, but since most of the targets are disabled, alerts will be triggered but not sent for these.
Apica’s alerting requires that the monitoring checks be defined, the thresholds and conditions for an alert set, and a destination where this information will ultimately end.
This section introduces the concept of alerting using a push concept, with Webhooks defining where the alerts' POST body gets sent. The contents of the alert messages are populated with placeholders that convey the actual values needed by either the Webhook-enabled service, the SMS text, or the email.
Alerts
Are fired/triggered when a threshold has been passed, or a condition has been met.
Have Targets (places to send the alert to)
Targets
Can be directly sent to a person/group
Email
SMS
It can be a Webhook-enabled service like OpsGenie or Splunk
Push style to the service ingestion endpoint
Not a REST-API Pull model
Placeholders
Messages: The messages these alerts create can contain information/metric Message Placeholders filled in during message generation.
For Webhooks: WebHook Placeholders are what the service needs to route the incoming alert message correctly. These placeholders can be customized for the service to be integrated.
The Alert Setup Process
From the Manage Alerts view, you can assign individual or group alerts for each check, which will notify them of any status change according to the preferred severity.
Screenshots
In Manage Alerts, checks are displayed by Top Level Group and Subgroup, much as they appear in Manage checks, but with an icon along the right side with which Alerts can be assigned.
In the upper right, you can toggle between Alerts and Recipients.
Workflow
The general workflow for creating alerts is:
Create or Add a User to receive the alerts (Who gets the alert?).
Create Targets for the recipients' delivery method (How are the alerts delivered? e.g., PagerDuty, SMS Text, Email, etc.).
(Optional) Create a Group containing multiple recipients for the alert.
Create the Alerts themselves by selecting checks and assigning recipients.
Step | Screenshot |
---|---|
Add RecipientsThe Users and Groups you set to receive Alerts are set up in the Recipients tab. There is a column to define Users and their contact information, as well as Groups. | |
First Step: Add a UserRecipients are the users or groups of users you select to receive the alerts.
| |
Second Step: Add GroupRecipient groups are collections of user targets you select to receive the alerts. Note: You need to create the users and targets before you can add them to a group. Create GroupTo add a recipient group:
| |
TargetsWhen defining alert recipients, you can have the message delivered via various target services. For each User or Group Recipient, you add delivery Targets that define the method of delivery. UserYou can select to add PagerDuty, Email, a WebHook integration, or SMS (text message) as targets. GroupsWhen you have defined targets for individual users, you can add them to Groups: | |
Alerts TabThe Alerts tab allows you to set Severity, Targets (individual users), and Groups (of users) to review alerts according to the parameters you prefer. Alerts can be set for individual checks and be delivered to multiple Targets. | |
Add AlertYou can add alerts for any checks and select one or several severities to include in the alert. Each alert can have multiple Recipients, and each recipient can have multiple Targets. Create Alert To add an alert:
OR:
|
Configuring Different Alerting Types
E-mail Alerts
A standard way of delivering notifications is sending an email. You can send the alert to multiple email addresses, and optionally have a customized message containing Message Placeholders.
An email target is created automatically when you set up a Add User.
Add an E-mail Target
To add an Email target:
Click the Email button
Enter a Target Name for identification in Synthetic Monitoring
Enter a list of Email addresses to sent the alert to
If you want to use a custom message:
Uncheck Use Default Message
Enter an alert Message (you can use Message Placeholders)
Click the Add Email Target button
The Target is created, containing the selected user/targets.
SMS Alerts
Alerts can be delivered as SMS to mobile phones. You can send the alert to multiple numbers, and optionally have a customized message containing Message Placeholders.
The phone target is created automatically when you set up user alerts via Add User.
The phone number needs to include the International Country Prefix. For example; +1
for the US, and +46
for Sweden, etc.
Create Target
To add an Text Message target:
Click the Text Message button
Enter a Target Name for identification in Synthetic Monitoring
Enter a list of Phone Numbers to sent the alert to
If you want to use a custom message:
Uncheck Use Default Message
Enter an alert Message (you can use Message Placeholders)
Click the Add SMS Target button
The Target is created, containing the selected user/targets.
PagerDuty Alerts
With the PagerDuty Integration, you can have alerts delivered through the PagerDuty platform, offering a rich set of notification delivery options. You need to set up the PagerDuty Integration before you can create a PagerDuty target.
You need to create the users and groups before you can add Targets to them.
To add a PagerDuty target, click the PagerDuty button
Open the Service menu
Choose the desired service
Enter a Target Name for identification in Synthetic Monitoring
Click the Add PagerDuty Target button
The Target is created, containing the selected user/targets.
Other Alert Types
For instructions on how to configure other alert types, refer to the article Configuring Webhook Alerts.
Understanding and Configuring Placeholders
A placeholder is a character, word, or string of characters that temporarily takes the place of the final data.
For example, an operations manager may know that, for an alert, he needs a certain number of metrics with returned values or variables but doesn't yet know what to input because the value is dynamically returned from the monitoring results. He can use a placeholder as a temporary solution until a proper value or variable can be assigned by an alert (or message).
At Apica, we use placeholders in the following manners:
Alerts and Messages: when a customer wants to be made aware of/alerted about a state of a monitoring check. In other words, some threshold or a specified set of conditions has been met that needs to be sent (in some format to consume, like a popular alerting service like PagerDuty, or via a Webhook, SMS, or email). When this happens, a message gets generated and displayed in either an email or SMS text or generally POSTed to a Webhook-enabled service that ingests this information.
Webhooks: When there’s a business service that needs to ingest information about an Apica monitoring check (status, alert, message), the Webhook, as a push service, is a passive way to receive this information. So, a set of alert integration placeholders has been defined and customized to your service needs.
Depending on what the purpose is, two placeholder characters denote an Apica placeholder.
Message and Alert placeholders are each surrounded by the % character.
Webhook placeholders are each surrounded by the # character.
Message Placeholders
When Apica sends out an alert (with Apica’s “Alerter service”), it uses a set of “placeholders” (needed by the various destination “targets” (alert destinations)) to refer to parts of information associated with events that have triggered (or resolved) the alert.
So placeholders provide a way to customize the layout and contents of SMTP (email), SMS messages and provide event-based information via a POST body to Webhook enabled Services like OpsGenie ServiceNow.
A placeholder has the following format:
%placeholder-name%
There is a set of predefined placeholders configured in ASM:
Placeholder | Meaning | Example |
---|---|---|
Event-based Placeholders | ||
%E% | Event symbol. For check-based events, this is the CheckConfig.check_symbol value. | N84_M377_C1000_URL_20090227_013715_307 |
%M% | Event message text. | Message |
%N% | The NETBIOS name of the host is the source of the event. | Node |
%QM% | Event message text with any double-quotes (") replaced by single-quotes ('). | Message |
%S% | Event severity as one upper-case character, I, W, E, or F. | Severity |
%SEV% | Event severity as one word, Info, Warning, Error, or Fatal. | Severity |
%T% | Agent-local timestamp. Format YYYY-MM-DD HH:MM:SS. | Timestamp |
%UTC-T% | UTC timestamp with a 'T' between the date and time portions. Format YYYY-MM-DDTHH:MM:SS. | Timestamp (UTC) |
%UTC% | The timestamp of the event is expressed in UTC. Format YYYY-MM-DD HH:MM:SS. | Timestamp (UTC) |
For Check-based Events, Use the Following Placeholders | ||
%CHECK_ID% | Check id (32-bit positive integer from CheckConfig.id) | |
%CHECK_GUID% | Check GUID. | A UUID from CheckConfig.check_guid, in the format XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX |
%CHECK_NAME% | Check descriptor (from CheckConfig.check_descriptor) | |
%CHECK_TYPE% | Check descriptor (based on CheckConfig.check_type) | URL, Command, Ping, Port, Scenario, Fullpage (IE), or Fullpage |
%MODULE_NAME% | Check module descriptor (from NodeModule.amm_descriptor) | |
%OLD_SEV_CHAR% | Previous check severity as an uppercase letter | (I, W, E, or F) |
%NEW_SEV_CHAR% | Current check severity as an uppercase letter | (I, W, E, or F) |
%OLD_SEV_WORD% | Previous check severity as a word | (Info, Warning, Error or Fatal) |
%NEW_SEV_WORD% | Current check severity as a word | (Info, Warning, Error or Fatal) |
%RESULT_GUID% | Check Result UUID without dashes. Replaced by an empty string if no result identifier is part of the event. | a8e59d718fa949cb86c9ccfc93ff1876 |
%RESULT_G-U-I-D% | Check Result UUID with dashes. Replaced by an empty string if no result identifier is part of the event. | a8e59d71-8fa9-49cb-86c9-ccfc93ff1876. Replaced by an empty string if no result identifier is part of the event. |
%TT% | Timestamp adjusted to the timezone of the current dispatch target (maybe based on user/customer). Falls back to UTC. | Format YYYY-MM-DD HH:MM:SS (TZ-offset) or YYYY-MM-DD HH:MM:SS if UTC. |
%CHECK_TAGS% | A set of Key, Value pairs assigned to the check. | "Key 1: Value 1, Value 2, Value 3; Key 2: Value 1, Value 2, Value 3" |
Placeholders that may be available if the Alerter uses a check information cache | ||
%CHECK_DESCRIPTION% | Check description (from CheckConfig.check_description). For CLI-targets, any embedded carriage return/newline (CR/LF) character combinations (\r\n) s are replaced by a space, then the remaining CR and LF are replaced by empty strings. | |
%xmlsafe:CHECK_DESCRIPTION% | Check description (from CheckConfig.check_description) with any XML-unsafe characters replaced by character entities. | e.g. & -> &. Same rules apply for embedded CR and LF as for %CHECK_DESCRIPTION%. |
%GROUPS% | List of monitor groups to which the check belongs. A comma-separated list of "top group/subgroup" entries. Since a check can be associated with more than one monitor group (possibly belonging to different users), the list can contain more than one entry. |
Event-Related Placeholders
Placeholder | Description | Example |
---|---|---|
%E% | Event symbol. For check-based events, this is the CheckConfig.check_symbol value. |
|
%M% | Event message text. | |
%QM% | Event message text with any double-quotes (") replaced by single-quotes ('). | |
%S% | Event severity as one upper-case character, I, W, E, or F. | |
%SEV% | Event severity as one word, Info, Warning, Error, or Fatal. | |
%UTC% | The timestamp of the event is expressed in UTC. | Format YYYY-MM-DD HH:MM:SS. |
%UTC-T% | UTC timestamp with a 'T' between the date and time portions. | Format YYYY-MM-DDTHH:MM:SS. |
Check-related Placeholders
Placeholder | Description | Example |
---|---|---|
%CHECK_DESCRIPTION% | Check description. For CLI-targets, any embedded carriage return/newline (CR/LF) character combinations (\r\n) are replaced by a space, then the remaining CR and LF are replaced by empty strings. | |
%CHECK_GUID% | Check GUID. | A UUID in the format XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX. |
%CHECK_ID% | Check ID. | |
%CHECK_NAME% | Check name. | |
%CHECK_TAGS% | A set of Key: Value pair(s) assigned to the check, with the format: "Key 1: Value 1, Value 2, Value 3; Key 2: Value 1, Value 2, Value 3" | |
%CHECK_TYPE% | Check descriptor. | URL, Command, Ping, Port, Scenario, Fullpage (IE), or Fullpage |
%GROUPS% | List of monitor groups to which the check belongs. | A comma-separated list of "top group/subgroup" entries. Since a check can be associated with more than one monitor group (possibly belonging to different users), the list can contain more than one entry. |
%LOCATION% | The location from which the check is executed. | |
%NEW_SEV_CHAR% | Current check severity as an uppercase letter. | (I, W, E or F) |
%NEW_SEV_WORD% | Current check severity as a word. | (Info, Warning, Error or Fatal) |
%OLD_SEV_CHAR% | Previous check severity as an uppercase letter. | (I, W, E or F) |
%OLD_SEV_WORD% | Previous check severity as a word. | (Info, Warning, Error or Fatal) |
%RESULT_G-U-I-D% | Check Result UUID with dashes. Replaced by an empty string if no result identifier is part of the event. | a8e59d71-8fa9-49cb-86c9-ccfc93ff1876. Replaced by an empty string if no result identifier is part of the event. |
%RESULT_GUID% | Check Result UUID without dashes. Replaced by an empty string if no result identifier is part of the event. | a8e59d718fa949cb86c9ccfc93ff1876. |
%TT% | Timestamp adjusted to the timezone of the current dispatch target (maybe based on user/customer). Falls back to UTC. | Format YYYY-MM-DD HH:MM:SS (TZ-offset) or YYYY-MM-DD HH:MM:SS if UTC. |
%xmlsafe:CHECK_DESCRIPTION% | Check description with any XML-unsafe characters replaced by character entities. | Check description (from CheckConfig.check_description) with any XML-unsafe characters replaced by character entities, e.g. & -> & Same rules apply for embedded CR and LF as for %CHECK_DESCRIPTION%. |
Webhook Placeholders
Default placeholders are surrounded by a pound/hashtag # character.
A default set of placeholders has been provided. These can be configured with the Webhook alert integration, or you may customize your Webhook placeholders as necessary.
Placeholder | Used For/Definition/Comment |
---|---|
| |
| Slack, ServiceNow |
| VictorOps, OpsGenie, Datadog |
| |
| |
| |
| |
| |
| HipChat, VictorOps |
| Splunk |
|
Defining Your Own Webhook Placeholders in Custom Webhooks
The default placeholders above should only be considered suggestions. It is also possible to define your own webhook placeholders, which will pull their value from some response content which comes back from an API call. These custom-defined placeholders are also surrounded by a pound/hashtag # character.
Consider the following example:
Here, after the main alert text is sent to Slack (via hooks.slack.com), a second URL call is made to https://api-wpm.apicasystem.com. It is a GET request which returns response data in JSON format:
One of the key/value pairs in the response is “url”. Although ASM asks for an XPath, you must provide the JSON path instead and ASM will find the value at the given path of the response and assign it to the custom placeholder you define. In other words, there is some manual translation which must be done in this instance - the URL property from the Postman screenshot above becomes /url in the webhook definition screenshot.
Essentially, in the above example, when the GET request is resolved, the value of #url# becomes whatever is found in the “url” property of the response body. In the above example, the recipient of the check will instantly know that the check goes to https://www.msn.com.
Alert Configuration Examples
The following examples explain various advanced configurations. For explicit instructions on how to set up the Webhook configurations shown below, refer to the article Configuring Webhook Alerts.
API keys/dynamic tokens/etc. are always generated on the recipient alerting platform side. For instructions on how to generate these tokens, refer to the alerting platform’s API documentation.
If you have any questions about alert configuration, please send a support ticket to support@apica.io, and we will help you with the setup and testing process.
Custom Message:
The status changed to %SEV% from %OLD_SEV_WORD% %TT% (%UTC% UTC) for check "%CHECK_NAME%" (id %CHECK_ID% from %LOCATION% ). Message: %M% The check is run from %LOCATION%. http://wpm.apicasystem.com/check/details/%CHECK_ID%
The above E-mail message uses placeholders. See
OpsGenie
Host: https://api.opsgenie.com
API key: generated on the OpsGenie side
Message: %CHECK_NAME% Status Has Changed
Alias: %CHECK_NAME%
Description: a custom description for the alert which gives identifying information about the Alert.
Here is an example Description (some identifying information removed):
The status changed to *Error* (from Info) at *2022-07-05 14:54:39 (GMT-04:00)* for the check <https://wpm.apicasystem.com/BrowserResult/Details?checkId={checkId}&resultId={resultId}> Message: *Fullpage (FF) check 'test waitForText' failed [Error on 4 URL(s) Time (8272) was above upper limit (2000 ms)]* The check is run from *Check Location*.
Slack
URL: generated on the Slack side
Custom Webhook w/ message placeholder (Microsoft Teams)
For instructions on setting up the URL which will be used as both the “Trigger URL” and the “Resolution URL”, refer to the official Microsoft documentation here.
Trigger URL: https://apicasystem.webhook.office.com/webhookb2/{longGuidString} /IncomingWebhook/{longGuidString}
Resolution URL: https://apicasystem.webhook.office.com/webhookb2/{longGuidString} /IncomingWebhook/{longGuidString}
Data: For both the trigger and resolution request sequences, the placeholder %CHECK_ID% is used along with a custom message to inform the user that the alert has been triggered/resolved.
Example alert trigger/resolution within a configured Microsoft Teams channel:
Custom Webhook w/ message placeholder (OpsGenie)
Trigger URL: https://api.opsgenie.com/v2/alerts?apiKey={opsGenieApiKey} (opsGenieApiKey is generated on the OpsGenie end)
Data:
{ "message":"%CHECK_NAME%", "alias":"%CHECK_ID%", "description":"%CHECK_DESCRIPTION%", "responders":[ { "name":"SAT", "type":"team" } ], "visibleTo":[ { "name":"SAT", "type":"team" } ], "priority":"P1", "user":"ASM" }
This JSON data is an example which we’ve created based on your current OpsGenie integration. The payload can be customized by referring to this documentation: https://docs.opsgenie.com/docs/alert-api#create-alert
%CHECK_NAME%, %CHECK_ID%, and %CHECK_DESCRIPTION% are message placeholders. Alias must be %CHECK_ID% in order for the resolution request to function correctly.
Resolve URL: https://api.opsgenie.com/v2/alerts/%CHECK_ID%/close?identifierType=alias&apiKey={opsGenieApiKey} (opsGenieApiKey is generated on the OpsGenie end)
%CHECK_ID% is a string which dynamically grabs the check ID of the check, which has ALSO been set as the Alias of the alert. It is possible to close Alerts via alert Alias, which we defined to be the Check ID in the Trigger Sequence request body. Thus, by specifying the Check ID as both the alias of the alert, we are able to dynamically identify and close the alert we created before. “identifierType=alias” is required.
Data (resolution request):
{“user”:”ASM”}
The response body of the resolution alert cannot be empty. Thus, at least one property must be present in the JSON body, although the POST call does not require any data from the OpsGenie end. It is possible to add a resolution note as well; see https://docs.opsgenie.com/docs/alert-api#close-alert for more details concerning closing alerts.
Custom Webhook w/ message placeholder, a custom placeholder (Slack)
Response Parameters: although the UI asks for the XPath for finding XML, it is also capable of finding JSON properties with the XPath. In this instance, we want to capture the value of the URL property of the HTTP response of our Sub Request (the api-wpm.apicasystem.com request).
As such, we will use #url# to display the “url” property of the api-wpm API call in our resolution message.
Trigger Request: the initial API call we make when the Alert is Triggered. Generated from the Slack end. Example: https://hooks.slack.com/services/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
Data: Note that the data contains a Message Placeholder, %CHECK_NAME%, which allows us to display dynamic data in the Slack message body.
Severity: We can have different Payload data (that is, different request messages) for different severities if we so desire. That is not needed for our use case, so we will use one payload for Warning, Error, and Fatal.
Sub Requests: After the initial request to https://hooks.slack.com is made, we want to get ADDITIONAL information about the check from an API call which gives us more check data. We store that data in the Response Parameter we explained above. Example: https://api-wpm.apicasystem.com/v3/checks/{check_id}?auth_ticket={auth_ticket}
Resolution Request: the final API call we make when the Alert is Resolved. Note the usage of both Message and Webhook placeholders to give our resolution message that dynamic data we need. Generated from the Slack end. Example: https://hooks.slack.com/services/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
Here are examples of the request/resolution messages that are sent to Slack based on the above configuration: