Monitor Armory Continuous Deployment with Prometheus

Monitor Armory Continuous Deployment using Prometheus and Grafana.

Overview

Armory recommends monitoring the health of Armory Continuous Deployment in every production instance. This document describes how to set up a basic Prometheus and Grafana stack as well as enable monitoring for the Armory Continuous Deployment services.

Additional Prometheus and Grafana configuration is necessary to make them production-grade, and this configuration is not a part of this document. Also note that monitoring the Pipelines-as-Code service (Dinghy) and the Terraform Integration service (Terraformer) are not discussed on this page.

Before you begin

Use kube-prometheus to create a monitoring stack

You can skip this section if you already have a monitoring stack.

A quick and easy way to configure a cluster monitoring solution is to use kube-prometheus. This project creates a monitoring stack that includes cluster monitoring with Prometheus and dashboards with Grafana.

To create the stack, follow the kube-prometheus quick start instructions beginning with the Compatibility Matrix section.

After you complete the instructions, you have pods running in the monitoring namespace:

% kubectl get pods --namespace monitoring

NAME                                  READY   STATUS    RESTARTS   AGE
alertmanager-main-0                   2/2     Running   0          44s
alertmanager-main-1                   2/2     Running   0          44s
alertmanager-main-2                   2/2     Running   0          44s
grafana-77978cbbdc-x5rsq              1/1     Running   0          40s
kube-state-metrics-7f6d7b46b4-crzx2   3/3     Running   0          40s
node-exporter-nrc88                   2/2     Running   0          41s
prometheus-adapter-68698bc948-bl7p8   1/1     Running   0          40s
prometheus-k8s-0                      3/3     Running   1          39s
prometheus-k8s-1                      3/3     Running   1          39s
prometheus-operator-6685db5c6-qfpbj   1/1     Running   0          106s

Access the Prometheus web interface by using the kubectl port-forward command. If you want to expose this interface for others to use, create an ingress service. Make sure you nable security controls that follow Prometheus best practices.

% kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090 &

Navigate to http://localhost:9090/targets.

Grant Prometheus RBAC permissions

There are two steps to configure Prometheus to monitor Armory Continuous Deployment:

  • Add permissions for Prometheus to talk to the Spinnaker namespace
  • Configure Prometheus to discover the Armory Continuous Deployment endpoints

Add permissions for Prometheus by applying the following configuration to your cluster. You can learn more about this process on the Prometheus Operator homepage.

Example config:

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  # name can be either prometheus or prometheus-k8s depending on the version of the prometheus-operator
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  # name can be either prometheus or prometheus-k8s depending on the version of the prometheus-operator
  name: prometheus
subjects:
  - kind: ServiceAccount
    # name can be either prometheus or prometheus-k8s depending on the version of the prometheus-operator
    name: prometheus-k8s
    namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  # name can be either prometheus or prometheus-k8s depending on the version of the prometheus-operator
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/metrics
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["get"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: monitoring
  # name can be either prometheus or prometheus-k8s depending on the version of the prometheus-operator
  name: prometheus-k8s

Add the ServiceMonitor

Prometheus Operator uses a “ServiceMonitor” to add targets that get scraped for monitoring. The following example config shows how to monitor pods that are using the Observability Plugin to expose the aop-prometheus endpoint. Note that the example contains both the exclusion of certain services (such as Redis) and changes to the Gate endpoint to show you different options.

These are examples of potential configurations. Use them as a starting point. Armory recommends that you understand how they operate and find services. Adapt them to your environment.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: spin
    # This label is here to match the prometheus operator serviceMonitorSelector attribute
    # prometheus.prometheusSpec.serviceMonitorSelector. For more information, see
    # https://github.com/helm/charts/tree/master/stable/prometheus-operator
    release: prometheus-operator
  name: spinnaker-all-metrics
  namespace: spinnaker
spec:
  endpoints:
  - interval: 10s
    path: /aop-prometheus
  selector:
    matchExpressions:
    - key: cluster
      operator: NotIn
      values:
      - spin-gate
      - spin-gate-api
      - spin-gate-custom
      - spin-deck
      - spin-deck-custom
      - spin-redis
      - spin-terraformer
      - spin-dinghy
    matchLabels:
      app: spin

The example excludes Gate, the API service since Gate restricts access to the endpoints unless authenticated (excluding health).

The following example is for a service monitor for Gate on a different path and using TLS.

Once these are applied, you can port forward prometheus and validate that prometheus has discovered and scraped targets as appropriate.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: spinnaker-internal-metrics
  namespace: spinnaker
  labels:
    app: spin
    # This label is here to match the prometheus operator serviceMonitorSelector attribute
    # prometheus.prometheusSpec.serviceMonitorSelector
    # https://github.com/helm/charts/tree/master/stable/prometheus-operator
    release: prometheus-operator
spec:
  selector:
    matchLabels:
      cluster: spin-gate
  endpoints:
  - interval: 10s
    path: "/api/v1/aop-prometheus"
    # If Prometheus returns the error "http: server gave HTTP response to HTTPS client" then
    # replace scheme with targetPort:
    # Note that "port" is string only. "targetPort" is integer or string.
    # For example, targetPort: 8084
    scheme: "https"
    tlsConfig:
      insecureSkipVerify: true

Check for Armory Continuous Deployment targets in Prometheus

After applying these changes, you should be able to see Armory Continuous Deployment targets in Prometheus. It may take 3 to 5 minutes for this to show up depending on where Prometheus is in its config polling interval.

Prometheus Targets

Access Grafana

Configure port forwarding for Grafana:

$ kubectl --namespace monitoring port-forward svc/grafana 3000

Access the Grafana web interface via http://localhost:3000 and use the default Grafana username and password of admin:admin.

Add Armory dashboards to Grafana

Armory provides some sample dashboards (in JSON format) that you can import into Grafana as a starting point for metrics to graph for monitoring. Armory has additional dashboards that are available to Armory customers. You can skip this section if you are a Grafana expert.

To import the sample dashboards, perform the following steps:

  1. Git clone this repo to your local workstation: (https://github.com/uneeq-oss/spinnaker-mixin)
  2. Access the Grafana web interface (as shown above)
  3. Navigate to Dashboards then Manage
  4. Click on the Import button
  5. Upload the one or more of the sample dashboard files from the repo you cloned

After importing the dashboards, you can explore graphs for each service by clicking on Dashboards > Manage > Spinnaker Kubernetes Details.

Grafana Dashboard

Available metrics by service

Disclaimer: the following tables may not contain every available metric for each service.

Clouddriver

Metric NameBase UnitDescription
amazonClientProvider_rateLimitDelayMillis
authorization
aws_request_clientExecuteTimemilliseconds
aws_request_credentialsRequestTimemilliseconds
aws_request_httpClientReceiveResponseTimemilliseconds
aws_request_httpClientSendRequestTimemilliseconds
aws_request_httpRequestTimemilliseconds
aws_request_requestCount
aws_request_requestMarshallTimemilliseconds
aws_request_requestSigningTimemilliseconds
aws_request_responseProcessingTimemilliseconds
aws_request_retryPauseTimemilliseconds
aws_request_throttling
awsSdkClientSupplier_averageLoadPenalty
awsSdkClientSupplier_hitCount
awsSdkClientSupplier_loadExceptionCount
awsSdkClientSupplier_missRate
cats_sqlCache_evict_deleteOperations
cats_sqlCache_evict_itemCount
cats_sqlCache_evict_itemsDeleted
cats_sqlCache_get_itemCount
cats_sqlCache_get_relationshipsRequested
cats_sqlCache_get_requestedSize
cats_sqlCache_get_selectOperations
cats_sqlCache_merge_deleteOperations
cats_sqlCache_merge_itemCount
cats_sqlCache_merge_itemsStored
cats_sqlCache_merge_relationshipCount
cats_sqlCache_merge_relationshipsStored
cats_sqlCache_merge_selectOperations
cats_sqlCache_merge_writeOperations
cf_okhttp_requestsmillisecondsTimer of OkHttp operation
controller_invocations
controller_invocations_contentLength
controller_invocations_contentLength_summary
executionTimemilliseconds
health_kubernetes_errors
http_server_requestsmilliseconds
jvm_buffer_countbuffersAn estimate of the number of buffers in the pool
jvm_gc_pausemillisecondsTime spent in GC pause
jvm_memory_committedbytesThe amount of memory in bytes that is committed for the Java virtual machine to use
jvm_memory_maxbytesThe maximum amount of memory in bytes that can be used for memory management
jvm_threads_daemonthreadsThe current number of live daemon threads
jvm_threads_peakthreadsThe peak live thread count since the Java virtual machine started or peak was reset
jvm_threads_statesthreadsThe current number of threads having BLOCKED state
kubernetes_apimilliseconds
logback_eventseventsNumber of debug level events that made it to the logs
onDemand_cachemilliseconds
onDemand_count
onDemand_error
onDemand_evictmilliseconds
onDemand_readmilliseconds
onDemand_storemilliseconds
onDemand_totalmilliseconds
onDemand_transformmilliseconds
operationsmilliseconds
orchestrationsmilliseconds
process_files_maxfilesThe maximum file descriptor count
reservedInstances_surplusByAccountClassic
reservedInstances_surplusByAccountVpc
reservedInstances_surplusOverall
resilience4j_retry_callsThe number of failed calls after a retry attempt
sql_cacheCleanupAgent_dataTypeCleanupDurationmilliseconds
sql_cacheCleanupAgent_dataTypeRecordsDeleted
sql_healthProvider_invocations
sql_taskCleanupAgent_deleted
sql_taskCleanupAgent_timingmilliseconds
system_load_average_1mThe sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time
tasks
tasks
tomcat_sessions_active_currentsessions
tomcat_sessions_expiredsessions
tomcat_sessions_rejectedsessions

Echo

Metric NameBase UnitDescription
aws_request_httpClientGetConnectionTimemilliseconds
controller_invocations
controller_invocations_contentLength
controller_invocations_contentLength_summary
echo_events_processed
echo_triggers_sync_executionTimeMillismilliseconds
fiat_enabled
fiat_getPermission
fiat_legacyFallback_enabled
fiat_permissionsCache_evictions
fiat_permissionsCache_evictions-weight
fiat_permissionsCache_hits
fiat_permissionsCache_loadsmilliseconds
fiat_permissionsCache_loads-failure
fiat_permissionsCache_loads-success
fiat_permissionsCache_misses
front50_lastPoll
front50_requests
http_server_requestsmilliseconds
jvm_buffer_countbuffersAn estimate of the number of buffers in the pool
jvm_buffer_memory_usedbytesAn estimate of the memory that the Java virtual machine is using for this buffer pool
jvm_buffer_total_capacitybytesAn estimate of the total capacity of the buffers in this pool
jvm_classes_loadedclassesThe number of classes that are currently loaded in the Java virtual machine
jvm_classes_unloadedclassesThe total number of classes unloaded since the Java virtual machine has started execution
jvm_gc_allocationRate
jvm_gc_live_data_sizebytesSize of old generation memory pool after a full GC
jvm_gc_liveDataSize
jvm_gc_max_data_sizebytesMax size of old generation memory pool
jvm_gc_maxDataSize
jvm_gc_memory_allocatedbytesIncremented for an increase in the size of the young generation memory pool after one GC to before the next
jvm_gc_memory_promotedbytesCount of positive increases in the size of the old generation memory pool before GC to after GC
jvm_gc_pausemillisecondsTime spent in GC pause
jvm_gc_promotionRate
jvm_memory_committedbytesThe amount of memory in bytes that is committed for the Java virtual machine to use
jvm_memory_maxbytesThe maximum amount of memory in bytes that can be used for memory management
jvm_memory_usedbytesThe amount of used memory
jvm_threads_daemonthreadsThe current number of live daemon threads
jvm_threads_livethreadsThe current number of live threads including both daemon and non-daemon threads
jvm_threads_peakthreadsThe peak live thread count since the Java virtual machine started or peak was reset
jvm_threads_statesthreadsThe current number of threads having NEW state
logback_eventseventsNumber of info level events that made it to the logs
okhttp_requestsmilliseconds
orca_requests
orca_trigger_success
pipelines_triggered
process_cpu_usageThe recent cpu usage for the Java Virtual Machine process
process_files_maxfilesThe maximum file descriptor count
process_files_openfilesThe open file descriptor count
process_start_timemillisecondsStart time of the process since unix epoch_
process_uptimemillisecondsThe uptime of the Java virtual machine
quietPeriod_tests
resilience4j_circuitbreaker_buffered_callsThe number of buffered failed calls stored in the ring buffer
resilience4j_circuitbreaker_callsmillisecondsTotal number of calls which failed but the exception was ignored
resilience4j_circuitbreaker_failure_rateThe failure rate of the circuit breaker
resilience4j_circuitbreaker_slow_call_rateThe slow call of the circuit breaker
resilience4j_circuitbreaker_stateThe states of the circuit breaker
system_cpu_countThe number of processors available to the Java virtual machine
system_load_average_1mThe sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time
tomcat_sessions_active_currentsessions
tomcat_sessions_active_maxsessions
tomcat_sessions_alive_maxmilliseconds
tomcat_sessions_createdsessions
tomcat_sessions_expiredsessions
tomcat_sessions_rejectedsessions

Fiat

Metric NameBase UnitDescription
controller_invocations
controller_invocations_contentLength
controller_invocations_contentLength_summary
fiat_getUserPermission
fiat_userRoles_syncAnonymousmilliseconds
fiat_userRoles_syncCount
fiat_userRoles_syncTimemilliseconds
fiat_userRoles_syncUsersmilliseconds
http_server_requestsmilliseconds
jvm_buffer_countbuffersAn estimate of the number of buffers in the pool
jvm_buffer_memory_usedbytesAn estimate of the memory that the Java virtual machine is using for this buffer pool
jvm_buffer_total_capacitybytesAn estimate of the total capacity of the buffers in this pool
jvm_classes_loadedclassesThe number of classes that are currently loaded in the Java virtual machine
jvm_classes_unloadedclassesThe total number of classes unloaded since the Java virtual machine has started execution
jvm_gc_allocationRate
jvm_gc_live_data_sizebytesSize of old generation memory pool after a full GC
jvm_gc_liveDataSize
jvm_gc_max_data_sizebytesMax size of old generation memory pool
jvm_gc_maxDataSize
jvm_gc_memory_allocatedbytesIncremented for an increase in the size of the young generation memory pool after one GC to before the next
jvm_gc_memory_promotedbytesCount of positive increases in the size of the old generation memory pool before GC to after GC
jvm_gc_pausemillisecondsTime spent in GC pause
jvm_gc_promotionRate
jvm_memory_committedbytesThe amount of memory in bytes that is committed for the Java virtual machine to use
jvm_memory_maxbytesThe maximum amount of memory in bytes that can be used for memory management
jvm_memory_usedbytesThe amount of used memory
jvm_threads_daemonthreadsThe current number of live daemon threads
jvm_threads_livethreadsThe current number of live threads including both daemon and non-daemon threads
jvm_threads_peakthreadsThe peak live thread count since the Java virtual machine started or peak was reset
jvm_threads_statesthreadsThe current number of threads having TERMINATED state
kork_lock_acquire
kork_lock_acquire_duration
kork_lock_heartbeat
kork_lock_release
logback_eventseventsNumber of debug level events that made it to the logs
okhttp_requestsmilliseconds
permissionsRepository_get1_invocations
permissionsRepository_get1_timing
permissionsRepository_getAllById_invocations
permissionsRepository_getAllById_timing
permissionsRepository_put1_invocations
permissionsRepository_put1_timing
permissionsRepository_putAllById1_invocations
permissionsRepository_putAllById1_timing
process_cpu_usageThe recent cpu usage for the Java Virtual Machine process
process_files_maxfilesThe maximum file descriptor count
process_files_openfilesThe open file descriptor count
process_start_timemillisecondsStart time of the process since unix epoch_
process_uptimemillisecondsThe uptime of the Java virtual machine
redis_command_invocation_del
redis_command_invocation_eval
redis_command_invocation_get
redis_command_invocation_hgetAll
redis_command_invocation_hmset
redis_command_invocation_hscan
redis_command_invocation_pipelined
redis_command_invocation_rename
redis_command_invocation_sadd
redis_command_invocation_set
redis_command_invocation_sismember
redis_command_invocation_srem
redis_command_invocation_sscan
redis_command_invocation_time
redis_command_latency_del
redis_command_latency_evalmilliseconds
redis_command_latency_getmilliseconds
redis_command_latency_get
redis_command_latency_hgetAll
redis_command_latency_hmset
redis_command_latency_hscan
redis_command_latency_pipelined
redis_command_latency_rename
redis_command_latency_sadd
redis_command_latency_set
redis_command_latency_sismember
redis_command_latency_srem
redis_command_latency_sscan
redis_command_latency_time
redis_command_payloadSize_eval
redis_command_payloadSize_eval_summary
redis_command_payloadSize_sadd
redis_command_payloadSize_sadd_summary
redis_command_payloadSize_set
redis_command_payloadSize_set_summary
resilience4j_circuitbreaker_buffered_callsThe number of buffered failed calls stored in the ring buffer
resilience4j_circuitbreaker_callsmilliseconds
resilience4j_circuitbreaker_failure_rateThe failure rate of the circuit breaker
resilience4j_circuitbreaker_slow_call_rateThe slow call of the circuit breaker
resilience4j_circuitbreaker_stateThe states of the circuit breaker
resilience4j_retry_callsThe number of failed calls after a retry attempt
system_cpu_countThe number of processors available to the Java virtual machine
system_cpu_usageThe recent cpu usage for the whole system
system_load_average_1mThe sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time
tomcat_sessions_active_currentsessions
tomcat_sessions_active_maxsessions
tomcat_sessions_alive_maxmilliseconds
tomcat_sessions_createdsessions
tomcat_sessions_expiredsessions
tomcat_sessions_rejectedsessions

Front50

Metric NameBase UnitDescription
aws_request_clientExecuteTimemilliseconds
aws_request_credentialsRequestTimemilliseconds
aws_request_httpClientGetConnectionTimemilliseconds
aws_request_httpClientReceiveResponseTimemilliseconds
aws_request_httpClientSendRequestTimemilliseconds
aws_request_httpRequestTimemilliseconds
aws_request_requestCount
aws_request_requestSigningTimemilliseconds
aws_request_responseProcessingTimemilliseconds
aws_request_retryPauseTimemilliseconds
controller_invocations
controller_invocations_contentLength
controller_invocations_contentLength_summary
fiat_enabled
fiat_getPermission
fiat_legacyFallback_enabled
fiat_permissionsCache_evictions
fiat_permissionsCache_evictions-weight
fiat_permissionsCache_hits
fiat_permissionsCache_loadsmilliseconds
fiat_permissionsCache_loads-failure
fiat_permissionsCache_loads-success
fiat_permissionsCache_misses
http_server_requestsmilliseconds
jvm_buffer_countbuffersAn estimate of the number of buffers in the pool
jvm_buffer_memory_usedbytesAn estimate of the memory that the Java virtual machine is using for this buffer pool
jvm_buffer_total_capacitybytesAn estimate of the total capacity of the buffers in this pool
jvm_classes_loadedclassesThe number of classes that are currently loaded in the Java virtual machine
jvm_classes_unloadedclassesThe total number of classes unloaded since the Java virtual machine has started execution
jvm_gc_allocationRate
jvm_gc_live_data_sizebytesSize of old generation memory pool after a full GC
jvm_gc_liveDataSize
jvm_gc_max_data_sizebytesMax size of old generation memory pool
jvm_gc_maxDataSize
jvm_gc_memory_allocatedbytesIncremented for an increase in the size of the young generation memory pool after one GC to before the next
jvm_gc_memory_promotedbytesCount of positive increases in the size of the old generation memory pool before GC to after GC
jvm_gc_pausemillisecondsTime spent in GC pause
jvm_gc_promotionRate
jvm_memory_committedbytesThe amount of memory in bytes that is committed for the Java virtual machine to use
jvm_memory_maxbytesThe maximum amount of memory in bytes that can be used for memory management
jvm_memory_usedbytesThe amount of used memory
jvm_threads_daemonthreadsThe current number of live daemon threads
jvm_threads_livethreadsThe current number of live threads including both daemon and non-daemon threads
jvm_threads_peakthreadsThe peak live thread count since the Java virtual machine started or peak was reset
jvm_threads_statesthreadsThe current number of threads having WAITING state
logback_eventseventsNumber of error level events that made it to the logs
okhttp_requestsmilliseconds
process_cpu_usageThe recent cpu usage for the Java Virtual Machine process
process_files_maxfilesThe maximum file descriptor count
process_files_openfilesThe open file descriptor count
process_start_timemillisecondsStart time of the process since unix epoch_
process_uptimemillisecondsThe uptime of the Java virtual machine
resilience4j_circuitbreaker_buffered_calls
resilience4j_circuitbreaker_callsmilliseconds
resilience4j_circuitbreaker_failure_rateThe failure rate of the circuit breaker
resilience4j_circuitbreaker_slow_call_rateThe slow call of the circuit breaker
resilience4j_circuitbreaker_slow_callsThe number of slow failed calls which were slower than a certain threshold
resilience4j_circuitbreaker_stateThe states of the circuit breaker
storageServiceSupport_autoRefreshTimemilliseconds
storageServiceSupport_cacheAge
storageServiceSupport_cacheRefreshTimemilliseconds
storageServiceSupport_cacheSize
storageServiceSupport_mismatchedIds
storageServiceSupport_numAdded
storageServiceSupport_numRemoved
storageServiceSupport_numUpdated
storageServiceSupport_scheduledRefreshTimemilliseconds
system_cpu_countThe number of processors available to the Java virtual machine
system_cpu_usageThe recent cpu usage for the whole system
system_load_average_1mThe sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time
tomcat_sessions_active_currentsessions
tomcat_sessions_active_maxsessions
tomcat_sessions_alive_maxmilliseconds
tomcat_sessions_createdsessions
tomcat_sessions_expiredsessions
tomcat_sessions_rejectedsessions

Gate

Metric NameBase UnitDescription
controller_invocations
controller_invocations_contentLength
controller_invocations_contentLength_summary
fiat_enabled
fiat_getPermission
fiat_legacyFallback_enabled
fiat_login
fiat_permissionsCache_evictions
fiat_permissionsCache_evictions-weight
fiat_permissionsCache_hits
fiat_permissionsCache_loadsmilliseconds
fiat_permissionsCache_loads-failure
fiat_permissionsCache_loads-success
fiat_permissionsCache_misses
http_server_requestsmilliseconds
http_server_requestsmilliseconds
http_server_requestsmilliseconds
jvm_buffer_countbuffersAn estimate of the number of buffers in the pool
jvm_buffer_memory_usedbytesAn estimate of the memory that the Java virtual machine is using for this buffer pool
jvm_buffer_total_capacitybytesAn estimate of the total capacity of the buffers in this pool
jvm_classes_loadedclassesThe number of classes that are currently loaded in the Java virtual machine
jvm_classes_unloadedclassesThe total number of classes unloaded since the Java virtual machine has started execution
jvm_gc_allocationRate
jvm_gc_live_data_sizebytesSize of old generation memory pool after a full GC
jvm_gc_liveDataSize
jvm_gc_max_data_sizebytesMax size of old generation memory pool
jvm_gc_maxDataSize
jvm_gc_memory_allocatedbytesIncremented for an increase in the size of the young generation memory pool after one GC to before the next
jvm_gc_memory_promotedbytesCount of positive increases in the size of the old generation memory pool before GC to after GC
jvm_gc_pausemillisecondsTime spent in GC pause
jvm_gc_promotionRate
jvm_memory_committedbytesThe amount of memory in bytes that is committed for the Java virtual machine to use
jvm_memory_maxbytesThe maximum amount of memory in bytes that can be used for memory management
jvm_memory_usedbytesThe amount of used memory
jvm_threads_daemonthreadsThe current number of live daemon threads
jvm_threads_livethreadsThe current number of live threads including both daemon and non-daemon threads
jvm_threads_peakthreadsThe peak live thread count since the Java virtual machine started or peak was reset
jvm_threads_peakthreadsThe peak live thread count since the Java virtual machine started or peak was reset
jvm_threads_statesthreadsThe current number of threads having RUNNABLE state
logback_eventseventsNumber of error level events that made it to the logs
okhttp_requestsmilliseconds
plugins_deckAssets_hits
plugins_deckCache_downloadDurationmilliseconds
plugins_deckCache_hits
plugins_deckCache_misses
plugins_deckCache_refreshDurationmilliseconds
plugins_deckCache_versions
process_cpu_usageThe recent cpu usage for the Java Virtual Machine process
process_files_maxfilesThe maximum file descriptor count
process_files_openfilesThe open file descriptor count
process_start_timemillisecondsStart time of the process since unix epoch_
process_uptimemillisecondsThe uptime of the Java virtual machine
system_cpu_countThe number of processors available to the Java virtual machine
system_cpu_usageThe recent cpu usage for the whole system
system_load_average_1mThe sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time
tomcat_sessions_active_currentsessions
tomcat_sessions_active_currentsessions
tomcat_sessions_active_maxsessions
tomcat_sessions_alive_maxmilliseconds
tomcat_sessions_createdsessions
tomcat_sessions_expiredsessions
tomcat_sessions_rejectedsessions

Igor

Metric NameBase UnitDescription
controller_invocations
controller_invocations_contentLength
controller_invocations_contentLength_summary
fiat_enabled
fiat_getPermission
fiat_legacyFallback_enabled
fiat_permissionsCache_evictions
fiat_permissionsCache_evictions-weight
fiat_permissionsCache_hits
fiat_permissionsCache_loadsmilliseconds
fiat_permissionsCache_loads-failure
fiat_permissionsCache_loads-success
fiat_permissionsCache_misses
http_server_requestsmilliseconds
jvm_buffer_countbuffersAn estimate of the number of buffers in the pool
jvm_buffer_memory_usedbytesAn estimate of the memory that the Java virtual machine is using for this buffer pool
jvm_classes_loadedclassesThe number of classes that are currently loaded in the Java virtual machine
jvm_classes_unloadedclassesThe total number of classes unloaded since the Java virtual machine has started execution
jvm_gc_allocationRate
jvm_gc_live_data_sizebytesSize of old generation memory pool after a full GC
jvm_gc_liveDataSize
jvm_gc_max_data_sizebytesMax size of old generation memory pool
jvm_gc_maxDataSize
jvm_gc_memory_allocatedbytesIncremented for an increase in the size of the young generation memory pool after one GC to before the next
jvm_gc_pausemillisecondsTime spent in GC pause
jvm_gc_promotionRate
jvm_memory_committedbytesThe amount of memory in bytes that is committed for the Java virtual machine to use
jvm_memory_maxbytesThe maximum amount of memory in bytes that can be used for memory management
jvm_memory_usedbytesThe amount of used memory
jvm_threads_daemonthreadsThe current number of live daemon threads
jvm_threads_livethreadsThe current number of live threads including both daemon and non-daemon threads
jvm_threads_peakthreadsThe peak live thread count since the Java virtual machine started or peak was reset
jvm_threads_statesthreadsThe current number of threads having NEW state
logback_eventsevents
okhttp_requestsmilliseconds
pollingMonitor_docker_retrieveImagesByAccountmilliseconds
pollingMonitor_jenkins_retrieveProjectsmilliseconds
pollingMonitor_pollTimingmilliseconds
process_cpu_usageThe recent cpu usage for the Java Virtual Machine process
process_files_maxfilesThe maximum file descriptor count
process_files_openfilesThe open file descriptor count
process_start_timemillisecondsStart time of the process since unix epoch_
process_uptimemillisecondsThe uptime of the Java virtual machine
resilience4j_circuitbreaker_buffered_callsThe number of buffered failed calls stored in the ring buffer
resilience4j_circuitbreaker_callsTotal number of not permitted calls
resilience4j_circuitbreaker_failure_rateThe failure rate of the circuit breaker
resilience4j_circuitbreaker_slow_call_rateThe slow call of the circuit breaker
resilience4j_circuitbreaker_stateThe states of the circuit breaker
system_cpu_countThe number of processors available to the Java virtual machine
system_cpu_usageThe recent cpu usage for the whole system
system_load_average_1mThe sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time
tomcat_sessions_active_currentsessions
tomcat_sessions_alive_maxmilliseconds
tomcat_sessions_createdsessions
tomcat_sessions_expiredsessions
tomcat_sessions_rejectedsessions

Kayenta

Metric NameBase UnitDescription
canary_pipelines_initiated
canary_telemetry_query
controller_invocationsmilliseconds
controller_invocations_contentLength
controller_invocations_contentLength_summary
executions_active
executions_completed
executions_started
http_server_requestsmilliseconds
jvm_gc_allocationRate
jvm_gc_liveDataSize
jvm_gc_maxDataSize
jvm_gc_pausemilliseconds
jvm_gc_promotionRate
okhttp_requestsmilliseconds
orca_task_result
queue_acknowledged_messages
queue_depth
queue_duplicate_messages
queue_last_poll_age
queue_last_retry_check_age
queue_message_lagmilliseconds
queue_orphaned_messages
queue_pushed_messages
queue_ready_depth
queue_unacked_depth
redis_command_invocation_exists
redis_command_invocation_hdel
redis_command_invocation_hget
redis_command_invocation_hgetAll
redis_command_invocation_hmset
redis_command_invocation_hset
redis_command_invocation_multi
redis_command_invocation_sadd
redis_command_invocation_srem
redis_command_invocation_zadd
redis_command_latency_exists
redis_command_latency_exists
redis_command_latency_hdel
redis_command_latency_hget
redis_command_latency_hgetAll
redis_command_latency_hmsetmilliseconds
redis_command_latency_hset
redis_command_latency_multi
redis_command_latency_sadd
redis_command_latency_srem
redis_command_latency_zadd
redis_command_payloadSize_hmset
redis_command_payloadSize_hmset_summary
redis_command_payloadSize_hset
redis_command_payloadSize_hset_summary
redis_command_payloadSize_sadd
redis_command_payloadSize_sadd_summary
redis_command_payloadSize_srem
redis_command_payloadSize_srem_summary
redis_connectionPool_maxIdle
redis_connectionPool_minIdle
redis_connectionPool_numActive
redis_connectionPool_numIdle
redis_connectionPool_numWaiters
redis_executionRepository_store1_invocations
redis_executionRepository_store1_timingmilliseconds
redis_executionRepository_storeStage1_invocations
redis_executionRepository_storeStage1_timing
redis_executionRepository_updateStatus1_invocations
redis_executionRepository_updateStatus1_timingmilliseconds
retrieveById_redis_executionRepository_invocations
retrieveById_redis_executionRepository_timing
stage_invocations
stage_invocations_duration
task_completions_durationmilliseconds
task_completions_duration_withTypemilliseconds
task_invocations_durationmilliseconds
task_invocations_duration_withTypemilliseconds
threadpool_activeCount
threadpool_blockingQueueSize
threadpool_corePoolSize
threadpool_maximumPoolSize
threadpool_poolSize
tomcat_sessions_active_currentsessions
tomcat_sessions_active_maxsessions
tomcat_sessions_alive_maxmilliseconds
tomcat_sessions_createdsessions
tomcat_sessions_expiredsessions
tomcat_sessions_rejectedsessions

Orca

Metric NameBase UnitDescription
aws_request_httpClientGetConnectionTimemilliseconds
controller_invocations
controller_invocations_contentLength
controller_invocations_contentLength_summary
executions_active
executions_completed
executions_started
executions_totalTimemilliseconds
fiat_enabled
fiat_getPermission
fiat_legacyFallback_enabled
fiat_permissionsCache_loadsmilliseconds
fiat_permissionsCache_loads-failure
http_server_requestsmilliseconds
jdbc_connections_active
jdbc_connections_idle
jdbc_connections_max
jvm_gc_allocationRate
jvm_gc_pausemilliseconds
jvm_gc_promotionRate
mpt_requests
okhttp_requestsmilliseconds
orca_task_result
queue_acknowledged_messages
queue_depth
queue_duplicate_messages
queue_last_poll_age
queue_message_notfound
queue_orphaned_messages
queue_pushed_messages
queue_retried_messages
queue_unacked_depth
redis_connectionPool_maxIdle
redis_connectionPool_numActive
redis_connectionPool_numIdle
resilience4j_retry_callsThe number of successful calls after a retry attempt
retrieveById_sql_executions_invocations
retrieveById_sql_executions_timing
sql_executions_addStage1_timing
sql_executions_cancel4_invocations
sql_executions_cancel4_timing
sql_executions_countActiveExecutions_invocations
sql_executions_countActiveExecutions_timing
sql_executions_handlesPartition1_invocations
sql_executions_handlesPartition1_timingmilliseconds
sql_executions_retrieveByCorrelationId2_timing
sql_executions_retrieveOrchestrationsForApplication3_timing
sql_executions_store1_timing
sql_executions_storeStage1_invocations
sql_executions_storeStage1_timing
sql_executions_updateStatus1_invocations
sql_executions_updateStatus1_timing
sql_healthProvider_invocations
sql_pool_default_connectionAcquiredTimingmilliseconds
sql_queueActivator_invocations
stage_invocations
stage_invocations_duration
task_completions_durationmilliseconds
task_completions_duration_withTypemilliseconds
task_invocations_durationmilliseconds
task_invocations_duration_withTypemilliseconds
tasks_serverGroupCacheForceRefresh
threadpool_activeCount
threadpool_blockingQueueSize
threadpool_corePoolSize
threadpool_corePoolSize
threadpool_maximumPoolSize
threadpool_poolSize
tomcat_sessions_active_currentsessions
tomcat_sessions_active_maxsessions
tomcat_sessions_alive_maxmilliseconds
tomcat_sessions_rejectedsessions

Rosco

Metric NameBase UnitDescription
bakesActive
bakesCompletedmilliseconds
controller_invocations
controller_invocations_contentLength
controller_invocations_contentLength_summary
http_server_requestsmilliseconds
jvm_buffer_countbuffersAn estimate of the number of buffers in the pool
jvm_buffer_memory_usedbytesAn estimate of the memory that the Java virtual machine is using for this buffer pool
jvm_buffer_total_capacitybytesAn estimate of the total capacity of the buffers in this pool
jvm_classes_loadedclassesThe number of classes that are currently loaded in the Java virtual machine
jvm_classes_unloadedclassesThe total number of classes unloaded since the Java virtual machine has started execution
jvm_gc_allocationRate
jvm_gc_live_data_sizebytesSize of old generation memory pool after a full GC
jvm_gc_liveDataSize
jvm_gc_max_data_sizebytesMax size of old generation memory pool
jvm_gc_maxDataSize
jvm_gc_memory_allocatedbytesIncremented for an increase in the size of the young generation memory pool after one GC to before the next
jvm_gc_memory_promotedbytesCount of positive increases in the size of the old generation memory pool before GC to after GC
jvm_gc_pausemillisecondsTime spent in GC pause
jvm_gc_promotionRate
jvm_memory_committedbytesThe amount of memory in bytes that is committed for the Java virtual machine to use
jvm_memory_maxbytesThe maximum amount of memory in bytes that can be used for memory management
jvm_memory_usedbytesThe amount of used memory
jvm_threads_daemonthreadsThe current number of live daemon threads
jvm_threads_livethreadsThe current number of live threads including both daemon and non-daemon threads
jvm_threads_peakthreadsThe peak live thread count since the Java virtual machine started or peak was reset
jvm_threads_statesthreads
logback_eventsevents
okhttp_requestsmilliseconds
process_cpu_usageThe recent cpu usage for the Java Virtual Machine process
process_files_maxfilesThe maximum file descriptor count
process_files_openfilesThe open file descriptor count
process_start_timemillisecondsStart time of the process since unix epoch_
process_uptimemillisecondsThe uptime of the Java virtual machine
system_cpu_countThe number of processors available to the Java virtual machine
system_cpu_usageThe recent cpu usage for the whole system
system_load_average_1mThe sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time
tomcat_sessions_active_currentsessions
tomcat_sessions_active_maxsessions
tomcat_sessions_alive_maxmilliseconds
tomcat_sessions_createdsessions
tomcat_sessions_expiredsessions
tomcat_sessions_rejectedsessions

Last modified October 17, 2023: (aa87b671)