Spring Boot for Apache Geode and Pivotal GemFire (SBDG) adds Spring Boot Actuator
support and dedicated HealthIndicators
for Apache Geode and Pivotal GemFire. Equally, the provided HealthIndicators
will even work with Pivotal Cloud Cache, which is backed by Pivotal GemFire, when pushing your Spring Boot applications
to Pivotal CloudFoundry (PCC).
Spring Boot HealthIndicators
provide details about the runtime operation and behavior of your Apache Geode
or Pivotal GemFire based Spring Boot applications. For instance, by querying the right HealthIndicator
endpoint,
you would be able to get the current hit/miss count for your Region.get(key)
data access operations.
In addition to vital health information, SBDG provides basic, pre-runtime configuration meta-data about the Apache Geode / Pivotal GemFire components that are monitored by Spring Boot Actuator. This makes it easier to see how the application was configured all in one place, rather than in properties files, Spring config, XML, etc.
The provided Spring Boot HealthIndicators
fall under one of three categories:
HealthIndicators
that apply to all Apache Geode/Pivotal GemFire, Spring Boot applications, regardless of
cache type, such as Regions, Indexes and DiskStores.
Cache
based HealthIndicators
that are only applicable to peer Cache
applications, such as
AsyncEventQueues
, CacheServers
, GatewayReceivers
and GatewaySenders
.
ClientCache
based HealthIndicators
that are only applicable to ClientCache
applications, such as
ContinuousQueries
and connection Pools
.
The following sections give a brief overview of all the available Spring Boot HealthIndicators
provided for
Apache Geode/Pivotal GemFire, out-of-the-box.
The following section covers Spring Boot HealthIndicators
that apply to both peer Cache
and ClientCache
,
Spring Boot applications. That is, these HealthIndicators
are not specific to the cache type.
In both Apache Geode and Pivotal GemFire, the cache instance is either a peer Cache
instance, which makes your
Spring Boot application part of a GemFire/Geode cluster, or more commonly, a ClientCache
instance that talks to
an existing cluster. Your Spring Boot application can only be one cache type or the other and can only have a single
instance of that cache type.
The GeodeCacheHealthIndicator
provides essential details about the (single) cache instance (Client or Peer) along with
the underlying DistributedSystem
, the DistributedMember
and configuration details of the ResourceManager
.
When your Spring Boot application creates an instance of a peer Cache
,
the DistributedMember
object represents
your application as a peer member/node of the DistributedSystem
formed from a collection of connected peers (i.e. the cluster), to which your application also has
access,
indirectly via the cache instance.
This is no different for a ClientCache
even though the client is technically not part of the peer/server cluster.
But, it still creates instances of the DistributedSystem
and DistributedMember
objects, respectively.
The following configuration meta-data and health details about each object is covered:
Table 13.1. Cache Details
Name | Description |
---|---|
geode.cache.name | Name of the member in the distributed system. |
geode.cache.closed | Determines whether the cache has been closed. |
geode.cache.cancel-in-progress | Cancellation of operations in progress. |
Table 13.2. DistributedMember Details
Name | Description |
---|---|
geode.distributed-member.id | DistributedMember identifier (used in logs internally). |
geode.distributed-member.name | Name of the member in the distributed system. |
geode.distributed-members.groups | Configured groups to which the member belongs. |
geode.distributed-members.host | Name of the machine on which the member is running. |
geode.distributed-members.process-id | Identifier of the JVM process (PID). |
Table 13.3. DistributedSystem Details
Name | Description |
---|---|
geode.distributed-system.member-count | Total number of members in the cluster (1 for clients). |
geode.distributed-system.connected | Indicates whether the member is currently connected to the cluster. |
geode.distributed-system.reconnecting | Indicates whether the member is in a reconnecting state, which happens when a network partition occurs and the member gets disconnected from the cluster. |
geode.distributed-system.properties-location | Location of the standard configuration properties. |
geode.distributed-system.security-properties-location | Location of the security configuration properties. |
Table 13.4. ResourceManager Details
Name | Description |
---|---|
geode.resource-manager.critical-heap-percentage | Percentage of heap at which the cache is in danger of becoming inoperable. |
geode.resource-manager.critical-off-heap-percentage | Percentage of off-heap at which the cache is in danger of becoming inoperable. |
geode.resource-manager.eviction-heap-percentage | Percentage of heap at which eviction begins on Regions configured with a Heap LRU Eviction policy. |
geode.resource-manager.eviction-off-heap-percentage | Percentage of off-heap at which eviction begins on Regions configured with a Heap LRU Eviction policy. |
The GeodeRegionsHealthIndicator
provides details about all the configured and known Regions
in the cache.
If the cache is a client, then details will include all LOCAL, PROXY and CACHING_PROXY Regions
. If the cache
is a peer, then the details will include all LOCAL, PARTITION and REPLICATE Regions
.
While the configuration meta-data details are not exhaustive, essential details along with basic performance metrics are covered:
Table 13.5. Region Details
Name | Description |
---|---|
geode.cache.regions.<name>.cloning-enabled | Whether Region values are cloned on read (e.g. |
geode.cache.regions.<name>.data-policy | Policy used to manage the data in the Region (e.g. PARTITION, REPLICATE, etc). |
geode.cache.regions.<name>.initial-capacity | Initial number of entries that can be held by a Region before it needs to be resized. |
geode.cache.regions.<name>.load-factor | Load factor used to determine when to resize the Region when it nears capacity. |
geode.cache.regions.<name>.key-constraint | Type constraint for Region keys. |
geode.cache.regions.<name>.off-heap | Determines whether this Region will store values in off-heap memory (NOTE: Keys are always kept on Heap). |
geode.cache.regions.<name>.pool-name | If this Region is a client Region, then this property determines
the configured connection |
geode.cache.regions.<name>.pool-name | Determines the |
geode.cache.regions.<name>.value-constraint | Type constraint for Region values. |
Additionally, when the Region is a peer Cache
PARTITION
Region, then the following details are also covered:
Table 13.6. Partition Region Details
Name | Description |
---|---|
geode.cache.regions.<name>.partition.collocated-with | Indicates this Region is collocated with another PARTITION Region, which is necessary when performing equi-joins queries (NOTE: distributed joins are not supported). |
geode.cache.regions.<name>.partition.local-max-memory | Total amount of Heap memory allowed to be used by this Region on this node. |
geode.cache.regions.<name>.partition.redundant-copies | Number of replicas for this PARTITION Region, which is useful in High Availability (HA) use cases. |
geode.cache.regions.<name>.partition.total-max-memory | Total amount of Heap memory allowed to be used by this Region across all nodes in the cluster hosting this Region. |
geode.cache.regions.<name>.partition.total-number-of-buckets | Total number of buckets (shards) that this Region is divided up into (NOTE: defaults to 113). |
Finally, when statistics are enabled (e.g. using @EnableStatistics
,
(see here
for more details), the following details are available:
Table 13.7. Region Statistic Details
Name | Description |
---|---|
geode.cache.regions.<name>.statistics.hit-count | Number of hits for a Region entry. |
geode.cache.regions.<name>.statistics.hit-ratio | Ratio of hits to the number of |
geode.cache.regions.<name>.statistics.last-accessed-time | For an entry, determines the last time it was accessed
with |
geode.cache.regions.<name>.statistics.last-modified-time | For an entry, determines the time a Region’s entry value was last modified. |
geode.cache.regions.<name>.statistics.miss-count | Returns the number of times that a |
The GeodeIndexesHealthIndicator
provides details about the configured Region Indexes
used in OQL query
data access operations.
The following details are covered:
Table 13.8. Index Details
Name | Description |
---|---|
geode.index.<name>.from-clause | Region from which data is selected. |
geode.index.<name>.indexed-expression | The Region value fields/properties used in the Index expression. |
geode.index.<name>.projection-attributes | For all other Indexes, returns "", but for Map Indexes, returns either "" or the specific Map keys that were indexed. |
geode.index.<name>.region | Region to which the Index is applied. |
Additionally, when statistics are enabled (e.g. using @EnableStatistics
;
(see here
for more details), the following details are available:
Table 13.9. Index Statistic Details
Name | Description |
---|---|
geode.index.<name>.statistics.number-of-bucket-indexes | Number of bucket Indexes created in a Partitioned Region. |
geode.index.<name>.statistics.number-of-keys | Number of keys in this Index. |
geode.index.<name>.statistics.number-of-map-indexed-keys | Number of keys in this Index at the highest-level. |
geode.index.<name>.statistics.number-of-values | Number of values in this Index. |
geode.index.<name>.statistics.number-of-updates | Number of times this Index has been updated. |
geode.index.<name>.statistics.read-lock-count | Number of read locks taken on this Index. |
geode.index.<name>.statistics.total-update-time | Total amount of time (ns) spent updating this Index. |
geode.index.<name>.statistics.total-uses | Total number of times this Index has been accessed by an OQL query. |
The GeodeDiskStoresHealthIndicator
provides details about the configured DiskStores
in the system/application.
Remember, DiskStores
are used to overflow and persist data to disk, including type meta-data tracked by PDX
when the values in the Region(s) have been serialized with PDX and the Region(s) are persistent.
Most of the tracked health information pertains to configuration:
Table 13.10. DiskStore Details
Name | Description |
---|---|
geode.disk-store.<name>.allow-force-compaction | Indicates whether manual compaction of the DiskStore is allowed. |
geode.disk-store.<name>.auto-compact | Indicates if compaction occurs automatically. |
geode.disk-store.<name>.compaction-threshold | Percentage at which the oplog will become compactable. |
geode.disk-store.<name>.disk-directories | Location of the oplog disk files. |
geode.disk-store.<name>.disk-directory-sizes | Configured and allowed sizes (MB) for the disk directory storing the disk files. |
geode.disk-store.<name>.disk-usage-critical-percentage | Critical threshold of disk usage proportional to the total disk volume. |
geode.disk-store.<name>.disk-usage-warning-percentage | Warning threshold of disk usage proportional to the total disk volume. |
geode.disk-store.<name>.max-oplog-size | Maximum size (MB) allowed for a single oplog file. |
geode.disk-store.<name>.queue-size | Size of the queue used to batch writes flushed to disk. |
geode.disk-store.<name>.time-interval | Time to wait (ms) before writes are flushed to disk from the queue if the size limit has not be reached. |
geode.disk-store.<name>.uuid | Universally Unique Identifier for the DiskStore across Distributed System. |
geode.disk-store.<name>.write-buffer-size | Size the of write buffer the DiskStore uses to write data to disk. |
The ClientCache
based HealthIndicators
provide additional details specifically for Spring Boot, cache client
applications. These HealthIndicators
are only available when the Spring Boot application creates a ClientCache
instance (i.e. is a cache client), which is the default.
The GeodeContinuousQueriesHealthIndicator
provides details about registered client Continuous Queries (CQ).
CQs enable client applications to receive automatic notification about events that satisfy some criteria. That criteria
can be easily expressed using the predicate of an OQL query (e.g. “SELECT * FROM /Customers c WHERE c.age > 21”).
Anytime data of interests is inserted or updated, and matches the criteria specified in the OQL query predicate,
an event is sent to the registered client.
The following details are covered for CQs by name:
Table 13.11. Continuous Query(CQ) Details
Name | Description |
---|---|
geode.continuous-query.<name>.oql-query-string | OQL query constituting the CQ. |
geode.continuous-query.<name>.closed | Indicates whether the CQ has been closed. |
geode.continuous-query.<name>.closing | Indicates whether the CQ is the process of closing. |
geode.continuous-query.<name>.durable | Indicates whether the CQ events will be remembered between client sessions. |
geode.continuous-query.<name>.running | Indicates whether the CQ is currently running. |
geode.continuous-query.<name>.stopped | Indicates whether the CQ has been stopped. |
In addition, the following CQ query and statistical data is covered:
Table 13.12. Continuous Query(CQ), Query Details
Name | Description |
---|---|
geode.continuous-query.<name>.query.number-of-executions | Total number of times the query has been executed. |
geode.continuous-query.<name>.query.total-execution-time | Total amount of time (ns) spent executing the query. |
geode.continuous-query.<name>.statistics.number-of-deletes |
Table 13.13. Continuous Query(CQ), Statistic Details
Name | Description |
---|---|
geode.continuous-query.<name>.statistics.number-of-deletes | Number of Delete events qualified by this CQ. |
geode.continuous-query.<name>.statistics.number-of-events | Total number of events qualified by this CQ. |
geode.continuous-query.<name>.statistics.number-of-inserts | Number of Insert events qualified by this CQ. |
geode.continuous-query.<name>.statistics.number-of-updates | Number of Update events qualified by this CQ. |
In a more general sense, the GemFire/Geode Continuous Query system is tracked with the following, additional details on the client:
Table 13.14. Continuous Query(CQ), Statistic Details
Name | Description |
---|---|
geode.continuous-query.count | Total count of CQs. |
geode.continuous-query.number-of-active | Number of currently active CQs (if available). |
geode.continuous-query.number-of-closed | Total number of closed CQs (if available). |
geode.continuous-query.number-of-created | Total number of created CQs (if available). |
geode.continuous-query.number-of-stopped | Number of currently stopped CQs (if available). |
geode.continuous-query.number-on-client | Number of CQs that are currently active or stopped (if available). |
The GeodePoolsHealthIndicator
provide details about all the configured client connection Pools
.
This HealthIndicator
primarily provides configuration meta-data for all the configured Pools
.
The following details are covered:
Table 13.15. Pool Details
Name | Description |
---|---|
geode.pool.count | Total number of client connection Pools. |
geode.pool.<name>.destroyed | Indicates whether the Pool has been destroyed. |
geode.pool.<name>.free-connection-timeout | Configured amount of time to wait for a free connection from the Pool. |
geode.pool.<name>.idle-timeout | The amount of time to wait before closing unused, idle connections not exceeding the configured number of minimum required connections. |
geode.pool.<name>.load-conditioning-interval | Controls how frequently the Pool will check to see if a connection to a given server should be moved to a different server to improve the load balance. |
geode.pool.<name>.locators | List of configured Locators. |
geode.pool.<name>.max-connections | Maximum number of connections obtainable from the Pool. |
geode.pool.<name>.min-connections | Minimum number of connections contained by the Pool. |
geode.pool.<name>.multi-user-authentication | Determines whether the Pool can be used by multiple authenticated users. |
geode.pool.<name>.online-locators | Returns a list of living Locators. |
geode.pool.<name>.pending-event-count | Approximate number of pending subscription events maintained at server for this durable client Pool at the time it (re)connected to the server. |
geode.pool.<name>.ping-interval | How often to ping the servers to verify they are still alive. |
geode.pool.<name>.pr-single-hop-enabled | Whether the client will acquire a direct connection to the server containing the data of interests. |
geode.pool.<name>.read-timeout | Number of milliseconds to wait for a response from a server before timing out the operation and trying another server (if any are available). |
geode.pool.<name>.retry-attempts | Number of times to retry a request after timeout/exception. |
geode.pool.<name>.server-group | Configures the group in which all servers this Pool connects to must belong. |
geode.pool.<name>.servers | List of configured servers. |
geode.pool.<name>.socket-buffer-size | Socket buffer size for each connection made in this Pool. |
geode.pool.<name>.statistic-interval | How often to send client statistics to the server. |
geode.pool.<name>.subscription-ack-interval | Interval in milliseconds to wait before sending acknowledgements to the cache server for events received from the server subscriptions. |
geode.pool.<name>.subscription-enabled | Enabled server-to-client subscriptions. |
geode.pool.<name>.subscription-message-tracking-timeout | Time-to-Live period (ms), for subscription events the client has received from the server. |
geode.pool.<name>.subscription-redundancy | Redundancy level for this Pools server-to-client subscriptions, which is used to ensure clients will not miss potentially important events. |
geode.pool.<name>.thread-local-connections | Thread local connection policy for this Pool. |
The peer Cache
based HealthIndicators
provide additional details specifically for Spring Boot, peer cache member
applications. These HealthIndicators
are only available when the Spring Boot application creates a peer Cache
instance.
![]() | Note |
---|---|
The default cache instance created by Spring Boot for Apache Geode/Pivotal GemFire is a |
![]() | Tip |
---|---|
To control what type of cache instance is created, such as a "peer", then you can explicitly declare either the
|
The GeodeCacheServersHealthIndicator
provides details about the configured Apache Geode/Pivotal GemFire CacheServers
.
CacheServer
instances are required to enable clients to connect to the servers in the cluster.
This HealthIndicator
captures basic configuration meta-data and runtime behavior/characteristics of
the configured CacheServers
:
Table 13.16. CacheServer Details
Name | Description |
---|---|
geode.cache.server.count | Total number of configured CacheServer instances on this peer member. |
geode.cache.server.<index>.bind-address | IP address of the NIC to which the CacheServer |
geode.cache.server.<index>.hostname-for-clients | Name of the host used by clients to connect to the CacheServer (useful with DNS). |
geode.cache.server.<index>.load-poll-interval | How often (ms) to query the load probe on the CacheServer. |
geode.cache.server.<index>.max-connections | Maximum number of connections allowed to this CacheServer. |
geode.cache.server.<index>.max-message-count | Maximum number of messages that can be enqueued in a client queue. |
geode.cache.server.<index>.max-threads | Maximum number of Threads allowed in this CacheServer to service client requests. |
geode.cache.server.<index>.max-time-between-pings | Maximum time between client pings. |
geode.cache.server.<index>.message-time-to-live | Time (seconds) in which the client queue will expire. |
geode.cache.server.<index>.port | Network port to which the CacheServer |
geode.cache.server.<index>.running | Determines whether this CacheServer is currently running and accepting client connections. |
geode.cache.server.<index>.socket-buffer-size | Configured buffer size of the Socket connection used by this CacheServer. |
geode.cache.server.<index>.tcp-no-delay | Configures the TCP/IP TCP_NO_DELAY setting on outgoing Sockets. |
In addition to the configuration settings shown above, the CacheServer’s
ServerLoadProbe
tracks additional details
about the runtime characteristics of the CacheServer
, as follows:
Table 13.17. CacheServer Metrics and Load Details
Name | Description |
---|---|
geode.cache.server.<index>.load.connection-load | Load on the server due to client to server connections. |
geode.cache.server.<index>.load.load-per-connection | Estimate of the how much load each new connection will add to this server. |
geode.cache.server.<index>.load.subscription-connection-load | Load on the server due to subscription connections. |
geode.cache.server.<index>.load.load-per-subscription-connection | Estimate of the how much load each new subscriber will add to this server. |
geode.cache.server.<index>.metrics.client-count | Number of connected clients. |
geode.cache.server.<index>.metrics.max-connection-count | Maximum number of connections made to this CacheServer. |
geode.cache.server.<index>.metrics.open-connection-count | Number of open connections to this CacheServer. |
geode.cache.server.<index>.metrics.subscription-connection-count | Number of subscription connections to this CacheServer. |
The GeodeAsyncEventQueuesHealthIndicator
provides details about the configured AsyncEventQueues
. AEQs can be
attached to Regions to configure asynchronous, write-behind behavior.
This HealthIndicator
captures configuration meta-data and runtime characteristics for all AEQs, as follows:
Table 13.18. AsyncEventQueue Details
Name | Description |
---|---|
geode.async-event-queue.count | Total number of configured AEQs. |
geode.async-event-queue.<id>.batch-conflation-enabled | Indicates whether batch events are conflated when sent. |
geode.async-event-queue.<id>.batch-size | Size of the batch that gets delivered over this AEQ. |
geode.async-event-queue.<id>.batch-time-interval | Max time interval that can elapse before a batch is sent. |
geode.async-event-queue.<id>.disk-store-name | Name of the disk store used to overflow & persist events. |
geode.async-event-queue.<id>.disk-synchronous | Indicates whether disk writes are sync or async. |
geode.async-event-queue.<id>.dispatcher-threads | Number of Threads used to dispatch events. |
geode.async-event-queue.<id>.forward-expiration-destroy | Indicates whether expiration destroy operations are forwarded to AsyncEventListener. |
geode.async-event-queue.<id>.max-queue-memory | Maximum memory used before data needs to be overflowed to disk. |
geode.async-event-queue.<id>.order-policy | Order policy followed while dispatching the events to AsyncEventListeners. |
geode.async-event-queue.<id>.parallel | Indicates whether this queue is parallel (higher throughput) or serial. |
geode.async-event-queue.<id>.persistent | Indicates whether this queue stores events to disk. |
geode.async-event-queue.<id>.primary | Indicates whether this queue is primary or secondary. |
geode.async-event-queue.<id>.size | Number of entries in this queue. |
The GeodeGatewayReceiversHealthIndicator
provide details about the configured (WAN) GatewayReceivers
, which are
capable of receiving events from remote clusters when using Apache Geode/Pivotal GemFire’s
multi-site, WAN topology.
This HealthIndicator
captures configuration meta-data along with the running state for each GatewayReceiver
:
Table 13.19. GatewayReceiver Details
Name | Description |
---|---|
geode.gateway-receiver.count | Total number of configured GatewayReceivers. |
geode.gateway-receiver.<index>.bind-address | IP address of the NIC to which the GatewayReceiver
|
geode.gateway-receiver.<index>.end-port | End value of the port range from which the GatewayReceiver’s port will be chosen. |
geode.gateway-receiver.<index>.host | IP address or hostname that Locators will tell clients (i.e. GatewaySenders) that this GatewayReceiver is listening on. |
geode.gateway-receiver.<index>.max-time-between-pings | Maximum amount of time between client pings. |
geode.gateway-receiver.<index>.port | Port on which this GatewayReceiver listens for clients (i.e. GatewaySenders). |
geode.gateway-receiver.<index>.running | Indicates whether this GatewayReceiver is running and accepting client connections (from GatewaySenders). |
geode.gateway-receiver.<index>.socket-buffer-size | Configured buffer size for the Socket connections used by this GatewayReceiver. |
geode.gateway-receiver.<index>.start-port | Start value of the port range from which the GatewayReceiver’s port will be chosen. |
The GeodeGatewaySendersHealthIndicator
provides details about the configured GatewaySenders
. GatewaySenders
are
attached to Regions in order to send Region events to remote clusters in Apache Geode/Pivotal GemFire’s
multi-site, WAN topology.
This HealthIndicator
captures essential configuration meta-data and runtime characteristics for each GatewaySender
:
Table 13.20. GatewaySender Details
Name | Description |
---|---|
geode.gateway-sender.count | Total number of configured GatewaySenders. |
geode.gateway-sender.<id>.alert-threshold | Alert threshold (ms) for entries in this GatewaySender’s queue. |
geode.gateway-sender.<id>.batch-conflation-enabled | Indicates whether batch events are conflated when sent. |
geode.gateway-sender.<id>.batch-size | Size of the batches sent. |
geode.gateway-sender.<id>.batch-time-interval | Max time interval that can elapse before a batch is sent. |
geode.gateway-sender.<id>.disk-store-name | Name of the DiskStore used to overflow and persist queue events. |
geode.gateway-sender.<id>.disk-synchronous | Indicates whether disk writes are sync or async. |
geode.gateway-sender.<id>.dispatcher-threads | Number of Threads used to dispatch events. |
geode.gateway-sender.<id>.max-queue-memory | Maximum amount of memory (MB) usable for this GatewaySender’s queue. |
geode.gateway-sender.<id>.max-parallelism-for-replicated-region | |
geode.gateway-sender.<id>.order-policy | Order policy followed while dispatching the events to GatewayReceivers. |
geode.gateway-sender.<id>.parallel | Indicates whether this GatewaySender is parallel (higher throughput) or serial. |
geode.gateway-sender.<id>.paused | Indicates whether this GatewaySender is paused. |
geode.gateway-sender.<id>.persistent | Indicates whether this GatewaySender persists queue events to disk. |
geode.gateway-sender.<id>.remote-distributed-system-id | Identifier for the remote distributed system. |
geode.gateway-sender.<id>.running | Indicates whether this GatewaySender is currently running. |
geode.gateway-sender.<id>.socket-buffer-size | Configured buffer size for the Socket connections between this GatewaySender and its receiving GatewayReceiver. |
geode.gateway-sender.<id>.socket-read-timeout | Amount of time (ms) that a Socket read between this sending GatewaySender and its receiving GatewayReceiver will block. |