Lab 7 - Discovering Service Targets
Lab Goal
This lab provides an understanding of how service discovery is used in Prometheus for
locating and scraping targets for metrics collection. You're learning by setting up a
service discovery mechanism to dynamically maintain a list of scraping targets.
Service discovery - It's been all static so far
Up to this point in this workshop, you've been statically configuring your Prometheus
installation to scrap each target in its own job
in the
static_configs
section of your workshop-prometheus.yml
file.
In real world cloud native infrastructure things will be dynamically scaling and make it entirely
impossible to maintain such a static configuration. You'll be faced with virtual machines on
cloud providers, services and applications on container orchestrators such as Kubernetes, and
microservice architectures constantly changing the observability data targets you need to be
scraping.
Service discovery - Introducing dynamic discovery
Remember back in the first introduction you saw the Prometheus architecture where service
discovery was looming as a way to leverage built in mechanisms for discovering virtual machines
on cloud providers, service and application instances on container orchestrators, and other
generic lookup mechanisms (DNS, Consul, Zookeeper, etc.):
Service discovery - Supporting dynamic discovery
Any of those methods can be added to a scrape_config
section in your
Prometheus configuration file to provide a dynamic list of targets, continuously updating during
runtime. Prometheus automatically stops scraping old instances and starts scraping new ones,
making highly dynamic environments such as services running on Kubernetes manageable:
Service discovery - Functions of discovery data
When Prometheus is configured to use a service discovery mechanism, it's using the provided
discovery information for three purposes:
- knowing what should exist
- knowing how to pull metrics from targets that exist
- how to use associated target metadata
Service discovery - Knowing what should exist
Prometheus, as a monitoring system, needs to know what systems and services should be up and
running at any given point in time. A key function of any service discovery mechanism that
Prometheus uses, is to continuously provide that information. With this information being
available to Prometheus, it's trivial to leverage the Alert Manager to alert on any unreachable
targets.
Service discovery - Knowing how to pull metrics
Prometheus, as a monitoring system, needs to know more than whether a target system exists or
not, it needs to know how to pull metrics from it. You can imagine what Prometheus needs to
know, such as:
- host name
- port number
- protocol (http or https)
- ...any other information needed to reach or access the target
Most of this is provided by different service discovery mechanisms in target metadata and it
enables Prometheus to fetch data from the target.
Service discovery - How to use target metadata
Many of the service discovery mechanisms you'll use are providing metadata about each target
(with things like labels
, annotations
,
service names
, ready states
, etc). This metadata is used during
the relabeling
phase to filter targets, modify how targets are scraped, or
map any metadata into final target labels.
As shown previously in this workshop, relabeling
allows enriching target
labels based on discovery information creates more useful time series data for eventual queries.
Service discovery - Target metadata and relabeling
During a previous lab in this workshop you were introduced to target metadata, normal labels,
and how the relabeling phase can be used to modify target labels before persisting. When using
service discovery, target sources provide normal labels and 'hidden' labels prefixed with a
double underscore (__). These 'hidden' labels contain additional metadata about the target.
All 'hidden' labels are removed after the relabeling phase, only making it into the target's
final labels if you filter them using relabeling and change a target's labels or scrape behavior.
Service discovery - Metadata labels affecting scrape behavior
It's possible to use the following metadata labels, as they are always provided by raw targets,
to modify any targets scrape behavior:
__address__
: Contains the target TCP address that should be scrapped.
It initially defaults to [host]:[port] provided by the service discovery mechanism.
Prometheus sets the instance label to the value of __address__
after
relabeling if you don't set the instance label explicitly to another value during
relabeling.
__scheme__
: Contains the HTTP scheme (http or https) with which
target should be scrapped. Defaults to http.
__metrics_path__
: Contains the HTTP path to scrape metrics from.
Defaults to /metrics.
Another interesting label is the
__param_[name]
label, allowing you to send HTTP
query parameters along with a scrape on any target. For example, you could set the
__param_filter
label to the value
active
to send a
filter active HTTP query parameter.
Service discovery - Metadata (__meta_) labels
Finally, every service discovery mechanism can provide discovery specific metadata about a target
using labels starting with
__meta_
. For example, service discovery for
Kubernetes provides:
__meta_kubernetes_pod_name
label for each pod
target
__meta_kubernetes_pod_ready
label indicating if pod is in a
ready state or not
There are numerous metadata labels available to each service discovery mechanism and you are
encouraged to explore their individual configuration documentation to find out more. Here is
an example for
Kubernetes service discovery configuration documentation.
Service discovery - Options for demo environments
For the rest of this lab you will be setting up your Prometheus custom service discovery
integration to watch a set of local files containing target information. You can then write
custom code to update the target files and Prometheus will automatically adjust to any new
targets. You'll be using the file-based service discovery mechanism to feed a changing list of
custom targets to Prometheus during runtime.
The lab exercise uses the services demo project to simulate several infrastructure environments,
all to be monitored by a Prometheus instance. They all can be installed on your local machine in
one of two ways, so please click on the option you want to use to continue with this workshop: