Chronosphere - Get back control of your o11y data
When your scale becomes an issue, both in cost and resources, take back control of your data
with Chronosphere. This is a SaaS platform that is 100% compatible with Prometheus and PromQL,
ingests existing metrics collecting efforts, and puts you back in control of your data costs:
Chronosphere is a full-service, end-to-end, SaaS Observability platform, inclusive of metrics,
events, traces, and logs
We power engineers through the observability lifecycle to: Know about a problem (alerts), Triage
a problem (explore and analyze in dashboards and traces), & Understand the root cause of that
problem (perform root cause analysis)
Data collection:
Breadth and scale Collect and manage metrics, logs, traces, and events in hundreds of open source
or proprietary formats, from Prometheus and OpenTelemetry to Datadog or Splunk, regardless of
scale.
Control Plane:
Cost and Value Our control plane gives you the power to transform your observability data based
on the need, context, and utility.
It analyzes the usage and the value of your data, enabling you to determine what is useful and
what is waste.
It refines your data to improve cost and user experience.
You can refine data at the edge for bandwidth savings or centrally for detailed control.
You can now send log data to any log store, in your own environment (like S3) or to a third
party SaaS (Splunk).
The result:
Operate more efficiently by optimizing costs and solving problems faster by giving teams fast
access to the right data.
Reliability & Performance:
Consisted delivery Our promised SLA is 99.9% uptime.
We have the most efficient engine for collecting and storing your data – we learned how to do
industrial, highly performance and highly available storage from building M3 at Uber.
Right data with context:
Lens - In most observability platforms, when engineers login, there is so much data, they don't
know where to go to find what they need to solve problems.
Chronosphere Lens turns the raw data and surfaces it as actionable insights relevant to the
problem at hand.
All with no vendor lock in:
All the ins and outs of our platform are open source compatible, from ingestion to querying, to
dashboards, and alerting.
Chronosphere - Value drivers
Exceed your customers’ expectations:
Observability needs to be AT LEAST as highly available as the product to deliver the best experience.
Chronosphere is 5x more reliable than the competition.
Robinhood:
During the craziness of the GameStop(meme stock) trading frenzy, Robinhood experienced some
serious outages and had 37 days of trade execution impacted.
Their observability platform was not getting the job done and only had 1 x 9 of availability,
and terrible performance with dashboards taking 15 minutes to load, and queries timing out.
Often when they had an outage, their Observability tooling was unavailable or not helpful.
Now, with Chronosphere:
The results for Robinhood in production were impactful to their business.
The dashboard and query performance improvements led to a 4x improvement in MTTD (2 min to 30
seconds - average), and as a result they also saw a 75% reduction of critical incidents.
Control cost:
Fair pricing model not based on hosts or users
Customers determine what data they’re willing to pay for
Results in reduced costs and long-term efficiency
Zillow:
Saved millions of dollars and thousands of engineering hours by consolidating their six
different and fragmented observability solutions and standardizing on Chronosphere.
Their 6 person central observability team was tasked with managing 6 different and fragmented
solutions ranging from OSS to SaaS and 1000s of alerts. Not only was this a management headache,
it also prevented them from being able to track and resolve issues across their distributed
systems quickly, ultimately impacting user experience.
Chronosphere was able to consolidate to a single platform across their business to reduce total
spend by 90% by optimizing their data volumes by 80%, while improving their observability
reliability to four 9s.
Improve developer productivity:
Deep context provided across data types to speed all phases of observability
Optimized for outcomes, not inputs
Usable by all engineers, not just a select set of power users
Affirm: Prior to using Chronosphere, Affirm struggled with availability, performance and cost
with their legacy observability vendor. With Chronosphere, they are able to achieve their
objective of attracting and retaining highly satisfied customers and engineers.
Every day engineers would click on alerts from their 8-figure observability tool, just to be
faced with extended latencies. Engineers complained about missing MTTR targets but with 2 years
left on their contract, they were stuck.
Affirm's biggest revenue generating event of 2021 was BFCM (Black Friday Cyber Monday). Their
traffic grew 218% to over 1.3B ATS causing their observability solution to tip over and alerts
to not fire, resulting in issues not getting remediated.
In parallel with the incoming traffic, the complaints from engineering and customers rose.
Engineers complained about downtime and latencies while customers complained about slow or
failed transactions.
Since coming on board, Affirm has had over 4 9’s of uptime, saved over 14k eng hrs per year by
improving their ingest and query latencies while also saving $5M a year by using our control
plane to reduce unused data. All in under 4 months.
Chronosphere - Control methodology
Data Explosion -> Control.
Market leaders in control, with products that help our customers make data driven decisions
about what to keep and what to discard with full awareness of cost/value tradeoffs so that they
can make budget-saving decisions without impacting developer workflows.
We pioneered this approach for metrics and it’s been disruptive in the market.
Chronosphere - Metrics control
How do we do it?
Our control plan for metrics offers a few core features.
First, traffic analyzer - see data before you pay for it.
Second, complete toolkit for data shaping. Keep, drop, rollup, transform things like derived
metrics. Best in class tooling for clients to make nuanced decisions.
Second, usage analyzer - for data you’ve chosen to ingest, see how developers are making use of
it. Is it valuable or not? Use this to close the loop and be able to continuously, proactively
tailor your data over time.
That was metrics - and we’ve been expanding this toolkit to other telemetries, as mentioned.
Chronosphere - Trace control
What does Control look like for tracing?
Similarly, we first let you analyze your traces with our Trace Analyzer, to understand what you
have so that you can make decisions about what you actually want.
Then, for tracing, like for metrics, we offer a full suite of traffic shaping tools. For traces
these look like:
- Dynamic head sampling
- Tail sampling
A bit more on our Dynamic Head Sampling feature because it differentiates us from others in the
market and also helps us offer that continuous adjustment loop (similar to continuous
improvement loop).
From our work with customers, we know that traces are really most valuable when something is
going wrong
Let’s say right now I'm in the middle of an incident, and I want to change the head sampling
rate:
- Before, to change sampling rates, I’d usually have to redeploy my services or OpenTelemetry Collector
- That takes time I don’t really have
- Using Chronosphere, I don’t have to redeploy.
- Customers can now make dynamic head sampling decisions and “turn up” the head sampling rate
when an incident is happening and turn “it back down” when the issue has been identified.
- Customers can set sampling rules in a central location (using Terraform) that the OTEL
collector automatically pulls and updates any rule changes
- These changes can occur as frequently as every 15 seconds
We have built in tooling for trace control that help our customers be proactively responsive
and adjust their trace data as needed to always have the best set of data for the current
incident/problem.
Chronosphere - Log control
And what does Control look like for logs?
Acquired Calyptia to accelerate ability to offer logs alongside other telemetry in Chronosphere,
with a Control feature set out of the gate.
It allows customers to analyze their log data in real time during collection
It supports thousands of open source and proprietary formats
It helps customers reduce data by an average of 30% with just the 20 TB data transformations it
supports
Chronosphere - Lens (dynamic service views)
Developer inefficiency -> Context
Chronosphere lens is one way that our platform gives developers a familiar platform and
entrypoint for their observability workflows in Chronosphere.
Lens offers customers their data through their business context - instead of metrics, it is
their Business Services. It is the links between their Services and things up and downstream.
It is links from their Services to all relevant Dashboards, Monitors, Traces, Events, and Logs.
Giving developers a clear path to the tooling they need to solve problems relevant to them.
- Service view
- Automatically extracted infra and grpc metrics
- Dependency view generated from trace data
- Links to other services
- Back to original service
- Explore traces
- Integration with change events
Ease of use and next generation of APM.
Chronosphere - Change event tracking
Centralized Change Visibility - Our customers needed a unified view of changes across the entire
stack and development org. So we provide a centralized timeline that presents who is making what
change, when, and where, to help eliminate friction and knowledge dependencies that occur when
team’s operational events are in silos.
Customization and Flexibility - We knew early on that 'events' were not standardized, and they’d
be coming in from many sources containing unique properties. This led us to defining a very light
schema making it easy to get data in, and we paired that with a simple but powerful filtering
syntax, so teams and individuals can focus on the events that truly matter for their operations.
Chronosphere - Key takeaways
Leading innovation for cloud native observability
A proven track record with the largest cloud native organizations
Our differentiators:
Best reliability (99.99% availability)
Complete end to end control
Intuitive developer experience
Chronosphere Observability Demo
Chronosphere is the only cloud native observability platform that helps you quickly resolve
incidents while controlling costs.
Resume presentation
Chronosphere - Get back control of your o11y data When your scale becomes an issue, both in cost and resources, take back control of your data
with Chronosphere. This is a SaaS platform that is 100% compatible with Prometheus and PromQL,
ingests existing metrics collecting efforts, and puts you back in control of your data costs: Chronosphere is a full-service, end-to-end, SaaS Observability platform, inclusive of metrics,
events, traces, and logs
We power engineers through the observability lifecycle to: Know about a problem (alerts), Triage
a problem (explore and analyze in dashboards and traces), & Understand the root cause of that
problem (perform root cause analysis)
Data collection:
Breadth and scale Collect and manage metrics, logs, traces, and events in hundreds of open source
or proprietary formats, from Prometheus and OpenTelemetry to Datadog or Splunk, regardless of
scale.
Control Plane:
Cost and Value Our control plane gives you the power to transform your observability data based
on the need, context, and utility.
It analyzes the usage and the value of your data, enabling you to determine what is useful and
what is waste.
It refines your data to improve cost and user experience.
You can refine data at the edge for bandwidth savings or centrally for detailed control.
You can now send log data to any log store, in your own environment (like S3) or to a third
party SaaS (Splunk).
The result:
Operate more efficiently by optimizing costs and solving problems faster by giving teams fast
access to the right data.
Reliability & Performance:
Consisted delivery Our promised SLA is 99.9% uptime.
We have the most efficient engine for collecting and storing your data – we learned how to do
industrial, highly performance and highly available storage from building M3 at Uber.
Right data with context:
Lens - In most observability platforms, when engineers login, there is so much data, they don't
know where to go to find what they need to solve problems.
Chronosphere Lens turns the raw data and surfaces it as actionable insights relevant to the
problem at hand.
All with no vendor lock in:
All the ins and outs of our platform are open source compatible, from ingestion to querying, to
dashboards, and alerting.