tl;dr All metrics are numbers but that doesn’t mean they behave all equal when analyzing them. Knowing these types of metrics helps you to understand a) am i using the right metric, b) what are potential pitfalls and c) what else could i use.
Metrics are one of the - if not the - most important ingredient in data & analytics. When building a report you always start with a single table. Each row contains a fact, an observation or a case, which could be business objects like orders, products and users or observations (summaries/aggregates) like how many orders happened on a particular day in a particular city. The columns are either dimensions or metrics describing individual attributes of the row objects. Metrics are always numbers while dimensions are mostly textual/categorial attributes or dates.
The text you add here will only be seen by users with visual disabilities. It will not be visible on the article itself.
Any kind of report on that table, in dependently of whether it will be a bar chart, line graph, pie chart or just a table itself, would group (aggregate) the rows by certain dimension in order to show what happened on different days or months or within different cites, etc.. The crucial thing about metrics is that if rows are grouped, e.g. analyzing orders in different cities, metrics can be aggregated using mathematical operations like sums or averages.
Note: you may have more than one table in the beginning, but right before any report you should have joined/prepared everything in one table, the so called reporting layer
There can be a lot of metrics in a business, thousands in fact. We did create metric frameworks for customers who would like to foster the creation of data driven marketing initiatives. Team members should be able to select the best goal metrics (KPI) for their initiatives by drilling down through series of filters. If you look at a metric like cost per facebook video view in phase awareness you can imagine that the combinations can easily exceed 1000. Having a clear understanding of the possible types or categories of metrics will help to select the right ones for your case.
Furthermore, metrics can sometimes be misleading. If you are not sure if your metric is the right one to use, it helps to determine its category and learn what applications and pitfalls come with this type.
This article exactly tries to do this. Please note that the categorization is not along a single dimension but on multiple aspects. This means that metrics can be part of different categories.
As the name suggests these metrics are directly measured. In general you can only measure extensive metrics (see below). In some cases, if the data comes from sources outside of your control, you may not know this. In such cases you should be careful as they may be hidden calculated or even guessed metrics.
Calculated metrics can’t directly be measured as they are a calculated from at least two values. Through this combination of source metrics you can eventually create a large number of calculated metrics, and many of them do (unfortunately) make some sense. Most of the following metrics types are in fact also calculated metrics.
These metrics are the most simple ones. The important property is that they add up if you aggregate them (group several rows). They consists of metrics like counts, values or revenues. Very often those can be directly measured. Note: There is always at least one regular metric in a table even if there are no numerical columns. It’s the count of rows which you can explicitly create by adding a column and put a 1 in each row.
There are also pseudo extensive metrics. Basically they are extensive metrics but summing them up doesn’t make sense for us. For example age: If you want to know the age of all people in a city you usually don’t want to know the sum. Rather the average - but that’s actually a calculated metric already.
Intensive metrics are are kind of the opposite of extensive metrics (see regular metrics above). They are calculated by dividing two extensive metrics and therefore don’t sum up if you group together several rows. Instead you would expect something like the average of all rows. Examples are revenue per user or conversion rates like users that did B divided by the users that did A or age per user which may sound a bit weird but technically average(age) is nothing else than sum(age) (our pseudo regular metrics from above) divided by count(user)
Attention: don’t put intensive metrics directly in the tables that you start your report with. The reason is the following: If you have a table containing the average age of users per day per city and you would like to generate a report showing the average age per city you would be calculating the average of the average which is not the same as the average. In fact it would overrate days with less users. You should rather have the reporting tool calculate them directly from the extensive metrics on the granularity (in this case city) that the report needs. You do this by creating the intensive metric as a custom metric in the reporting tool.
Another warning is that the two source metrics that you divide should relate to the same group of objects. So for example if you want the conversion rate from all users doing a signup to users doing an upgrade you would just divide these two counts. However if you want to have this on a daily basis, the group of users doing a signup on a particular day is certainly different from the group of users doing an upgrade. Dividing these numbers is likely to be wrong. You need cohorted metrics here, even though the simple quotient may be a useful as a proxy metric.
These metrics are simply sums or differences of other metrics. An example is revenue=sum(order_revenue)-sum(return_revenue). If these other metrics are regular extensive metrics, then also the result is extensive and hence the restrictions of intensive metrics regarding aggregation don’t exist as the sum of sums is equal to the sum. The warning about cohorted metrics however does apply.
These metrics are derived from a regular metric by filtering out rows with a certain condition. For example analyze users coming from Facebook separately from LinkedIn users. One other example is new users vs. all users. In a way we are deriving metrics from dimension here by counting how often they appear. Data scientist call this kind of transformation one-hop encoding.
They are kind of a special case of intensive metrics. Here however we especially want to divide metrics that relate to different groups, in particular where one groups is a subset of the other. Simple cases are the percentage of facebook visitors from all visitors. This can be easily achieved by using a filtered metric and dividing it by the unfiltered or using table calculation in the reporting tool like percentage of total.
However there are more complex cases. Let’s assume you are looking at revenue per user as one of important metrics. And you would like to know how much each channel contributes the revenue per user. This is especially important if this metrics changed and you want to know where it came from. If you just take a regular intensive metric (revenue/user) and group it by channel the reporting tool would just calculate sum(revenue, grouped by channel) / count(users, grouped by channel) and these metrics wouldn’t sum up to the overall revenue per user. In fact there may be some channels with an extra ordinary high revenue per user but with a low volume which means that they don’t contribute a lot. In deed you would like to have something like sum(revenue, grouped by channel) / count(users, not grouped by channel). This is called a level of detail (LOD) expression and some tools like Tableau offer this. In some cases you can also table calculation can work.
Cohorts are a complex topic and are eventually beyond the scope of this article. Simply speaking it’s about what belongs together. The number of signups on a day and the number of upgrades share the same date but are in fact rather unrelated. As an analyst you want to know what effects what and in this sense we would know how much our product can convince users to upgrade after signup independent on whether these two events actually happened. If you still want a temporal resolution (i.e. knowing how this conversion rates evolves over time) you need to recalculate both events to the same date (most likely the signup date) using the users identity. An example may be the live time value LTV of a user which sums up all revenues of a particular user. You can now determine the average LTV based on the signup date of the users to get a time series of this metric.
The key to create a cohorted metric is in the table that you use for the report. Instead of noting the date of the event (signup, upgrade) you take the signup date for the related user for all events. This may require a bit of joining in the table preparation. You can’t do this in the reporting tool if you haven’t prepared it.
The problem with this type of metric is that they are lagging. Some of the events that will still contribute to users with a signup date in the past are in the future. So the value of the metric for dates in the past will still change in future. While they saturate over time it can take even several months until they are kind of useable. Unfortunately, if you look at metrics like LTV it always looks as if you business is doing bad lately. Proxy metrics or modeled metrics may help here.
In some cases the metric you need can’t be measured or directly calculated from measured metrics. Cohorted metrics like LTV often are of this type because some revenue is still in the future. Another example would be to predict next years revenue. In these cases you will need a model which calculates your metric based on metrics you can measure and a number of parameters that you need to provide. Generally you would estimate these parameters based on historic data and/or make assumption like the revenue will increase by 10%. Both will not necessarily hold also for the current data and hence predicted or modeled metrics always come with uncertainty.
A special case is interpolation. For example if you have a metric only on a monthly basis but you need to compare it to other data that is on a daily basis you could calculate daily numbers from the monthly data by interpolation. A related method is extrapolation which can calculate values outside of your current range of values in your data. It’s mostly used for predicting future values.
Like in the previous paragraph, if you can’t measure or directly calculate the metric that you really need, you could use a proxy metric. In comparison to the modeled metric, the proxy metric is not trying to mimic the desired metric but instead is an actually measure or calculated metric that you think is a good proxy for your desired metric. So for example to measure the performance of workers it may be convenient to measure worked hours instead of work output as the latter is difficult to determine or at least difficult to compare when workers are assigned varying tasks. You would assume that there maybe a factor between worked hours and work output which is potentially different for different workers. Instead of trying to determine this factor(s) (e.g. from historic data as you would do for modeled metrics) you would just use the proxy metric directly for for example determining salary. Anyhow, both, proxy metrics and modeled metrics will show uncertainty.
Another example would be to proxy LTV by dividing all revenue from a day by the number of new users. Again, this relates to different groups of users as explained above but if the distributions of these groups do not drastically change over time this may still be a good approximation. And it has the big advantage of not being a lagging metric so you could see up and downs faster (by compromising accuracy)
Target metrics simply defined. If they come from pure gut feeling or are actually calculated for example based on historical data and some assumptions (which makes them actually predicted metrics) is a matter of taste. When using them in reports make it clear they are targets.
Business related categories:
Scores are probably the weirdest type of metric from a scientist perspective, still they are very often used in business context. Basically they are adding up unrelated metrics. Mathematically they are mostly weighted sum like s = a1*M1+a2*M2+a3*M3. This way you can add for example revenues and users even though they have different units. Scores reflect that different factors (like revenue and users) may contribute to the business success, but they require assumptions about their weights (and these may change). An example could be a RFM score (recency, frequency, monetization) which is a weighted sum of the revenue of a user, the number of transactions and the time in days since the last transaction.
You should be very carful with using scores. They are very intransparent as they are mixing up a lot of things. They may work as an alerting mechanism on a very high level but once you need to find out why a score moved you need to look at the underlying metrics.
As said there are easily thousands of metrics in a business and it totally makes no sense to look at all of them. In fact a business should define a small set of metrics that are most efficient for assessing performance and steering decisions. These are the KPI.
Note, KPI exist on different levels. The C-level needs different KPI than the business units and a channel manager again different ones.
A related metric is the vanity metric, which is a KPI which shouldn’t be one. Vanity in this case means that it may look good (e.g. thousands of newly signed up users) but they were all promised free beer and will most likely not contribute to the business’ success in future.
Another special case are currency metrics. They are important in larger organizations with lot’s of business units having their own KPI. Unfortunately, they often have also their own definitions of what actually should be the same KPI. One department is measuring revenue from Google Analytics data, another one from shop data directly and a 3rd is even subtracting returns. If they talk to each other than every one has a different value for the same thing and a single source of truth is missing. Here it is important to identify those currency metrics that are important to more than one business unit and define them centrally. This could be done for example in data contracts within a data mesh concept.
Understanding properties and pitfalls of metrics is essential when using them in analyses and reports. Differences exist with respect to how to generate the underlying data, how to deal with them when grouping and aggregating or how to interpret them in business context. If you have any questions or comments let us know.