Antenna is a new method of gaining seamless access to your log-level data for querying in a SQL interface without the need for ingesting, cleaning and transforming your log files.
In this tutorial, we’ll explore a few ways you can use the log-level data that Antenna makes available to analyze your data in ways that are unavailable to you in Beeswax’s Query Tool. Throughout this process we’ll work our way up from basic queries to queries that are more advanced.
While we’ll work our way up in difficulty, this tutorial assumes a basic working knowledge of SQL. For a tutorial on SQL basics, one resource is W3Schools’ [SQL Tutorial.](🔗)
## Example 1: Calculating Reach and Frequency
Let’s start with a relatively basic analysis: finding the Reach and Frequency of our campaigns. For this analysis we only need data on won impressions, so we’ll use the table that houses this: `
Let’s start with the basic framework of the query, selecting the fields we need and getting a basic impression count before we bring in Reach and Frequency. We’ll aggregate over the `
campaign_id` and `
line_item_id` dimensions for the last 30 days of data. We’re pulling a count of all the rows here, as this table represents impressions.
Now we have all our campaigns broken down by line items sorted by impressions in descending order. This isn’t anything too special though — you can pull this in Query Tool today!
Let’s start getting closer to the analysis we actually want, with Reach and Frequency calculations. To calculate Reach we’ll use Snowflake’s [HLL](🔗) function which runs a HyperLogLog approximate count analysis. This will trade off a small degree of accuracy (+/- 1.62%) for a much more cost-efficient query.
Great! We’ve now got a report calculating reach and frequency across our line items. However, for good measure let’s swap this out for a dimension not available in Query Tool today — `
With a few small steps in Antenna, we’ve now gone from a query you could fully run in Query Tool to one that contains both dimensions and metrics in a frequency analysis that were previously unavailable to you.
## Limiting to Converters
Let’s do one final alteration that may offer a little bit more insight. Let’s find out the distribution of impression frequency for converters. Let’s start by getting the frequency for each `
user_id`, but limited to only impressions that had a conversion. We’ll use a slightly different methodology than before. This analysis doesn’t handle all of the niche cases, but will pull the frequency of impressions for any user who made at least one conversion and saw at least one impression in the last 30 days.
Close, but we’re not done yet to get the frequency distribution. Let’s wrap this in a subquery to take these results and get a distribution. Because there may be a long-tail of users with a large number of impressions. Let’s group all the users with 10+ impressions together into a single bucket. To do this we need to also cast `
impressions_per_user` to the string datatype using `
With these results, we can take the results into a spreadsheet program and create a nice visualization of the frequency distribution (log scale).