Track battery level in Google Tag Manager

For analysing data context is very important. How about check how battery level influence for web user behaviour?

To solve this case I’ve prepared modification of  code. This code put battery level in dataLayer and you can track it in every web analytics solution using Google Tag Manager.

Create Custom HTML tag with this code:

 

Publish your container and that’s it!

Now create new dataLayer variables in GTM to get this values and send i.e. in Google Analytics tracking as custom dimension (session scope):

  • battery_level
  • battery_is_charging
  • battery_time_to_charge
  • battery_time_to_discharge

Check also source code on my Github.

Find signal in noise

How to find value from messy data?

Sometimes during my projects I hear from my clients: “We’re collecting data. Let’s do something cool with it!”. It’s two popular ways to deal with it.

Scenario 1: start analysis following intuition. But to be honest – from the side it can seems like “black magic” and causing questions: Why you use this method? Why not that? And an effect is poor trust for results.

The better solution is Scenario 2: more formal approach. 

CRISP-DM – the Data Science framework

My choice of framework is CRISP-DM (Cross Industry Standard Process for Data Mining). Set in 1996 but still works. And it’s still the most popular Data Science framework (according to kdnuggets survey).

Why I use it?

  • This approach keeps business goal in mind during whole analysis process.
  • Following every steps I create complete, reproductible and well documented analysis.
  • I get answer for initial question.

6 steps to find value in data

This framework define 6 phases of data analysis process:

1. Business Understanding

  • Don’t be afraid to ask questions. Talk with business team members.
  • Check business context and set up goal of analysis.
  • What you want to achieve during this analysis?

2. Data Unterstanding

  • Collect dataset.
  • Check what kind of data you have? What every variable means? Is there any technical circumstances of this data source? (i.e. in web analytics world major of tracking tools are based on cookies and JavaScript. If use has disabled it in browser – he wouldn’t be tracked and won’t be appear in dataset).
  • Make some exploratory data analysis. Check distributions – some algorithms works only on variables in normal distributions.
  • What missing value means?  Is it error in data collecting or data processing?

3. Data preparation

  • Check variables type (factor / continuous).
  • Make some data cleaning.
  • Create new variables if necessary (i.e. convert numeric to flag etc)
  • Great book R for Data Science by Garrett Grolemund and Hadley Wickham can be very helpful to conduct data wrangling in R.

 4. Modeling

  • Divide dataset to training and test subset
  • Conduct data analysis with proper methods.
  • Good practice is to train a few models and next decide which method gives the best output (the most accurate).

5. Evaluation

  • Build matrix error to compare error rate of every model. Then decide to choose one the best model or use all models together and voting (i.e. if 3 of 4 models classify observation to group get this result)

6. Deployment

  • Very important step. Get result of analysis and take action. Make and share report. Build data-driven function in you app. Include this results in business model. Sometimes if you find next questions – start next CRISP-DM iteration.

Putting all together – this chart illustrate all process:

crisp-dm_process_diagram

Image: https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining#/media/File:CRISP-DM_Process_Diagram.png

Slides

I had session on  #9 Measure Camp London about this topic. Slides from my session are now available on SlideShare:


Twitter

And some thoughts from my audience 🙂 Thanks!

Machine learning in action

On last MeasureCamp #8 in London I’ve led session about ML:

Machine Learning in action. Advanced users segmentation using Google Analytics, Google Tag Manager and R.

 

Summary of workflow

  1. Prepare data collection
    1. Google Tag Manager
      1. dataLayer
      2. browser fingerprinting
  2. Data processing and aggregation
    1. Google Analytics
      1. Custom Dimensions
      2. Content Grouping
  3. Advanced data analysis
    1. R + Google Analytics API
    2. Unsupervised Learning – k-Means algorithm

Result – clustered users:

clusters

 

Slides

Slides from my session are now available on SlideShare:


R Code

You can find complete example on my GitHub repository

Atendees

It was pleasure to speak for full room. I hope that this session provide some inspiration to new way of web traffic analysis 🙂