Staying on track 8: ToTD 36-40

17 November 2015 / Tim Holmes

When I started the first round of eye-tracking tips back in May, it was on something of a whim because I’d spotted a few issues that seemed to occur across multiple projects and thought them worth sharing with a wider audience. As Twitter tips they were intended to be snappy little nudges in the right direction, but as this list has progressed the tips have become more like discussion topics aimed at raising awareness of things you might not have thought were an issue, and that is certainly true of this last batch. I think these tips are some of the most interesting and, as such, deserve more discussion than I will have words for today, so I intend to pick them up again over future blog pieces. I know already from Twitter that some of these topics might push a few buttons, in which case I would encourage you to join in the discussion via Twitter, LinkedIn or email, in which case I’ll be sure to include your contribution in future posts! So enough of my pre-amble, and for the last time this year, let’s get to the tips…

Tip 36: Think aloud can contaminate results, but retrospective might not be as pure as you think

One of the great things about eye-movements is that they are closely related to the task being performed, indeed that’s the very reason most of us want to eye-track in the first place! Ironically then, one of the methods often used alongside eye-tracking is something called think aloud, where the person being tracked is encouraged to articulate their thought process whilst they are performing a task such as navigating a website or evaluating some marketing materials. Stop and think about this for a second. One of the reasons we study eye-movements is that they are susceptible to unconscious influences and involuntary distractions which are frequently quite fleeting. By asking someone to articulate their thoughts only one of two outcomes is possible: the first is that they slow down their behaviour to the speed of self-report and so are no longer behaving naturally and the second is that they are only reporting on part of the process because they cannot, by definition, accurately report their unconscious behaviour. Either way, think aloud is not especially helpful.

One of the often suggested solutions to this is something called retrospective think aloud, which allows the task to be completed without the think aloud part and then asks the participant to articulate their thought process afterwards while watching a playback of their eye-movements. Now, clearly this is better than standard think aloud since the natural behaviour whilst performing the task has been preserved, but how useful is it to play a person’s eye-movements back to them and ask them to comment on them when, as I already highlighted, many of those eye-movements are involuntary or unconscious. In fact those are the very ones we are often most interested in! This is a topic I’m going to return to in more detail in a future blog for sure, but I already mentioned a study at VSS last year from Jeremey Wolfe’s lab in which radiologists were shown both their own and other radiologists’ scan paths and were no better than chance at identifying theirs.  Results like these are popping up more and more at vision conferences and suggest that retrospective think aloud must be as susceptible to post-hoc justification of behaviour as any other self-report method. So what’s a researcher to do? Well, personally I think a better option is NOT to present the eye-movements to the participant, but simply to observe them during the original activity using a live-view window and to develop your debrief (deep dive) questions in response to those eye-movements to validate what was missed or what was distracting. This is part of the skill of a good moderator and, unfortunately, only comes with experience of working with a few eye-tracking studies. Other, less qualitative, methods include recall, recognition and comprehension tests which typically require more effort in the research design but can provide less subjective results.

Tip 37: Pupil data comes for free but imposes design constraints if you want to interpret it

In these post-Daniel-Kahneman days of system 1/system 2 and behavioural economics the question that most commercial researchers seem to be seeking an answer to is whether or not a participant shows an emotional response to a stimulus. This of course makes sense because it underlies much of the automated (system 1) behaviour that we now know influences the majority of decision making. So wouldn’t it be great if an eye-tracker could give us some insight into this? Well, of course, it can through the pupil diameter that is included in the raw data from most decent eye-trackers, but it comes with a lot of small print!

First problem relates to Tip 39 and is that in most cases you pretty much have to resort to raw data to get at pupil diameter. Moreover, depending on the type of tracker you are using the absolute accuracy of this measurement might not be as good as you’d like. Fortunately for most cognitive measures it’s the relative change in pupil size that you need so, providing the level of accuracy is consistent, you don’t need to worry too much about it and can treat this as noise in your measure. The second problem, and by far the biggest challenge when dealing with pupil data is something called the pupillary light response, which describes the constriction of the pupil in response to bright light and the corresponding dilation of the pupil in response to low light levels. There are lots of ways you can deal with this, enough for another blog piece in fact, but for now all you need to know is this: if you haven’t controlled for luminance changes in your stimuli you might not be measuring what you think you are, and this effect is strong enough to call your data into question.

If you’ve managed to extract your data and you’ve removed all effects of light from your measure, you immediately run into another problem with pupil response, and that is its inability to tell you anything about the emotional response, other than that there was one. What this means is that one person’s “aww” response to picture of kitten will look remarkably like another person’s “eeyuw”! Of course, it’s not always impossible – sometimes you can use an independent ranking of emotional content to interpret the response, and if it’s sexual attraction you’re interested in then a dilated pupil is generally a positive sign! Finally, or at least finally for now, the latency of the pupil response to an emotional stimulus is relatively slow by eye-movement standards. This means that the change in pupil size and the subject of fixation may not necessarily be the same and so you need to be really careful when attributing the effect to its source.

What all this means is that you need to plan your research quite carefully if you’re going to use pupil size as a measure of anything. Oh, and I might have mentioned this before, pilot your study and your analysis before you make promises about what you can deliver!

Tip 38: Explore, and eliminate if necessary, salience effects before theorising about cognition

Few topics divide eye-movement researchers more than salience. No one disputes its existence but its power to predict where someone is going to look at any moment in time is hotly debated and I could fill a book, or at least another blog, with a discussion of the many models that exist. I touched on this subject already this year in my first blog from ECEM and if you’ve read it you might be wondering “why is Tim telling to me explore salience?” given that I definitely believe the top-down modulation of eye-movements is more important when answering questions of a cognitive nature. After all, if you’ve run an eye-tracking study, you already have a recording of exactly where your participant was looking throughout your study so why do you need to look at a prediction of where they are most likely to look?

Well in fact, this is one of those situations where the answer is in the question. Salience models are typically based on the visual features of the stimulus: for example, the classic Itti & Koch (2001) model creates maps based on colour, contrast and intensity which, by no small coincidence, are features that are responded to by different parts of the visual cortex. What this means is the model is telling you something about the likelihood of the distribution of attention based purely on properties of the scene – it’s worth noting that eye-movements were not actually discussed in the initial models, but the relationship between (overt) attention and eye-movements was certainly applied to them. The model of course does not know what task your participant was performing, and indeed as a researcher you might not be certain of this yourself. But here’s the thing, if the model is telling you something about the expected distribution of attention in the absence of a task, then deviation from that distribution might be telling you something about the task itself or some response to the stimulus, so if you have controlled the task well enough you can potentially “extract” the salience effect from the eye-movements and make a lot more sense of the data. Disreputable researchers can, of course, also report the level of “engagement” with a product on a shelf, for example, in quite a misleading way if they haven’t considered visual salience in their analysis, something I’ve seen done on more than one occasion!

So my tip boils down to this: if you can, generate a salience map from one of the peer-reviewed models and use it formulate some hypothesis for your study and then look at the actual data. If these are the same then it’s as likely that salience is contributing to the results as any other interpretation you would like to apply!

Tip 39: Raw or filtered data? Those gaze plots might be missing something important

So I already mentioned raw data way back in Tip 16 and I am a huge fan of using it, but I also understand that for many people that is exactly what they want to avoid because they trust the software manufacturers to filter it for them. I haven’t said this for a while, but buy me a pint and I’ll tell you why in some cases you REALLY shouldn’t do that!

The reason for me returning to this topic is the popularity of head-mounted (glasses type) eye-trackers for commercial research and the soon to be released automated coding tools from the likes of Tobii and SMI. In Tip 30 I talked about the pain of having to manually code eye-movements when wanting to extract AOI based statistics from glasses recordings, but I kind of skipped over something there (bad Tim!) which was that in most cases what you’re doing is coding fixations. If you’re reading this blog I’m going to assume you are aware that fixations are only part of the data recorded and that an algorithm, called a fixation filter, is eliminating the actual eye-movements (saccades) from your data. In reality what this means is that you are manually coding 3-4 fixations per second of recording instead of 30-60 gaze points per second which is what the raw data contains. Now this would all be fine if the fixation filters used in glasses software had been developed for use with glasses, but this is often not the case. What this means is that they don’t cope so well with the relationship between head movement and eye-movements, or to give them their proper name, smooth pursuit eye-movements. These are the eye-movements that you make when tracking a moving object or when you track a stationary object to compensate for head movement. In other words they are the process of keeping the image stable on the retina, which is what we are normally interested in when we run a fixation filter! The problem is that some of the algorithms for this might classify smooth pursuit as a saccade and not make the data available for coding meaning that the resulting gaze plots or AOI statistics might be a little sparse.

Unfortunately, the solutions for this are not easy! You can usually choose to code in raw-mode but with some systems this might actually limit your analysis and will certainly require a LOT more time to prepare for analysis. Automated coding will solve some of the problems because it will at least mean that each gaze point is coded by the software, but without the head movement data the filters will still run into the same issues. The proper solution requires the head movement to be tracked and the eye-movement data to be analysed relative to it as part of the filter. It’s totally doable, it’s just not done yet! So for now, I’m afraid, this is something you’d have to do yourself, but it’s worth being aware of when comparing gaze-replay videos with gaze-plot visualisations and trying to understand why they sometimes differ.

Tip 40: Eye-tracking adds massive value to other biometrics but think about synchronization

The last tip of this batch is also probably the least controversial. These days we increasingly get calls from customers wanting to combine eye-tracking with other biometrics such as EEG, GSR or some form of motion capture. This makes a lot of sense because if you’re using something like EEG to ascertain that someone is encoding a stimulus into memory you almost certainly want to know precisely where they were looking at the time. The problem is that you are now collecting data from two different sources with different latencies and almost certainly different frequencies. Synchronising this is both important and not necessarily easy. If you’re lucky your systems will be compatible and provide the necessary timing signals in or out to allow at least a one-way pulse, and for now I have to leave it at that because there are way too many combinations of systems to discus in any detail here.

But what do you do if you’re unlucky? Well, you could write or invest in a third party piece of software that would receive both data streams and allow you synchronise the two – we’ve done that a few times so I can recommend something if you’re struggling. Alternatively, if you want synchronisation but you don’t need millisecond accuracy, and here I’m talking slower signals like GSR, you could try injecting a visual time synch into the eye-tracking scene camera recording and then using this in playback to create an event in your data. Again, this is another one of those cases where a bit of piloting will certainly pay off!

And with that I am drawing this series of tips to a close. The next six months should see a rush of innovations to the market which will, no doubt, bring with them plenty more opportunities for learning, so I suspect we haven’t seen the last of tips of the day – in fact I already have quite a few that didn’t make the cut this time! So stay tuned for more, or follow me on Twitter to get the succinct versions!


Leave a Reply