Launching a new product in a niche that’s as saturated as website building, we knew that Zyro needed to stand out from the get-go.
Our product was solid, sure, but we needed that special something to make it solve even more of our customers’ problems.
For example, we know that a lot of the people who want to create a website or set up an online shop struggle with writing copy.
Not everyone is a writer, after all. So, we created our AI Writer tool by using GPT-2 pre-trained models. It’s a tool that our customers took to right away, but we were sure we could solve even more of their problems that we came across in market analysis.
One issue that stood out to the entire Zyro team the most was a design-related one:
How could our customers predict the behavior of their website visitors – without spending lots of time or money? How could they tell what one’s attention will be concentrated on when visiting a website?
After all, knowing where to place what information, especially CTAs, could be incredibly valuable and increase conversion rates.
And what better way to solve this exact problem than with AI?
Since we knew what problem we were focusing on, Zyro’s AI team started by analyzing similar tools and reading up on how others were solving similar problems.
Dozens of academic papers later, we had identified a way forward.
We were going to use deep neural networks with U-NET architecture and a bunch of image pairs: a raw image and an attention mask image, where white areas would show high potential interest and black areas show no interest.
Here’s how U-NET architecture looks like:
And here is a pair of sample images:
When it comes to attention masks, there are two main methods that exist to generate them.
First, we could use hardware to track actual eye movements.
Or, we could opt for BubbleView that uses mouse clicks to measure which information people consciously choose to examine.
How BubbleView works is this: a user sees a blurred image and can move the mouse to unblur a small part of the image.
As with everything in science, there are solid arguments to be made for both methods – hardware-based and BubbleView.
However, we wanted to keep our feature robust, so we combined data from these two methods instead.
We did not reinvent the wheel by collecting data ourselves. Instead, we used openly available datasets and merged them into a single one that we could train our neural networks on.
At this point, we had the dataset necessary for neural network training and the architecture for it too. It was time to decide on a framework that we could use to make this feature a reality.
Here at Zyro, we don’t have a strict rule about what framework we should use, as it totally depends on the circumstances of the situation.
What we do have a rule for is to ‘start small and iterate,’ and in the case of this feature, we thought that PyTorch would be the best framework to use.
AI Heatmap – Version 1
To put together the first version of our tool, which we had branded as AI Heatmap, we split the data into training, validation, and testing subsets.
Then, we started experimenting with the neural network architecture.
Although we chose U-NET architecture to begin with, we also experimented with different depths of the network.
We also experimented with downsampling architectures, replacing the pure U-NET architecture with various pre-trained networks such as VGG16, VGG19, and ResNet.
And, rather quickly, we had a model that worked really well.
Given an image, our model puts out a black and white image, which we convert to an attention heatmap.
Here’s how the very first AI Heatmap outputs looked:
As soon as the feature was live, we started to track our user data, seeing how the feature was used, how it performed, and how accurate it was.
The data we collected allowed us to find weak spots and improve them.
You might be thinking: how can this heatmap help one design a website?
Well, in the image above, one can see which parts of a website are getting the most attention – the heading copy, the woman’s face, and the company logo.
If we were the owners of such a website, this attention heatmap would clearly tell us to change the webpage layout and to move the call-to-action button up.
In just a few clicks, people could change their webpage and get a much better result when it comes to user attention and, hopefully, the conversion rate.
Here’s the improved version of the same webpage:
AI Heatmap – Version 2
When the first version of AI Heatmap was shipped, we started working on a second iteration of it right away.
The data that we were collecting showed us that users were getting familiar with the feature and were starting to test it out.
The usage numbers were rising, but we really wanted to improve the accuracy of our model to make users trust the feature even more.
We made a few small changes, which improved accuracy of AI Heatmap by approximately 5%, but we knew we could do better. 😎
And the game changer for us was Uber’s CoordConv solution.
Basically, this method assists convolution by letting filters know where they are.
They do this by adding two channels to the input – one with i coordinates and another one with j coordinates.
By implementing the CoordConv layers, we increased the accuracy of our own model by nearly 15% 🚀
Serving the model
When we started to think about deployment, we needed a solution that would allow us to implement AI Heatmap fast.
Again, by sticking to the ‘start small and iterate’ principle, we came up with the following solution:
Because we are using Google Cloud, it made sense to use Google Buckets for storing all of our models and datasets. It took a single virtual machine with GPU enabled to deploy AI Heatmap.
Also, we use Python as our main programming language, Starlette as an ASGI framework, and Uvicorn as an ASGI server.
Here’s a full diagram of how we served AI Heatmap:
This approach currently works just fine.
And in case we ever need to scale it up, we can use Compute Engine Instance Groups with Google Load Balancing.
The future of AI Heatmap
As with every feature of Zyro, work continues on AI Heatmap and improving it even further.
First of all, we’d love to increase the dataset that we’ve used for AI Heatmap’s training, by adding not just regular images (as we did this far), but by focusing on website images specifically.
This would give the accuracy of this feature an immediate boost.
Second, we’d love to experiment further with the model architecture and find an even better one.
But we aren’t solely working on AI Heatmap. We have many different AI projects in the AI team’s pipeline.