Personalisation at Scale: How AI Recommendation Engines Work

Practical AI and eCommerce insights — recommendation engines, LLMs, EU AI Act compliance, and retail AI strategy for Irish businesses.

By Michael English, Co-Founder & CTO, IMPT.io  ·  Clonmel, Co. Tipperary, Ireland

Recommendation Systems | AI Personalisation | Ireland


Meta Description: How AI recommendation engines work explained by Michael English (IMPT.io CTO). Matrix factorisation, deep learning recommenders, A/B testing — a technical guide for Irish eCommerce teams.

Target Keywords: recommendation engine how it works, AI personalisation eCommerce Ireland, collaborative filtering Ireland, matrix factorisation recommender, eCommerce personalisation EU, Michael English recommendation systems


What Makes Recommendation Engines the Highest-ROI AI Investment

Amazon's recommendation engine generates 35% of its revenue. Netflix's recommendation system is estimated to save $1 billion annually in customer retention. These are the most profitable AI systems ever deployed in commercial settings.

Why? Because recommendations address the fundamental economic problem of too many choices: with thousands or millions of products, customers cannot discover everything relevant to them. Good recommendations surface the right product at the right time to the right customer — turning passive browsing into active purchase intent.

This article provides a technical deep-dive into how recommendation engines work, the architectures worth deploying for Irish eCommerce businesses, and how to measure and improve their performance.


The Recommendation Problem

Formally defined:

This is an extreme cold-data problem: most user-item pairs are unobserved (the matrix R is typically 99.9%+ sparse). The challenge is learning from a tiny fraction of interactions to predict many unknowns.


The Main Algorithmic Approaches

1. Collaborative Filtering: "Users Like You Also Liked..."

Collaborative filtering (CF) uses the pattern of user-item interactions to identify similar users (or items) and make recommendations based on those similarities.

User-based CF:

Item-based CF (more practical at scale):


import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

def item_based_cf(
    interaction_matrix: np.ndarray,  # Users × Items
    user_id: int,
    top_k: int = 10
) -> list:
    """
    Simple item-based collaborative filtering.
    Returns top-K recommended item indices.
    """
    # Compute item-item similarity matrix
    item_similarity = cosine_similarity(interaction_matrix.T)
    
    # Items the user has interacted with
    user_interactions = interaction_matrix[user_id]
    interacted_items = np.where(user_interactions > 0)[0]
    
    # Score all items based on similarity to interacted items
    scores = np.zeros(interaction_matrix.shape[1])
    for item_idx in interacted_items:
        scores += item_similarity[item_idx] * user_interactions[item_idx]
    
    # Zero out already-interacted items
    scores[interacted_items] = 0
    
    # Return top-K item indices
    return np.argsort(scores)[::-1][:top_k].tolist()

Limitation: Doesn't work for new users (cold-start) or new items; doesn't scale beyond ~100K items without approximation.


2. Matrix Factorisation: The Foundation of Modern Recommendations

Matrix factorisation decomposes the interaction matrix R into two low-dimensional matrices:


R ≈ U × V^T

Where:

The predicted interaction score for user u and item i is:


R̂[u][i] = U[u] · V[i]^T  (dot product of user and item vectors)

Training: Minimise reconstruction error on observed interactions:


Loss = Σ(u,i observed) (R[u][i] - U[u]·V[i]^T)² + λ(||U||² + ||V||²)

The regularisation term λ prevents overfitting.

Algorithms:

Implicit vs Explicit Feedback:


import implicit
import scipy.sparse as sp

def train_als_recommender(
    interaction_data: list,  # [(user_id, item_id, confidence)]
    n_users: int,
    n_items: int,
    factors: int = 64,
    iterations: int = 20,
    regularization: float = 0.1
) -> implicit.als.AlternatingLeastSquares:
    """
    Train ALS recommendation model using implicit library.
    Efficient for implicit feedback (purchases, clicks).
    """
    
    # Build sparse user-item confidence matrix
    # Confidence = 1 + alpha * frequency_of_interaction
    rows = [uid for uid, _, conf in interaction_data]
    cols = [iid for _, iid, conf in interaction_data]
    data = [conf for _, _, conf in interaction_data]
    
    interaction_matrix = sp.csr_matrix(
        (data, (rows, cols)), 
        shape=(n_users, n_items)
    )
    
    # Train ALS model
    model = implicit.als.AlternatingLeastSquares(
        factors=factors,
        iterations=iterations,
        regularization=regularization,
        use_gpu=False  # Set True if GPU available
    )
    
    model.fit(interaction_matrix)
    
    return model, interaction_matrix

def get_recommendations(
    model: implicit.als.AlternatingLeastSquares,
    interaction_matrix: sp.csr_matrix,
    user_id: int,
    n_recommendations: int = 10,
    filter_already_purchased: bool = True
) -> list:
    """Get top-N recommendations for a user."""
    
    ids, scores = model.recommend(
        user_id, 
        interaction_matrix[user_id],
        N=n_recommendations,
        filter_already_liked=filter_already_purchased
    )
    
    return [(int(item_id), float(score)) for item_id, score in zip(ids, scores)]

3. Deep Learning Recommendation Models

Modern recommendation systems use deep neural networks to capture complex, non-linear patterns in user-item interactions.

Neural Collaborative Filtering (NCF):


import torch
import torch.nn as nn

class NeuralCF(nn.Module):
    """
    Neural Collaborative Filtering — combines matrix factorisation 
    with neural network layers for non-linear interaction modelling.
    """
    
    def __init__(
        self, 
        n_users: int, 
        n_items: int, 
        mf_dim: int = 64,
        mlp_dims: list = [128, 64, 32, 16],
        dropout: float = 0.2
    ):
        super().__init__()
        
        # MF embeddings (for matrix factorisation branch)
        self.mf_user_embed = nn.Embedding(n_users, mf_dim)
        self.mf_item_embed = nn.Embedding(n_items, mf_dim)
        
        # MLP embeddings (for neural branch)
        mlp_input_dim = mlp_dims[0]
        self.mlp_user_embed = nn.Embedding(n_users, mlp_input_dim // 2)
        self.mlp_item_embed = nn.Embedding(n_items, mlp_input_dim // 2)
        
        # MLP layers
        layers = []
        for i in range(len(mlp_dims) - 1):
            layers.extend([
                nn.Linear(mlp_dims[i], mlp_dims[i+1]),
                nn.ReLU(),
                nn.Dropout(dropout)
            ])
        self.mlp = nn.Sequential(*layers)
        
        # Prediction layer
        self.prediction = nn.Linear(mf_dim + mlp_dims[-1], 1)
        self.sigmoid = nn.Sigmoid()
    
    def forward(self, user_ids: torch.Tensor, item_ids: torch.Tensor) -> torch.Tensor:
        # MF branch
        mf_user = self.mf_user_embed(user_ids)
        mf_item = self.mf_item_embed(item_ids)
        mf_output = mf_user * mf_item  # Element-wise product
        
        # MLP branch
        mlp_user = self.mlp_user_embed(user_ids)
        mlp_item = self.mlp_item_embed(item_ids)
        mlp_input = torch.cat([mlp_user, mlp_item], dim=-1)
        mlp_output = self.mlp(mlp_input)
        
        # Concatenate and predict
        combined = torch.cat([mf_output, mlp_output], dim=-1)
        prediction = self.sigmoid(self.prediction(combined))
        
        return prediction.squeeze()

Two-Tower Models (industry standard for large-scale systems):


User Tower                    Item Tower
─────────────                 ─────────────
User ID embedding             Item ID embedding
Demographic features          Item attributes
Behavioural history           Category, price, brand
      ↓                               ↓
[Dense layers]                [Dense layers]
      ↓                               ↓
User embedding vector         Item embedding vector
      \                              /
       \                            /
        [Dot product → Score]

Two-tower architecture enables:

  1. Pre-compute all item embeddings offline
  2. For each user, compute embedding at query time
  3. Approximate nearest neighbour (ANN) search to find top-K items in milliseconds
  4. Scales to millions of items and users while maintaining sub-50ms inference

4. Transformer-Based Sequential Models

The latest generation of recommendation models uses transformer architectures to model user behaviour as a sequence:

SASRec (Self-Attentive Sequential Recommendation):

Models the sequence of items a user has interacted with (in chronological order), using self-attention to weight which past interactions most influence the current recommendation.


class SASRecTransformer(nn.Module):
    """
    Self-Attentive Sequential Recommendation model.
    Models user interest evolution as a sequence.
    """
    
    def __init__(
        self,
        n_items: int,
        max_sequence_length: int = 50,
        d_model: int = 64,
        n_heads: int = 4,
        n_layers: int = 2,
        dropout: float = 0.2
    ):
        super().__init__()
        
        self.item_embedding = nn.Embedding(n_items + 1, d_model, padding_idx=0)
        self.position_embedding = nn.Embedding(max_sequence_length, d_model)
        
        encoder_layer = nn.TransformerEncoderLayer(
            d_model=d_model,
            nhead=n_heads,
            dim_feedforward=d_model * 4,
            dropout=dropout,
            batch_first=True
        )
        self.transformer = nn.TransformerEncoder(encoder_layer, n_layers)
        
        self.dropout = nn.Dropout(dropout)
        self.layer_norm = nn.LayerNorm(d_model)
        
    def forward(self, sequence: torch.Tensor) -> torch.Tensor:
        seq_len = sequence.size(1)
        positions = torch.arange(seq_len, device=sequence.device).unsqueeze(0)
        
        # Item + position embeddings
        x = self.item_embedding(sequence) + self.position_embedding(positions)
        x = self.dropout(x)
        x = self.layer_norm(x)
        
        # Causal mask (can only attend to past items)
        causal_mask = torch.triu(
            torch.ones(seq_len, seq_len, device=sequence.device) * float('-inf'), 
            diagonal=1
        )
        
        output = self.transformer(x, mask=causal_mask)
        
        # Return the last item's representation as user state
        return output[:, -1, :]  # [batch, d_model]

Handling the Cold-Start Problem

The cold-start problem — making recommendations for new users or new items with no interaction history — is the most challenging practical issue.

New User Cold-Start Strategies

1. Onboarding preference capture:

Ask users 3-5 quick preference questions during registration. "Do you prefer minimalist or statement styles?" → map to item clusters → immediate personalisation.

2. Session-based recommendations:

Use items browsed/added-to-cart in the current session to generate recommendations, without any user history.

3. Content-based starter:

Use demographic, location, or device signals as proxy features until behaviour history accumulates. "Users in Galway who viewed this category first often also bought..."

4. Progressive personalisation:

Start with popularity-based recommendations, transition to CF/MF as interaction history grows (typically after 5-10 interactions).

New Item Cold-Start Strategies

1. Content-based bootstrapping:

Use item attributes (category, price, brand, material, colour) to position new items in embedding space near similar existing items.

2. Warm-up promotion:

Feature new items in "New Arrivals" placements for 30-90 days, artificially generating interaction data for the model.


A/B Testing Recommendation Engines

Good recommendation engines are built through systematic experimentation. Without A/B testing, you don't know if changes help or hurt.

Experimental Design


import hashlib
import random

def assign_recommendation_experiment(
    user_id: str,
    experiment_name: str,
    variants: list,  # e.g., ['control', 'variant_a', 'variant_b']
    traffic_splits: list = None  # e.g., [0.4, 0.3, 0.3]
) -> str:
    """
    Deterministically assign users to recommendation experiment variants.
    Same user always gets same variant; prevents novelty effects.
    """
    if traffic_splits is None:
        traffic_splits = [1/len(variants)] * len(variants)
    
    # Hash user_id + experiment to get deterministic bucket
    hash_input = f"{user_id}:{experiment_name}".encode('utf-8')
    hash_value = int(hashlib.md5(hash_input).hexdigest(), 16)
    bucket = (hash_value % 1000) / 1000  # 0.000 to 0.999
    
    # Assign to variant based on bucket
    cumulative = 0
    for variant, split in zip(variants, traffic_splits):
        cumulative += split
        if bucket < cumulative:
            return variant
    
    return variants[-1]  # Fallback to last variant

Key Metrics to Track

Metric Description Measurement
CTR (Recommendation Click-Through Rate) % of recommended items clicked Clicks / Impressions
CVR from recommendations % of rec clicks → purchase Purchases from rec / Rec clicks
Revenue per session Total revenue attributable to recommendations Session revenue (rec touch) / sessions
Diversity Average pairwise dissimilarity of recommendations Prevents filter-bubble effects
Coverage % of catalogue appearing in any recommendation Prevents popularity bias
Serendipity Measure of unexpected relevant discoveries Harder to measure; user survey based

Deployment Architecture for Irish Retailers

Scale-Appropriate Architecture

Small retailers (<10K MAU):

Medium retailers (10K-1M MAU):

Large retailers (1M+ MAU):


Conclusion

Recommendation engines are the most mature and highest-ROI AI application in eCommerce. From simple item-based collaborative filtering to transformer-based sequential models, the range of approaches available matches every scale and technical sophistication level.

For Irish eCommerce businesses, the entry point is managed services (AWS Personalize, Nosto) requiring minimal ML expertise. The path forward is custom two-tower models as interaction data grows and the sophistication ceiling of managed services is reached.

The critical investment is in data quality and tracking infrastructure. Models are increasingly commodity; the differentiation is in the quality and freshness of the features feeding them.


Michael English is Co-Founder & CTO of IMPT.io. He builds and deploys ML recommendation systems for Irish and EU eCommerce. Based in Clonmel, Co. Tipperary, Ireland.

impt.io

Keywords: recommendation engine how it works Ireland, AI personalisation eCommerce, collaborative filtering explained, matrix factorisation recommender, deep learning recommendation Ireland, eCommerce personalisation EU, Michael English recommendation systems

Michael English — Co-Founder & CTO, IMPT.io

Michael English is Co-Founder & CTO of IMPT.io, a blockchain-based carbon credit platform operating across the EU. He writes on quantum computing, carbon markets, AI, and sustainable technology infrastructure. Based in Clonmel, Co. Tipperary, Ireland.

impt.io  ·  mike-english.com