Product Recommendation using Graph Databases

With a wide variety of products available in online marketplace, accurate product recommendation is becoming increasingly important. This is even more evident in a Social Network enabled Marketplace. In this post we will look at how Graph Databases can be used for product recommendations for a Social Network Marketplace.

The following property graph schema stores information about people’s purchases and their social interactions. The CUSTOMER nodes represent users of a social network, who can FOLLOW each other. The PURCHASE relationship indicates that given persons buy given PRODUCT using an electronic marketplace.

How we will using Graph Database queries to mine product recommendations from the Schema

Let’s start by create a sample dataset:

-- Node Table: Customers
CREATE TABLE customers (
  customer_id INT
  , customer_name varchar(120)
  , gender varchar(6)
);

-- Node Table: Products
CREATE TABLE products (
  product_id INT
  , product_type varchar(120)
  , list_price float
);

-- Edge Table: Purchases linking customers and products
CREATE TABLE purchases (
  customer_id INT
  , product_id INT
  , purchase_date DATE
);

-- Edge Table: Purchases linking customers with other customer (Follow relationship)

CREATE TABLE followers (
  follower_customer_id INT
  , followee_customer_id INT
);

insert into customers values 
  (1, 'Sofia', 'Female')
  , (2, 'Chiara', 'Female')
  , (3, 'Fabio', 'Male')
  , (4, 'Luca', 'Male');

insert into products values 
  (111, 'Stiletto Heel', 995.4)
  , (222, 'Dress', 600.34)
  , (333, 'Tuxedo Jacket', 1200)
  , (444, 'Tie', 150)
  , (555, 'Tuxedo Shirt', 300)
  , (666, 'Jean', 200);

insert into purchases values 
  (1, 111, to_date('2025-01-01', 'YYYY-MM-DD'))
  , (1, 666 , to_date('2025-01-01', 'YYYY-MM-DD'))
  , (2, 111, to_date('2025-01-01', 'YYYY-MM-DD'))
  , (2, 222, to_date('2025-01-01', 'YYYY-MM-DD'))
  , (2, 666, to_date('2025-01-01', 'YYYY-MM-DD'))
  , (3, 333, to_date('2025-01-01', 'YYYY-MM-DD'))
  , (3, 444, to_date('2025-01-01', 'YYYY-MM-DD'))
  , (3, 555, to_date('2025-01-01', 'YYYY-MM-DD'))
  , (3, 666, to_date('2025-01-01', 'YYYY-MM-DD'))
  , (4, 333, to_date('2025-01-01', 'YYYY-MM-DD'))

;

insert into followers values 
  (1, 3) -- Sofia follows Fabio
  , (3, 2); -- Fabio follows Chiara

Next let’s create a Property Graph to that reflects the Knowledge in the Graph data:

CREATE OR REPLACE PROPERTY GRAPH recommender_graph
  VERTEX TABLES (
    customers KEY (customer_id) PROPERTIES ARE ALL COLUMNS
    , products KEY(product_id) PROPERTIES ARE ALL COLUMNS
  )
  EDGE TABLES ( 
    purchases 
      key (customer_id, product_id) 
      SOURCE KEY (customer_id) REFERENCES customers(customer_id)
      DESTINATION KEY(product_id) REFERENCES products(product_id)
      PROPERTIES ARE ALL COLUMNS
    
    , followers
      key (follower_customer_id, followee_customer_id)
      SOURCE KEY (follower_customer_id) REFERENCES customers(customer_id)
      DESTINATION KEY (followee_customer_id) REFERENCES customers(customer_id)
      PROPERTIES ARE ALL COLUMNS
);

Let’s start with a simple Graph Database query to recommend products on what CUSTOMER from the same gender group have purchased

SELECT distinct 
  recommend_to_customer_name
  , based_on_purchase_customer_name
  , customer_a_gender
  , recommended_product_type
FROM GRAPH_TABLE (
       recommender_graph
       MATCH   (customer_a  IS customers) - [e1 is purchases] -> 
               (product_a IS products) <- [e2 is purchases] - (customer_b IS customers) - 
               [e3 is purchases ] -> (recommended_product IS products)
       COLUMNS (
               customer_a.customer_name  as recommend_to_customer_name
               , customer_a.customer_id  as recommend_to_customer_id
               , customer_b.customer_name as based_on_purchase_customer_name
               , customer_a.gender as customer_a_gender
               , customer_b.gender as customer_b_gender
               , recommended_product.product_type as recommended_product_type
               , recommended_product.product_id as recommended_product_id
        )
) --as recommendations
where customer_a_gender = customer_b_gender
  and recommend_to_customer_name <> based_on_purchase_customer_name
   and not exists (
     select 1 
     from purchases 
     where recommend_to_customer_id = purchases.customer_id
       and recommended_product_id = purchases.PRODUCT_ID
)
  ;

This will result in

Note this is not taking the Social Network (FOLLOWS relationship) into account. Now we will update the query to utilize multiple paths in the MATCH clause to get to the desired outcome.

In using Graph Query Language (GQL) to query Property Graphs, pattern matching is done by specifying one or more path patterns in the MATCH clause. A single path pattern matches a linear path of vertices and edges, while more complex patterns can be matched by combining multiple path patterns, separated by comma. This is very useful when analyzing Social Network graphs.

SELECT distinct 
  recommend_to_customer_name
  , based_on_purchase_customer_name
  , customer_a_gender
  , recommended_product_type
FROM GRAPH_TABLE (
       recommender_graph
       MATCH   (customer_a  IS customers) - [e1 is purchases] -> 
               (product_a IS products) <- [e2 is purchases] - (customer_b IS customers) - 
               [e3 is purchases ] -> (recommended_product IS products)
              , (customer_a) - [e4 is followers] ->{,10} (customer_b) --- incorporates the FOLLOWS relationship
       COLUMNS (
               customer_a.customer_name  as recommend_to_customer_name
               , customer_a.customer_id  as recommend_to_customer_id
               , customer_b.customer_name as based_on_purchase_customer_name
               , customer_a.gender as customer_a_gender
               , customer_b.gender as customer_b_gender
               , recommended_product.product_type as recommended_product_type
               , recommended_product.product_id as recommended_product_id
        )
) --as recommendations
where customer_a_gender = customer_b_gender
  and recommend_to_customer_name <> based_on_purchase_customer_name
   and not exists (
     select 1 
     from purchases 
     where recommend_to_customer_id = purchases.customer_id
       and recommended_product_id = purchases.PRODUCT_ID
)
  ;


This will output

image

Note that this is incorporating the FOLLOWS relationship. While there is not direct relationship between Sofia and Chiara, Sofia follows Fabio and Fabio follows Chiara. MATCH can find variable lengths patterns. To find relationship between customer up to 5 Levels we use : (customer_a) - [e4 is followers] ->{,10} (customer_b)

Available length patterns:

{n} : exactly n steps
{n.m} : between n and m steps (inclusive)
{,n} : up to n steps (from 0, inclusive)
? : 0 or 1 steps

Acknowledgments:

  1. Property Graph visualizations were created using G.V() – Graph Database Client & Visualization Tool