Search technology has evolved significantly – and so have user expectations, especially since the rise of GenAI chatbots like ChatGPT. Users searching for products in e-commerce used to try to match product keywords. Now they expect to be able to do their product searches in natural language. Just six months after the launch of ChatGPT, Algolia’s CTO reported observing “twice as many keywords per search query” on their search-as-a-service platform as before.
In this article, we’ll show you how to build a modern search system for e-commerce products using natural language queries that leverage context, rather than traditional text-to-SQL approaches that struggle with meaning.
Traditional keyword-based search with filters often fails to capture the meaning of complex queries that combine multiple attributes - like text descriptions, numerical data (eg. price), categorical information (eg. product type), and ratings. For example:
We can build a system that’s more flexible and effective than converting natural language to SQL queries (text-to-SQL), using:
from superlinked import framework as sl from superlinked_app import constants class ProductSchema(sl.Schema): id: sl.IdField type: sl.String category: sl.StringList title: sl.String description: sl.String review_rating: sl.Float review_count: sl.Integer price: sl.Float product = ProductSchema()
# Create similarity spaces for different attributes category_space = sl.CategoricalSimilaritySpace( category_input=product.category, categories=constants.CATEGORIES, uncategorized_as_category=True, negative_filter=-1, ) description_space = sl.TextSimilaritySpace( text=product.description, model="Alibaba-NLP/gte-large-en-v1.5" ) review_rating_maximizer_space = sl.NumberSpace( number=product.review_rating, min_value=-1.0, max_value=5.0, mode=sl.Mode.MAXIMUM ) price_minimizer_space = sl.NumberSpace( number=product.price, min_value=0.0, max_value=1000, mode=sl.Mode.MINIMUM )
product_index = sl.Index( spaces=[ category_space, description_space, review_rating_maximizer_space, price_minimizer_space, ], fields=[product.type, product.category, product.review_rating, product.price], )
semantic_query = ( base_query .similar(description_space, text_param) .similar(price_space, price_param) .similar(rating_space, rating_param) )
Using a multi-attribute vector search system, the query
"psychology books with a price lower than 100 and a rating bigger than 4"
gets automatically decoded into:
semantic search elements: description="psychology" price=100 rating=4 filters: type="book"
Our system then performs vector similarity search across all attributes simultaneously, providing more nuanced and relevant results than traditional filtering.
A complete implementation of this system is available on GitHub. It includes:
Combining vector embeddings and natural language processing is by far more intuitive and powerful than traditional text-to-SQL methods in handling the complexities of e-commerce search. Using a multi-attribute vector search platform like Superlinked’s, you can build modern search experiences that meet the expectations of 21st century e-commerce users, without the complexities and deficiencies of manual SQL query generation, or the overhead of maintaining a separate search index for each different attribute type.
Ready to try it out yourself? Superlinked is here to help you with your implementation. We also offer a fully-managed cloud service. Don’t hesitate to get in touch!
This article is based on Paul Iusztin’s tutorial on Decoding ML - check out the full version here.
Stay updated with VectorHub
Continue Reading