The First International Workshop on
Interactive and Scalable Information Retrieval
methods for eCommerce
Feb 25th, 2022, Arizona, USA. Co-located with WSDM 2022.
Tell Me More

About the Workshop

Over the past few years, consumer behavior has shifted from traditional in-store shopping to online shopping. For example, eCommerce sales have grown from around 5% of total US sales in 2012 to around 14% in year 2021 and more than 25% growth in sales globally. This rapid growth of eCommerce has created new challenges and vital new requirements for intelligent information retrieval systems.

Scalable systems

Since the pandemic hit, eCommerce became an important part of people’s routine and they started using online shopping for smallest grocery items to big electronics as well as cars. With such a large assortment of products and millions of users, achieving higher scalability without losing accuracy is a leading concern for information retrieval systems for eCommerce.

Interactive Systems

The diverse buyers make the relevance of the results highly subjective, because relevance varies for different buyers. The most suitable and intuitive solution to this problem is to make the system interactive and provide correct relevance for different users. Hence, interactive information retrieval systems are becoming necessity in eCommerce.

System improvement

To handle sudden change in buyers’ behavior, industries adopted existing sub-optimal in- formation retrieval techniques for various eCommerce tasks. Parallelly, they also started exploring/researching for better solutions and in dire need of help from research community

The objective of this workshop is to bring a diverse set of practitioners and researchers together and encourage them to share their ideas, challenges & solutions and research. This workshop will provide a forum to discuss and learn the latest trends for interactive and scalable information retrieval approaches for eCommerce.

Call for Paper

This workshop is targeted to provide academic and industrial researchers a platform to present their latest works, share research ideas, present and discuss various challenges. Hence, we invite two kinds of contributions: full research papers (up to 10 pages) and short papers (up to 5 pages). Submissions must be in English, as PDF file, formated according to the new ACM format published in ACM guidelines, selecting the generic two-column “sigconf” sample (including references and figures). The papers can represent reports of original research, preliminary research results, or proposals for new work. The review process is single-blind. Papers will be evaluated according to their significance, originality, technical content, style, clarity, relevance to the workshop, and likelihood of generating discussion. Authors should note that changes to the author list after the submission deadline are not allowed without permission from the PC Chairs. At least one author of each accepted paper is required to register for, attend, and present the work at the workshop.

Topics of interest include but not limited to:
  • Query Understanding
    • Type-ahead/auto-completion, spell correction
    • Query intent understanding
    • Non-text query understanding
    • Attribute understanding
  • Product Understanding
    • Product intent and facets
    • Product knowledge graph
    • Ontology mining for product graph construction
  • Product Retrieval and Ranking
    • Product indexing and recall
    • Scalable and real-time indexing for frequently changing products (offers, auctions, etc.)
    • Recall and Ranking for multi-faceted products and multi-attributed queries
    • Ranking for Relevance vs Popularity vs Business trade-offs
    • Search Re-Ranking
  • Personalization and Recommendation
    • Interactive Search for personalization
    • Context and/or location based personalization
    • User attribute based personalization
    • Personalized and Semantic Retrieval
  • Conversational Search and Recommendation
    • Multi-turn product search and recommendation
    • Conversational query understanding and re-writing
    • Clarification and preference elicitation
    • Conversational result presentation and explanation
    • Multi-modal conversational systems for eCommerce
  • Other Topics
    • Feature learning for eCommerce search
    • Search & Recommendations: Fairness and trust for marketplaces
    • Balancing sponsorship vs relevance tread off in search results
    • Robust training objective and effective experimental strategy for IR models
    • End-to-End solution for interactive and scalable search framework

All papers must be submitted via EasyChair at:

Important Dates (AoE):

  • Submission deadline: Jan 16, 2022
  • Paper notifications: Jan 23, 2022
  • Camera-ready deadline: Feb 15, 2022
  • Workshop Day: Feb 25, 2022


Registration: WSDM 2022

Time (MST) Talk Title
10:50am - 11:00am Opening Remarks Welcome to ISIR-eCom
11:00am - 11:45am Maarten de Rijke Understanding Multi-channel Customer Behavior. [PDF]
11:45am - 12:30pm Khalifeh Al Jadda Building Multi Modal Search and Recommender Systems at Scale. [PDF]
12:30pm - 12:50pm Full Paper 1 ROSE: Robust Caches for Amazon Product Search. [PDF]
12:50pm - 1:10pm Full Paper 2 ORDSIM: Ordinal Regression for E-Commerce QuerySimilarity Prediction. [PDF]
1:10pm - 1:30pm Full Paper 3 Embracing Structure in Data for Billion-Scale Semantic Product Search. [PDF]
1:30pm - 1:45pm Break
1:45pm - 2:45pm Panel Discussion Opportunities and Challenges in conversational search and recommendation in eCommerce
2:45pm - 3:00pm Short Paper 1 CatBERT: An Incrementally Trained Language Representation Model for E-Commerce Applications. [PDF]
3:00pm - 3:15pm Short Paper 2 E-commerce Product Attribute Value Validation and Correction Based on Transformers. [PDF]
3:15pm - 4:00pm Yanjie Fu Merging Representation Learning and Interactive Intelligence for User Profiling. [PDF]
4:00pm - 4:45pm Xiaokui Xiao Efficient Network Embeddings for Large Graphs [PDF]
4:45pm - 4:50pm Closing Notes


Maarten de Rijke

Distinguished Professor, University of Amsterdam

Maarten de Rijke is a University Professor of Artificial Intelligence and Information Retrieval at the University of Amsterdam and the director of the national Innovation Center for Artificial Intelligence. He holds MSc degrees in Philosophy and Mathematics (both cum laude), and a PhD in Theoretical Computer Science. He worked as a postdoc at CWI and as a Warwick Research Fellow at the University of Warwick before joining the University of Amsterdam in 1998, where he was appointed full professor in 2004 and Distinguished University Professor in 2018. With an h-index of 80 De Rijke has published over 900 papers, published or edited over a dozen books, is a former editor-in-chief of ACM Transactions on Information Systems, co-editor-in-chief of Foundations and Trends in Information Retrieval and of Springer’s Information Retrieval book series, (associate) editor for various journals and book series, and a former coordinator of retrieval evaluation tracks at TREC, CLEF and INEX. He has been general (co)chair or program (co)chair for SIGIR, WSDM, WWW, CIKM, ECIR, ICTIR.

TITLE: Understanding Multi-channel Customer Behavior

Online shopping is gaining popularity. Traditional retailers with physical stores adjust to this trend by allowing their customers to shop online as well as offline, in-store. Increasingly, customers can browse and purchase products across multiple shopping channels. Understanding how customer behavior relates to the availability of multiple shopping channels is an important prerequisite for many downstream machine learning tasks, such as recommendation and purchase prediction. However, previous work in this domain is limited to analyzing single-channel behavior only. In this talk, I share insights into multi-channel customer behavior in retail based on a large sample of 2.8 million transactions originating from 300,000 customers of a food retailer in Europe. Our analysis reveals significant differences in customer behavior across online and offline channels, for example with respect to the repeat ratio of item purchases and basket size. Based on these findings, we investigate the performance of a next basket recommendation model under multi-channel settings. We find that the recommendation performance differs significantly for customers based on their choice of shopping channel, which strongly indicates that future research on recommenders in this area should take into account the particular characteristics of multi-channel retail shopping.

Khalifeh Al Jadda

Sr. Director of Data Sciecnce, Home Depot

Khalifeh AlJadda holds a Ph.D. in computer science from the University of Georgia (UGA), with a specialization in machine learning. He has experience implementing large-scale, distributed machine learning algorithms to solve challenging problems in domains ranging from Bioinformatics to search and recommendation engines. He is the Sr. Director of Online Data Science at Home Depot, which is the largest home improvement company in the world. In his current role he oversees the AI transformation of the online business of Home Depot. Before joining Home Depot he was leading the data science R&D organization at CareerBuilder where he led the initiative to design and implement the backend of CareerBuilder’s language-agnostic semantic search engine leveraging NLP, Apache Spark, and the Hadoop ecosystem. He also led the team on building a new AI-based recommendation engine using cutting-edge technologies. Khalifeh is the founder and organizer of the Southern Data Science Conference, which is a major data science conference in Atlanta that aims to promote data science in the Southeast region. He also co-founded a non-profit organization ATLytiCS.

TITLE: Building Multi Modal Search and Recommender Systems at Scale

In the recent years we have witnessed a significant increase of e-commerce shopping especially during the pandemic when most people shifted to online shopping due to the lockdown or to shop safely from home. Therefore, e-commerce platforms have become a crucial part of any retail business and thus retailers started to invest more in improving their digital stores by focusing on providing customers with the best experience where they can find whatever they need in a frictionless way. Product discoverability is a core function in any e-commerce platform and building intelligent information retrieval systems like Search and Recommender systems which can help customers find what they need without much effort is very important to the success of any online shopping platform. In this talk I’ll present how we leveraged different AI techniques to build intelligent search and recommendation engines to improve product discoverability and thus help customers find what they need with less effort. The talk will also cover the importance of leveraging different modality of the data to solve different discoverability problems.

Yanjie Fu

Assistant Professor, University of Central Florida

Dr. Yanjie Fu is an assistant professor in the Department of Computer Science at the University of Central Florida. He received his Ph.D. degree from Rutgers, the State University of New Jersey in 2016, the B.E. degree from University of Science and Technology of China in 2008, and the M.E. degree from Chinese Academy of Sciences in 2011. His research interests include data mining and big data analytics. He has research experience in industry research labs, such as Microsoft Research Asia and IBM Thomas J. Watson Research Center. He has published prolifically in refereed journals and conference proceedings, such as IEEE TKDE, IEEE TMC, ACM TKDD, ACM SIGKDD, AAAI, IJCAI, VLDB, WWW. He received US NSF CAREER Award (2021), ACM SIGSpatial Best Paper Runner-Up Award (2020), US NSF CRII Award (2018), ACM SIGKDD Best Student Paper Finalist (2018), University of Missouri System Research Board Award (2017), Microsoft Research Azure Research Award (2016), IEEE ICDM Best Paper Finalist (2014). He is committed to data science education. His graduated Ph.D. students have joined academia as tenure-track faculty members.

TITLE: Merging Representation Learning and Interactive Intelligence for User Profiling

The pervasiveness of mobile, IoT, and sensing technologies have connected humans, physical worlds, and cyber worlds into a grand human-social-technological system. This system consists of individual users, devices, infrastructures, and cities that interact and communicate with each other in real time and at different locations. Therefore, big spatial-temporal-networked data have been accumulated from mobile devices and App services. Such spatial-temporal-networked data have unprecedented and unique complexity. In this talk, he will introduce different representation techniques and the idea that representation as a tool for user profiling. Later, he will discuss how the structured patterns in user behavioral data can be used to inform the development of advanced representation learning methods. Finally, he will discuss how interactive intelligence can be integrated into representation learning to support user profiling.

Xiaokui Xiao

Dean's Chair Associate Professor, National University of Singapore

Xiaokui Xiao is a Dean's Chair Associate Professor at the School of Computing, National University of Singapore (NUS). He received a Ph.D. in Computer Science from the Chinese University of Hong Kong, and did a postdoctoral stint at the Department of Computer Science, Cornell University. Before joining NUS, he was an associate professor at the Nanyang Technological University, Singapore. Xiaokui’s research focuses on data management and analytics, especially on algorithms for large data, data privacy, and data mining. His research interest also include social network analysis, graph embedding and graph neural networks. He has published extensively in the leading data management conference and journals, and is serving as associate editors for the International Journal on Very Large Data Bases (VLDBJ) and the IEEE Transactions on Knowledge and Data Engineering (TKDE). He received the best research paper award in VLDB 2021, and was elected a distinguished member of ACM in 2021.

TITLE: Efficient Network Embeddings for Large Graphs

Given a graph G, network embedding maps each node in G into a compact, fixed-dimensional feature vector, which can be used in downstream machine learning tasks. Most of the existing methods for network embedding fail to scale to large graphs with millions of nodes, as they either incur significant computation cost or generate low-quality embeddings on such graphs. In this talk, we will present two efficient network embedding algorithms for large graphs with and without node attributes, respectively. The basic idea is to first model the affinity between nodes (or between nodes and attributes) based on random walks, and then factorize the affinity matrix to derive the embeddings. The main challenges that we address include (i) the choice of the affinity measure and (ii) the reduction of space and time overheads entailed by the construction and factorization of the affinity matrix. Extensive experiments on large graphs demonstrate that our algorithms outperform the existing methods in terms of both embedding quality and efficiency.

Panel Discussion

Opportunities and Challenges in conversational search and recommendation in eCommerce

Stephen Guo

Director@ads, Walmart Global Tech - Moderator

Max Harper

Senior Applied Research Scientist at Amazon

Pranam Kolari

VP@search, Walmart Global Tech

Rui Li

ML Engineer Manager at Pinterest

Kexin Xie

VP & ML Principal Architect at Salesforce

Lingfei Wu

Principal Scientist at JD.COM Silicon Valley Research Center

Accepted Papers

  1. ROSE: Robust Caches for Amazon Product Search: Chen Luo, Vihan Lakshman, Anshumali Shrivastava, Tianyu Cao, Sreyashi Nag, Rahul Goutam, Hanqing Lu, Yiwei Song and Bing Yin. [PDF] [BibTex]

  2. ORDSIM: Ordinal Regression for E-Commerce QuerySimilarity Prediction: Md. Ahsanul Kabir, Mohammad Al Hasan, Aritra Mandal, Daniel Tunkelang and Zhe Wu. [PDF] [BibTex]

  3. Embracing Structure in Data for Billion-Scale Semantic Product Search: Vihan Lakshman, Choon Hui Teo, Xiaowen Chu, Abhinandan Patni, Pooja Maknikar and Svn Vishwanathan. [PDF] [BibTex]

  4. CatBERT: An Incrementally Trained Language Representation Model for E-Commerce Applications: Tejaswini Mallavarapu, Ying Xie and Simon Hughes. [PDF] [BibTex]

  5. E-commerce Product Attribute Value Validation and Correction Based on Transformers: Le Yu, Haozheng Tian, Yun Zhu, Simon Hughes and Aleksandar Velkoski. [PDF] [BibTex]


Vachik Dave

Walmart Global Tech

Linsey Pang

Walmart Global Tech

Xiquan Cui

The Home Depot

Lingfei Wu

JD.COM Silicon Valley Research Center

Hamed Zamani

University of Massachusetts Amherst

George Karypis

University of Minnesota Twint-cities

Program Committee:

  • Thomas Packer, The Home Depot, USA
  • Jake Gao, Walmart Global Tech, USA
  • Yaxin Zhu, UMass Amherst, USA
  • Mohit Sharma, Google, USA
  • Nguyen Vo, Walmart Global Tech, USA
  • --> -->