Two-Tower Document Search

Client: ML Institute

Year: 2025

Role: Machine Learning Engineer

Duration: 1 week

Relevant links: Github

Summary: Two-Tower network for document retrieval of MS Marco dataset.

To rapidly deepen my expertise in deep learning, I enrolled in an intensive, eight-week bootcamp designed for experienced software engineers. One of my primary projects there involved building a semantic search engine using the MS MARCO dataset.

Initially, I experimented with training custom word2vec embeddings on the text8 Wikipedia corpus. However, after careful evaluation, I opted for the reliable, open-source embeddings provided by Gensim.

The final architecture I implemented was a Two-Tower RNN model, featuring:

Parallel encoding layers—one dedicated to queries and the other to documents (both positive and negative).
Recurrent neural networks (RNNs) for capturing the sequential nature of text, preserving critical information such as word order.
An InfoNCE loss function that encourages queries to be close to relevant documents while distancing them from irrelevant ones, thereby enhancing the overall quality of the learned representations.

This project served as a hands-on introduction to deep learning for semantic search and provided valuable insights into both embedding techniques and contrastive learning.

For inference, I precomputed embeddings for every document in the dataset using the DocsTower network and stored them in a Faiss in-memory index. When an API request arrives, I simply generate the query embedding through the QueryTower network and perform a k-nearest neighbors (kNN) search in Faiss to quickly retrieve the most relevant documents.

The entire solution is modular, fully containerized, and designed for scalability. It’s currently deployed on a Hetzner server that I rent, ensuring I have the flexibility and control needed for ongoing experimentation and optimizations.

Page updated

Google Sites

Report abuse