Back to Projects

Yalie Search

Semantic search for Yale students using CLIP — find people by describing their appearance, style, or characteristics.

Machine Learning Python Next.js CLIP

Overview

Yalie Search is a semantic search engine that lets you find Yale students by describing their appearance in natural language. Powered by OpenAI’s CLIP (Contrastive Language-Image Pre-training), it understands the relationship between text descriptions and images to return relevant matches.

Instead of searching by name, you can search by description — “curly red hair and freckles” or “wearing a Yale crew sweatshirt” — and find matching profiles from the Yale directory.

Key Features

  • Natural Language Search — Find people using descriptive queries
  • Advanced Filters — Filter results by residential college, graduation year, and major
  • Find Similar — Click any result to discover visually similar people
  • Leaderboards — See trending searches and most-viewed profiles
  • Anonymous Mode — Search privately without logging to history
  • Content Moderation — AI-powered filtering using GPT-4o-mini to prevent misuse

Tech Stack

Backend:

  • FastAPI for the API layer
  • PyTorch with CLIP ViT-Large-Patch14 for embeddings
  • SQLite for leaderboard persistence
  • OpenAI API for content moderation
  • Yale CAS for authentication

Frontend:

  • Next.js 14 with App Router
  • TypeScript for type safety
  • Tailwind CSS with glassmorphism design
  • Framer Motion for animations

Infrastructure:

  • Railway for backend hosting (Docker)
  • Vercel for frontend (Edge network)

How It Works

  1. Embedding Generation: Profile images are encoded into 768-dimensional vectors using CLIP
  2. Query Processing: Natural language queries are encoded into the same vector space
  3. Similarity Search: Cosine similarity finds the closest matches between query and image embeddings
  4. Caching: Popular queries are cached for instant results