Yalie Search
Semantic search for Yale students using CLIP — find people by describing their appearance, style, or characteristics.
Machine Learning Python Next.js CLIP
Overview
Yalie Search is a semantic search engine that lets you find Yale students by describing their appearance in natural language. Powered by OpenAI’s CLIP (Contrastive Language-Image Pre-training), it understands the relationship between text descriptions and images to return relevant matches.
Instead of searching by name, you can search by description — “curly red hair and freckles” or “wearing a Yale crew sweatshirt” — and find matching profiles from the Yale directory.
Key Features
- Natural Language Search — Find people using descriptive queries
- Advanced Filters — Filter results by residential college, graduation year, and major
- Find Similar — Click any result to discover visually similar people
- Leaderboards — See trending searches and most-viewed profiles
- Anonymous Mode — Search privately without logging to history
- Content Moderation — AI-powered filtering using GPT-4o-mini to prevent misuse
Tech Stack
Backend:
- FastAPI for the API layer
- PyTorch with CLIP ViT-Large-Patch14 for embeddings
- SQLite for leaderboard persistence
- OpenAI API for content moderation
- Yale CAS for authentication
Frontend:
- Next.js 14 with App Router
- TypeScript for type safety
- Tailwind CSS with glassmorphism design
- Framer Motion for animations
Infrastructure:
- Railway for backend hosting (Docker)
- Vercel for frontend (Edge network)
How It Works
- Embedding Generation: Profile images are encoded into 768-dimensional vectors using CLIP
- Query Processing: Natural language queries are encoded into the same vector space
- Similarity Search: Cosine similarity finds the closest matches between query and image embeddings
- Caching: Popular queries are cached for instant results