NLP · Information Retrieval · Full-Stack

CrediCrew

AI-powered recruitment matching that understands fit, not just keywords

The hiring pipeline is broken in a specific, measurable way. A recruiter posting a senior backend role receives hundreds of resumes. Eighty percent are irrelevant. Of the remaining twenty percent, maybe half are genuinely strong fits — but distinguishing them requires reading each one carefully against the job description, weighing trade-offs between experience depth and breadth, evaluating project relevance, and making judgment calls that do not scale.

CrediCrew automates the screening and ranking phase. Not with keyword matching — that is what applicant tracking systems have done for decades, and it is why good candidates get filtered out for not using the exact right buzzwords. Instead, CrediCrew encodes both resumes and job descriptions into the same semantic embedding space and retrieves candidates by meaning, not lexical overlap.

The Two-Stage Architecture

"The interesting technical choice: a bi-encoder gets you speed — sub-second retrieval across thousands of candidates. A cross-encoder gets you accuracy — nuanced understanding of fit. Use both."

Stage 1 — Bi-Encoder Retrieval

Every resume and every job description is independently encoded into a 384-dimensional vector using MiniLM-L6-v2 via sentence-transformers. These embeddings live in a FAISS index. When a new JD arrives, the system retrieves the top-k nearest candidates in milliseconds — even across a corpus of tens of thousands. This stage prioritizes recall: cast a wide net, miss nothing plausible.

Stage 2 — Cross-Encoder Re-Ranking

The top candidates from Stage 1 are then re-ranked by a cross-encoder that processes the resume and JD as a single concatenated input. This allows full attention between every token in both documents — the model can reason about whether three years of Django experience compensates for a missing Kubernetes requirement, something a bi-encoder's independent encoding cannot capture. This stage prioritizes precision: the final top-10 should be genuinely the best fits.

Pipeline Architecture

  Resume Upload          Job Description
       |                       |
       v                       v
  +-----------+         +-----------+
  | MiniLM    |         | MiniLM    |
  | Encoder   |         | Encoder   |
  +-----------+         +-----------+
       |                       |
       v                       v
  FAISS Index  ---- cosine ---  Query Vector
       |
       v  (top-50 candidates)
  +-------------------+
  | Cross-Encoder     |
  | Re-Ranking        |
  | (resume + JD      |
  |  jointly attended)|
  +-------------------+
       |
       v  (top-10 ranked)
  +-------------------+
  | Match Explanation |
  | & Score Card      |
  +-------------------+
       |
       v
  Recruiter Dashboard

Why This Matters

The difference between keyword matching and semantic matching is not incremental. A candidate whose resume says "built distributed microservices handling 10k RPS" will not match a JD asking for "scalable backend architecture" under lexical search. Under semantic search, the connection is obvious. CrediCrew surfaces candidates that ATS systems systematically miss.

Every match comes with an explanation — not just a score, but a breakdown of which skills aligned, where the gaps are, and how confidently the system rates the fit. Recruiters do not have to trust a black box.

<1.5s CV Processing Time

<300ms Cached Matching

Top-10 Cross-Encoder Precision

Stack

PythonFastAPIReactsentence-transformersMiniLM-L6-v2FAISSCross-EncoderPostgreSQL