HuggingFace has launched a new job search assistant designed to help job seekers find relevant roles by analyzing resumes and generating tailored recommendations. The tool uses a combination of Qwen3-8B and DeepSeek V4 Pro to create a shortlist of job postings with defensible reasoning for each entry. Users can see why a particular job is ranked higher than another, providing clarity on the model's decision-making process. The assistant is built using a closed-loop dataset that includes 2,500 resumes and 10,000 job postings scraped from LinkedIn through JobSpy. This data is used to train the model to evaluate job-fit scores across five dimensions, including skills, experience, education, and industry alignment.

The system employs a two-stage training approach, using LoRA SFT runs on a single A100 GPU via Modal. The teacher model, DeepSeek V4 Pro, generates structured labels for the student model, Qwen3-8B, which is optimized for inference with quantization to Q4_K_M. The final model is available in safetensors format and a Q4_K_M base with LoRA-GGUF sidecars for use with llama.cpp. The inference process runs on HuggingFace ZeroGPU Spaces, using llama-cpp-python with a pre-built CUDA wheel to minimize latency and cold start costs.

The project includes an HuggingFace agent-traces dataset that documents the entire development process, including raw JSONL events and trace viewer access. Users can upload their resumes to the live demo at build-small-hackathon/job-search-assistant to experience the tool firsthand. The assistant is designed to reduce the time spent sifting through job postings by providing a curated shortlist with detailed reasoning for each recommendation.

Source: huggingface