JobFitAI：利用Deepseek+DeepInfra+Gradio構建一個簡歷綜合分析專案

構建一個簡歷綜合分析專案

在當今競爭激烈的就業市場上，讓您的簡歷脫穎而出至關重要。JobFitAI 是一款創新型解決方案，旨在透過分析簡歷和提供可操作的反饋來幫助求職者和招聘人員。傳統的基於關鍵字的篩選方法可能會忽略求職者簡歷中的關鍵細微差別。為了克服這些挑戰，可以利用人工智慧驅動的系統來分析簡歷、提取關鍵技能，並將其與職位描述進行有效匹配。

學習目標

安裝所有必需的庫，並使用 DeepInfra API 金鑰配置您的環境。
瞭解如何建立可處理 PDF 和音訊檔案的人工智慧簡歷分析器。
透過 DeepInfra 利用 DeepSeek-R1 從簡歷中提取相關資訊。
使用 Gradio 開發互動式網路應用程式，實現無縫使用者互動。
應用實用的增強功能並解決常見問題，為您的簡歷分析器增添重要價值。

什麼是Deepseek R1

DeepSeek-R1 是一款先進的開源人工智慧模型，專為自然語言處理（NLP）任務而設計。它是一個基於轉換器的大型語言模型（LLM），經過訓練可理解和生成類人文字。DeepSeek-R1 可以執行文字摘要、問題解答、語言翻譯等任務。由於它是開源的，開發人員可以將其整合到各種應用中，根據特定需求進行微調，並在自己的硬體上執行，而無需依賴專有系統。它尤其適用於研究、自動化和人工智慧驅動的應用。

瞭解Gradio

Gradio 是一個使用者友好型 Python 庫，可幫助開發人員為機器學習模型和其他應用建立互動式網路介面。只需幾行程式碼，Gradio 就能讓使用者建立具有輸入元件（如文字框、滑塊和圖片上傳）和輸出顯示（如文字、圖片或音訊）的可共享應用程式。它被廣泛應用於人工智慧模型演示、快速原型設計和麵向非技術使用者的友好介面。Gradio 還支援簡單的模型部署，允許開發人員透過公共連結分享他們的應用程式，而無需複雜的網路開發技能。

本指南介紹了 JobFitAI，這是一個端到端的解決方案，可利用尖端技術提取文字、生成詳細分析，並就簡歷與給定職位描述的匹配程度提供反饋：

DeepSeek-R1：強大的人工智慧模型，可從簡歷文字中提取關鍵技能、經驗、教育和成就。
DeepInfra：提供強大的與 OpenAI 相容的 API 介面，使我們能夠與 DeepSeek-R1 等人工智慧模型進行無縫互動。
Gradio：一個使用者友好型框架，可讓您快速輕鬆地為機器學習應用構建互動式網路介面。

專案架構

JobFitAI 專案採用模組化架構，每個元件都在處理簡歷時發揮特定作用。以下是概述：

JobFitAI/

│── src/

│ ├── __pycache__/ (compiled Python files)

│ ├── analyzer.py

│ ├── audio_transcriber.py

│ ├── feedback_generator.py

│ ├── pdf_extractor.py

│ ├── resume_pipeline.py

│── .env (environment variables)

│── .gitignore

│── app.py (Gradio interface)

│── LICENSE

│── README.md

│── requirements.txt (dependencies)

JobFitAI/ │── src/ │ ├── __pycache__/ (compiled Python files) │ ├── analyzer.py │ ├── audio_transcriber.py │ ├── feedback_generator.py │ ├── pdf_extractor.py │ ├── resume_pipeline.py │── .env (environment variables) │── .gitignore │── app.py (Gradio interface) │── LICENSE │── README.md │── requirements.txt (dependencies)

JobFitAI/ 
│── src/
│   ├── __pycache__/  (compiled Python files)
│   ├── analyzer.py
│   ├── audio_transcriber.py
│   ├── feedback_generator.py
│   ├── pdf_extractor.py
│   ├── resume_pipeline.py
│── .env  (environment variables)
│── .gitignore
│── app.py  (Gradio interface)
│── LICENSE
│── README.md
│── requirements.txt  (dependencies)

設定環境

在深入學習程式碼之前，您需要設定開發環境。

建立虛擬環境並安裝依賴項

首先，在專案資料夾中建立一個虛擬環境來管理依賴項。開啟終端並執行

python3 -m venv jobfitai

source jobfitai/bin/activate # On macOS/Linux

python -m venv jobfitai

jobfitai\Scripts\activate # On Windows - cmd

python3 -m venv jobfitai source jobfitai/bin/activate # On macOS/Linux python -m venv jobfitai jobfitai\Scripts\activate # On Windows - cmd

python3 -m venv jobfitai
source jobfitai/bin/activate  # On macOS/Linux
python -m venv jobfitai
jobfitai\Scripts\activate # On Windows - cmd

接下來，建立一個名為 requirements.txt 的檔案，並新增以下庫：

requests

whisper

PyPDF2

python-dotenv

openai

torch

torchvision

torchaudio

gradio

requests whisper PyPDF2 python-dotenv openai torch torchvision torchaudio gradio

requests 
whisper
PyPDF2
python-dotenv
openai
torch
torchvision
torchaudio
gradio

執行下列命令安裝依賴項：

pip install -r requirements.txt

pip install -r requirements.txt

設定環境變數

專案需要 API 令牌才能與 DeepInfra API 互動。在專案根目錄中建立 .env 檔案，並新增 API 令牌：

DEEPINFRA_TOKEN="your_deepinfra_api_token_here"

DEEPINFRA_TOKEN="your_deepinfra_api_token_here"

確保將 your_deepinfra_api_token_here 替換為 DeepInfra 提供的實際令牌。

瞭解如何訪問 DeepInfra API 金鑰；此處。

專案簡介

該專案由多個 Python 模組組成。在下面的章節中，我們將瞭解每個檔案的用途及其在專案中的上下文。

src/audio_transcriber.py

簡歷並不總是文字格式。在收到音訊簡歷時，AudioTranscriber 類就會發揮作用。該檔案使用 OpenAI 的 Whisper 模型將音訊檔案轉錄為文字。然後，分析器會使用轉錄內容提取簡歷細節。

import whisper

class AudioTranscriber:

"""Transcribe audio files using OpenAI Whisper."""

def __init__(self, model_size: str = "base"):

"""

Initializes the Whisper model for transcription.

Args:

model_size (str): The size of the Whisper model to load. Defaults to "base".

"""

self.model_size = model_size

self.model = whisper.load_model(self.model_size)

def transcribe(self, audio_path: str) -> str:

"""

Transcribes the given audio file and returns the text.

Args:

audio_path (str): The path to the audio file to be transcribed.

Returns:

str: The transcribed text.

Raises:

Exception: If transcription fails.

"""

try:

result = self.model.transcribe(audio_path)

return result["text"]

except Exception as e:

print(f"Error transcribing audio: {e}")

return ""

import whisper class AudioTranscriber: """Transcribe audio files using OpenAI Whisper.""" def __init__(self, model_size: str = "base"): """ Initializes the Whisper model for transcription. Args: model_size (str): The size of the Whisper model to load. Defaults to "base". """ self.model_size = model_size self.model = whisper.load_model(self.model_size) def transcribe(self, audio_path: str) -> str: """ Transcribes the given audio file and returns the text. Args: audio_path (str): The path to the audio file to be transcribed. Returns: str: The transcribed text. Raises: Exception: If transcription fails. """ try: result = self.model.transcribe(audio_path) return result["text"] except Exception as e: print(f"Error transcribing audio: {e}") return ""

import whisper
class AudioTranscriber:
"""Transcribe audio files using OpenAI Whisper."""
def __init__(self, model_size: str = "base"):
"""
Initializes the Whisper model for transcription.
Args:
model_size (str): The size of the Whisper model to load. Defaults to "base".
"""
self.model_size = model_size 
self.model = whisper.load_model(self.model_size)
def transcribe(self, audio_path: str) -> str:
"""
Transcribes the given audio file and returns the text.
Args:
audio_path (str): The path to the audio file to be transcribed.
Returns:
str: The transcribed text.
Raises:
Exception: If transcription fails.
"""
try:
result = self.model.transcribe(audio_path)
return result["text"]
except Exception as e:
print(f"Error transcribing audio: {e}")
return ""

src/pdf_extractor.py

大多數簡歷都是 PDF 格式。PDFExtractor 類負責使用 PyPDF2 庫從 PDF 檔案中提取文字。該模組迴圈瀏覽 PDF 文件的所有頁面，提取文字並將其編譯成一個字串，以便進一步分析。

import PyPDF2

class PDFExtractor:

"""Extract text from PDF files using PyPDF2."""

def __init__(self):

"""Initialize the PDFExtractor."""

pass

def extract_text(self, pdf_path: str) -> str:

"""

Extract text content from a given PDF file.

Args:

pdf_path (str): Path to the PDF file.

Returns:

str: Extracted text from the PDF.

Raises:

FileNotFoundError: If the file does not exist.

Exception: For other unexpected errors.

"""

text = ""

try:

with open(pdf_path, "rb") as file:

reader = PyPDF2.PdfReader(file)

for page in reader.pages:

page_text = page.extract_text()

if page_text:

text += page_text + "\n"

except FileNotFoundError:

print(f"Error: The file '{pdf_path}' was not found.")

except Exception as e:

print(f"An error occurred while extracting text: {e}")

return text

import PyPDF2 class PDFExtractor: """Extract text from PDF files using PyPDF2.""" def __init__(self): """Initialize the PDFExtractor.""" pass def extract_text(self, pdf_path: str) -> str: """ Extract text content from a given PDF file. Args: pdf_path (str): Path to the PDF file. Returns: str: Extracted text from the PDF. Raises: FileNotFoundError: If the file does not exist. Exception: For other unexpected errors. """ text = "" try: with open(pdf_path, "rb") as file: reader = PyPDF2.PdfReader(file) for page in reader.pages: page_text = page.extract_text() if page_text: text += page_text + "\n" except FileNotFoundError: print(f"Error: The file '{pdf_path}' was not found.") except Exception as e: print(f"An error occurred while extracting text: {e}") return text

import PyPDF2
class PDFExtractor:
"""Extract text from PDF files using PyPDF2."""
def __init__(self):
"""Initialize the PDFExtractor."""
pass
def extract_text(self, pdf_path: str) -> str:
"""
Extract text content from a given PDF file.
Args:
pdf_path (str): Path to the PDF file.
Returns:
str: Extracted text from the PDF.
Raises:
FileNotFoundError: If the file does not exist.
Exception: For other unexpected errors.
"""
text = ""
try:
with open(pdf_path, "rb") as file:
reader = PyPDF2.PdfReader(file)
for page in reader.pages:
page_text = page.extract_text()
if page_text:
text += page_text + "\n"
except FileNotFoundError:
print(f"Error: The file '{pdf_path}' was not found.")
except Exception as e:
print(f"An error occurred while extracting text: {e}")
return text

src/resume_pipeline.py

ResumePipeline 模組是處理簡歷的協調器。它整合了 PDF 提取器和音訊轉錄器。根據使用者提供的檔案型別，它將簡歷導向正確的處理器，並返回提取的文字。這種模組化設計便於在將來需要支援其他簡歷格式時進行擴充套件。

from src.pdf_extractor import PDFExtractor

from src.audio_transcriber import AudioTranscriber

class ResumePipeline:

"""

Process resume files (PDF or audio) and return extracted text.

"""

def __init__(self):

"""Initialize the ResumePipeline with PDFExtractor and AudioTranscriber."""

self.pdf_extractor = PDFExtractor()

self.audio_transcriber = AudioTranscriber()

def process_resume(self, file_path: str, file_type: str) -> str:

"""

Process a resume file and extract text based on its type.

Args:

file_path (str): Path to the resume file.

file_type (str): Type of the file ('pdf' or 'audio').

Returns:

str: Extracted text from the resume.

Raises:

ValueError: If the file type is unsupported.

FileNotFoundError: If the specified file does not exist.

Exception: For other unexpected errors.

"""

try:

file_type_lower = file_type.lower()

if file_type_lower == "pdf":

return self.pdf_extractor.extract_text(file_path)

elif file_type_lower in ["audio", "wav", "mp3"]:

return self.audio_transcriber.transcribe(file_path)

else:

raise ValueError("Unsupported file type. Use 'pdf' or 'audio'.")

except FileNotFoundError:

print(f"Error: The file '{file_path}' was not found.")

return ""

except ValueError as ve:

print(f"Error: {ve}")

return ""

except Exception as e:

print(f"An unexpected error occurred: {e}")

return ""

from src.pdf_extractor import PDFExtractor from src.audio_transcriber import AudioTranscriber class ResumePipeline: """ Process resume files (PDF or audio) and return extracted text. """ def __init__(self): """Initialize the ResumePipeline with PDFExtractor and AudioTranscriber.""" self.pdf_extractor = PDFExtractor() self.audio_transcriber = AudioTranscriber() def process_resume(self, file_path: str, file_type: str) -> str: """ Process a resume file and extract text based on its type. Args: file_path (str): Path to the resume file. file_type (str): Type of the file ('pdf' or 'audio'). Returns: str: Extracted text from the resume. Raises: ValueError: If the file type is unsupported. FileNotFoundError: If the specified file does not exist. Exception: For other unexpected errors. """ try: file_type_lower = file_type.lower() if file_type_lower == "pdf": return self.pdf_extractor.extract_text(file_path) elif file_type_lower in ["audio", "wav", "mp3"]: return self.audio_transcriber.transcribe(file_path) else: raise ValueError("Unsupported file type. Use 'pdf' or 'audio'.") except FileNotFoundError: print(f"Error: The file '{file_path}' was not found.") return "" except ValueError as ve: print(f"Error: {ve}") return "" except Exception as e: print(f"An unexpected error occurred: {e}") return ""

from src.pdf_extractor import PDFExtractor
from src.audio_transcriber import AudioTranscriber
class ResumePipeline:
"""
Process resume files (PDF or audio) and return extracted text.
"""
def __init__(self):
"""Initialize the ResumePipeline with PDFExtractor and AudioTranscriber."""
self.pdf_extractor = PDFExtractor()
self.audio_transcriber = AudioTranscriber()
def process_resume(self, file_path: str, file_type: str) -> str:
"""
Process a resume file and extract text based on its type.
Args:
file_path (str): Path to the resume file.
file_type (str): Type of the file ('pdf' or 'audio').
Returns:
str: Extracted text from the resume.
Raises:
ValueError: If the file type is unsupported.
FileNotFoundError: If the specified file does not exist.
Exception: For other unexpected errors.
"""
try:
file_type_lower = file_type.lower()
if file_type_lower == "pdf":
return self.pdf_extractor.extract_text(file_path)
elif file_type_lower in ["audio", "wav", "mp3"]:
return self.audio_transcriber.transcribe(file_path)
else:
raise ValueError("Unsupported file type. Use 'pdf' or 'audio'.")
except FileNotFoundError:
print(f"Error: The file '{file_path}' was not found.")
return ""
except ValueError as ve:
print(f"Error: {ve}")
return ""
except Exception as e:
print(f"An unexpected error occurred: {e}")
return ""

src/analyzer.py

該模組是簡歷分析器的主幹。它使用 DeepSeek-R1 模型初始化與 DeepInfra API 的連線。該檔案中的主要函式是 analyze_text，它將簡歷文字作為輸入，並返回總結簡歷關鍵細節的分析結果。該檔案確保我們的簡歷文字由專為簡歷分析定製的人工智慧模型處理。

import os

from openai import OpenAI

from dotenv import load_dotenv

# Load environment variables from .env file

load_dotenv()

class DeepInfraAnalyzer:

"""

Calls DeepSeek-R1 model on DeepInfra using an OpenAI-compatible interface.

This class processes resume text and extracts structured information using AI.

"""

def __init__(

self,

api_key: str= os.getenv("DEEPINFRA_TOKEN"),

model_name: str = "deepseek-ai/DeepSeek-R1"

"""

Initializes the DeepInfraAnalyzer with API key and model name.

:param api_key: API key for authentication

:param model_name: The name of the model to use

"""

try:

self.openai_client = OpenAI(

api_key=api_key,

base_url="https://api.deepinfra.com/v1/openai",

)

self.model_name = model_name

except Exception as e:

raise RuntimeError(f"Failed to initialize OpenAI client: {e}")

def analyze_text(self, text: str) -> str:

"""

Processes the given resume text and extracts key information in JSON format.

The response will contain structured details about key skills, experience, education, etc.

:param text: The resume text to analyze

:return: JSON string with structured resume analysis

"""

prompt = (

"You are an AI job resume matcher assistant. "

"DO NOT show your chain of thought. "

"Respond ONLY in English. "

"Extract the key skills, experiences, education, achievements, etc. from the following resume text. "

"Then produce the final output as a well-structured JSON with a top-level key called \"analysis\". "

"Inside \"analysis\", you can have subkeys like \"key_skills\", \"experiences\", \"education\", etc. "

"Return ONLY the final JSON, with no extra commentary.\n\n"

f"Resume Text:\n{text}\n\n"

"Required Format (example):\n"

"```\n"

"{\n"

" \"analysis\": {\n"

" \"key_skills\": [...],\n"

" \"experiences\": [...],\n"

" \"education\": [...],\n"

" \"achievements\": [...],\n"

" ...\n"

" }\n"

"}\n"

"```\n"

)

try:

response = self.openai_client.chat.completions.create(

model=self.model_name,

messages=[{"role": "user", "content": prompt}],

)

return response.choices[0].message.content

except Exception as e:

raise RuntimeError(f"Error processing resume text: {e}")

import os from openai import OpenAI from dotenv import load_dotenv # Load environment variables from .env file load_dotenv() class DeepInfraAnalyzer: """ Calls DeepSeek-R1 model on DeepInfra using an OpenAI-compatible interface. This class processes resume text and extracts structured information using AI. """ def __init__( self, api_key: str= os.getenv("DEEPINFRA_TOKEN"), model_name: str = "deepseek-ai/DeepSeek-R1" ): """ Initializes the DeepInfraAnalyzer with API key and model name. :param api_key: API key for authentication :param model_name: The name of the model to use """ try: self.openai_client = OpenAI( api_key=api_key, base_url="https://api.deepinfra.com/v1/openai", ) self.model_name = model_name except Exception as e: raise RuntimeError(f"Failed to initialize OpenAI client: {e}") def analyze_text(self, text: str) -> str: """ Processes the given resume text and extracts key information in JSON format. The response will contain structured details about key skills, experience, education, etc. :param text: The resume text to analyze :return: JSON string with structured resume analysis """ prompt = ( "You are an AI job resume matcher assistant. " "DO NOT show your chain of thought. " "Respond ONLY in English. " "Extract the key skills, experiences, education, achievements, etc. from the following resume text. " "Then produce the final output as a well-structured JSON with a top-level key called \"analysis\". " "Inside \"analysis\", you can have subkeys like \"key_skills\", \"experiences\", \"education\", etc. " "Return ONLY the final JSON, with no extra commentary.\n\n" f"Resume Text:\n{text}\n\n" "Required Format (example):\n" "```\n" "{\n" " \"analysis\": {\n" " \"key_skills\": [...],\n" " \"experiences\": [...],\n" " \"education\": [...],\n" " \"achievements\": [...],\n" " ...\n" " }\n" "}\n" "```\n" ) try: response = self.openai_client.chat.completions.create( model=self.model_name, messages=[{"role": "user", "content": prompt}], ) return response.choices[0].message.content except Exception as e: raise RuntimeError(f"Error processing resume text: {e}")

import os
from openai import OpenAI 
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
class DeepInfraAnalyzer:
"""
Calls DeepSeek-R1 model on DeepInfra using an OpenAI-compatible interface.
This class processes resume text and extracts structured information using AI.
""" 
def __init__(
self,
api_key: str= os.getenv("DEEPINFRA_TOKEN"),
model_name: str = "deepseek-ai/DeepSeek-R1"
):
"""
Initializes the DeepInfraAnalyzer with API key and model name.
:param api_key: API key for authentication 
:param model_name: The name of the model to use 
"""
try:
self.openai_client = OpenAI(
api_key=api_key, 
base_url="https://api.deepinfra.com/v1/openai",
)
self.model_name = model_name 
except Exception as e:
raise RuntimeError(f"Failed to initialize OpenAI client: {e}")
def analyze_text(self, text: str) -> str:
"""
Processes the given resume text and extracts key information in JSON format.
The response will contain structured details about key skills, experience, education, etc.
:param text: The resume text to analyze
:return: JSON string with structured resume analysis
"""
prompt = (
"You are an AI job resume matcher assistant. "
"DO NOT show your chain of thought. "
"Respond ONLY in English. "
"Extract the key skills, experiences, education, achievements, etc. from the following resume text. "
"Then produce the final output as a well-structured JSON with a top-level key called \"analysis\". "
"Inside \"analysis\", you can have subkeys like \"key_skills\", \"experiences\", \"education\", etc. "
"Return ONLY the final JSON, with no extra commentary.\n\n"
f"Resume Text:\n{text}\n\n"
"Required Format (example):\n"
"```\n"
"{\n"
"  \"analysis\": {\n"
"    \"key_skills\": [...],\n"
"    \"experiences\": [...],\n"
"    \"education\": [...],\n"
"    \"achievements\": [...],\n"
"    ...\n"
"  }\n"
"}\n"
"```\n"
) 
try:
response = self.openai_client.chat.completions.create(
model=self.model_name,
messages=[{"role": "user", "content": prompt}], 
)
return response.choices[0].message.content
except Exception as e:
raise RuntimeError(f"Error processing resume text: {e}")

src/feedback_generator.py

從簡歷中提取詳細資訊後，下一步就是將簡歷與特定職位描述進行比較。FeedbackGenerator 模組從簡歷中提取分析結果，並提供匹配分數和改進建議。該模組對於求職者來說至關重要，他們可以透過該模組完善簡歷，使其與職位描述更加匹配，從而增加透過自動求職系統的機會。

from src.analyzer import DeepInfraAnalyzer

class FeedbackGenerator:

"""

Generates feedback for resume improvement based on a job description

using the DeepInfraAnalyzer.

"""

def __init__(self, analyzer: DeepInfraAnalyzer):

"""

Initializes the FeedbackGenerator with an instance of DeepInfraAnalyzer.

Args:

analyzer (DeepInfraAnalyzer): An instance of the DeepInfraAnalyzer class.

"""

self.analyzer = analyzer

def generate_feedback(self, resume_text: str, job_description: str) -> str:

"""

Generates feedback on how well a resume aligns with a job description.

Args:

resume_text (str): The extracted text from the resume.

job_description (str): The job posting or job description.

Returns:

str: A JSON-formatted response containing:

- "match_score" (int): A score from 0-100 indicating job match quality.

- "job_alignment" (dict): Categorization of strong and weak matches.

- "missing_skills" (list): Skills missing from the resume.

- "recommendations" (list): Actionable suggestions for improvement.

Raises:

Exception: If an unexpected error occurs during analysis.

"""

try:

prompt = (

"You are an AI job resume matcher assistant. "

"DO NOT show your chain of thought. "

"Respond ONLY in English. "

"Compare the following resume text with the job description. "

"Calculate a match score (0-100) for how well the resume matches. "

"Identify keywords from the job description that are missing in the resume. "

"Provide bullet-point recommendations to improve the resume for better alignment.\n\n"

f"Resume Text:\n{resume_text}\n\n"

f"Job Description:\n{job_description}\n\n"

"Return JSON ONLY in this format:\n"

"{\n"

" \"job_match\": {\n"

" \"match_score\": <integer>,\n"

" \"job_alignment\": {\n"

" \"strong_match\": [...],\n"

" \"weak_match\": [...]\n"

" },\n"

" \"missing_skills\": [...],\n"

" \"recommendations\": [\n"

" \"<Actionable Suggestion 1>\",\n"

" \"<Actionable Suggestion 2>\",\n"

" ...\n"

" ]\n"

" }\n"

"}"

)

return self.analyzer.analyze_text(prompt)

except Exception as e:

print(f"Error in generating feedback: {e}")

return "{}" # Returning an empty JSON string in case of failure

from src.analyzer import DeepInfraAnalyzer class FeedbackGenerator: """ Generates feedback for resume improvement based on a job description using the DeepInfraAnalyzer. """ def __init__(self, analyzer: DeepInfraAnalyzer): """ Initializes the FeedbackGenerator with an instance of DeepInfraAnalyzer. Args: analyzer (DeepInfraAnalyzer): An instance of the DeepInfraAnalyzer class. """ self.analyzer = analyzer def generate_feedback(self, resume_text: str, job_description: str) -> str: """ Generates feedback on how well a resume aligns with a job description. Args: resume_text (str): The extracted text from the resume. job_description (str): The job posting or job description. Returns: str: A JSON-formatted response containing: - "match_score" (int): A score from 0-100 indicating job match quality. - "job_alignment" (dict): Categorization of strong and weak matches. - "missing_skills" (list): Skills missing from the resume. - "recommendations" (list): Actionable suggestions for improvement. Raises: Exception: If an unexpected error occurs during analysis. """ try: prompt = ( "You are an AI job resume matcher assistant. " "DO NOT show your chain of thought. " "Respond ONLY in English. " "Compare the following resume text with the job description. " "Calculate a match score (0-100) for how well the resume matches. " "Identify keywords from the job description that are missing in the resume. " "Provide bullet-point recommendations to improve the resume for better alignment.\n\n" f"Resume Text:\n{resume_text}\n\n" f"Job Description:\n{job_description}\n\n" "Return JSON ONLY in this format:\n" "{\n" " \"job_match\": {\n" " \"match_score\": <integer>,\n" " \"job_alignment\": {\n" " \"strong_match\": [...],\n" " \"weak_match\": [...]\n" " },\n" " \"missing_skills\": [...],\n" " \"recommendations\": [\n" " \"<Actionable Suggestion 1>\",\n" " \"<Actionable Suggestion 2>\",\n" " ...\n" " ]\n" " }\n" "}" ) return self.analyzer.analyze_text(prompt) except Exception as e: print(f"Error in generating feedback: {e}") return "{}" # Returning an empty JSON string in case of failure

from src.analyzer import DeepInfraAnalyzer 
class FeedbackGenerator:
"""
Generates feedback for resume improvement based on a job description 
using the DeepInfraAnalyzer.
"""
def __init__(self, analyzer: DeepInfraAnalyzer):
"""
Initializes the FeedbackGenerator with an instance of DeepInfraAnalyzer.
Args:
analyzer (DeepInfraAnalyzer): An instance of the DeepInfraAnalyzer class.
"""
self.analyzer = analyzer 
def generate_feedback(self, resume_text: str, job_description: str) -> str:
"""
Generates feedback on how well a resume aligns with a job description.
Args:
resume_text (str): The extracted text from the resume.
job_description (str): The job posting or job description.
Returns:
str: A JSON-formatted response containing:
- "match_score" (int): A score from 0-100 indicating job match quality.
- "job_alignment" (dict): Categorization of strong and weak matches.
- "missing_skills" (list): Skills missing from the resume.
- "recommendations" (list): Actionable suggestions for improvement.
Raises:
Exception: If an unexpected error occurs during analysis.
"""
try:
prompt = (
"You are an AI job resume matcher assistant. "
"DO NOT show your chain of thought. "
"Respond ONLY in English. "
"Compare the following resume text with the job description. "
"Calculate a match score (0-100) for how well the resume matches. "
"Identify keywords from the job description that are missing in the resume. "
"Provide bullet-point recommendations to improve the resume for better alignment.\n\n"
f"Resume Text:\n{resume_text}\n\n"
f"Job Description:\n{job_description}\n\n"
"Return JSON ONLY in this format:\n"
"{\n"
"  \"job_match\": {\n"
"    \"match_score\": <integer>,\n"
"    \"job_alignment\": {\n"
"      \"strong_match\": [...],\n"
"      \"weak_match\": [...]\n"
"    },\n"
"    \"missing_skills\": [...],\n"
"    \"recommendations\": [\n"
"      \"<Actionable Suggestion 1>\",\n"
"      \"<Actionable Suggestion 2>\",\n"
"      ...\n"
"    ]\n"
"  }\n"
"}"
) 
return self.analyzer.analyze_text(prompt)
except Exception as e:
print(f"Error in generating feedback: {e}")
return "{}"  # Returning an empty JSON string in case of failure

app.py

app.py 檔案是 JobFitAI 專案的主要入口。它整合了上述所有模組，並使用 Gradio 構建了一個互動式網頁介面。使用者可以上傳簡歷/CV 檔案（PDF 或音訊）並輸入職位描述。然後，應用程式會處理簡歷、執行分析、生成反饋，並返回包含分析和建議的結構化 JSON 響應。

import os

from dotenv import load_dotenv

load_dotenv()

import gradio as gr

from src.resume_pipeline import ResumePipeline

from src.analyzer import DeepInfraAnalyzer

from src.feedback_generator import FeedbackGenerator

# Pipeline for PDF/audio

resume_pipeline = ResumePipeline()

# Initialize the DeepInfra analyzer

analyzer = DeepInfraAnalyzer()

# Feedback generator

feedback_generator = FeedbackGenerator(analyzer)

def analyze_resume(resume_path, job_desc):

"""

Gradio callback function to analyze a resume against a job description.

Args:

resume_path (str): Path to the uploaded resume file (PDF or audio).

job_desc (str): The job description text for comparison.

"""

try:

if not resume_path or not job_desc:

return {"error": "Please upload a resume and enter a job description."}

# Determine file type from extension

lower_name = resume_path.lower()

file_type = "pdf" if lower_name.endswith(".pdf") else "audio"

# Extract text from the resume

resume_text = resume_pipeline.process_resume(resume_path, file_type)

# Analyze extracted text

analysis_result = analyzer.analyze_text(resume_text)

# Generate feedback and recommendations

feedback = feedback_generator.generate_feedback(resume_text, job_desc)

# Return structured response

return {

"analysis": analysis_result,

"recommendations": feedback

}

except ValueError as e:

return {"error": f"Unsupported file type or processing error: {str(e)}"}

except Exception as e:

return {"error": f"An unexpected error occurred: {str(e)}"}

# Define Gradio interface

demo = gr.Interface(

fn=analyze_resume,

inputs=[

gr.File(label="Resume (PDF/Audio)", type="filepath"),

gr.Textbox(lines=5, label="Job Description"),

outputs="json",

title="JobFitAI: AI Resume Analyzer",

description="""

Upload your resume/cv (PDF or audio) and paste the job description to get a match score,

missing keywords, and actionable recommendations.""",

)

if __name__ == "__main__":

demo.launch(server_name="0.0.0.0", server_port=8000)

import os from dotenv import load_dotenv load_dotenv() import gradio as gr from src.resume_pipeline import ResumePipeline from src.analyzer import DeepInfraAnalyzer from src.feedback_generator import FeedbackGenerator # Pipeline for PDF/audio resume_pipeline = ResumePipeline() # Initialize the DeepInfra analyzer analyzer = DeepInfraAnalyzer() # Feedback generator feedback_generator = FeedbackGenerator(analyzer) def analyze_resume(resume_path, job_desc): """ Gradio callback function to analyze a resume against a job description. Args: resume_path (str): Path to the uploaded resume file (PDF or audio). job_desc (str): The job description text for comparison. """ try: if not resume_path or not job_desc: return {"error": "Please upload a resume and enter a job description."} # Determine file type from extension lower_name = resume_path.lower() file_type = "pdf" if lower_name.endswith(".pdf") else "audio" # Extract text from the resume resume_text = resume_pipeline.process_resume(resume_path, file_type) # Analyze extracted text analysis_result = analyzer.analyze_text(resume_text) # Generate feedback and recommendations feedback = feedback_generator.generate_feedback(resume_text, job_desc) # Return structured response return { "analysis": analysis_result, "recommendations": feedback } except ValueError as e: return {"error": f"Unsupported file type or processing error: {str(e)}"} except Exception as e: return {"error": f"An unexpected error occurred: {str(e)}"} # Define Gradio interface demo = gr.Interface( fn=analyze_resume, inputs=[ gr.File(label="Resume (PDF/Audio)", type="filepath"), gr.Textbox(lines=5, label="Job Description"), ], outputs="json", title="JobFitAI: AI Resume Analyzer", description=""" Upload your resume/cv (PDF or audio) and paste the job description to get a match score, missing keywords, and actionable recommendations.""", ) if __name__ == "__main__": demo.launch(server_name="0.0.0.0", server_port=8000)

import os
from dotenv import load_dotenv
load_dotenv()
import gradio as gr 
from src.resume_pipeline import ResumePipeline
from src.analyzer import DeepInfraAnalyzer
from src.feedback_generator import FeedbackGenerator
# Pipeline for PDF/audio
resume_pipeline = ResumePipeline()
# Initialize the DeepInfra analyzer   
analyzer = DeepInfraAnalyzer()
# Feedback generator
feedback_generator = FeedbackGenerator(analyzer) 
def analyze_resume(resume_path, job_desc):
"""
Gradio callback function to analyze a resume against a job description.
Args:
resume_path (str): Path to the uploaded resume file (PDF or audio).
job_desc (str): The job description text for comparison.
""" 
try:
if not resume_path or not job_desc:
return {"error": "Please upload a resume and enter a job description."}
# Determine file type from extension
lower_name = resume_path.lower()
file_type = "pdf" if lower_name.endswith(".pdf") else "audio"
# Extract text from the resume
resume_text = resume_pipeline.process_resume(resume_path, file_type)
# Analyze extracted text
analysis_result = analyzer.analyze_text(resume_text)
# Generate feedback and recommendations
feedback = feedback_generator.generate_feedback(resume_text, job_desc)
# Return structured response
return {
"analysis": analysis_result,
"recommendations": feedback
}
except ValueError as e:
return {"error": f"Unsupported file type or processing error: {str(e)}"}
except Exception as e:
return {"error": f"An unexpected error occurred: {str(e)}"}
# Define Gradio interface
demo = gr.Interface(
fn=analyze_resume,
inputs=[
gr.File(label="Resume (PDF/Audio)", type="filepath"),
gr.Textbox(lines=5, label="Job Description"),
],
outputs="json",
title="JobFitAI: AI Resume Analyzer",
description="""
Upload your resume/cv (PDF or audio) and paste the job description to get a match score,
missing keywords, and actionable recommendations.""",
)
if __name__ == "__main__": 
demo.launch(server_name="0.0.0.0", server_port=8000)

使用 Gradio 執行應用程式

設定好環境並檢查所有程式碼元件後，就可以執行應用程式了。

啟動應用程式： 在終端導航到專案目錄，執行以下程式碼

python app.py

python app.py

該命令將在本地啟動 Gradio 介面。在瀏覽器中開啟提供的 URL，檢視互動式簡歷分析器。
測試 JobFitAI：
- 上傳簡歷：選擇 PDF 檔案或包含錄音簡歷的音訊檔案。
- 輸入職位描述：貼上或輸入職位描述
- 檢視輸出：系統將顯示一個 JSON 響應，其中包括對簡歷的詳細分析、匹配分數、缺失的關鍵字以及反饋和改進建議。

您可以在 Github 程式碼庫中找到所有程式碼檔案 –點選此處。

使用案例和實際應用

JobFitAI 簡歷分析器可應用於各種實際場景：

提高簡歷質量

自我評估：求職者可在申請前使用該工具對簡歷進行自我評估。透過了解匹配分數和需要改進的地方，他們可以更好地為特定職位定製簡歷。
反饋迴路：該工具生成的結構化 JSON 反饋可整合到職業諮詢平臺中，提供個性化的簡歷改進建議。

教育和培訓應用

職業研討會：教育機構和職業輔導平臺可將 JobFitAI 納入其課程。它是如何利用人工智慧提高職業準備度的實際演示。
編碼和人工智慧專案：有抱負的資料科學家和開發人員可以學習如何將多種人工智慧服務（如轉錄、PDF 提取和自然語言處理）整合到一個有凝聚力的專案中。

故障排除和擴充套件

下面讓我們來探討故障排除和擴充套件– 常見問題和解決方案

常見問題和解決方案

API 令牌問題：如果 DeepInfra API 標記丟失或不正確，分析器模組將失效。請始終驗證您的 .env 檔案是否包含正確的令牌，以及令牌是否處於啟用狀態。
不支援的檔案型別：應用程式目前僅支援 PDF 和音訊格式。如果嘗試上傳其他檔案型別（如 DOCX），系統將提示錯誤。未來的擴充套件功能將包括對其他格式的支援。
轉錄延遲：音訊轉錄有時需要較長的時間，尤其是較大的檔案。如果您計劃處理許多音訊簡歷，請考慮使用更高規格的機器或基於雲的解決方案。

進一步開發的想法

支援更多檔案格式：擴充套件簡歷管道以支援 DOCX 或純文字等其他檔案型別。
增強反饋機制：整合更復雜的自然語言處理模型，在基本匹配分數之外提供更豐富、更細緻的反饋。
使用者身份驗證：實施使用者身份驗證，讓求職者能夠儲存他們的分析結果，並跟蹤隨時間推移的改進情況。
儀表盤整合：建立一個儀表板，招聘人員可在其中管理和比較多個求職者的簡歷分析。
效能最佳化：最佳化音訊轉錄和 PDF 提取流程，以便更快地分析大規模資料集。

小結

JobFitAI 簡歷分析器是一款功能強大的多功能工具，它利用最先進的人工智慧模型彌合簡歷與職位描述之間的差距。透過DeepInfra整合DeepSeek-R1以及轉錄和PDF提取功能，您現在擁有了一個完整的解決方案，可以自動分析簡歷並生成反饋，以改進職位匹配。

本指南提供了全面的演示–從設定環境到了解每個模組的作用，最後到執行互動式 Gradio 介面。無論您是希望擴充套件自己產品組合的開發人員，還是希望簡化候選人篩選流程的人力資源專業人士，抑或是旨在改進簡歷的求職者，JobFitAI 專案都能為您提供實用的見解和進一步探索的絕佳起點。

擁抱人工智慧的力量，嘗試新功能，繼續完善專案以滿足您的需求。工作應用的未來已經到來，而且比以往任何時候都更加智慧！

JobFitAI 利用 DeepSeek-R1 和 DeepInfra 從簡歷中提取技能、經驗和成就，以便更好地匹配工作。
該系統支援 PDF 和音訊簡歷，使用 PyPDF2 進行文字提取，使用 Whisper 進行音訊轉錄。
Gradio 提供了一個無縫、使用者友好的網路介面，可進行即時簡歷分析和反饋。
該專案採用模組化架構和環境設定，並配有 API 金鑰，以實現平滑整合和可擴充套件性。
開發人員可以對 DeepSeek-R1 進行微調、排除故障並擴充套件功能，以實現更強大的人工智慧簡歷篩選。

JobFitAI：利用Deepseek+DeepInfra+Gradio構建一個簡歷綜合分析專案

學習目標

什麼是Deepseek R1

瞭解Gradio

專案架構

設定環境

建立虛擬環境並安裝依賴項

設定環境變數

專案簡介

src/audio_transcriber.py

src/pdf_extractor.py

src/resume_pipeline.py

src/analyzer.py

src/feedback_generator.py

app.py

使用 Gradio 執行應用程式

使用案例和實際應用

提高簡歷質量

教育和培訓應用

故障排除和擴充套件

常見問題和解決方案

進一步開發的想法

小結

評論留言

取消回覆

文章目录

JobFitAI：利用Deepseek+DeepInfra+Gradio構建一個簡歷綜合分析專案

學習目標

什麼是Deepseek R1

瞭解Gradio

專案架構

設定環境

建立虛擬環境並安裝依賴項

設定環境變數

專案簡介

src/audio_transcriber.py

src/pdf_extractor.py

src/resume_pipeline.py

src/analyzer.py

src/feedback_generator.py

app.py

使用 Gradio 執行應用程式

使用案例和實際應用

提高簡歷質量

教育和培訓應用

故障排除和擴充套件

常見問題和解決方案

進一步開發的想法

小結

相關文章

評論留言

取消回覆

文章目录