This project is designed to evaluate the responses of large User and bot conversation based on different metrics such as relevance, clarity, coherence, and more. The system consists of a backend built with Node.js (Express) and a frontend using React (Vite). The system uses OpenAI's gpt-3.5-turbo model to analyze user conversations based on user-specified metrics.
- Metric-based evaluation: Allows users to submit conversations and evaluate them using various metrics.
- Dynamic prompt engineering: The backend dynamically generates evaluation prompts based on the selected metric.
- LLM-powered: Uses OpenAI’s gpt-3.5-turbo model to perform evaluations on conversations.
- Frontend-Backend integration: Simple React-based frontend for interacting with the backend service and viewing evaluation results.
- Question Clarity: Evaluates how clear and specific the user's question is.
- Answer Relevance: Assesses whether the bot's response is directly related to the user's question.
- Fluency and Grammar: Checks for grammatical errors and readability in the bot's response.
- Completeness: Ensures that the bot’s response fully answers the user’s question without missing key details.
- Conciseness: Evaluates if the bot's response is concise while still providing necessary information.
- Accuracy: Assesses the factual correctness of the bot's response.
- Coherence: Measures the logical flow of the bot's responses throughout the conversation.
- Contextual Awareness: Evaluates how well the bot remembers and utilizes the context of the conversation in its responses.
- Handling of Uncertainty: Assesses how well the bot handles unclear or ambiguous inputs.
- Creativity and Insightfulness: Evaluates the originality and depth of the bot's responses.
- Backend: Node.js with Express
- Frontend: React (Vite)
- LLM Evaluation: OpenAI gpt-3.5-turbo API
-
Clone the repository:
git clone [email protected]:nareshNishad/conversation-eval-suite.git cd conversation-eval-suite/backend npm install
-
Update .env file: In the `backend` directory, rename `.env.example` file to `.env`. Add your OpenAI API key.
Example `.env` file content:
OPENAI_API_KEY=your-openai-api-key
-
Run the backend server:
npm run start
The backend server should now be running on `http://localhost:3000\`.
-
Navigate to the frontend directory:
cd ../frontend npm install
-
Run the frontend development server:
npm run dev
The frontend server should now be running on `http://localhost:5173\`.
- `server.js`: Main server file, sets up API routes and communicates with OpenAI's API.
- `routes`: Contains route definitions for handling evaluation requests.
- `utils`: Utility functions, including the `generatePrompt` function for dynamically creating prompts based on evaluation metrics.
- `src/component`: Contains the React components for the user interface.
- `src/App.jsx`: Main application file.
- `vite.config.js`: Configuration for the Vite bundler.
-
Open the frontend in the browser: Navigate to `http://localhost:5173\` after starting both the frontend and backend servers.
-
Submit a conversation for evaluation:
- Select a metric from the dropdown list (e.g., "Answer Relevance").
- Enter the user-bot conversation into the input field.
- Click Submit to receive the evaluation result from the backend.
-
View the evaluation result: The result of the evaluation will be displayed below the form.
- Multi-metric evaluation: Allow users to select multiple metrics for a single conversation.
- User authentication: Add user accounts and history tracking for previous evaluations.