Deliverable 1 Problem Statement Smart Customer Support Agent for Amazon Product Queries Problem Description

Download 33.71 Kb.

Page	2/3
Date	31.10.2023
Size	33.71 Kb.
	#62452

1 2 3

Question 1-3 (2)

Deliverable 2
Dataset & Preprocessing
Our chosen data set has 10,636 question answers corresponding to the 1191 unique software products.
Reduction: Our dataset has two types of questions - Yes/No and Open-ended. Our current model focussed on working only on Yes/No questions.

This leaves us with 5000 final input question-answer lines for our model. For these questions, we have three types of answers Yes/No/?. Our Objective was to train the model using labels `Yes` and `No` (2819) and then find the answer for `?`.

For preprocessing, we created a tokenization function, coupled with the appropriate tokenizer as per pre-trained models, to convert raw text data into a format that can be fed into the model, enabling the model to learn patterns and make predictions or classifications based on text inputs.

Deliverable 3
Algorithm 1 using DQN
The objective of DRL is to improve the decision-making process for generating responses.
Deep Q-Network (DQN). We tried to frame the problem as a discrete action space, DQN can be used to learn the Q-values of different actions.
Training Data: We used training data that consists of user queries, corresponding context, agent actions, and the resulting rewards.
Training Process: We trained the DRL agent using the collected data and the chosen DRL algorithm. The agent learns to select actions that maximize the cumulative expected reward over time. The training process involves fine-tuning the agent's policy to improve its decision-making abilities.
Deployment: Once trained, we aim to deploy the DRL-based customer support agent alongside any other Transformer-based model or Siamese neural network. The DRL agent can decide which action to take based on the user's query and the current context.
Evaluation and Fine-Tuning: We will continuously evaluate the performance of the DRL-based agent by collecting user feedback and monitoring its interactions. We aim to Fine-tune the agent's policy as needed to adapt to changing user queries and preferences.

Download 33.71 Kb.

Share with your friends:

1 2 3