-
Notifications
You must be signed in to change notification settings - Fork 7.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: added dataset recall testing API #9300
Conversation
1.Many contents in controllers/service_api/dataset/hit_testing.py and controllers/console/datasets/hit_testing.py are duplicated. Can a common method be abstracted? |
@hwzhuhao |
@gubinjie Maybe you can add a base class, for example: class HitTestingBase:
def get_and_validate_dataset(self, dataset_id_str):
dataset = DatasetService.get_dataset(dataset_id_str)
if dataset is None:
raise NotFound("Dataset not found.")
try:
DatasetService.check_dataset_permission(dataset, current_user)
except services.errors.account.NoPermissionError as e:
raise Forbidden(str(e))
return dataset
def parse_args(self):
parser = reqparse.RequestParser()
parser.add_argument("query", type=str, location="json")
parser.add_argument("retrieval_model", type=dict, required=False, location="json")
parser.add_argument("external_retrieval_model", type=dict, required=False, location="json")
return parser.parse_args()
def hit_testing_args_check(self, args):
HitTestingService.hit_testing_args_check(args)
def perform_hit_testing(self, dataset, args):
try:
response = HitTestingService.retrieve(
dataset=dataset,
query=args["query"],
account=current_user,
retrieval_model=args["retrieval_model"],
external_retrieval_model=args["external_retrieval_model"],
limit=10,
)
return {"query": response["query"], "records": marshal(response["records"], hit_testing_record_fields)}
except services.errors.index.IndexNotInitializedError:
raise DatasetNotInitializedError()
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
except QuotaExceededError:
raise ProviderQuotaExceededError()
except ModelCurrentlyNotSupportError:
raise ProviderModelCurrentlyNotSupportError()
except LLMBadRequestError:
raise ProviderNotInitializeError(
"No Embedding Model or Reranking Model available. Please configure a valid provider "
"in the Settings -> Model Provider."
)
except InvokeError as e:
raise CompletionRequestError(e.description)
except ValueError as e:
raise ValueError(str(e))
except Exception as e:
logging.exception("Hit testing failed.")
raise InternalServerError(str(e))
class HitTestingApi(Resource, HitTestingBase):
@setup_required
@login_required
@account_initialization_required
def post(self, dataset_id):
self.dataset_id_str = str(dataset_id)
dataset = self.get_and_validate_dataset(self.dataset_id_str)
args = self.parse_args()
self.hit_testing_args_check(args)
return self.perform_hit_testing(dataset, args) |
Okay, big guy, wait a minute, modify it immediately~~0.0 |
@hwzhuhao I'm okay, please take a look |
@gubinjie The file name might be better as hit_testing_base.py. Additionally, please review the comments and address them. Thank you. |
@hwzhuhao Hmm, I found out that the path in the code was wrong before, and now it's been modified |
… hit testing of the knowledge base
@hwzhuhao These have been corrected |
lgtm, thanks. @crazywoola |
Checklist:
Important
Please review the checklist below before submitting your pull request.
dev/reformat
(backend) andcd web && npx lint-staged
(frontend) to appease the lint godsDescription
A knowledge base recall test interface for external calls is added
Fixes #8959
Close #8477
Type of Change
Testing Instructions
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration