You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Originally posted by DeekshithaDPrakash January 23, 2025
Checked other resources
I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
Commit to Help
I commit to help with one of those options 👆
Example Code
importrequestsfromtypingimportList, Optional, Dict, Any, Union, Literalfromlangchain.chat_models.baseimportBaseChatModelfromlangchain.schemaimportBaseMessage, AIMessage, ChatResult, ChatGeneration, SystemMessage, HumanMessagefromlangchain_core.language_modelsimportLanguageModelInputfromlangchain_core.runnablesimportRunnable, RunnablePassthrough, RunnableMapfromlangchain_core.output_parsersimportJsonOutputParser, PydanticOutputParserfrompydanticimportBaseModel, FieldfromoperatorimportitemgetterclassTRTLLMChat(BaseChatModel):
url: str=Field(..., description="URL of the Triton inference server endpoint")
temperature: float=Field(0.0, description="Sampling temperature")
max_tokens: int=Field(4096, description="Maximum number of tokens to generate")
format: Optional[Union[Literal["json"], Dict]] =None@propertydef_llm_type(self) ->str:
return"trt-llm-chat"def_convert_messages_to_prompt(self, messages: List[BaseMessage]) ->str:
prompt=""formessageinmessages:
ifisinstance(message, SystemMessage):
prompt+=f"System: {message.content}\n"elifisinstance(message, HumanMessage):
prompt+=f"Human: {message.content}\n"elifisinstance(message, AIMessage):
prompt+=f"Assistant: {message.content}\n"returnprompt.strip()
''' def _call(self, messages: List[BaseMessage], stop: Optional[List[str]] = None) -> str: prompt = self._convert_messages_to_prompt(messages) payload = { "text_input": prompt, "parameters": { "temperature": float(self.temperature), "max_tokens": int(self.max_tokens) } } if self.format is not None: payload["format"] = self.format if stop and len(stop) > 0: payload["parameters"]["stop"] = stop[0] try: response = requests.post( self.url, json=payload, headers={"Content-Type": "application/json"} ) if response.status_code != 200: raise Exception(f"Error from Triton server: {response.text}") result = response.json() response_text = result["text_output"].strip().lower() # Handle binary yes/no responses if self.format == "json" and response_text in ["yes", "no"]: return f'{{"binary_score": "{response_text}"}}' return result["text_output"] except Exception as e: print(f"Request payload: {payload}") raise e '''def_call(self, messages: List[BaseMessage], stop: Optional[List[str]] =None) ->str:
prompt=self._convert_messages_to_prompt(messages)
payload= {
"text_input": prompt,
"parameters": {
"temperature": float(self.temperature),
"max_tokens": int(self.max_tokens)
}
}
ifself.formatisnotNone:
payload["format"] =self.formatifstopandlen(stop) >0:
payload["parameters"]["stop"] =stop[0]
try:
response=requests.post(
self.url,
json=payload,
headers={"Content-Type": "application/json"}
)
ifresponse.status_code!=200:
raiseException(f"Error from Triton server: {response.text}")
result=response.json()
response_text=result["text_output"].strip().lower()
# For binary yes/no responsesifself.format=="json"andresponse_textin ["yes", "no"]:
returnf'{{"binary_score": "{response_text}"}}'elifself.format=="json"andresponse_textin ['not_retrieve','vectorstore', '벡터스토어','kari', '항공우주', '위성', '발사체', '우주', '항공', '발사', '위성','태양전지', '태양', '전지']:
returnf'{{"datasource": "{response_text}"}}'returnresponse_textexceptExceptionase:
print(f"Request payload: {payload}")
raiseedefwith_structured_output(
self,
schema: Union[Dict, type],
*,
method: Literal["function_calling", "json_mode", "json_schema"] ="function_calling",
include_raw: bool=False,
**kwargs: Any,
) ->Runnable[LanguageModelInput, Union[Dict, BaseModel]]:
ifkwargs:
raiseValueError(f"Received unsupported arguments {kwargs}")
ifmethod=="json_mode":
llm=TRTLLMChat(
url=self.url,
temperature=self.temperature,
max_tokens=self.max_tokens,
format="json"
)
elifmethod=="json_schema":
ifisinstance(schema, type):
llm=TRTLLMChat(
url=self.url,
temperature=self.temperature,
max_tokens=self.max_tokens,
format=schema.model_json_schema()
)
else:
llm=TRTLLMChat(
url=self.url,
temperature=self.temperature,
max_tokens=self.max_tokens,
format=schema
)
else:
llm=selfoutput_parser=PydanticOutputParser(pydantic_object=schema) ifisinstance(schema, type) elseJsonOutputParser()
ifinclude_raw:
parser_assign=RunnablePassthrough.assign(
parsed=itemgetter("raw") |output_parser,
parsing_error=lambda_: None
)
parser_none=RunnablePassthrough.assign(parsed=lambda_: None)
parser_with_fallback=parser_assign.with_fallbacks(
[parser_none], exception_key="parsing_error"
)
returnRunnableMap(raw=llm) |parser_with_fallbackelse:
returnllm|output_parserdef_generate(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] =None,
run_manager: Optional[Any] =None,
**kwargs: Any,
) ->ChatResult:
text=self._call(messages, stop)
message=AIMessage(content=text)
generation=ChatGeneration(message=message)
returnChatResult(generations=[generation])
llm=TRTLLMChat(
url="http://ip:port/v2/models/ensemble/generate",
temperature=0,
max_tokens=8096
)
fromtypingimportLiteralfrompydanticimportBaseModel, Fieldfromlangchain.promptsimportChatPromptTemplateclassRouteQuery(BaseModel):
"""Route a user query to the most relevant datasource."""datasource: Literal["vectorstore", "not_retrieve"] =Field(
description="Given a user question, choose to route it to a vectorstore or not_retrieve.",
)
structured_llm_router=llm.with_structured_output(RouteQuery, method="json_mode")
system_prompt="""You are an expert at routing a user question to a vectorstore.The vectorstore contains documents related to the research and development of NASA, including topics such as aircraft, unmanned vehicles, satellites, space launch vehicles, satellite imagery, space exploration, and satellite navigation.Output as "vectorstore" for questions on these topics. If the question is not related, respond with "not_retrieve"."""route_prompt=ChatPromptTemplate.from_messages([
("system", system_prompt),
("human", "{question}")
])
question_router=route_prompt|structured_llm_routerresult=question_router.invoke({"question": "Tell me about camel"})
print(result)
Error:
OutputParserException: Invalidjsonoutput: spider.
output: not_retrieveFortroubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE
Description
I am trying to make a custom chat model so that i can use the LLM served on triton server with langchain/langraph and make the task automated like agents
System Info
System Information
OS: Linux
OS Version: #137~20.04.1-Ubuntu SMP Fri Nov 15 14:46:54 UTC 2024
Python Version: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
Package Information
langchain_core: 0.3.31
langchain: 0.3.12
langchain_community: 0.3.12
langsmith: 0.2.3
langchain_experimental: 0.3.3
langchain_groq: 0.2.1
langchain_nvidia: Installed. No version info available.
langchain_nvidia_ai_endpoints: 0.3.7
langchain_nvidia_trt: 0.0.1rc0
langchain_ollama: 0.2.1
langchain_openai: 0.2.12
langchain_text_splitters: 0.3.3
langgraph_sdk: 0.1.51
Discussed in #29369
Originally posted by DeekshithaDPrakash January 23, 2025
Checked other resources
Commit to Help
Example Code
Description
I am trying to make a custom chat model so that i can use the LLM served on triton server with langchain/langraph and make the task automated like agents
System Info
System Information
Package Information
Optional packages not installed
Other Dependencies
The text was updated successfully, but these errors were encountered: