Improve website summary quality via browse prompt change #3551

onekum · 2023-04-29T22:37:37Z

Background

I feel that "question" and "summary" are bad words to use when prompting an AI to extract data from a webpage. They're too general. Thus, I fixed the browse prompt to reflect my opinion. This ended up significantly improving GPT-3's performance in summarizing website content. By telling it to use maximum detail, the browse outputs won't get (as) decreasingly verbose and simple, which would lead to a bad final summary. The caveat of this is potentially using more tokens because the summaries will be longer, but at least the AI won't be neutered by bad browsing functionality.

This also makes it so you won't go homeless from needing to use GPT-4 to get decent website summaries.

Changes

Changed the browse prompt to be more effective.

Documentation

Nothing is added, only changed. No documentation necessary (I don't think).

Test Plan

I've tested this over the course of many different prompts on my own repo and I can say confidently that this alternate browse prompt significantly increases the quality of the summary returned. You'll see that the summaries are much, much more information dense and character-efficient. Example outputs are below; compare with current performance:

Output 1:

COMMAND = browse_website ARGUMENTS = {'url': 'https://rimworldwiki.com/wiki/Events', 'question': 'What are the different types of events in Rimworld?'}

SYSTEM:  Command browse_website returned: ('Answer gathered from website: The different types of events in Rimworld include Major Threats, Extreme Weather, Random events, Incidental events, Faction events, Sub-events, Game Endings, and Legacy events. Major Threats consist of Enemy Attacks, Infestations, Manhunter Packs, Psychic Waves, Crashed Ship Parts (Defoliator and Psychic), and Mech Clusters. Extreme Weather events include Flashstorms, Toxic Fallout, and Volcanic Winter. Random events can be Good (such as Ambrosia Sprout, Animal Join, and Party) or Bad (such as Beavers!, Blight, and Mad Animal). Incidental events can be Good (such as Friendlies, Masterwork, and Meteorite) or Bad (such as Ancient Danger, Animal Revenge, and Mental Break). Faction events can be Allied, Enemy now Neutral, Hostile Faction, Leader Died, and New Leader. Sub-events include Ambush and Sleeping Mechanoids. Game Endings can be Game Over or Planetkiller. Legacy events include Bonded Animal's Death, Bonded Master's Death, Breakdown, and Tornado. The data block provides detailed information on each type of event, including statistics on their occurrence and effects.')

Output 2:

NEXT ACTION:  COMMAND = browse_website ARGUMENTS = {'url': 'https://www.apa.org/', 'question': 'psychopathy'}

SYSTEM:  Command browse_website returned: ('Answer gathered from website: There is no specific data relevant to the query "psychopathy" in the given data block. The data block mainly discusses the positive impact of psychology on our lives, communities, and world, and highlights the partnerships across professions, worker well-being, and the use of technology in psychology. It also provides information about APA services, publications, and products, as well as resources for members and students. Additionally, it includes news articles on various psychology-related topics such as cognitive behavioral therapy, love, stress, and transgender issues.')

Output 3:

NEXT ACTION:  COMMAND = browse_website ARGUMENTS = {'url': 'https://www.upgrad.com/blog/python-projects-ideas-topics-beginners/', 'question': 'python tutorials'}

SYSTEM:  Command browse_website returned: ('Answer gathered from website: There is no specific data relevant to the query "python tutorials" in the given data block. However, the block provides information about a course titled "Learn Python Programming – Coding Bootcamp Online" offered by IIIT Bangalore under the Data Science category. This course is an ACP (Advanced Certification Program) in Data Science. The block also includes a list of Python project ideas for students at different levels of expertise, ranging from beginner to advanced. The block emphasizes the importance of hands-on experience in developing Python projects to improve one\'s coding skills and enhance their resume. Additionally, the article discusses 42 exciting Python project ideas and topics for beginners in 2023, which can be considered as a valuable resource for those looking to learn Python programming. The article also highlights the importance of project-based learning and how it can help individuals improve their programming skills and gain practical experience.')

PR Quality Checklist

My pull request is atomic and focuses on a single change.
I have thoroughly tested my changes with multiple different prompts.
I have considered potential risks and mitigations for my changes.
I have documented my changes clearly and comprehensively.
I have not snuck in any "extra" small tweaks changes

Things to consider

Perhaps change question to query in the initial prompt so that the AI treats it more as a query for information and thus makes its input more objective. The fact that I can't find where the initial prompt is anymore prevents me from making this change myself. If anyone wants to make this change or inform me how to do so, it'll happen, but I think this would be a fine PR merge even without that.

… seem to impact anything

vercel · 2023-04-29T22:37:41Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	May 10, 2023 6:37pm

codecov · 2023-04-29T22:40:39Z

Codecov Report

Patch coverage has no change and project coverage change: -1.01 ⚠️

Comparison is base (8f31196) 60.99% compared to head (e6c30d3) 59.98%.

❗ Current head e6c30d3 differs from pull request most recent head 6546760. Consider uploading reports for the commit 6546760 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3551      +/-   ##
==========================================
- Coverage   60.99%   59.98%   -1.01%     
==========================================
  Files          73       69       -4     
  Lines        3310     3099     -211     
  Branches      542      513      -29     
==========================================
- Hits         2019     1859     -160     
+ Misses       1152     1109      -43     
+ Partials      139      131       -8

Impacted Files	Coverage Δ
autogpt/commands/web_requests.py	`81.25% <ø> (ø)`
autogpt/processing/text.py	`79.03% <ø> (ø)`

... and 21 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

onekum · 2023-04-29T22:42:35Z

How can I make the linting check pass? Help would be appreciated.

ntindle · 2023-04-30T01:32:04Z

You need to run black . to sort all files. You’ll also need issort installed as well and run it too

ntindle · 2023-04-30T03:06:45Z

I’d also love to see you add a regression test to prevent this from happening in the future. You can copy paste tests/integration/goal_oriented/test_browse_website.py and change the agent and the assert statement to fit your needs!

onekum · 2023-04-30T10:15:11Z

Hey @ntindle,

I've fixed the linting issues, but I'm still struggling with the regression test for my changes. I'm not too experienced in this area, but I'd like to learn. Could you give me some guidance or a specific example of how to create a test for my improvements to the browsing functionality?

Sorry if it's a bother, but any help would be awesome. Thanks!

bszollosinagy · 2023-05-01T12:53:41Z

There is only one regression test in "test_browse_website.py" for the moment. It tests to see if AutoGPT can correctly find the price of an item in a pre-defined website text. This is something that is repeatable, and can be automated.

I am not sure that it is possible to make a regression test for testing longer, more comprehensive summaries. ChatGPT will often give different summarizations (and we cannot control its random seed via the API).

A qualitative benchmark is possible, but that cannot be automated. You basically look at a website, and see if the summary is good or not. Maybe do that multiple times for the same website, and possibly for a few more websites.

This is a well known issue with testing LLMs, so I wouldn't worry too much, as long as a couple of qualitative tests show that this prompt indeed produces longer summaries, with more relevant detail retained. I remember that one of my issues with browsing was also that it made very small summaries, that lost a lot of useful detail.

onekum · 2023-05-01T13:16:09Z

Thank you; I can't stress how much that cleared things up for me. I suppose now we'll wait for someone higher up to benchmark this.

bszollosinagy · 2023-05-01T20:59:02Z

Okay, so here is my attempt at a qualitative benchmark. We can discuss if this is the right approach or if you have something else in mind.

Background

As a bit of background, remember that if a website is considered too long (ie. it will use up too much tokens just to send the contents to ChatGPT), then it will be split up into sentences using Spacy, and then into chunks, to max out the number of tokens allowed. (by default we use GPT 3.5-Turbo, and BROWSE_CHUNK_MAX_LENGTH=3000 tokens (to allow about 1000 tokens for the answer.)) Each chunk will be summarized, and then the summaries will be concatenated by AutoGPT and summarized again.

Method

I have chosen a website that is too large, and will be split into 3 chunks... and then those 3 chunks will be summarized into a final summary.

I will not communicate the intermediate summaries here to save space, only the so called final, overall summary of the website as returned by
"summary_text = summary.summarize_text(url, text, question, driver)" in browse_website(...) in autogpt/commands/web_selenium.py

I will repeat the summary creation 3 times on the same website, and using the same question, just because we have a probabilistic ChatGPT so we need more than 1 sample.

I will do this once for the current AutoGPT implementation, and once for the new prompt proposed by this PR.

I will then copy the summaries below into the next paragraphs as they are, without modification.

Here are the results using a Temperature 0.5, doing the following AutoGPT command:

command_name = 'browse_website'
arguments = {
'url': 'http://www.ontario.ca/page/cherry-cultivars-sweet-and-tart', 
'question': 'list of commercially available cherry cultivars'}

RESULTS for original AutoGPT version:

The text provides a list of commercially available cherry cultivars for Ontario, including recommended sweet and tart cherry cultivars, harvest dates, pollination information, and cherry cultivar descriptions. The cultivars are listed in order of maturity and are grouped into general planting, limited planting, and trial planting categories. The text also includes information on cherry rootstocks and pollen incompatibility groups for sweet cherry cultivars.

The text provides a list of commercially available cherry cultivars for Ontario, including recommended sweet and tart cherry cultivars, their harvest dates, pollination requirements, and brief descriptions of each cultivar's characteristics and performance. The cultivars are grouped into general planting, limited planting, and trial planting categories, and recommendations for planting cultivars and adapted areas within the province have been determined by various organizations and consultations with industry stakeholders. The list includes cultivars such as Sunburst, Sweetheart, Tehranivee, Ulster, Valera, Van, and Montmorency, as well as information about different cherry rootstocks. The information is provided by the Ministry of Agriculture, Food and Rural Affairs in Ontario, Canada.

The text provides a list of commercially available cherry cultivars, including both sweet and tart varieties, as well as information on cherry rootstocks and species collections. The list is organized by maturity and includes recommended cultivars for Ontario. The text also mentions the importance of maintaining cherry collections for evaluation and breeding programs.

RESULTS for the prompt from this Pull Request:

Query: "List of commercially available cherry cultivars"

Data block 1: The article provides a list of commercially available cherry cultivars for Ontario, categorized by recommended general planting, limited planting, and trial planting. The recommended sweet cherry cultivars are Viva, Vista, Hartland, Valera, Vega, Cavalier, Viscount, Venus, Cristalina, Bing, Vic, Kristin, Vogue, Newstar, Vandalay, Stella, Tehranivee, Sonata, and Hedelfingen. The recommended tart cherry cultivars are Montmorency, Northstar, Balaton, Galaxy, and Meteor. The article also includes information on cherry harvest dates, pollination for sweet and tart cherries, cherry cultivar descriptions, and cherry rootstocks.

Data block 2: The data block contains a list of commercially available cherry cultivars, including both sweet and tart varieties. Some of the sweet cherry cultivars mentioned are Colt, Gisela®5, Gisela®6, and Mazzard. The tart cherry cultivars mentioned include Balaton, English Morello, Meteor, and Montmorency. The list also includes descriptions of each cultivar's characteristics, such as fruit size, ripening time, and resistance to certain diseases. There is no specific statistic related to this query, as the data block provides qualitative descriptions rather than quantitative data.

Data block 3: The data block contains a list of commercially available cherry cultivars, including their names and some additional information such as their ripening time, color, and origin. Some of the named cultivars include FrancisIII, Grosse GermersdorferIII, Harlemer Doppelte-Hartland, LambertIII, Lapins, NapoleonIII, RainierIX, and Stella. The block also includes a list of cherry rootstocks and species collections, as well as some miscellaneous information about cherry varieties. Overall, the block provides a comprehensive overview of the different types of cherries that are available commercially.

Explanation: All three data blocks provide a list of commercially available cherry cultivars, along with additional information such as ripening time, disease resistance, and origin. The first data block provides recommended sweet and tart cherry cultivars for Ontario, while the second data block provides a list of sweet and tart cherry cultivars along with their characteristics. The third data block provides a list of named cherry cultivars along with their ripening time, color, and origin, as well as information on cherry rootstocks and species collections. No specific statistics are provided for this query, as the data blocks provide qualitative descriptions of the cherry cultivars.

The data block provides a list of commercially available cherry cultivars for Ontario, categorized by sweet and tart cherries, with recommended cultivars listed in order of maturity. The list includes general planting, limited planting, and trial planting categories. The block also provides information on cherry harvest dates, pollination requirements for sweet and tart cherries, and brief descriptions of major sweet cherry cultivars. The data block does not provide specific statistics on the number of cultivars available or their market share. The cultivars listed include Sunburst, Sweetheart, Tehranivee, Ulster, Valera, Van, Vandalay, Vega, Venus, Vic, Viscount, Vista, Viva, and Vogue for sweet cherries, and Balaton, English Morello, Meteor, and Montmorency for tart cherries. The list also includes information on cherry rootstocks, including Colt, Gisela clones, Mahaleb, Mazzard, and MxM clones. There are no specific statistics provided on the number of commercially available cherry cultivars or their market share.

The data block provides a list of commercially available cherry cultivars for Ontario, categorized by sweet and tart cherries, and grouped by recommended cultivars for general planting, limited planting, and trial planting. The list includes recommended cherry cultivars, harvest dates, pollination requirements, cultivar descriptions, and recommended rootstocks. Some sweet cherry cultivars that are recommended for Ontario include Sunburst, Sweetheart, Tehranivee, Ulster, Valera, Van, Vandalay™, Vega, Venus, Vic, Viscount, Viva, and Vogue. Some tart cherry cultivars that are recommended for Ontario include Balaton, English Morello, Meteor, and Montmorency. The data block also provides information on the importance of selecting appropriate cultivars for specific climatic zones, and the need for cross-pollination in sweet cherry cultivars. The list includes average first harvest dates for sweet and tart cherries, and pollen incompatibility groups for sweet cherry cultivars. The data block does not provide specific statistics on the number of commercially available cherry cultivars in Ontario.

DISCUSSION

Original Version: Only 1 summary out of 3 actually contains the answer to the question. The two others allude to the fact that the website contains the answer, but does not actually list the requested cherry tree cultivars in the summary.

This PR version: All three summaries actually have the answer to the question "list of commercially available cherry cultivars", and a bit more text and detail. This should help give a better Ada embedding down the line, when AutoGPT is looking at its memories.

CONCLUSION:

I think that the summaries are better than before, so I would recommend merging this PR.

onekum · 2023-05-01T22:36:18Z

@bszollosinagy This is some really professional work. Overall, I think this benchmark effectively demonstrates and showcases this PR's improvements. Well done; I'll learn from you.

onekum · 2023-05-02T19:26:29Z

A test is failing with what seems to be a "Incorrect API key provided" error. Is this an issue with the PR? It happened on the most recent merge from master.

bszollosinagy · 2023-05-04T20:03:37Z

A test is failing with what seems to be a "Incorrect API key provided" error. Is this an issue with the PR? It happened on the most recent merge from master.

Apparently this is caused by some test which is caching its results into something called a cassettte. See CONTRIBUTING.md

All you have to do is run pytest on /Auto-GPT/tests/integration/goal_oriented/test_browse_website.py and then make a commit that contains the updated YAML cassette file from Auto-GPT/tests/integration/goal_oriented/cassettes/test_browse_website

If I understand correctly, this is done so that running the tests on Github will not need an API, as long as it finds a "cassette" which acts as a cache. For your case it did not find the casette, so it tried to use the OpenAI API, hence the error (because nobody is giving their own API key just so that 100s of unit tests may run on Github for each Pull Request).

I hope that clears it up.

p-i- · 2023-05-05T00:53:46Z

This is a mass message from the AutoGPT core team.
Our apologies for the ongoing delay in processing PRs.
This is because we are re-architecting the AutoGPT core!

For more details (and for infor on joining our Discord), please refer to:
https://github.com/Significant-Gravitas/Auto-GPT/wiki/Architecting

github-actions · 2023-05-05T05:06:06Z

This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size

onekum · 2023-05-05T05:07:04Z

@bszollosinagy If it matters, the test failed; the logs are below. I've pushed the updated YAML. Thanks for the patience. I'd appreciate if you'd let me know if I did anything wrong. I'll probably see more of what I could do in terms of troubleshooting tomorrow, but right now I'm fairly exhausted.

****@DESKTOP-AJJG9NM MINGW64 ~/OneDrive/Desktop/Auto-GPT (patch-1)
$pytest tests/integration/goal_oriented/test_browse_website.py
============================= test session starts =============================
platform win32 -- Python 3.11.3, pytest-7.3.1, pluggy-1.0.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: C:\Users\****\OneDrive\Desktop\Auto-GPT
plugins: anyio-3.6.2, asyncio-0.21.0, benchmark-4.0.0, cov-4.0.0, integration-0.2.3, mock-3.10.0, recording-0.12.2
asyncio: mode=Mode.STRICT
collected 1 item

tests\integration\goal_oriented\test_browse_website.py F                 [100%]

================================== FAILURES ===================================
_____________________________ test_browse_website _____________________________

browser_agent = <autogpt.agent.agent.Agent object at 0x000001D46313B4D0>

    @requires_api_key("OPENAI_API_KEY")
    @pytest.mark.vcr
    def test_browse_website(browser_agent: Agent) -> None:
        file_path = browser_agent.workspace.get_path("browse_website.txt")
        try:
>           run_interaction_loop(browser_agent, 120)

tests\integration\goal_oriented\test_browse_website.py:14:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests\integration\agent_utils.py:10: in run_interaction_loop
    result = future.result(timeout=timeout)
..\..\..\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py:456: in result
    return self.__get_result()
..\..\..\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py:401: in __get_result
    raise self._exception
..\..\..\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\thread.py:58: in run
    result = self.fn(*self.args, **self.kwargs)
autogpt\agent\agent.py:112: in start_interaction_loop
    assistant_reply = chat_with_ai(
autogpt\llm\chat.py:244: in chat_with_ai
    assistant_reply = create_chat_completion(
autogpt\llm\llm_utils.py:166: in create_chat_completion
    response = api_manager.create_chat_completion(
autogpt\llm\api_manager.py:55: in create_chat_completion
    response = openai.ChatCompletion.create(
..\..\..\AppData\Local\Programs\Python\Python311\Lib\site-packages\openai\api_resources\chat_completion.py:25: in create
    return super().create(*args, **kwargs)
..\..\..\AppData\Local\Programs\Python\Python311\Lib\site-packages\openai\api_resources\abstract\engine_api_resource.py:153: in create
    response, _, api_key = requestor.request(
..\..\..\AppData\Local\Programs\Python\Python311\Lib\site-packages\openai\api_requestor.py:226: in request
    resp, got_stream = self._interpret_response(result, stream)
..\..\..\AppData\Local\Programs\Python\Python311\Lib\site-packages\openai\api_requestor.py:619: in _interpret_response
    self._interpret_response_line(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <openai.api_requestor.APIRequestor object at 0x000001D4635A7C90>
rbody = '{\n    "error": {\n        "message": "Incorrect API key provided: sk-dummy. You can find your API key at https://pla...ys.",\n        "type": "invalid_request_error",\n        "param": null,\n        "code": "invalid_api_key"\n    }\n}\n'
rcode = 401
rheaders = {'alt-svc': 'h3=":443"; ma=86400, h3-29=":443"; ma=86400', 'Date': 'Fri, 05 May 2023 05:03:56 GMT', 'strict-transport-...05299794e7894fe26e13f65e', 'vary': 'Origin', 'Content-Type': 'application/json; charset=utf-8', 'Server': 'cloudflare'}
stream = False

    def _interpret_response_line(
        self, rbody: str, rcode: int, rheaders, stream: bool
    ) -> OpenAIResponse:
        # HTTP 204 response code does not have any content in the body.
        if rcode == 204:
            return OpenAIResponse(None, rheaders)

        if rcode == 503:
            raise error.ServiceUnavailableError(
                "The server is overloaded or not ready yet.",
                rbody,
                rcode,
                headers=rheaders,
            )
        try:
            if 'text/plain' in rheaders.get('Content-Type'):
                data = rbody
            else:
                data = json.loads(rbody)
        except (JSONDecodeError, UnicodeDecodeError) as e:
            raise error.APIError(
                f"HTTP code {rcode} from API ({rbody})", rbody, rcode, headers=rheaders
            ) from e
        resp = OpenAIResponse(data, rheaders)
        # In the future, we might add a "status" parameter to errors
        # to better handle the "error while streaming" case.
        stream_error = stream and "error" in resp.data
        if stream_error or not 200 <= rcode < 300:
>           raise self.handle_error_response(
                rbody, rcode, resp.data, rheaders, stream_error=stream_error
            )
E           openai.error.AuthenticationError: Incorrect API key provided: sk-dummy. You can find your API key at https://platform.openai.com/account/api-keys.

..\..\..\AppData\Local\Programs\Python\Python311\Lib\site-packages\openai\api_requestor.py:682: AuthenticationError
---------------------------- Captured stdout setup ----------------------------
  Resolving path 'file_logger.txt' in workspace 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace'
  Resolved root as 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace'
  Joined paths as 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace\file_logger.txt'
----------------------------- Captured log setup ------------------------------
DEBUG    LOGGER:logs.py:143 Resolving path 'file_logger.txt' in workspace 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace'
DEBUG    LOGGER:logs.py:143 Resolved root as 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace'
DEBUG    LOGGER:logs.py:143 Joined paths as 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace\file_logger.txt'
---------------------------- Captured stdout call -----------------------------
  Resolving path 'browse_website.txt' in workspace 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace'
  Resolved root as 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace'
  Joined paths as 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace\browse_website.txt'
  Token limit: 4000
  Memory Stats: {}
  Token limit: 4000
  Send Token Count: 1352
  Tokens remaining for response: 2648
  ------------ CONTEXT SENT TO AI ---------------
  System: The current time and date is Fri May  5 01:03:45 2023

  User: Determine which next command to use, and respond using the format specified above:

  ----------- END OF CONTEXT ----------------
  Creating chat completion with model gpt-3.5-turbo, temperature 0, max_tokens 2648
  Response: {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "{\n    \"thoughts\": {\n        \"text\": \"I need to use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html and find the price of the book.\",\n        \"reasoning\": \"I can use the browse_website command to visit the website and search for the price of the book by inspecting the HTML code.\",\n        \"plan\": \"- Use the browse_website command to visit the website\\n- Inspect the HTML code to find the price of the book\\n- Write the price to a file named 'browse_website.txt'\\n- Use the task_complete command to complete the task\",\n        \"criticism\": \"I need to be careful when inspecting the HTML code to ensure I find the correct price.\",\n        \"speak\": \"I will use the browse_website command to visit the website and find the price of the book.\"\n    },\n    \"command\": {\n        \"name\": \"browse_website\",\n        \"args\": {\n            \"url\": \"http://books.toscrape.com/catalogue/meditations_33/index.html\",\n            \"question\": \"What is the price of the book?\"\n        }\n    }\n}",
        "role": "assistant"
      }
    }
  ],
  "created": 1683236647,
  "id": "chatcmpl-7Cb27SMDZb3xvSD6pAjeDdutxqzBb",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 242,
    "prompt_tokens": 849,
    "total_tokens": 1091
  }
}
  Total running cost: $0.002
  The JSON object is valid.
 THOUGHTS:  I need to use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html and find the price of the book.
REASONING:  I can use the browse_website command to visit the website and search for the price of the book by inspecting the HTML code.
PLAN:
-  Use the browse_website command to visit the website
-  Inspect the HTML code to find the price of the book
-  Write the price to a file named 'browse_website.txt'
-  Use the task_complete command to complete the task
CRITICISM:  I need to be careful when inspecting the HTML code to ensure I find the correct price.
NEXT ACTION:  COMMAND = browse_website ARGUMENTS = {'url': 'http://books.toscrape.com/catalogue/meditations_33/index.html', 'question': 'What is the price of the book?'}
  Text length: 2067 characters
  Adding chunk 1 / 1 to memory
  Summarizing chunk 1 / 1 of length 2067 characters, or 502 tokens
  Creating chat completion with model gpt-3.5-turbo, temperature 0, max_tokens None
SYSTEM:  Command browse_website returned: Error: Incorrect API key provided: sk-dummy. You can find your API key at https://platform.openai.com/account/api-keys.
  Token limit: 4000
  Memory Stats: {}
  Creating chat completion with model gpt-3.5-turbo, temperature 0, max_tokens None
  Response: {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "I was created and nothing new has happened.",
        "role": "assistant"
      }
    }
  ],
  "created": 1683236670,
  "id": "chatcmpl-7Cb2ULB31u8ivPLzce7FZuNYy0bdI",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 9,
    "prompt_tokens": 92,
    "total_tokens": 101
  }
}
  Total running cost: $0.002
  Token limit: 4000
  Send Token Count: 1667
  Tokens remaining for response: 2333
  ------------ CONTEXT SENT TO AI ---------------
  System: The current time and date is Fri May  5 01:03:56 2023

  System: This reminds you of these events from your past:
I was created and nothing new has happened.

  User: Determine which next command to use, and respond using the format specified above:

  Assistant: {
    "thoughts": {
        "text": "I need to use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html and find the price of the book.",
        "reasoning": "I can use the browse_website command to visit the website and search for the price of the book by inspecting the HTML code.",
        "plan": "- Use the browse_website command to visit the website\n- Inspect the HTML code to find the price of the book\n- Write the price to a file named 'browse_website.txt'\n- Use the task_complete command to complete the task",
        "criticism": "I need to be careful when inspecting the HTML code to ensure I find the correct price.",
        "speak": "I will use the browse_website command to visit the website and find the price of the book."
    },
    "command": {
        "name": "browse_website",
        "args": {
            "url": "http://books.toscrape.com/catalogue/meditations_33/index.html",
            "question": "What is the price of the book?"
        }
    }
}

  System: Command browse_website returned: Error: Incorrect API key provided: sk-dummy. You can find your API key at https://platform.openai.com/account/api-keys.

  User: Determine which next command to use, and respond using the format specified above:

  ----------- END OF CONTEXT ----------------
  Creating chat completion with model gpt-3.5-turbo, temperature 0, max_tokens 2333

---------------------------- Captured stderr call -----------------------------
\r[WDM] - Downloading:   0%|          | 0.00/6.81M [00:00<?, ?B/s]\r[WDM] - Downloading: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 6.81M/6.81M [00:00<00:00, 421MB/s]
------------------------------ Captured log call ------------------------------
DEBUG    LOGGER:logs.py:143 Resolving path 'browse_website.txt' in workspace 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace'
DEBUG    LOGGER:logs.py:143 Resolved root as 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace'
DEBUG    LOGGER:logs.py:143 Joined paths as 'C:\Users\****\AppData\Local\Temp\pytest-of-****\pytest-3\test_browse_website0\home\users\monty\auto_gpt_workspace\browse_website.txt'
DEBUG    JSON_LOGGER:logs.py:174 []
DEBUG    LOGGER:logs.py:143 Token limit: 4000
DEBUG    LOGGER:logs.py:143 Memory Stats: {}
DEBUG    LOGGER:logs.py:143 Token limit: 4000
DEBUG    LOGGER:logs.py:143 Send Token Count: 1352
DEBUG    LOGGER:logs.py:143 Tokens remaining for response: 2648
DEBUG    LOGGER:logs.py:143 ------------ CONTEXT SENT TO AI ---------------
DEBUG    LOGGER:logs.py:143 System: The current time and date is Fri May  5 01:03:45 2023
DEBUG    LOGGER:logs.py:143
DEBUG    LOGGER:logs.py:143 User: Determine which next command to use, and respond using the format specified above:
DEBUG    LOGGER:logs.py:143
DEBUG    LOGGER:logs.py:143 ----------- END OF CONTEXT ----------------
DEBUG    JSON_LOGGER:logs.py:174 [
    {
        "role": "system",
        "content": "You are browse_website-GPT, an AI designed to use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html, answer the question 'What is the price of the book?' and write the price to a file named \"browse_website.txt\", and use the task_complete command to complete the task.\nYour decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.\n\nGOALS:\n\n1. Use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html and answer the question 'What is the price of the book?'\n2. Write the price of the book to a file named \"browse_website.txt\".\n3. Use the task_complete command to complete the task.\n4. Do not use any other commands.\n\n\nConstraints:\n1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.\n2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.\n3. No user assistance\n4. Exclusively use the commands listed in double quotes e.g. \"command name\"\n\nCommands:\n1. append_to_file: Append to file, args: \"filename\": \"<filename>\", \"text\": \"<text>\"\n2. delete_file: Delete file, args: \"filename\": \"<filename>\"\n3. list_files: List Files in Directory, args: \"directory\": \"<directory>\"\n4. read_file: Read file, args: \"filename\": \"<filename>\"\n5. write_to_file: Write to file, args: \"filename\": \"<filename>\", \"text\": \"<text>\"\n6. browse_website: Browse Website, args: \"url\": \"<url>\", \"question\": \"<what_you_want_to_find_on_website>\"\n7. delete_agent: Delete GPT Agent, args: \"key\": \"<key>\"\n8. get_hyperlinks: Get text summary, args: \"url\": \"<url>\"\n9. get_text_summary: Get text summary, args: \"url\": \"<url>\", \"question\": \"<question>\"\n10. list_agents: List GPT Agents, args: () -> str\n11. message_agent: Message GPT Agent, args: \"key\": \"<key>\", \"message\": \"<message>\"\n12. start_agent: Start GPT Agent, args: \"name\": \"<name>\", \"task\": \"<short_task_desc>\", \"prompt\": \"<prompt>\"\n13. task_complete: Task Complete (Shutdown), args: \"reason\": \"<reason>\"\n\nResources:\n1. Internet access for searches and information gathering.\n2. Long Term memory management.\n3. GPT-3.5 powered Agents for delegation of simple tasks.\n4. File output.\n\nPerformance Evaluation:\n1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\n2. Constructively self-criticize your big-picture behavior constantly.\n3. Reflect on past decisions and strategies to refine your approach.\n4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.\n5. Write all code to a file.\n\nYou should only respond in JSON format as described below \nResponse Format: \n{\n    \"thoughts\": {\n        \"text\": \"thought\",\n        \"reasoning\": \"reasoning\",\n        \"plan\": \"- short bulleted\\n- list that conveys\\n- long-term plan\",\n        \"criticism\": \"constructive self-criticism\",\n        \"speak\": \"thoughts summary to say to user\"\n    },\n    \"command\": {\n        \"name\": \"command name\",\n        \"args\": {\n            \"arg name\": \"value\"\n        }\n    }\n} \nEnsure the response can be parsed by Python json.loads"
    },
    {
        "role": "system",
        "content": "The current time and date is Fri May  5 01:03:45 2023"
    },
    {
        "role": "user",
        "content": "Determine which next command to use, and respond using the format specified above:"
    }
]
DEBUG    LOGGER:logs.py:143 Creating chat completion with model gpt-3.5-turbo, temperature 0, max_tokens 2648
DEBUG    LOGGER:logs.py:143 Response: {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "{\n    \"thoughts\": {\n        \"text\": \"I need to use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html and find the price of the book.\",\n        \"reasoning\": \"I can use the browse_website command to visit the website and search for the price of the book by inspecting the HTML code.\",\n        \"plan\": \"- Use the browse_website command to visit the website\\n- Inspect the HTML code to find the price of the book\\n- Write the price to a file named 'browse_website.txt'\\n- Use the task_complete command to complete the task\",\n        \"criticism\": \"I need to be careful when inspecting the HTML code to ensure I find the correct price.\",\n        \"speak\": \"I will use the browse_website command to visit the website and find the price of the book.\"\n    },\n    \"command\": {\n        \"name\": \"browse_website\",\n        \"args\": {\n            \"url\": \"http://books.toscrape.com/catalogue/meditations_33/index.html\",\n            \"question\": \"What is the price of the book?\"\n        }\n    }\n}",
        "role": "assistant"
      }
    }
  ],
  "created": 1683236647,
  "id": "chatcmpl-7Cb27SMDZb3xvSD6pAjeDdutxqzBb",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 242,
    "prompt_tokens": 849,
    "total_tokens": 1091
  }
}
DEBUG    LOGGER:logs.py:143 Total running cost: $0.002
DEBUG    LOGGER:logs.py:143 The JSON object is valid.
INFO     TYPER:logs.py:102 I need to use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html and find the price of the book.
INFO     TYPER:logs.py:102 I can use the browse_website command to visit the website and search for the price of the book by inspecting the HTML code.
INFO     TYPER:logs.py:102
INFO     TYPER:logs.py:102 Use the browse_website command to visit the website
INFO     TYPER:logs.py:102 Inspect the HTML code to find the price of the book
INFO     TYPER:logs.py:102 Write the price to a file named 'browse_website.txt'
INFO     TYPER:logs.py:102 Use the task_complete command to complete the task
INFO     TYPER:logs.py:102 I need to be careful when inspecting the HTML code to ensure I find the correct price.
DEBUG    JSON_LOGGER:logs.py:174 {
    "thoughts": {
        "text": "I need to use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html and find the price of the book.",
        "reasoning": "I can use the browse_website command to visit the website and search for the price of the book by inspecting the HTML code.",
        "plan": "- Use the browse_website command to visit the website\n- Inspect the HTML code to find the price of the book\n- Write the price to a file named 'browse_website.txt'\n- Use the task_complete command to complete the task",
        "criticism": "I need to be careful when inspecting the HTML code to ensure I find the correct price.",
        "speak": "I will use the browse_website command to visit the website and find the price of the book."
    },
    "command": {
        "name": "browse_website",
        "args": {
            "url": "http://books.toscrape.com/catalogue/meditations_33/index.html",
            "question": "What is the price of the book?"
        }
    }
}
INFO     TYPER:logs.py:102 COMMAND = browse_website  ARGUMENTS = {'url': 'http://books.toscrape.com/catalogue/meditations_33/index.html', 'question': 'What is the price of the book?'}
INFO     LOGGER:logs.py:143 Text length: 2067 characters
INFO     LOGGER:logs.py:143 Adding chunk 1 / 1 to memory
INFO     LOGGER:logs.py:143 Summarizing chunk 1 / 1 of length 2067 characters, or 502 tokens
DEBUG    LOGGER:logs.py:143 Creating chat completion with model gpt-3.5-turbo, temperature 0, max_tokens None
INFO     TYPER:logs.py:102 Command browse_website returned: Error: Incorrect API key provided: sk-dummy. You can find your API key at https://platform.openai.com/account/api-keys.
DEBUG    JSON_LOGGER:logs.py:174 [
    {
        "role": "user",
        "content": "Determine which next command to use, and respond using the format specified above:"
    },
    {
        "role": "assistant",
        "content": "{\n    \"thoughts\": {\n        \"text\": \"I need to use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html and find the price of the book.\",\n        \"reasoning\": \"I can use the browse_website command to visit the website and search for the price of the book by inspecting the HTML code.\",\n        \"plan\": \"- Use the browse_website command to visit the website\\n- Inspect the HTML code to find the price of the book\\n- Write the price to a file named 'browse_website.txt'\\n- Use the task_complete command to complete the task\",\n        \"criticism\": \"I need to be careful when inspecting the HTML code to ensure I find the correct price.\",\n        \"speak\": \"I will use the browse_website command to visit the website and find the price of the book.\"\n    },\n    \"command\": {\n        \"name\": \"browse_website\",\n        \"args\": {\n            \"url\": \"http://books.toscrape.com/catalogue/meditations_33/index.html\",\n            \"question\": \"What is the price of the book?\"\n        }\n    }\n}"
    },
    {
        "role": "system",
        "content": "Command browse_website returned: Error: Incorrect API key provided: sk-dummy. You can find your API key at https://platform.openai.com/account/api-keys."
    }
]
DEBUG    LOGGER:logs.py:143 Token limit: 4000
DEBUG    LOGGER:logs.py:143 Memory Stats: {}
DEBUG    JSON_LOGGER:logs.py:174 [
    {
        "role": "user",
        "content": "Your task is to create a concise running summary of actions and information results in the provided text, focusing on key and potentially important information to remember.\n\nYou will receive the current summary and the your latest actions. Combine them, adding relevant key information from the latest development in 1st person past tense and keeping the summary concise.\n\nSummary So Far:\n\"\"\"\nI was created.\n\"\"\"\n\nLatest Development:\n\"\"\"\nNothing new happened.\n\"\"\"\n"
    }
]
DEBUG    LOGGER:logs.py:143 Creating chat completion with model gpt-3.5-turbo, temperature 0, max_tokens None
DEBUG    LOGGER:logs.py:143 Response: {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "I was created and nothing new has happened.",
        "role": "assistant"
      }
    }
  ],
  "created": 1683236670,
  "id": "chatcmpl-7Cb2ULB31u8ivPLzce7FZuNYy0bdI",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 9,
    "prompt_tokens": 92,
    "total_tokens": 101
  }
}
DEBUG    LOGGER:logs.py:143 Total running cost: $0.002
DEBUG    JSON_LOGGER:logs.py:174 "I was created and nothing new has happened."
DEBUG    LOGGER:logs.py:143 Token limit: 4000
DEBUG    LOGGER:logs.py:143 Send Token Count: 1667
DEBUG    LOGGER:logs.py:143 Tokens remaining for response: 2333
DEBUG    LOGGER:logs.py:143 ------------ CONTEXT SENT TO AI ---------------
DEBUG    LOGGER:logs.py:143 System: The current time and date is Fri May  5 01:03:56 2023
DEBUG    LOGGER:logs.py:143
DEBUG    LOGGER:logs.py:143 System: This reminds you of these events from your past:
I was created and nothing new has happened.
DEBUG    LOGGER:logs.py:143
DEBUG    LOGGER:logs.py:143 User: Determine which next command to use, and respond using the format specified above:
DEBUG    LOGGER:logs.py:143
DEBUG    LOGGER:logs.py:143 Assistant: {
    "thoughts": {
        "text": "I need to use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html and find the price of the book.",
        "reasoning": "I can use the browse_website command to visit the website and search for the price of the book by inspecting the HTML code.",
        "plan": "- Use the browse_website command to visit the website\n- Inspect the HTML code to find the price of the book\n- Write the price to a file named 'browse_website.txt'\n- Use the task_complete command to complete the task",
        "criticism": "I need to be careful when inspecting the HTML code to ensure I find the correct price.",
        "speak": "I will use the browse_website command to visit the website and find the price of the book."
    },
    "command": {
        "name": "browse_website",
        "args": {
            "url": "http://books.toscrape.com/catalogue/meditations_33/index.html",
            "question": "What is the price of the book?"
        }
    }
}
DEBUG    LOGGER:logs.py:143
DEBUG    LOGGER:logs.py:143 System: Command browse_website returned: Error: Incorrect API key provided: sk-dummy. You can find your API key at https://platform.openai.com/account/api-keys.
DEBUG    LOGGER:logs.py:143
DEBUG    LOGGER:logs.py:143 User: Determine which next command to use, and respond using the format specified above:
DEBUG    LOGGER:logs.py:143
DEBUG    LOGGER:logs.py:143 ----------- END OF CONTEXT ----------------
DEBUG    JSON_LOGGER:logs.py:174 [
    {
        "role": "system",
        "content": "You are browse_website-GPT, an AI designed to use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html, answer the question 'What is the price of the book?' and write the price to a file named \"browse_website.txt\", and use the task_complete command to complete the task.\nYour decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.\n\nGOALS:\n\n1. Use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html and answer the question 'What is the price of the book?'\n2. Write the price of the book to a file named \"browse_website.txt\".\n3. Use the task_complete command to complete the task.\n4. Do not use any other commands.\n\n\nConstraints:\n1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.\n2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.\n3. No user assistance\n4. Exclusively use the commands listed in double quotes e.g. \"command name\"\n\nCommands:\n1. append_to_file: Append to file, args: \"filename\": \"<filename>\", \"text\": \"<text>\"\n2. delete_file: Delete file, args: \"filename\": \"<filename>\"\n3. list_files: List Files in Directory, args: \"directory\": \"<directory>\"\n4. read_file: Read file, args: \"filename\": \"<filename>\"\n5. write_to_file: Write to file, args: \"filename\": \"<filename>\", \"text\": \"<text>\"\n6. browse_website: Browse Website, args: \"url\": \"<url>\", \"question\": \"<what_you_want_to_find_on_website>\"\n7. delete_agent: Delete GPT Agent, args: \"key\": \"<key>\"\n8. get_hyperlinks: Get text summary, args: \"url\": \"<url>\"\n9. get_text_summary: Get text summary, args: \"url\": \"<url>\", \"question\": \"<question>\"\n10. list_agents: List GPT Agents, args: () -> str\n11. message_agent: Message GPT Agent, args: \"key\": \"<key>\", \"message\": \"<message>\"\n12. start_agent: Start GPT Agent, args: \"name\": \"<name>\", \"task\": \"<short_task_desc>\", \"prompt\": \"<prompt>\"\n13. task_complete: Task Complete (Shutdown), args: \"reason\": \"<reason>\"\n\nResources:\n1. Internet access for searches and information gathering.\n2. Long Term memory management.\n3. GPT-3.5 powered Agents for delegation of simple tasks.\n4. File output.\n\nPerformance Evaluation:\n1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\n2. Constructively self-criticize your big-picture behavior constantly.\n3. Reflect on past decisions and strategies to refine your approach.\n4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.\n5. Write all code to a file.\n\nYou should only respond in JSON format as described below \nResponse Format: \n{\n    \"thoughts\": {\n        \"text\": \"thought\",\n        \"reasoning\": \"reasoning\",\n        \"plan\": \"- short bulleted\\n- list that conveys\\n- long-term plan\",\n        \"criticism\": \"constructive self-criticism\",\n        \"speak\": \"thoughts summary to say to user\"\n    },\n    \"command\": {\n        \"name\": \"command name\",\n        \"args\": {\n            \"arg name\": \"value\"\n        }\n    }\n} \nEnsure the response can be parsed by Python json.loads"
    },
    {
        "role": "system",
        "content": "The current time and date is Fri May  5 01:03:56 2023"
    },
    {
        "role": "system",
        "content": "This reminds you of these events from your past: \nI was created and nothing new has happened."
    },
    {
        "role": "user",
        "content": "Determine which next command to use, and respond using the format specified above:"
    },
    {
        "role": "assistant",
        "content": "{\n    \"thoughts\": {\n        \"text\": \"I need to use the browse_website command to visit http://books.toscrape.com/catalogue/meditations_33/index.html and find the price of the book.\",\n        \"reasoning\": \"I can use the browse_website command to visit the website and search for the price of the book by inspecting the HTML code.\",\n        \"plan\": \"- Use the browse_website command to visit the website\\n- Inspect the HTML code to find the price of the book\\n- Write the price to a file named 'browse_website.txt'\\n- Use the task_complete command to complete the task\",\n        \"criticism\": \"I need to be careful when inspecting the HTML code to ensure I find the correct price.\",\n        \"speak\": \"I will use the browse_website command to visit the website and find the price of the book.\"\n    },\n    \"command\": {\n        \"name\": \"browse_website\",\n        \"args\": {\n            \"url\": \"http://books.toscrape.com/catalogue/meditations_33/index.html\",\n            \"question\": \"What is the price of the book?\"\n        }\n    }\n}"
    },
    {
        "role": "system",
        "content": "Command browse_website returned: Error: Incorrect API key provided: sk-dummy. You can find your API key at https://platform.openai.com/account/api-keys."
    },
    {
        "role": "user",
        "content": "Determine which next command to use, and respond using the format specified above:"
    }
]
DEBUG    LOGGER:logs.py:143 Creating chat completion with model gpt-3.5-turbo, temperature 0, max_tokens 2333
============================== warnings summary ===============================
tests/integration/goal_oriented/test_browse_website.py::test_browse_website
  C:\Users\****\OneDrive\Desktop\Auto-GPT\autogpt\commands\web_selenium.py:111: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
    driver = webdriver.Chrome(

tests/integration/goal_oriented/test_browse_website.py::test_browse_website
tests/integration/goal_oriented/test_browse_website.py::test_browse_website
tests/integration/goal_oriented/test_browse_website.py::test_browse_website
tests/integration/goal_oriented/test_browse_website.py::test_browse_website
tests/integration/goal_oriented/test_browse_website.py::test_browse_website
tests/integration/goal_oriented/test_browse_website.py::test_browse_website
  C:\Users\****\AppData\Local\Programs\Python\Python311\Lib\site-packages\selenium\webdriver\remote\remote_connection.py:390: DeprecationWarning: HTTPResponse.getheader() is deprecated and will be removed in urllib3 v2.1.0. Instead use HTTPResponse.headers.get(name, default).
    if response.getheader('Content-Type'):

tests/integration/goal_oriented/test_browse_website.py::test_browse_website
tests/integration/goal_oriented/test_browse_website.py::test_browse_website
tests/integration/goal_oriented/test_browse_website.py::test_browse_website
tests/integration/goal_oriented/test_browse_website.py::test_browse_website
tests/integration/goal_oriented/test_browse_website.py::test_browse_website
tests/integration/goal_oriented/test_browse_website.py::test_browse_website
  C:\Users\****\AppData\Local\Programs\Python\Python311\Lib\site-packages\selenium\webdriver\remote\remote_connection.py:391: DeprecationWarning: HTTPResponse.getheader() is deprecated and will be removed in urllib3 v2.1.0. Instead use HTTPResponse.headers.get(name, default).
    content_type = response.getheader('Content-Type').split(';')

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ===========================
FAILED tests/integration/goal_oriented/test_browse_website.py::test_browse_website
======================= 1 failed, 13 warnings in 11.37s =======================

waynehamadi · 2023-05-05T16:19:13Z

@bszollosinagy we're currently building challenges and I really liked your work on this PR. Can I talk to you in voice ?

Please join use on Discord through this link https://discord.gg/autogpt (if not already)

DM me on the Auto-GPT discord channel (my discord is merwanehamadi).

bszollosinagy · 2023-05-05T21:45:09Z

@onekum If you are running the test on your computer, then somehow the .env file does not have your API key filled if.

Boostrix · 2023-05-07T07:19:15Z

this should probably be using the same argument structure that other commands are using, i.e.

browse_website <query> <focus> <constraints> (might even include XSLT/XPath queries)

PS: please consider leaving your feedback about the idea of supporting "prompt profiles" (directories with different prompt configs) for these types of changes, as per: #1874 (comment)

github-actions · 2023-05-10T18:37:40Z

This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size

onekum · 2023-05-10T18:44:13Z

@bszollosinagy How can I switch it from sk-dummy? I made sure that the .env file had my API key and I even tried updating the env.template file as well, which resulted in the same error. What file it getting the API key from if not .env? It keeps trying to use sk-dummy.

github-actions · 2023-05-16T03:14:33Z

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

bszollosinagy · 2023-05-19T22:44:38Z

@merwanehamadi This seems to be a bug with VCR. If I comment out the line @pytest.mark.vcr, then the test passes, if it is present, then the Dummy key is used, even if a correct API key is specified in the .env file.

If you check out this PR, and try to run the browser test, it will also fail for you.

Note: the test was executed from command line to avoid conflict with an IDEs.

Pwuts · 2023-05-22T10:25:57Z

This directly conflicts with changes in #4208 (which has priority). Can you test that fork, see if it still needs improvement, and make a PR to that fork, or to master after it is merged?

ntindle · 2023-05-25T16:01:27Z

Hey, I've marked this as don't merge until the Memory Fixes are in. Sorry to keep it on hold longer, just no real way to test/validate functionality until the fixes are in

onekum · 2023-05-25T21:49:24Z

Closing this as the changes in #4208 seem to fix the previous behavior adequately enough in my opinion. Additionally, due to a lack of free time and technical knowledge, my motivation to continue work on this PR is waning.

If issues with the browse function arise again in the future or if it's thought that it can still see further improvement, someone may try an updated implementation of this PR.

onekum added 5 commits April 29, 2023 17:10

improve browse prompt

06562aa

Update web_requests.py

5cb7aa5

fix formatting mistake

d9884ec

change query back to question because just nevermind and it doesn't…

7d81073

… seem to impact anything

change query back to question because just nevermind and it doesn't…

bd0d7f1

github-actions bot added the size/m label Apr 29, 2023

onekum changed the title ~~Improve website summary quality via prompt change~~ Improve website summary quality via browse prompt change Apr 29, 2023

richbeales added Needs Benchmark This change is hard to test and requires a benchmark AI efficacy labels Apr 30, 2023

richbeales and others added 2 commits April 30, 2023 06:51

Merge branch 'master' into patch-1

0c9f911

run black and isort

cfce116

Merge branch 'master' into patch-1

72f47f7

Merge branch 'master' into patch-1

e6c30d3

vercel bot temporarily deployed to Preview May 1, 2023 13:14 Inactive

onekum added 2 commits May 1, 2023 18:36

Merge branch 'master' into patch-1

a5ad390

Merge branch 'master' into patch-1

3be74d4

Merge branch 'master' into patch-1

19bc213

vercel bot temporarily deployed to Preview May 5, 2023 04:55 Inactive

Update test_browse_website.yaml

c843fe7

github-actions bot added size/l and removed size/m labels May 5, 2023

redo with chrome installed, whoops

e6c937a

github-actions bot added size/xl and removed size/l labels May 5, 2023

This comment was marked as duplicate.

Sign in to view

Boostrix mentioned this pull request May 9, 2023

Web Navigation Challenge #3936

Closed

1 task

Merge branch 'master' into patch-1

6546760

vercel bot temporarily deployed to Preview May 10, 2023 18:37 Inactive

github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label May 16, 2023

onekum closed this May 25, 2023

onekum deleted the patch-1 branch May 25, 2023 21:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve website summary quality via browse prompt change #3551

Improve website summary quality via browse prompt change #3551

onekum commented Apr 29, 2023

vercel bot commented Apr 29, 2023 •

edited

Loading

codecov bot commented Apr 29, 2023 •

edited

Loading

onekum commented Apr 29, 2023

ntindle commented Apr 30, 2023

ntindle commented Apr 30, 2023

onekum commented Apr 30, 2023

bszollosinagy commented May 1, 2023

onekum commented May 1, 2023

bszollosinagy commented May 1, 2023

onekum commented May 1, 2023 •

edited

Loading

onekum commented May 2, 2023 •

edited

Loading

bszollosinagy commented May 4, 2023

p-i- commented May 5, 2023

github-actions bot commented May 5, 2023

onekum commented May 5, 2023 •

edited

Loading

waynehamadi commented May 5, 2023

bszollosinagy commented May 5, 2023

Boostrix commented May 7, 2023 •

edited

Loading

This comment was marked as duplicate.

github-actions bot commented May 10, 2023

onekum commented May 10, 2023

github-actions bot commented May 16, 2023

bszollosinagy commented May 19, 2023 •

edited

Loading

Pwuts commented May 22, 2023

ntindle commented May 25, 2023

onekum commented May 25, 2023

Improve website summary quality via browse prompt change #3551

Improve website summary quality via browse prompt change #3551

Conversation

onekum commented Apr 29, 2023

Background

Changes

Documentation

Test Plan

PR Quality Checklist

Things to consider

vercel bot commented Apr 29, 2023 • edited Loading

codecov bot commented Apr 29, 2023 • edited Loading

Codecov Report

onekum commented Apr 29, 2023

ntindle commented Apr 30, 2023

ntindle commented Apr 30, 2023

onekum commented Apr 30, 2023

bszollosinagy commented May 1, 2023

onekum commented May 1, 2023

bszollosinagy commented May 1, 2023

Background

Method

RESULTS for original AutoGPT version:

RESULTS for the prompt from this Pull Request:

DISCUSSION

CONCLUSION:

onekum commented May 1, 2023 • edited Loading

onekum commented May 2, 2023 • edited Loading

bszollosinagy commented May 4, 2023

p-i- commented May 5, 2023

github-actions bot commented May 5, 2023

onekum commented May 5, 2023 • edited Loading

waynehamadi commented May 5, 2023

bszollosinagy commented May 5, 2023

Boostrix commented May 7, 2023 • edited Loading

This comment was marked as duplicate.

github-actions bot commented May 10, 2023

onekum commented May 10, 2023

github-actions bot commented May 16, 2023

bszollosinagy commented May 19, 2023 • edited Loading

Pwuts commented May 22, 2023

ntindle commented May 25, 2023

onekum commented May 25, 2023

vercel bot commented Apr 29, 2023 •

edited

Loading

codecov bot commented Apr 29, 2023 •

edited

Loading

onekum commented May 1, 2023 •

edited

Loading

onekum commented May 2, 2023 •

edited

Loading

onekum commented May 5, 2023 •

edited

Loading

Boostrix commented May 7, 2023 •

edited

Loading

bszollosinagy commented May 19, 2023 •

edited

Loading