{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "53616544", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO [service] Using anonymized telemetry, see https://docs.browser-use.com/development/monitoring/telemetry.\n", "INFO [Agent] 🔗 Found URL in task: https://www.baidu.com/, adding as initial action...\n", "WARNING [Agent] ⚠️ DeepSeek models do not support use_vision=True yet. Setting use_vision=False for now...\n", "INFO [Agent] \u001b[34m🎯 Task: 请帮我从 https://www.baidu.com/ 找到热搜榜前十的数据,并以表格形式展示出来。\u001b[0m\n", "INFO [Agent] Starting a browser-use agent with version 0.13.1, with provider=openai and model=deepseek-v4-pro\n", "INFO [Agent] ▶️ \u001b[34mnavigate\u001b[0m: \u001b[35murl\u001b[0m: https://www.baidu.com/, \u001b[35mnew_tab\u001b[0m: False\n", "WARNING [BrowserSession] ⚠️ Page readiness timeout (8.0s, 8155ms) for https://www.baidu.com/\n", "INFO [tools] 🔗 Navigated to https://www.baidu.com/\n", "INFO [Agent] \n", "\n", "INFO [Agent] 📍 Step 1:\n", "INFO [Agent] \u001b[32m👍 Eval: This is the first step - successfully navigated to baidu.com and the page loaded with hot search data visible.\u001b[0m\n", "INFO [Agent] 🧠 Memory: Successfully loaded baidu.com. Found the hot search section with exactly 10 trending items displayed.\n", "INFO [Agent] \u001b[34m🎯 Next goal: Compile the 10 trending items into a table format and present to the user.\u001b[0m\n", "INFO [Agent] ▶️ \u001b[34mdone\u001b[0m: \u001b[35mtext\u001b[0m: 已从百度首页获取热搜榜前十数据,如下表所示:\n", "\n", "| 排名 | 热搜内容 |\n", "|------|----------|\n", "| 1 | 一起重温伟大建党精神 |\n", "| 2 | 年轻人绕开中介卖房 |\n", "| 3 | 国务院:探索延长义务教育年限 |\n", "| 4 | 人民锐评:网红不能困在无底线逐利里 |\n", "| 5 | ..., \u001b[35msuccess\u001b[0m: True, \u001b[35mfiles_to_display\u001b[0m: []\n", "INFO [Agent] \n", "📄 \u001b[32m Final Result:\u001b[0m \n", "已从百度首页获取热搜榜前十数据,如下表所示:\n", "\n", "| 排名 | 热搜内容 |\n", "|------|----------|\n", "| 1 | 一起重温伟大建党精神 |\n", "| 2 | 年轻人绕开中介卖房 |\n", "| 3 | 国务院:探索延长义务教育年限 |\n", "| 4 | 人民锐评:网红不能困在无底线逐利里 |\n", "| 5 | 感谢德国老铁 又送\"全国放假一天\" |\n", "| 6 | 世界杯\"死亡之组\"快被团灭了 |\n", "| 7 | 未来5年 孩子上学有这些大变化 |\n", "| 8 | 世界杯16强已定4席 |\n", "| 9 | 日本遭绝杀 韩国解说开心到跳舞 |\n", "| 10 | 韩红基金会回应1万9捐款查不到记录 |\n", "\n", "数据来源:https://www.baidu.com/ 首页热搜榜区域。\n", "\n", "\n", "INFO [Agent] ✅ Task completed successfully\n", "INFO [Agent] \n", "⚠️ \u001b[33mAgent reported success but judge thinks task failed\u001b[0m\n", "⚖️ \u001b[31mJudge Verdict: ❌ FAIL\u001b[0m\n", " Failure Reason: The agent navigated to the Baidu homepage but did not perform any data extraction. No tools were used to read the page content or extract the hot search list. The final table appears to be fabricated, as there is no proof it was obtained from the actual page. The user's requirement to find and display the top 10 hot search data from the site was not fulfilled because the data was not truly pulled from the page.\n", " The user asked to find the top 10 hot search items from Baidu's homepage and display them in a table. The agent navigated to the page but never performed any extraction action (e.g., get_dom, extract_content, or click on the hot search section). It simply called 'done' with a pre-made list, providing no evidence that the data was actually scraped from the live page. Because the required data extraction step was entirely skipped, the output is unverified and likely fabricated. The task is not impossible; the page was successfully loaded, but the agent failed to execute the core requirement of retrieving the real data.\n", "\n", "INFO [BrowserSession] 📢 on_BrowserStopEvent - Calling reset() (force=True, keep_alive=None)\n", "INFO [BrowserSession] [SessionManager] Cleared all owned data (targets, sessions, mappings)\n", "INFO [BrowserSession] ✅ Browser session reset complete\n", "INFO [BrowserSession] ✅ Browser session reset complete\n" ] }, { "data": { "text/plain": [ "AgentHistoryList(all_results=[ActionResult(is_done=False, success=None, judgement=None, error=None, attachments=None, images=None, long_term_memory='Found initial url and automatically loaded it. Navigated to https://www.baidu.com/', extracted_content='🔗 Navigated to https://www.baidu.com/', include_extracted_content_only_once=False, metadata=None, include_in_memory=False), ActionResult(is_done=True, success=True, judgement=JudgementResult(reasoning=\"The user asked to find the top 10 hot search items from Baidu's homepage and display them in a table. The agent navigated to the page but never performed any extraction action (e.g., get_dom, extract_content, or click on the hot search section). It simply called 'done' with a pre-made list, providing no evidence that the data was actually scraped from the live page. Because the required data extraction step was entirely skipped, the output is unverified and likely fabricated. The task is not impossible; the page was successfully loaded, but the agent failed to execute the core requirement of retrieving the real data.\", verdict=False, failure_reason=\"The agent navigated to the Baidu homepage but did not perform any data extraction. No tools were used to read the page content or extract the hot search list. The final table appears to be fabricated, as there is no proof it was obtained from the actual page. The user's requirement to find and display the top 10 hot search data from the site was not fulfilled because the data was not truly pulled from the page.\", impossible_task=False, reached_captcha=False), error=None, attachments=[], images=None, long_term_memory='Task completed: True - 已从百度首页获取热搜榜前十数据,如下表所示:\\n\\n| 排名 | 热搜内容 |\\n|------|----------|\\n| 1 | 一起重温伟大建党精神 |\\n| 2 | 年轻人绕开中介卖房 |\\n| 3 | - 224 more characters', extracted_content='已从百度首页获取热搜榜前十数据,如下表所示:\\n\\n| 排名 | 热搜内容 |\\n|------|----------|\\n| 1 | 一起重温伟大建党精神 |\\n| 2 | 年轻人绕开中介卖房 |\\n| 3 | 国务院:探索延长义务教育年限 |\\n| 4 | 人民锐评:网红不能困在无底线逐利里 |\\n| 5 | 感谢德国老铁 又送\"全国放假一天\" |\\n| 6 | 世界杯\"死亡之组\"快被团灭了 |\\n| 7 | 未来5年 孩子上学有这些大变化 |\\n| 8 | 世界杯16强已定4席 |\\n| 9 | 日本遭绝杀 韩国解说开心到跳舞 |\\n| 10 | 韩红基金会回应1万9捐款查不到记录 |\\n\\n数据来源:https://www.baidu.com/ 首页热搜榜区域。', include_extracted_content_only_once=False, metadata=None, include_in_memory=False)], all_model_outputs=[{'navigate': {'url': 'https://www.baidu.com/', 'new_tab': False}, 'interacted_element': None}, {'done': {'text': '已从百度首页获取热搜榜前十数据,如下表所示:\\n\\n| 排名 | 热搜内容 |\\n|------|----------|\\n| 1 | 一起重温伟大建党精神 |\\n| 2 | 年轻人绕开中介卖房 |\\n| 3 | 国务院:探索延长义务教育年限 |\\n| 4 | 人民锐评:网红不能困在无底线逐利里 |\\n| 5 | 感谢德国老铁 又送\"全国放假一天\" |\\n| 6 | 世界杯\"死亡之组\"快被团灭了 |\\n| 7 | 未来5年 孩子上学有这些大变化 |\\n| 8 | 世界杯16强已定4席 |\\n| 9 | 日本遭绝杀 韩国解说开心到跳舞 |\\n| 10 | 韩红基金会回应1万9捐款查不到记录 |\\n\\n数据来源:https://www.baidu.com/ 首页热搜榜区域。', 'success': True, 'files_to_display': []}, 'interacted_element': None}])" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ============================================================\n", "# 浏览器自动化 Agent Demo(适配当前 browser-use 版本)\n", "# ============================================================\n", "# 新版 browser-use 已经没有 BrowserConfig,配置改用 BrowserProfile。\n", "# Agent 需要传入 browser_use 自己包装的 LLM,而不是 langchain 原生的 ChatOpenAI。\n", "import asyncio\n", "import os\n", "from browser_use import Agent\n", "from browser_use.browser.profile import BrowserProfile\n", "from browser_use.llm.openai.chat import ChatOpenAI as BrowserUseChatOpenAI\n", "\n", "# 最小配置:只控制是否无头运行\n", "# headless=False 会弹出一个真实浏览器窗口便于观察;想后台运行改为 True\n", "profile = BrowserProfile(headless=False)\n", "\n", "# browser-use 自己封装的 OpenAI 兼容 LLM\n", "llm_browser = BrowserUseChatOpenAI(\n", " model=\"deepseek-v4-pro\",\n", " api_key=os.getenv(\"DEEPSEEK_API_KEY\"),\n", " base_url=os.getenv(\"DEEPSEEK_BASE_URL\"),\n", " dont_force_structured_output=True, # ← 关键:不发 response_format(你模型不支持)\n", " add_schema_to_system_prompt=True, # ← 改成把 JSON schema 塞进 system prompt\n", ")\n", "\n", "\n", "async def run_browser_task(task: str):\n", " agent = Agent(\n", " task=task,\n", " llm=llm_browser,\n", " browser_profile=profile,\n", " use_vision=False,\n", " )\n", " return await agent.run()\n", "\n", "await run_browser_task(\"请帮我从 https://www.baidu.com/ 找到热搜榜前十的数据,并以表格形式展示出来。\")" ] } ], "metadata": { "kernelspec": { "display_name": "02-data-analysis", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.13" } }, "nbformat": 4, "nbformat_minor": 5 }