Cathin是一个基于OCR、图像分类模型和图像描述生成模型构建的自动化测试框架。它支持安卓、iOS(未来)、Windows和Mac(未来)平台。
从https://huggingface.co/microsoft/Florence-2-base/tree/main 下载模型权重文件放置在florence_2_weights
然后运行 cathin/console_scripts/ai_model_server.py 来启动模型服务,默认端口号8080
运行 cathin/console_scripts/cat_ui/main.py
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver(udid)
使用文本定位元素,适用于所有能被OCR识别的文本。
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="文本")
对于所有不能被OCR识别的部分,例如图标,其ID通常是由图像分类生成的唯一标识符。
(注意:由于识别准确率仅约为70%,该ID不能完全描述图标,仅作为唯一标识符。如需更精确的描述,请使用cat(id="id").description
,它将调用图像描述生成模型来提供图标的确切描述。)
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(id="id")
你也可以使用方向定位,通过left(索引)
、right(索引)
、up(索引)
和down(索引)
,
其中索引从1开始,默认值为1。
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="文本").left()
cat(text="文本").right()
cat(text="文本").up()
cat(text="文本").down()
cat(text="文本").left(2)
cat(text="文本").right(2)
cat(text="文本").up(2)
cat(text="文本").down(2)
Action类提供了多种与UI元素交互的方法。以下是可用的方法:
点击元素。如有需要,可指定偏移量。
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="文本").click()
cat(text="文本").click(x_offset=10, y_offset=20)
长按元素指定时长。
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="文本").long_click(duration=2)
cat(text="文本").long_click(duration=2, x_offset=10, y_offset=20)
沿指定方向滚动指定时长。
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="文本").scroll(direction='竖直向上', duration=200)
cat(text="文本").scroll(direction='水平向左', duration=200)
从元素中心滑动到指定坐标。
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="文本").swipe(to_x=100, to_y=200, duration=0.5)
从元素中心拖拽到指定坐标。
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="文本").drag(to_x=100, to_y=200, duration=1)
在输入字段中设置文本。可以选择追加文本或替换文本。
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="文本").set_text("新文本")
cat(text="文本").set_text("新文本", append=True)
将进度条的值设置为指定百分比。
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="文本").set_seek_bar(percentage=0.5)
Cathin is an automated testing framework built on OCR, image classification models, and image description generation models. It supports Android, iOS, Windows, and Mac.
Input cat_ui
at command line
if success, it will start a server to show UI inspector
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver(udid)
use text to locate element, Applicable to all text recognized by OCR.
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="text")
For all parts that cannot be recognized by OCR, such as icons, the ID is usually a unique identifier generated by image classification
(Note: Due
to a recognition accuracy of only around 70%, the ID does not fully describe
the icon and serves only as a unique identifier.For a more accurate description,
please use cat(id="id").description
, which will call the image description generation model
to provide a precise description of the icon.)
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(id="id")
You can also use directional positioning with left(index)
, right(index)
, up(index)
, and down(index)
,
where the index starts from 1, and the default value is 1.
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="text").left()
cat(text="text").right()
cat(text="text").up()
cat(text="text").down()
cat(text="text").left(2)
cat(text="text").right(2)
cat(text="text").up(2)
cat(text="text").down(2)
The Action class provides various methods to interact with UI elements. Here are the available methods:
Click on the element. You can specify offsets if needed.
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="text").click()
cat(text="text").click(x_offset=10, y_offset=20)
Long click on the element for a specified duration.
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="text").long_click(duration=2)
cat(text="text").long_click(duration=2, x_offset=10, y_offset=20)
Scroll in a specified direction for a given duration.
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="text").scroll(direction='vertical_up', duration=200)
cat(text="text").scroll(direction='horizontal_left', duration=200)
Swipe from the element's center to a specified coordinate.
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="text").swipe(to_x=100, to_y=200, duration=0.5)
Drag from the element's center to a specified coordinate.
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="text").drag(to_x=100, to_y=200, duration=1)
Set text in an input field. You can choose to append the text or replace it.
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="text").set_text("new text")
cat(text="text").set_text("new text", append=True)
Set the value of a seek bar to a specified percentage.
from cathin.Android.android_driver import AndroidDriver
cat = AndroidDriver("udid")
cat(text="text").set_seek_bar(percentage=0.5)