Desktop Tools

Bom can see, click, type, and control windows on your desktop — just like you would.

Screenshots

Bom can take a screenshot of your screen at any time. It uses these screenshots to understand what is on your screen and decide what to do next.

  • Full screen or a specific area
  • Support for multiple monitors
  • Smart detection of interface elements (buttons, menus, text fields)
Tip

When Bom takes a screenshot, it can identify clickable elements and number them. This is how it knows where to click.

Mouse and Keyboard

Bom can control your mouse and keyboard to interact with any application:

  • Click at any position on the screen
  • Type text into any text field
  • Press keyboard shortcuts (like copy, paste, save)
  • Scroll up and down
  • Drag and drop

Window Management

Bom can see all open windows and control them:

  • List all open windows
  • Bring a specific window to the front
  • Minimize, maximize, or close a window
  • Get the position and size of any window

Interface Automation

For more precise control, Bom can directly interact with interface elements like buttons, checkboxes, and text fields — without needing to know their exact screen position.

  • Find a button by its name and click it
  • Read the text in a specific field
  • Check or uncheck a checkbox
  • Wait for an element to appear before continuing