Introduction

Seeing AI is a powerful tool by Microsoft for the visually impaired. It has several features, such as describing what is in front of you, converting text from any handwritten book or printed media into PDF, recognizing different currencies, describing colors and lighting conditions, and much more. These features have a recognition button option on the right side of the screen, below the middle portion of the app. It will vary based on the channel you select. For example, if you are in 'short text' channel, the Talkback will say 'recognizing English,' and if the channel is 'product,' it will say 'recognizing barcode'. It also has a quick help button on each page for your reference. We will discuss each function and overview of this app in this article below.

Overview of the app

First comes the navigation drawer button, which is at the top left corner of the app. Then comes the 'quick help' button, which is at the top right corner of the app. Then there is a recognition or switch camera button, which changes based on the channel. It is located on the right side, below the center of the screen. Then there are nine different channels, all located in a horizontal line below the center of the screen. There is also a 'take picture' button, which appears only for specific channels. Its position is nearly in the center of the screen.

To switch between different buttons, change the Talkback mode to controls by swiping down and up (or vice versa).

Button 1: Navigation Drawer

Talkback will announce it as 'open navigation drawer button'. It has options like

Button 2: Quick Help

You can use it to see the description for the currently selected channel; you will also get a video tutorial in the description explaining how to use the channel.

Button 3: Recognize or switch camera

It is the dynamic button that changes based on the selected channel. Below is the table showing what will be the name of that button based on the selected channel.

Channel name Button name
Short text Recognizing English
Document --
Product Recognizing bar codes
Seen (preview) --
Person Switch to front camera
Currency Recognizing British pounds
Color (preview) --
Handwriting (preview) --
Light --

Note: These names are default; you can manage these options by simply double-tapping on them with Talkback. For instance if you are in short text channel then using this button you can change the language which seeing AI should recognize or if you are in person then the button name would be “switch to front camera” or ”switch to back camera”. We will discuss all the channel in detail in this article.

Button 4: Take picture

It is also a dynamic button which appears in limited channels. Below is the list of channels where this button will appear.

Channel 1: Short Text

This option initiates immediate text reading within the camera's range and is recommended for small amounts of text. When activated, the app will promptly vocalize any text detected by the camera. If the image becomes clearer, it will re-read the text for enhanced clarity. Use this option for quick access to text such as room numbers, bus signage, shop names, short passages in books, or food packaging labels. It's ideal for instances where you need instant auditory feedback for brief text snippets without the need for extensive processing.

Channel 2: Document

Position the camera over a printed page to capture it. Once the text is recognized, you can utilize Talkback commands to navigate through it effectively.

Seeing AI provides guidance for camera placement until all edges of the document are visible and a photo is taken. Make necessary adjustments until you hear "Hold steady." A helpful technique involves placing the camera in the center of the page and gradually moving it away while making slight adjustments.

This channel performs optimally when there's a high contrast between the page and the background, such as a white document on a dark surface.

After the text is recognized, you can use the 'Add' button to scan additional pages or tap the 'Play' button to have the text read aloud with synchronized word highlighting. Upon tapping the 'Play' button, three additional buttons will appear: 'Skip Back', 'Play/Pause', and 'Skip Forward'. They will disappear when the 'Stop' button is clicked.

Besides having the text read from start to finish, you can ask Seeing AI questions about the document by tapping 'Ask Seeing AI' and typing/dictating your question. Note that answers are AI-generated, so errors are possible. To assist Microsoft in improving accuracy, please provide feedback.

There's also a 'More' option button that allows you to rescan a page, delete the current page, or delete all pages. Additionally, there's a 'Share' button with two options: one to share the image and the other to send the text. The text option is particularly useful as the app converts the scanned text into an HTML file, which is an optimal format for accessing any document.

Note: The app currently cannot scan mathematical equations, so it's recommended for reading printed media with text or studying literature subjects.

For your reference, all buttons on this screen (excluding the 'Add Page' and 'More Options' buttons) are located at the bottom, while these two buttons are positioned at the top right.

Channel 3: Product

Seeing AI can recognize products based on various types of codes printed on the packaging, including barcodes. You can select the type of code to recognize by tapping the button on the main screen. By default, barcodes will be detected, as they are the most common type of product code and may be found on the back or bottom of a container. Some products, including those manufactured by Unilever, feature an accessibility-enhanced QR code with a special border around it, making it easier to detect. These codes are typically found on the front of the packaging.

To scan a product, hold the camera over the item, and Seeing AI will guide you with camera placement until the code is detected. Once the code is detected, Seeing AI will read the product name aloud. Move the phone over the product until you hear beeps indicating that a code is nearby. Starting farther away and slowly moving the phone closer works best. The faster the beeps, the closer you are to the code. In the case of accessibility-enhanced QR codes, the distance will also be announced.

When a barcode or QR code is detected, Seeing AI will announce the product name. If additional information about the product is available, you can tap 'More Info' to access it.

Channel 4: seen (preview)

This channel features the latest Artificial Intelligence for describing an overall scene. This work is still experimental, so please use caution. Take a photo and hear a description of the scene it captured.

To hear a more detailed description of the photo, tap 'More Info'. Image descriptions are AI-generated, so mistakes are possible. To help Microsoft improve , please send feedback to Microsoft.

Channel 5: Person

Scan your surroundings to determine the number of people nearby, their proximity, and their facial expressions.

To add a specific person, tap on the 'Face Recognition' button on the main screen. Then, instruct the individual to take three photos, and a pop-up will prompt you to add a name to that person. Finally, click on the 'Add' button to save the information.

To edit or delete the data of saved faces, click on the same 'Face Recognition' button on the main screen. Here, you'll find 'Edit' and 'Delete' buttons corresponding to each face. Additionally, you can click on the 'Add' button to include new faces.

When utilizing the Person channel, Face Recognition can identify individuals nearby. Instead of hearing generic descriptions like "One face near center, 4 feet away," you'll hear their name announced, for example, "Mohit near center, 4 feet away".

It's advisable to seek permission from individuals before training Seeing AI to recognize them. Once you've taught the app to recognize a specific person, their name will be announced when they appear in view. After taking a photo, Seeing AI will give you an estimate of the person's facial characteristics and expressions. For this you can tap on ‘take picture’ button on main screen. If you want to take a selfie, use the button on the main screen to change to the front-facing camera.

Channel 6: Currency (preview)

This option is use to recognize different currencies. Currency recognition is always improving, so please have someone you trust confirm a note's value. Hold the camera over a single note to hear the estimated value. Use the button on the main screen to select which currency should be recognized. Send feedback to Microsoft if you wish to recognize a currency that isn't listed - more will be added over time.

Please note: Seeing AI will not differentiate between real and counterfeit currency.

Channel 7: Color (preview)

Use this channel to hear the perceived color of objects.

Note that this may depend on several factors - colors appear darker when there is less light, or if the object is in shadow; a white surface may appear slightly yellowish when the lights are on.

Channel 8: Handwriting (preview)

This experimental channel allows you to recognize handwritten text. Note that this channel requires the text to be the right way up. Recognition accuracy will vary based on handwriting style, which can vary greatly from individual to individual. Please send feedback to Microsoft to improve it.

Channel 9: Light

Use this channel to detect the amount of light around you. The pitch of the tone is based on how much light your phone sees. The more light there is, the higher the pitch of the tone.

Using Seeing AI With Other Apps

Seeing AI can also recognize and describe photos from other apps like Mail, Photos, and Twitter. Simply share the photo and select "Recognize with Seeing AI" or “Seeing AI” from the list of actions. If you cannot find this option, tap on the “more apps” button, and you will definitely see the Seeing AI app in the list.

Once the processing is completed, you will see the description of the image, such as “screenshot of phone”. The text inside the image will be displayed below the image; explore the screen to find it. Here, you will find three buttons: ‘more info’, ‘share’, and ‘explore photo’.

Tips & Tricks

Note that while the camera is running, it is using battery. Seeing AI will try to save battery when it detects inactivity, but it is best to lock your phone if you won't be using the app for an extended period.