1. Introduction
Photos and images constitute the largest chunk of the Web, and many include recognisable features, such as human faces, QR codes or text. Detecting these features is computationally expensive, but would lead to interesting use cases e.g. face tagging, or web URL redirection. This document deals with text detection whereas the sister document [SHAPE-DETECTION-API] specifies the Face and Barcode detection cases and APIs.
1.1. Text detection use cases
Please see the Readme/Explainer in the repository.
2. Text Detection API
Individual browsers MAY provide Detectors indicating the availability of hardware providing accelerated operation.
2.1. Image sources for detection
Please refer to Accelerated Shape Detection in Images § 2.1 Image sources for detection
2.2. Text Detection API
TextDetector
represents an underlying accelerated platform’s component for detection in images of Latin-1 text as defined in [iso8859-1]. It provides a single detect()
operation on an ImageBitmapSource
of which the result is a Promise. This method must reject this Promise in the cases detailed in § 2.1 Image sources for detection; otherwise it may queue a task using the OS/Platform resources to resolve the Promise with a sequence of DetectedText
s, each one essentially consisting on a rawValue
and delimited by a boundingBox
and a series of Point2D
s.
[Exposed =(Window ,Worker ),SecureContext ]interface {
TextDetector constructor ();Promise <sequence <DetectedText >>detect (ImageBitmapSource ); };
image
TextDetector()
-
Detectors may potentially allocate and hold significant resources. Where possible, reuse the same
TextDetector
for several detections. detect(ImageBitmapSource image)
- Tries to detect text blocks in the
ImageBitmapSource
image.
2.2.1. DetectedText
dictionary {
DetectedText required DOMRectReadOnly boundingBox ;required DOMString rawValue ;required sequence <Point2D >cornerPoints ; };
boundingBox
, of type DOMRectReadOnly- A rectangle indicating the position and extent of a detected feature aligned to the image
rawValue
, of type DOMString- Raw string detected from the image, where characters are drawn from [iso8859-1].
cornerPoints
, of type sequence<Point2D>- A sequence of corner points of the detected feature, in clockwise direction and starting with top-left. This is not necessarily a square due to possible perspective distortions.
3. Examples
This section is non-normative.
Slightly modified/extended versions of these examples (and more) can be found in e.g. this codepen collection.
3.1. Platform support for a text detector
if ( window. TextDetector== undefined ) { console. error( 'Text Detection not supported on this platform' ); }
3.2. Text Detection
let textDetector= new TextDetector(); // Assuming |theImage| is e.g. a <img> content, or a Blob. textDetector. detect( theImage) . then( detectedTextBlocks=> { for ( const textBlockof detectedTextBlocks) { console. log( 'text @ (${textBlock.boundingBox.x}, ${textBlock.boundingBox.y}), ' + 'size ${textBlock.boundingBox.width}x${textBlock.boundingBox.height}' ); } }). catch (() => { console. error( "Text Detection failed, boo." ); })