Options
All
  • Public
  • Public/Protected
  • All
Menu

Namespace VideoProcessor

Introduction

Amazon Chime SDK for JavaScript contains easy-to-use APIs for adding frame-by-frame processing to an outgoing video stream.

Amazon Chime SDK for JavaScript defines a video processing stage as an implementation of the VideoFrameProcessor interface, which takes an array of VideoFrameBuffers, applies builder-defined processing, and outputs an array of VideoFrameBuffers. The outputs of each processor can be linked to the inputs of the next processor, with the last processor in the chain required to implement asCanvasImageSource to return CanvasImageSource so that the resulting frames can be rendered onto a HTMLCanvasElement and transformed into a MediaStream.

To integrate video processing into meeting session, VideoTransformDevice should be used, which internally uses a VideoFrameProcessorPipeline to complete the aforementioned linking of stages and final canvas rendering.

A typical workflow would be:

  1. Create an array of custom VideoFrameProcessors.
  2. Create a VideoTransformDevice from a Device and the array of VideoFrameProcessors.
  3. Call meetingSession.audioVideo.startVideoInput with the VideoTransformDevice.

Browser compatibility

The APIs for video processing in Amazon Chime SDK for JavaScript work in Firefox, Chrome, Chromium-based browsers (including Electron) on desktop, Android and iOS operating systems. A full compatibility table is below.

Browser Minimum supported version
Firefox 76
Chromium-based browsers and environments, including Edge and Electron 78
Android Chrome 78
Safari on MacOS 13.0
iOS Safari 16
iOS Chrome 16
iOS Firefox (Except on iPad) 16

Note that there is a known issue with VideoFrameProcessor in Safari 15: see github issue 1059. This has been fixed with Safari 16.

Video Processing APIs

VideoTransformDevice

VideoTransformDevice allows VideoFrameProcessors to be applied to to a Device and provide a new object which can be passed into meetingSession.audioVideo.startVideoInput.

DefaultVideoTransformDevice is the provided implementation of VideoTransformDevice. It takes the aforementioned Device and array of VideoFrameProcessors, then uses VideoFrameProcessorPipeline under the hood and hides its complexity.

Construction and Starting Video Processing

The construction of the DefaultVideoTransformDevice will not start the camera or start processing. The method meetingSession.audioVideo.startVideoInput should be called just like for normal devices. The device controller will use the inner Device to acquire the source MediaStream and start the processing pipeline at the same frame rate. "Inner device" in this context refers to the original video stream coming from the selected camera.

The parameters to chooseVideoInputQuality are used as constraints on the source MediaStream. After the video input is chosen, meetingSession.audioVideo.startLocalVideoTile can be called to start streaming video.

import {
  DefaultVideoTransformDevice
} from 'amazon-chime-sdk-js';

const stages = [new VideoResizeProcessor(4/3)]; // constructs  processor

const transformDevice = new DefaultVideoTransformDevice(
  logger,
  'foo', // device id string
  stages
);

await meetingSession.audioVideo.startVideoInput(transformDevice);
meetingSession.audioVideo.startLocalVideo();

Switching the Inner Device on VideoTransformDevice

To switch the inner Device on DefaultVideoTransformDevice, call DefaultVideoTransformDevice.chooseNewInnerDevice with a new Device. DefaultVideoTransformDevice.chooseNewInnerDevice returns a new DefaultVideoTransformDevice but preserves the state of VideoFrameProcessors. Then call meetingSession.audioVideo.startVideoInput with the new transform device.

const newInnerDevice = 'bar';
if (transformDevice.getInnerDevice() !== innerDevice) {
  transformDevice = transformDevice.chooseNewInnerDevice(innerDevice);
}

Stopping VideoTransformDevice

To stop video processing for the chosen DefaultVideoTransformDevice, call meetingSession.audioVideo.startVideoInput with a different Device (possibly another DefaultVideoTransformDevice) or call meetingSession.audioVideo.stopVideoInput to stop using previous DefaultVideoTransformDevice.

After stopping the video processing, the inner Device will be released by device controller unless the inner Device is a MediaStream provided by users where it is their responsibility of users to handle the lifecycle.

After DefaultVideoTransformDevice is no longer used by device controller, call DefaultVideoTransformDevice.stop to release the VideoProcessors and underlying pipeline. After stop is called, users must discard the DefaultVideoTransformDevice as it will not be reusable.DefaultVideoTransformDevice.stop is necessary to release the internal resources.

await meetingSession.audioVideo.stopVideoInput();
transformDevice.stop();

Applications will need to stop and replace DefaultVideoTransformDevice when they want to change video processors or change the video input quality.

Receiving lifecycle notifications with an observer

To receive notifications of lifecycle events, a DefaultVideoTransformDeviceObserver can be added to the DefaultVideoTransformDevice and handlers added for the following:

Observer Description
processingDidStart Called when video processing starts.
processingDidFailToStart Called when video processing could not start due to runtime errors. In this case, developers are expected to call startVideoInput again with a valid VideoInputDevice to continue video sending.
processingDidStop Called when video processing is stopped expectedly.
processingDidFailToStart Called when the execution of processors slows the frame rate down by at least half.

VideoFrameBuffer

VideoFrameBuffer is an abstract interface that can be implemented to represent images or video sources. It is required to implement asCanvasImageSource to return CanvasImageSource; optionally, developers can implement asCanvasElement or asTransferable to facilitate processing algorithm to work with HTMLCanvasElements or Workers respectively.

VideoFrameProcessor

VideoFrameProcessor represents a processing stage. Internally, processors are executed in a completely serial manner. Each pass will finish before the next pass begins. The input VideoFrameBuffers are the video sources. Changing the property of buffers such as resizing will likely modify properties of the video sources and should be performed with care.

Building a simple processor

The following example shows how to build a basic processor to resize the video frames. We first define an implementation of VideoFrameProcessor:

class VideoResizeProcessor implements VideoFrameProcessor { 
  constructor(private displayAspectRatio) {}

  async process(buffers: VideoFrameBuffer[]): VideoFrameBuffer[];
  async destroy(): Promise<void>;
}

To keep the properties of the original video, the processor has to copy the frame onto its own staging buffer in process:

class VideoResizeProcessor implements VideoFrameProcessor { 
  private targetCanvas: HTMLCanvasElement = document.createElement('canvas') as HTMLCanvasElement;
  private targetCanvasCtx: CanvasRenderingContext2D = this.targetCanvas.getContext('2d') as CanvasRenderingContext2D;
  private canvasVideoFrameBuffer = new CanvasVideoFrameBuffer(this.targetCanvas);

  private renderWidth: number = 0;
  private renderHeight: number = 0;
  private sourceWidth: number = 0;
  private sourceHeight: number = 0;
    
  async process(buffers: VideoFrameBuffer[]): Promise<VideoFrameBuffer[]>;
}

During processing, the incoming video is painted onto the internal canvas like in the following abbreviated:

async process(buffers: VideoFrameBuffer[]): Promise<VideoFrameBuffer[]> {
  const canvas = buffers[0].asCanvasElement();
  const frameWidth = canvas.width;
  const frameHeight = canvas.height;

  // error handling to skip resizing
  if (frameWidth === 0 || frameHeight === 0) {
    return buffers;
  }

  // re-calculate the cropped width and height
  .....

  // copy the frame to the intermediate canvas
  this.targetCanvasCtx.drawImage(canvas, this.dx, 0, this.renderWidth, this.renderHeight,
    0, 0, this.renderWidth, this.renderHeight);

  // replace the video frame with the resized one for subsequent processor
  buffers[0] = this.canvasVideoFrameBuffer;
  return buffers;
}

Building an overlay processor

An overlay processor can be a customized processor for loading an external image. Note that this example accounts for the usage of Cross-Origin Resource Sharing (CORS):

class VideoLoadImageProcessor implements VideoFrameProcessor { 
  // Create a HTMLCanvasElement
  private targetCanvas: HTMLCanvasElement = document.createElement('canvas') as HTMLCanvasElement;
  // Create a HTMLImageElement
  private image = document.createElement("img") as HTMLImageElement;

  // Load the image from source
  loadImage("https://someurl.any/page/bg.jpg", image);

  private targetCanvasCtx: CanvasRenderingContext2D = this.targetCanvas.getContext('2d') as CanvasRenderingContext2D;

  // Render the image on the canvas
  this.targetCanvasCtx.drawImage(image, image.width, image.height);

  private canvasVideoFrameBuffer = new CanvasVideoFrameBuffer(this.targetCanvas);

  // Function to load an image from an external source (absolute URL) and configure CORS to make sure the image is successfully loaded
  async function loadImage(url: string, elem: HTMLImageElement): Promise<HTMLImageElement> {
    return new Promise((resolve, reject) => {
      elem.onload = (): void => resolve(elem);
      elem.onerror = reject;
      elem.src = url;
      // to configure CORS access for the fetch of the new image if it is not hosted on the same server
      elem.crossOrigin = "anonymous";
    });
  }

  async process(buffers: VideoFrameBuffer[]): Promise<VideoFrameBuffer[]> {
    const canvas = buffers[0].asCanvasElement();
    // copy the frame to the intermediate canvas
    this.targetCanvasCtx.drawImage(canvas, 0, 0));

    // replace the video frame with the external image one for subsequent processor
    buffers[0] = this.canvasVideoFrameBuffer;
    return buffers;
  }
}

Additional Video Processing Use-Cases

Custom processor usage during meeting preview

Local video post processing can be previewed before transmitting to remote clients just for a normal device.

import {
  DefaultVideoTransformDevice
} from 'amazon-chime-sdk-js';

const stages = [new VideoResizeProcessor(4/3)]; // constructs  processor
const videoElement = document.getElementById('video-preview');
const transformDevice = new DefaultVideoTransformDevice(
  logger,
  'foobar', // device id string
  stages
);

await meetingSession.audioVideo.startVideoInput(transformDevice);
meetingSession.audioVideo.startVideoPreviewForVideoInput(videoElement);

Custom video processor usage for content share

The API ContentShareControllerFacade.startContentShare does not currently support passing in a VideoTransformDevice or similar. But the DefaultVideoTransformDevice makes it straight forward to apply transforms on a given MediaStream, and output a new MediaStream.

Note that for screen share usage we use MediaDevices.getDisplayMedia directly rather then the helper function ContentShareControllerFacade.startContentShareFromScreenCapture.

import {
  DefaultVideoTransformDevice
} from 'amazon-chime-sdk-js';

mediaStream = navigator.mediaDevices.getDisplayMedia({
  audio: true,
  video: true
});

const stages = [new CircularCut()]; // constructs some custom processor
const transformDevice = new DefaultVideoTransformDevice(
  logger,
  undefined, // Not needed when using transform directly
  stages
);

await meetingSession.audioVideo.startContentShare(await transformDevice.transformStream(mediaStream));

// On completion
transformDevice.stop();

The MediaStream can also be from a file input or other source.

Give feedback on this guide

Generated using TypeDoc