XVF3615 Voice Processor with Wake Word Engine

The XMOS VocalFusion® XVF3615 is a specialised version of the XVF3610 voice processor.

The XMOS XVF3615 integrates the class-leading 2-microphone far-field DSP processing pipeline of the XMOS XVF3610 with an Amazon wake word engine (WWE) allowing the rapid integration of Far Field voice capabilities to an AVS client implementations.

The XVF3615 builds on XMOS’s new XVF3610 2-microphone stereo linear far-field voice processor which delivers the clearest voice capture and best user experience. The device interfaces to the host using USB audio class 1.0 for driverless interfacing while the AEC reference signals can be delivered either via the USB interface or via an external ADC.

The high-level architecture of the voice processor is shown in the diagram below.

../../_images/image-pipeline.png

Fig. 38 XMOS XVF3610/3615 Audio Pipeline

The XVF3615 audio processing pipeline is identical to the XVF3610 and details of this functionality can be found in the XVF3610 Product Description.

The wake word engine is integrated into the processor firmware and monitors the output audio stream for a specific keyword, defined by the specific wakeword model that is built into the executable image. When a wake word is detected the WWE can signal this to the host processor via an output pin, a USB HID event or via a polled counter. Details of these mechanisms are provide in a later section.