Audio Detection

The Audio Detection provided as part of DMSDK can be used to detect and retrieve information from Digimarc Barcode for Audio on any Apple platform. Detection can be performed on a live audio stream, from a source like a microphone. DMSDK can also process pre-recorded audio sources, such as an audio file.

Detection of Digimarc Barcode for Audio is also extremely CPU efficient. The detector works great on all sizes of devices, from a Macbook Pro, to an iPhone, and even a Apple Watch. Detection is entirely local, and does not require an active internet connection. Applications with web integration can cache offline detections for later when a device is online.

DMSDK works best with audio sources sampled at 16khz or higher. Only single channel audio is required, but multichannel input is supported.

For more information on Digimarc Barcode for Audio, and sample media, visit the Digimarc Barcode for Audio web page.

Integrating with AVCaptureSession

If your application is already doing detection of Digimarc Barcode for Print or Packaging using a camera and an AVCaptureSession, it’s easy to add audio detection to a capture session. Once an audio device is added to a capture session, a developer can use a DMSAudioCaptureReader/AudioCaptureReader to also process audio.

AVCaptureSession only supports audio capture on iOS, iPadOS, and in native Mac applications. For detecting audio Digimarc Barcodes in other environments such as watchOS, Mac Catalyst, and the iPhone Simulator, refer to the next section on integrating with AVAudioEngine. AVCaptureSession is present in the iPhone Simulator and Mac Catalyst, but does not recognize audio devices at this time.

To begin processing audio in an existing capture session, make sure an audio device is attached to the capture session. An Audio Capture Reader can then be configured and added to the session.

Swift
//captureSession is an existing capture session with a camera device
captureSession.beginConfiguration()

//Create the audio capture reader
let audioCaptureReader = try AudioCaptureReader(symbologies: [ .audioDigimarc])
//The audio capture reader needs a delegate for returning results
//Queue is the dispatch queue results will be returned on
audioCaptureReader.setResultsDelegate(self, queue: DispatchQueue.main)

//Add the audio capture reader to the capture session
captureSession.addOutput(audioCaptureReader.captureOutput)

//Commit the configuration
captureSession.commitConfiguration()

Obj-C
NSError *error = nil;
//captureSession is an existing capture session with a camera device
[captureSession beginConfiguration];

//Create the audio capture reader
DMSAudioCaptureReader *audioCaptureReader = [[DMSAudioCaptureReader alloc] initWithSymbologies:DMSSymbologyAudioDigimarc options:@{} error:&error];
//The audio capture reader needs a delegate for returning results
//Queue is the dispatch queue results will be returned on
[audioCaptureReader setResultsDelegate:self queue:dispatch_get_main_queue()];

//Add the audio capture reader to the capture session
[captureSession addOutput:audioCaptureReader.captureOutput];

//Commit the configuration
[captureSession commitConfiguration];

Future versions of DMSDK may add support for recognizing different kinds of audio. While the Audio Digimarc symbology is the only valid option at this time, developers should still specify it as the type they are interested in detecting.

After capture is setup, a developer should then add the delegate callback into their application to receive results.

Swift
func audioCaptureReader(_ audioCaptureReader: AudioCaptureReader, didOutputResult result: ReaderResult)
{
     // check result.payloads or result.newPayloads for detected payloads
     if result.newPayloads.count > 0
     {
         //handle results here
          print("Result: \(result.newPayloads)")
     }
}
Obj-C
-(void)audioCaptureReader:(DMSAudioCaptureReader *)audioCaptureReader didOutputResult:(DMSReaderResult *)result {
     // check result.payloads or result.newPayloads for detected payloads
     if(result.newPayloads.count > 0) {
          //handle results here
          NSLog(@"Result: %@", result.newPayloads);
     }
}

Result is a DMSReaderResult/ReaderResult type that will contain zero or more Payloads. Each Payload represents a detection. A Reader Result with zero Payloads implies that audio was processed nothing was detected at this time, but that the callback may be called again later with a result.

DMSDK contains a variety of options for how to extract information from Payloads. Payloads can also be delivered to a Digimarc cloud service that can return relevant web content and more. For more information, see the Payload Handling section for more.

A DMSDemo sample application is also included that demonstrates a complete implementation of a Audio Capture Reader.

Integrating with AVAudioEngine

AVAudioEngine is common in applications that are built specifically for audio, or are built for a wider range of platforms. DMSDK can integrate with AVAudioEngine as well. Input from both live microphone streams, and prerecorded content from audio files and buffers is supported.

DMSDK provides a DMSAudioReader/AudioReader class that can perform detections on an input of sequential AVAudioBuffers. Setup of an Audio Reader is simple.

Swift
self.audioReader = try! AudioReader(symbologies: .audioDigimarc)
Obj-C
NSError *error = nil;
DMSAudioReader *audioReader = [[DMSAudioReader alloc] initWithSymbologies:DMSSymbologyAudioDigimarc 
                                                      options:@{} 
                                                      error:&error];

It’s important to keep a reference to the Audio Reader. To produce reliable results, the same Audio Reader instance must be used for the entire sequence of audio buffers.

To obtain AVAudioBuffers from an AVAudioEngine tree, install an audio tap, and then pass the supplied buffers to an AVAudioReader. In the tap, pass the buffers to the Audio Reader for processing. Processing will be performed syncronously within the tap, so buffering and threading should be performed if there are performance concerns.

Swift
let inputNode = self.audioEngine.inputNode
let format = inputNode.inputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 2000, format: format) { (buf, wheb) in
   let result = try! self.audioReader.process(audioBuffer: buf)
   if result.payloads.count > 0 {
      //process the payloads here
   }
}
Obj-C
AVAudioInputNode *inputNode = [engine inputNode];
AVAudioFormat *format = [inputNode inputFormatForBus:0];
[inputNode installTapOnBus:0 bufferSize:2000 format:format block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
    NSError *processingError = nil;
    DMSReaderResult *result = [audioReader processAudioBuffer:buffer error:&processingError];
    if(result.payloads.count > 0) {
        //process the payloads here
    }
}];

Audio Readers only support a single audio stream, and are not thread safe. Multiple Audio Readers should be used for multiple streams of audio.

If an audio stream is interupted, or an Audio Reader is about to be used for a new series of sequentual audio buffers, the reset function will prepare the Audio Reader for a new stream.

Swift
open func reset()
Obj-C
-(void)reset;

AVAudioSinkNode can also be used to attach an Audio Reader to an AVAudioEngine. This class is available in Apple’s newer operating system releases. See the Apple’s documentation for more information.

Detection works best with the original, unaltered, enhanced audio content. Developers who are doing any audio processing in their AVAudioEngine node tree should carefully weigh where an audio tap is installed for Digimarc Barcode for Audio detection.

Integrating with CoreAudio

Applications processing audio using AudioBufferList and AudioStreamBasicDescription are also supported by DMSAudioReader/AudioReader.

An Audio Reader integrated with Core Audio behaves extremely similarly to an Audio Reader that was integrated with AVAudioEngine. However, a different function is used to process CoreAudio buffers.

Swift
open func process(audioBufferList bufferList: UnsafePointer<AudioBufferList>, streamDescription: UnsafePointer<AudioStreamBasicDescription>, frameCount: UInt32) throws -> ReaderResult
Obj-C
-(nullable DMSReaderResult *)processAudioBufferList:(const AudioBufferList *)bufferList streamDescription:(const AudioStreamBasicDescription *)streamDescription frameCount:(UInt32)frameCount error:(NSError  **)error NS_SWIFT_NAME(process(audioBufferList:streamDescription:frameCount:));

See the DMSAudioReader documentation for more information.

  • Audio readers synchronously process images and will return an array of payload classes that reflect the results found. An audio reader takes a CMSampleBuffer as input.

    See more

    Declaration

    Objective-C

    
    @interface DMSAudioReader : DMSReader

    Swift

    class AudioReader : Reader
  • The audio capture reader interface is ideal for real-time capture from an audio source such as a microphone. It vends a captureOutput that can be hooked to an existing AVFoundation capture tree. An audio capture reader will automatically perform buffering and optimization necessary for a capture source. For situations in which a developer wants to have deeper control of how audio is buffered or staged for detection, the synchronous audio reader interface should be used.

    See more

    Declaration

    Objective-C

    
    @interface DMSAudioCaptureReader : DMSCaptureReader

    Swift

    class AudioCaptureReader : CaptureReader