Configuration¶

Configuration settings are both readable and writable and are part of task models; they are saved to Stream by dup and save, and restored by load.

Use these to change model or fine-tune model behavior. Models have reasonable defaults, so there's usually no need to modify settings of this type.

Most frequently used are operating-point for wake words and command sets, leading-silence, and trailing-silence for VAD templates, partial-result-interval for LVCSR and STT, and stt-profile for STT models.

Use the Session get and set functions that match the type of the setting. Use getInt, for example, to read the int value for operating-point.

0.¶

configuration stream read-write

C/C++Java

#define SNSR_SLOT_0 "0."

public class Snsr {
  public final static String SLOT_0 = "0.";
}

Template slot 0.

The first slot in a template task.

Template slots expect a Stream opened on a .snsr model file.

You can also use this string value as an argument with slot.

slot

1.¶

configuration stream read-write

C/C++Java

#define SNSR_SLOT_1 "1."

public class Snsr {
  public final static String SLOT_1 = "1.";
}

Template slot 1.

The second slot in a template task.

Template slots expect a Stream opened on a .snsr model file.

You can also use this string value as an argument with slot.

slot

ac-prune-top-k¶

configuration int read-write tnl 7.5.0

C/C++Java

#define SNSR_AC_PRUNE_TOP_K "ac-prune-top-k"

public class Snsr {
  public final static String AC_PRUNE_TOP_K = "ac-prune-top-k";
}

Reduce LVCSR decoder CPU use

This setting trades CPU use for recognition accuracy.

A subset recognizers optimized for low resource use created by VoiceHub allow reducing the CPU cycles used in search decoding at the expense of an increased recognition error rate.

Set to 0 to disable.

accuracy¶

configuration double read-write

C/C++Java

#define SNSR_ACCURACY "accuracy"

public class Snsr {
  public final static String ACCURACY = "accuracy";
}

Enrollment accuracy.

Trades accuracy of the enrolled model for enrollment speed. The default accuracy is 1.0, for the best accuracy at the slowest enrollment speed. Valid range is 0.0 to 1.0.

^adapted, ^enrolled

am-size¶

configuration double read-only stt

C/C++Java

#define SNSR_RES_AM_SIZE "am-size"

public class Snsr {
  public final static String RES_AM_SIZE = "am-size";
}

Size of STT acoustic model, in bytes.

Note

Not supported for all STT models.

lm-size, nlu-size, slm-size

audio-stream-size¶

configuration int read-write

C/C++Java

#define SNSR_AUDIO_STREAM_SIZE "audio-stream-size"

public class Snsr {
  public final static String AUDIO_STREAM_SIZE = "audio-stream-size";
}

Input audio buffer size.

The number of audio samples kept in a circular audio history buffer, accessible through audio-stream.

Use this buffer to retrieve segmented audio using alignments (begin-sample, begin-ms, end-sample, end-ms) obtained in the ^result.

Set to 0 to disable audio buffering.

audio-stream

backlog-interval¶

configuration double read-write 7.7.0

C/C++Java

#define SNSR_BACKLOG_INTERVAL "backlog-interval"

public class Snsr {
  public final static String BACKLOG_INTERVAL = "backlog-interval";
}

Partial result update interval used while processing an audio backlog.

This setting overrides partial-result-interval when recognizing audio that precedes a wake word, enabled by setting wake-word-at-end to 1 or 2. By default backlog-interval = 0 for the lowest recognition result latency.

partial-result-interval, wake-word-at-end, tpl-opt-spot-vad-lvcsr-type, tpl-spot-vad-lvcsr-type

backoff¶

configuration int read-write

C/C++Java

#define SNSR_BACKOFF "backoff"

public class Snsr {
  public final static String BACKOFF = "backoff";
}

Start point back-off in ms.

Audio margin added before the start point found by a VAD.

begin-ms, begin-sample

cache-file¶

configuration string read-write

C/C++Java

#define SNSR_CACHE_FILE "cache-file"

public class Snsr {
  public final static String CACHE_FILE = "cache-file";
}

Continuous Adaptation cache file name.

When set, enrolled user data will be saved to, and loaded from this file. If not set, enrolled user data are discarded when the spotter session is released.

This setting is only available in fixed-phrase spotters that support continuous adaptation.

If you need more control over how or when the enrollment context is saved you can do this from the ^adapted callback handler.

^adapted

complete-only¶

configuration int read-write tnl

C/C++Java

#define SNSR_COMPLETE_ONLY "complete-only"

public class Snsr {
  public final static String COMPLETE_ONLY = "complete-only";
}

Controls whether incomplete LVCSR results are accepted.

The text result available in the ^result callback for LVCSR recognizers reports the recognition result that best matches the acoustic evidence the recognizer saw. The default behavior is to show incomplete results, even if they are not accepted by the grammar specification. For example, if a custom recognizer uses

grammar = <s> 1 2 3 4 5 6 7 8 9 10 </s>;

and the audio contains only "1 2 3 4", then the final result will be "1 2 3 4".

If this behavior is not desirable, setting complete-only to 1 will suppress such incomplete results. The ^result callback will still happen, but text will be <no-match/>. The ^nlu-intent and ^nlu-slot events will not be invoked.

^result, text

ctx-enroll¶

configuration int read-write

C/C++Java

#define SNSR_ENROLLMENT_CONTEXT "ctx-enroll"

public class Snsr {
  public final static String ENROLLMENT_CONTEXT = "ctx-enroll";
}

Number of enrollments with trailing context.

The recommended number of enrollments where the phrase is followed by additional speech. For example: "Hey Sensory will it rain tomorrow?"

custom-vocab¶

configuration string read-write stt

C/C++Java

#define SNSR_CUSTOM_VOCAB "custom-vocab"

public class Snsr {
  public final static String CUSTOM_VOCAB = "custom-vocab";
}

Custom STT vocabulary.

STT recognizers occasionally do not have full vocabulary coverage for low-frequency words, proper names, trade marks, and such. Use this custom vocabulary setting to add new words to a recognizer.

Note

Use custom vocabulary to address minor recognition issues. For more than a couple of hundred entries you'll get better performance with a domain-specific STT model. Please contact your Sensory sales representative to explore options.

Map format:

output word or phrase [, incorrect result [, incorrect result [, ...]]]
...

New vocabulary word or phrase,
followed by zero or more mis-recognized examples, each prefixed with a , separator.
Vocabulary entries are separated by \r, \n or ;

Example custom vocabulary

custom-vocab.txt

voice genie #(1)!
voice genie, voice jenny #(2)!
armadillo, i'm adello, amadello #(3)!

New vocabulary phrase, without any mis-recognized alternates. If "voice genie" is one of the alternates the recognizer is considering this will increase the likelihood that it is selected as the result.
If the STT engine were to recognize "voice jenny" it will be rewritten to "voice genie"
If the STT engine recognizes "i'm adello" or "amadello" these will both be rewritten as "armadillo" in the result.

Example

% snsr-eval -t model/stt-enUS-automotive-medium-2.3.15-pnc.snsr \
            -s partial-result-interval=0 \
            data/enrollments/armadillo-1-4-c.wav
NLU intent: no_command = an anlla record a video
400   1720 an anlla record a video

% snsr-eval -t model/stt-enUS-automotive-medium-2.3.15-pnc.snsr \
            -s partial-result-interval=0 \
            -s 'custom-vocab="armadillo, an anlla; jackalope"' \
            data/enrollments/armadillo-1-4-c.wav
NLU intent: no_command = armadillo record a video
400   1720 armadillo record a video

debug-log-file¶

configuration string read-write

C/C++Java

#define SNSR_DEBUG_LOG_FILE "debug-log-file"

public class Snsr {
  public final static String DEBUG_LOG_FILE = "debug-log-file";
}

Debug log filename.

The name of the log file tpl-spot-debug writes to. This value is required, and no default is defined in the template. The directory the log file is in must exist, and must be writable.

These optional and mutually exclusive character sequences are substituted with the time stamp when the log file is first opened:

%@ - year-month-day_hour-minute-second.milliseconds (UTC)
%# - milliseconds since the epoch.

Example

C/C++Java

snsrSetString(session, SNSR_DEBUG_LOG_FILE, "debug-%#.log");

session.setString(Snsr.DEBUG_LOG_FILE, "debug-%#.log");

delay¶

configuration int read-only 6.16.0

C/C++Java

#define SNSR_SPOT_DELAY "delay"

public class Snsr {
  public final static String SPOT_DELAY = "delay";
}

Phrase spotter delay in ms.

Deprecated

Support for this setting will be removed from the next major release of the TrulyNatural SDK.

First deprecated in release 6.16.0 (2021-06-06) and made read-only in 7.0.0 (2023-11-20).

The cumulative recognition score for a wake word or command recognizer can exceed the decision threshold before the end of the utterance. This setting controls how long the recognizer will wait while the recognition score is still increasing before reporting the event.

Longer delays can increase the time alignment accuracy of the end of the spotted phrase.

duration-ms¶

configuration double read-write

C/C++Java

#define SNSR_DURATION_MS "duration-ms"

public class Snsr {
  public final static String DURATION_MS = "duration-ms";
}

Low false-reject listening window.

Selects the time window in ms following a close false-reject that smart wake words will use low-fr-operating-point instead of operating-point.

Defaults to 10 seconds if not explicitly set.

low-fr-operating-point, operating-point

enrollment-task-index¶

configuration int read-write

C/C++Java

#define SNSR_ENROLLMENT_TASK_INDEX "enrollment-task-index"

public class Snsr {
  public final static String ENROLLMENT_TASK_INDEX = "enrollment-task-index";
}

The index of the sub-task to enroll.

For enrollment tasks that contain multiple sub-tasks (for example, a user-defined trigger and an enrolled fixed trigger), this integer value selects which of the sub-tasks the enrollments should be applied to.

See the documentation delivered with the task file for the sub-task mapping.

Note

For most enrollment tasks the only supported task index is 0.

fex-hash¶

configuration string read-only pre-release

C/C++Java

#define SNSR_FEATURE_HASH "fex-hash"

public class Snsr {
  public final static String FEATURE_HASH = "fex-hash";
}

Feature extractor hash.

Pre-release

This is an experimental feature. Do not use unless recommended by Sensory.

This is a unique string that identifies the feature type used by the task.

hold-over¶

configuration int read-write

C/C++Java

#define SNSR_HOLD_OVER "hold-over"

public class Snsr {
  public final static String HOLD_OVER = "hold-over";
}

Endpoint hold-over.

Audio margin added after the endpoint found by a VAD. This is the amount of trailing silence to include in the segmentation.

end-ms, end-sample

include-leading-silence¶

configuration int read-write

C/C++Java

#define SNSR_INCLUDE_LEADING_SILENCE "include-leading-silence"

public class Snsr {
  public final static String INCLUDE_LEADING_SILENCE = "include-leading-silence";
}

Include leading silence in VAD output.

Set to 1 to include all audio up to the endpoint in the <-audio-pcm output stream. Set to 0 to return to the default behavior, which discards leading silence.

If this setting is used with a spot-VAD template such as tpl-spot-vad, tpl-spot-vad-lvcsr, or tpl-opt-spot-vad-lvcsr the leading silence includes the trigger phrase.

include-wake-word-audio, <-audio-pcm, pass-through

include-model¶

configuration int read-write

C/C++Java

#define SNSR_INCLUDE_MODEL "include-model"

public class Snsr {
  public final static String INCLUDE_MODEL = "include-model";
}

Debug log includes a copy of the model.

This boolean value controls whether the debug-log-file includes a copy of the task model (the .snsr file).

The default value is 1. Set include-model=0 for smaller (but less complete) debug log files.

debug-log-file

include-wake-word-audio¶

configuration int read-write 7.6.0 tnl

C/C++Java

#define SNSR_INCLUDE_WAKE_WORD_AUDIO "include-wake-word-audio"

public class Snsr {
  public final static String INCLUDE_WAKE_WORD_AUDIO = "include-wake-word-audio";
}

Include the wake word audio in VAD output

When set to 1, VAD templates tpl-spot-vad, tpl-spot-vad-lvcsr, and tpl-opt-spot-vad-lvcsr include the wake word in the audio output. Set to 0 to return to the default behavior, where the output does not include the wake word audio.

Note

This setting is a synonym for include-leading-silence when used with these templates. If you set both include-wake-word-audio and include-leading-silence, include-wake-word-audio takes precedence.

include-leading-silence, <-audio-pcm, pass-through

interactive¶

configuration int read-write

C/C++Java

#define SNSR_INTERACTIVE_MODE "interactive"

public class Snsr {
  public final static String INTERACTIVE_MODE = "interactive";
}

Interactive enrollment mode.

This changes the enrollment task behavior: When set to 0, enrollment for the current phrase will continue until the end of the stream.

^adapted, ^enrolled

leading-silence¶

configuration int read-write

C/C++Java

#define SNSR_LEADING_SILENCE "leading-silence"

public class Snsr {
  public final static String LEADING_SILENCE = "leading-silence";
}

VAD leading silence time-out, in ms.

The VAD will invoke the ^silence event handler if no speech is detected during the first leading-silence ms of processed audio.

^silence, trailing-silence

listen-window¶

configuration int read-write

C/C++Java

#define SNSR_LISTEN_WINDOW "listen-window"

public class Snsr {
  public final static String LISTEN_WINDOW = "listen-window";
}

Phrase spot listening window in seconds or milliseconds.

This is the duration that a spotter will listen for a command before timing out. Spotters with short listening windows are typically optimized to have lower false reject, but higher false accept rates.

If this value is 120 or less it is in seconds. Values larger than 120 are in ms. In wake word spotters tuned for continuous listening this value is 0.

Note

This value is only used when:

Converting models to DSP format for embedded use.
When the spotter is used in slot 1 of the tpl-spot-sequential spotter template model.

In all other cases spotters listen continuously, regardless of the value of listen-window.

What is a Command Set?

lm-size¶

configuration double read-only stt

C/C++Java

#define SNSR_RES_LM_SIZE "lm-size"

public class Snsr {
  public final static String RES_LM_SIZE = "lm-size";
}

Size of STT language model, in bytes.

Note

Not supported for all STT models.

am-size, nlu-size, slm-size

loop¶

configuration int read-write

C/C++Java

#define SNSR_LOOP "loop"

public class Snsr {
  public final static String LOOP = "loop";
}

Control template looping behavior.

In tpl-spot-sequential, setting this value to 1 changes when the listening focus returns to slot 0. Instead of immediately returning to slot 0 after a spot in slot 1, it resets the expiration timer, and only a timeout returns to slot 0.

This allows for a wake word followed by zero or more commands from a command set. The default behavior (loop = 0) is to allow at most one command before requiring another wake word utterance.

7.6.0 Setting loop = 2 pins the listening focus to slot 1. Use this, for example, if an application needs to gate a command set recognizer with a wake word or an external event such as a push-to-talk button.

tpl-spot-sequential

low-fr-operating-point¶

configuration int read-write

C/C++Java

#define SNSR_LOW_FR_OPERATING_POINT "low-fr-operating-point"

public class Snsr {
  public final static String LOW_FR_OPERATING_POINT = "low-fr-operating-point";
}

Low false-reject spotter operating point.

Selects the low false-reject fall-back operating point used by smart wake words . This low false-reject operating point is selected for duration-ms if a spot was rejected at operating-point but would have been accepted at low-fr-operating-point.

duration-ms, operating-point

max-recording¶

configuration int read-write

C/C++Java

#define SNSR_MAX_RECORDING "max-recording"

public class Snsr {
  public final static String MAX_RECORDING = "max-recording";
}

VAD maximum record duration, in ms.

The VAD will invoke the ^limit event handler if the detected speech segment exceeds this value.

^limit

max-users¶

configuration int read-write

C/C++Java

#define SNSR_MAX_USERS "max-users"

public class Snsr {
  public final static String MAX_USERS = "max-users";
}

Maximum number of users to adapt to.

Sets a limit to the number of distinct users a continuously adapting fixed-phrase spotter will enroll.

nlu-match-max¶

configuration int read-write tnl

C/C++Java

#define SNSR_NLU_RES_MAX "nlu-match-max"

public class Snsr {
  public final static String NLU_RES_MAX = "nlu-match-max";
}

The maximum number of alternate NLU matches to consider

Limits the number of ^nlu-slot callbacks issued in case of multiple valid NLU matches to the recognition result. The default value is 1, limiting NLU results to the best-scoring match only.

^nlu-slot, nlu-match-index, nlu-slot-count

nlu-size¶

configuration double read-only stt

C/C++Java

#define SNSR_RES_NLU_SIZE "nlu-size"

public class Snsr {
  public final static String RES_NLU_SIZE = "nlu-size";
}

Size of STT NLU model, in bytes.

Note

Not supported for all STT models.

am-size, lm-size, slm-size

operating-point¶

configuration int read-write

C/C++Java

#define SNSR_OPERATING_POINT "operating-point"

public class Snsr {
  public final static String OPERATING_POINT = "operating-point";
}

Spotter operating point.

Selects the trade-off between false accept and false reject errors for wake word and command set recognizers.

Higher-numbered points are more accepting.

The valid range is from 1 to 21 inclusive.
Lower-numbered points have a lower false accept rate at the expense of higher false reject fraction.
The false accept rate is expressed as the expected number of false accepts (where the recognizer mistakenly spots the trigger phrase) per time unit. For example, 1.2 false accepts per day.
The false reject rate is the percentage of times the actual trigger phrase is spoken, but not recognized. For example, 4.5%.
The default operating point is selected by Sensory during trigger development for a good balance between the these two error types.
Not all operating points are necessarily valid. Use operating-point-iterator to find all the available points.

operating-point-iterator, low-fr-operating-point, duration-ms

partial-result-interval¶

configuration double read-write tnl stt

C/C++Java

#define SNSR_PARTIAL_RESULT_INTERVAL "partial-result-interval"

public class Snsr {
  public final static String PARTIAL_RESULT_INTERVAL = "partial-result-interval";
}

Partial result update interval.

The current preliminary result is emitted every partial-result-interval milliseconds. Set to 0 to disable partial result reporting.

Warning

Do not change partial-result-interval from an event handler, or while a model is running.

Note

In STT models this also sets the interval at which the model is evaluated. Less frequent updates trade preliminary result latency for lower average CPU use. Set to 0 for the lowest possible evaluation rate and CPU use.

^result-partial

pass-through¶

configuration int read-write

C/C++Java

#define SNSR_PASS_THROUGH "pass-through"

public class Snsr {
  public final static String PASS_THROUGH = "pass-through";
}

VAD audio pass-through behavior.

If set to 0, no audio from ->audio-pcm will be passed through to <-audio-pcm. The begin- and endpoint handlers will still be invoked. The default value, 1, passes speech-detected samples to <-audio-pcm.

include-leading-silence

push-buffer-backlog¶

configuration int read-write

C/C++Java

#define SNSR_RES_PUSH_BUFFER_BACKLOG "push-buffer-backlog"

public class Snsr {
  public final static String RES_PUSH_BUFFER_BACKLOG = "push-buffer-backlog";
}

Reports the number of bytes of deferred push data.

If push is used with a push-duration-limit, this setting reports the number of bytes deferred for processing in subsequent calls to push.

push, push-buffer-size, push-duration-limit

push-buffer-size¶

configuration int read-write

C/C++Java

#define SNSR_PUSH_BUFFER_SIZE "push-buffer-size"

public class Snsr {
  public final static String PUSH_BUFFER_SIZE = "push-buffer-size";
}

The size of the internal ring buffers used by push.

If push is used with a push-duration-limit, processing will require deferral if the duration limit is reached. In this case, push will allocate a ring buffer to hold these data. This setting configures the size of this buffer, in bytes.

The default buffer size is sufficient to defer up to 250 ms of audio data.

push, push-duration-limit, push-buffer-backlog

push-duration-limit¶

configuration double read-write

C/C++Java

#define SNSR_PUSH_DURATION_LIMIT "push-duration-limit"

public class Snsr {
  public final static String PUSH_DURATION_LIMIT = "push-duration-limit";
}

Sets a limit to the maximum processing time push should consume.

This setting is the maximum number of milliseconds any call to push should spend processing data before returning control to the caller.

The default value is 0, which disables the processing limit.

Note

This requires a valid real-time clock function, see CONFIG_CLOCK_FUNC.

TrulyNatural SDK libraries for Android, Linux, macOS, iOS, and Java include real-time clock functions and require no additional configuration.

You should use a push-duration-limit if:

You're using push, and
you collect live audio on the same thread as the recognizer, and
you will drop audio packets if you don't return from push before the next packet is available.

push-duration-limit adds a cap to the amount of CPU used in each call to push. This requires and allocates an additional input ring buffer that's push-buffer-size bytes in size.

If you have a separate thread, or interrupt-driven live audio recording and you want to maximize throughput, increase the size of the audio ring buffer instead of using a push-duration-limit.

Recommendations:

Use 15 ms audio chunks.
The audio recording buffer size determines the longest time the average recognizer throughput can fall behind real time.
With a a 30 ms buffer only two 15 ms block fit, which means that every SDK processing call must return within 15 ms, or we'll lose a block or partial block.
Using a 300 ms buffer relaxes this. 20 blocks mean that we can fall up to 18 blocks (270 ms) behind before losing audio.

push, push-buffer-size, push-buffer-backlog, CLOCK_FUNC

ram-limit¶

configuration double read-write tnl 7.5.0

C/C++Java

#define SNSR_RAM_LIMIT "ram-limit"

public class Snsr {
  public final static String RAM_LIMIT = "ram-limit";
}

Limit LVCSR decoder memory use

The amount of heap RAM to allocate to LVCSR search decoding, in bytes.

A subset recognizers optimized for low resource use created by VoiceHub allow limiting the amount of heap RAM to allocate to search decoding. This setting modifies this limit. Lower values can increase error rates, so we recommend that you set this to as large a value as constraints allow. Set to 0 to disable the limit.

ac-prune-top-k

req-enroll¶

configuration int read-write

C/C++Java

#define SNSR_ENROLLMENT_TARGET "req-enroll"

public class Snsr {
  public final static String ENROLLMENT_TARGET = "req-enroll";
}

Enrollment target.

The recommended number of enrollments for each user. Using either more or fewer enrollments will reduce overall spotter performance.

user-iterator, enrollment-count

result-max¶

configuration int read-write tnl

C/C++Java

#define SNSR_RESULT_MAX "result-max"

public class Snsr {
  public final static String RESULT_MAX = "result-max";
}

The maximum number of alternate phrase results to consider

Limits the number of alternate phrases returned by LVCSR models.

If result-max > 1, phrase-iterator will return phrase-level recognition results in order of likelihood.

The default is result-max == 1, which returns only the most likely result.

Limitations

word-iterator and phone-iterator are available for the most likely result only.
Time alignments are accurate for the most likely result only.
score values are not usable when result-max > 1.
Silence markup is elided from all but the top scoring phrase. An empty text result indicates that silence was the best match to the acoustic input.

Warning

N-best processing is computationally expensive, frequently prohibitively so. Contact Sensory for guidance before using this feature in production.

^result, phrase-iterator

samples-per-second¶

configuration int read-only

C/C++Java

#define SNSR_SAMPLE_RATE "samples-per-second"

public class Snsr {
  public final static String SAMPLE_RATE = "samples-per-second";
}

Model sample rate in Hz.

save-enroll-audio¶

configuration int read-write

C/C++Java

#define SNSR_SAVE_ENROLLMENT_AUDIO "save-enroll-audio"

public class Snsr {
  public final static String SAVE_ENROLLMENT_AUDIO = "save-enroll-audio";
}

Include enrollment audio in the enrollment context.

Set to 1 to include the enrollment audio in enrollment contexts, 0 to exclude.

RUNTIME, enrollment-iterator

score-offset¶

configuration double read-write

C/C++Java

#define SNSR_SCORE_OFFSET "score-offset"

public class Snsr {
  public final static String SCORE_OFFSET = "score-offset";
}

Reserved

Do not use unless recommended by Sensory.

search.frame-nota¶

configuration double read-write

C/C++Java

#define SNSR_OOV_REJECT "search.frame-nota"

public class Snsr {
  public final static String OOV_REJECT = "search.frame-nota";
}

Out-of-vocabulary rejection sensitivity.

This setting controls out-of-vocabulary rejection in custom LVCSR recognizers.

Custom LVCSR recognizers report <no-match/> for words or phrases that are not in the grammar. With an search.frame-nota value of 0 the recognizer will never report <no-match/>, it will return the closest match instead. With search.frame-nota at 1.0, almost all input will return <no-match/>.

The optimal value for search.frame-nota depends on the vocabulary used. A reasonable value to start testing with is 0.2.

Note

Do not change search.frame-nota for models that include statistical language model components. These models typically have either -broad- or -background- in the model name, and are configured to use the language model to recognize utterances not covered by the custom grammar.

grammar-stream

show-silence¶

configuration int read-write tnl

C/C++Java

#define SNSR_SHOW_SILENCE "show-silence"

public class Snsr {
  public final static String SHOW_SILENCE = "show-silence";
}

Include silence in recognizer results.

When set to 1, LVCSR recognition results include word-pause <wp>, sentence-begin <s>, and sentence-end </s> markup. The default value is 0, which elides these from results.

^result, ^result-partial

slm-enabled¶

configuration int read-write stt 7.4.0

C/C++Java

#define SNSR_SLM_ENABLED "slm-enabled"

public class Snsr {
  public final static String SLM_ENABLED = "slm-enabled";
}

Enable optional SLM component.

Set to 0 to turn the SLM component off, 1 to turn on.

^slm-start, ^slm-result, ^slm-result-partial, slm-turn-limit

slm-size¶

configuration double read-only stt 7.5.0

C/C++Java

#define SNSR_RES_SLM_SIZE "slm-size"

public class Snsr {
  public final static String RES_SLM_SIZE = "slm-size";
}

Size of STT SLM, in bytes.

Note

Not supported for all STT models.

am-size, lm-size, nlu-size

slm-turn-limit¶

configuration int read-write stt 7.4.0

C/C++Java

#define SNSR_SLM_TURN_LIMIT "slm-turn-limit"

public class Snsr {
  public final static String SLM_TURN_LIMIT = "slm-turn-limit";
}

Configure SLM history behavior.

If slm-turn-limit >= 0 the optional SLM component limits the number of conversational turns in the model history. The default -1, which keeps all history.

Writing to slm-turn-limit discards existing history.

Note

Values larger than 0 increases the SLM result latency and CPU use.

^slm-result, ^slm-result-partial, slm-enabled

slot¶

configuration string read-write

C/C++Java

#define SNSR_SLOT "slot"

public class Snsr {
  public final static String SLOT = "slot";
}

Template slot selector.

Use with tpl-spot-select and tpl-opt-spot-vad-lvcsr to select the active slot.

0, 1, phrasespot, lvcsr

stt-profile¶

configuration string read-write stt 7.4.0

C/C++Java

#define SNSR_STT_PROFILE "stt-profile"

public class Snsr {
  public final static String STT_PROFILE = "stt-profile";
}

Select STT speed vs accuracy trade-off.

Default value is accurate, set to fast to reduce CPU load at the expense of recognition accuracy.

^result, ^result-partial

sv-threshold¶

configuration double read-write

C/C++Java

#define SNSR_SV_THRESHOLD "sv-threshold"

public class Snsr {
  public final static String SV_THRESHOLD = "sv-threshold";
}

Enrolled wake word speaker verification threshold.

Enrolled wake word results with a sv-score less than this threshold are not reported. Increase this threshold to reduce the chance that someone other than the enrolled speaker triggers the phrase spotter.

^result, sv-score

task-name¶

configuration string read-only 6.14.0

C/C++Java

#define SNSR_TASK_NAME "task-name"

public class Snsr {
  public final static String TASK_NAME = "task-name";
}

Task name.

Deprecated

Support for this setting will be removed from the next major release of this SDK.

Do not use this in new code.

task-type¶

configuration string read-only

C/C++Java

#define SNSR_TASK_TYPE "task-type"

public class Snsr {
  public final static String TASK_TYPE = "task-type";
}

Task type.

This, together with task-version, describes the model behavior: Which setting keys and streams it supports.

Examples include: enroll, lvcsr, phrasespot, phrasespot-vad, and vad.

Values, task-version, require

task-type-and-version-list¶

configuration string write-only

C/C++Java

#define SNSR_TASK_TYPE_AND_VERSION_LIST "task-type-and-version-list"

public class Snsr {
  public final static String TASK_TYPE_AND_VERSION_LIST = "task-type-and-version-list";
}

Verifies that a model matches one of list of types and versions.

When used with require, the value argument must be a semicolon-separated list of task-type and task-version values. This list must have at least one element.

A task will match the requirement if one of the task-type fields match, and the corresponding task-version is satisfied.

If no task-type matches, require returns REQUIRE_MISMATCH.

If a task-type matches, but the associated task-version is not satisfied, require returns VERSION_MISMATCH.

Example

C/C++Java

snsrRequire(session, SNSR_TASK_TYPE_AND_VERSION_LIST,
            SNSR_PHRASESPOT " ~0.5.0 || 1.0.0;"
            SNSR_LVCSR " 1.0.0");

session.require(Snsr.TASK_TYPE_AND_VERSION_LIST,
                Snsr.PHRASESPOT + " ~0.5.0 || 1.0.0;" +
                Snsr.LVCSR + " 1.0.0");

require

task-version¶

configuration string read-only

C/C++Java

#define SNSR_TASK_VERSION "task-version"

public class Snsr {
  public final static String TASK_VERSION = "task-version";
}

Model task version.

These version strings follow semantic versioning rules.

task-type, require

threshold¶

configuration int read-write 7.4.0

C/C++Java

#define SNSR_THRESHOLD "threshold"

public class Snsr {
  public final static String THRESHOLD = "threshold";
}

Dynamic operating point selection threshold.

Deprecated

Superseded by built-in support for smart wake words in TrulyNatural 7.4.0.

Selects the threshold used by tpl-spot-dynop-1.4.0.snsr to decide whether to select the low-fr-operating-point.

duration-ms, low-fr-operating-point, operating-point

trailing-silence¶

configuration int read-write

C/C++Java

#define SNSR_TRAILING_SILENCE "trailing-silence"

public class Snsr {
  public final static String TRAILING_SILENCE = "trailing-silence";
}

VAD trailing silence time-out, in ms.

The VAD will invoke the ^end event handler once trailing-silence ms of silence has followed the last bit of speech.

^end, hold-over, leading-silence

user¶

configuration string read-write

C/C++Java

#define SNSR_USER "user"

public class Snsr {
  public final static String USER = "user";
}

Enrolling user tag.

Sets the tag for the current enrollment. This should be a unique alphanumeric phrase, without spaces. It is the phrase returned as a recognition result.

If enrolling more than one phrase for any of the users, the tag must contain one / that separates a user-specific part from the phrase part. For example: user1/phrase1, user2/phrase1, user2/phrase2.

^adapted, ^enrolled

wake-word-at-end¶

configuration int read-write 7.7.0

C/C++Java

#define SNSR_WAKE_WORD_AT_END "wake-word-at-end"

public class Snsr {
  public final static String WAKE_WORD_AT_END = "wake-word-at-end";
}

Support for trailing wake words.

Setting this to 1 or 2 enables support for recognizing utterances gated by a wake word at the end of an utterance, in addition to gating by a wake word at the start. 1 does not include the wake word audio, but 2 does. 0 turns this feature off.

Note

This feature requires an audio-stream-size large enough to hold the entire utterance. Enabling wake-word-at-end will increase this to ten seconds if it starts out smaller.

If the utterance before the wake word does not fit into the audio-stream-size ring buffer, the VAD will invoke the ^limit event instead of ^end

backlog-interval, tpl-opt-spot-vad-lvcsr-type, tpl-spot-vad-lvcsr-type, tpl-spot-vad-type, use-trailing-wake-word