Genie Community Forum

New Release: Almond 1.8

As promised a week ago, Almond 1.8 is now here, codenamed the “Stay-At-Home Assistant”.

The highlight of this release is improved support for knowledge-based question answering, and experimental support for Q&A devices based on schema.org markup. This work corresponds to this paper.

Release Notes

New Features for Users

  • The default temperature unit (which the word “degrees” maps to) is now dependent on the country the user is in, and defaults to Celsius outside the United States.
    This feature was actually part of 1.7.3 but it was never fully rolled out.
  • The wake word for the server and desktop version of Almond is now “computer”. This is a “universal” model provided by the snowboy library, which should improve
    the reliability of wake word detection, and also greatly reduce resource usage and dependencies compared to the previous implementation.
  • Text to speech on the server and desktop versions of Almond now uses Microsoft Speech Services instead of the mimic library. This greatly improves the quality and
    fluency of the generated speech.
  • The desktop version of Almond now supports controlling the compute volume with voice. It also received a number of fixes around launching app and around integration
    with the GNOME Shell, which should make it more pleasant to use.
  • Almond Cloud now supports voice input.
  • Almond Server now is officially supported as a native application on Windows, without using docker or WSL.
  • Google Assistant support has been reintroduced.

New Features for Developers

  • Support for queries and knowledge-based question-answering (KBQA) has been greatly expanded, with new parameter-level annotations representing filters and input parameters
    expressed in different ways. This can increase the variety and quality of generating datasets, and reduces the reliance on dataset.tt, which are tedious to write.
  • Thingpedia queries now receives a “hint” containing which filters, projections, sorting and slicing operations will be applied on the result. Query implementations can use the hint
    to reduce the amount of data retrieved from the server, and to implement filters using search APIs.
  • All local platforms (command-line, server, desktop) have gained the abilty to specify a “developer directory”. This is a directory containing Thingpedia devices, which override
    the devices in Thingpedia. This can be enabled by setting the developer-dir key of in the Almond preference file (located at ~/.config/almond-server/prefs.db for Almond Server,
    ~/.config/almond-cmdline/prefs.db for Almond Command-line, and ~/.config/almond/prefs.db or ~/.var/app/edu.stanford.Almond/config/almond/prefs.db for Almond Desktop).
    Previously, this feature was limited to the command-line platform.
  • Internationalization support for devices has been improved. All devices can now specify their name and description using #_[name] and #_[description] annotations, and these
    annotations are translatable Thingpedia. Devices can also include a po/ directory with gettext po files, which are automatically loaded.
  • Annotations with placeholders (#_[confirmation], #_[formatted], #[url]) now make use of the new string-interp library,
    which provides a greater set of options, better support for plural and gender variants, and ability to specify optional parts of the formatted or confirmation string.
  • The syntax of arithmetic expression has been simplified, and now all places where values are expected (such as input parameters) support simple arithmetic expressions. Among other things, this allows string concatenation with parameter passing.
  • In Almond Cloud, developers can now create MTurk batches and custom synthetic datasets, including datasets in MTurk mode. This provides a one-click solution to use MTurk to improve the
    accuracy of Thingpedia devices.
  • The voice API has been refactored and merged with the previous NLP API. The new API also includes a combined voice + NLU endpoint. The NLP API is now fully documented.

Under-the-Hood Improvements

  • The abstract-syntax-tree (AST) definitions for ThingTalk have been refactored to be modern, simple ES6 classes. This will simplify development of programs that manipulate ThingTalk ASTs, and will allow us to improve the documentation of the ThingTalk internals.
  • Optimizations and normalizations of ThingTalk programs have been improved. This should increase accuracy in semantic parsing.
  • Experimental and never-used, incomplete features have been removed from ThingTalk, reducing the cognitive burden on new developers.
  • The dataset generation algorithm has been rewritten to be faster and allow significantly control over the sampling, which in turn should allow for greater variety and greater quality in the resulting datasets.
  • The neural network library was rebranded from decanlp to genienlp, completing the fork as the original decanlp library was abandoned upstream. genienlp includes new BERT- and RoBERTa- based models, which leverage unsupervised pretraining to improve generalization on unseen sentences.
  • The exact matcher has been refactored and now uses a dedicate on-disk data structure, which should significantly reduce memory usage for cloud deployment of the NLP inference server.

Experimental and Ongoing Work

  • A new set of tools has been added to quickly build Thingpedia Q&A devices using schema.org structured markup in existing websites.
  • This release includes a new experimental design for multi-turn dialogues. A new language of dialogue state has been introduced in ThingTalk, and a new dialogue loop in the dialogue agent. The new design exposes the full state of the conversation to the neural network, which enables it to transition to any state and greatly expands the possible conversations that are understood.
  • It is now possible to generate datasets for programming languages other than ThingTalk. A multi-domain dialogue state tracking language has been added, which allows to generate datasets for the popular MultiWOZ benchmark dataset.
  • Generation of dialogues has been refactored and now supports generating both agent and user turns, allowing significantly more flexible conversational agent designs.

Module versions

This release comprises the following packages:

  • thingtalk: 1.10.0
  • thingpedia: 2.7.0
  • genie-toolkit: 0.6.0
  • thingengine-core: 1.8.0
  • almond-dialog-agent: 1.8.0
  • genienlp: 0.2.0
  • almond-cloud: 1.8.0
  • almond-cmdline: 1.8.0
  • almond-server: 1.8.0
  • almond-gnome: 1.8.0
  • almond-tokenizer: 3.1.0

Please refer to the HISTORY file on GitHub for details of what changed in each one.

Deployment

The new version has been tagged and is making its way through builds on dockerhub.

On our website, the new version will go live tomorrow. As announced previously, this version includes significant infrastructure migrations (primarily around the genienlp library, and migration to nodejs 10.*). We will try to keep the downtime minimal, but there will be some throughout the day. Also, please expect general flakiness in the NLP API both tomorrow and Saturday, as the new models are trained and deployed. We will announce here when the migration is complete.

On Flathub, we will release a new build when the new version is live on our website, to ensure that the app can access the new APIs released this time.

Hope you will enjoy our new version!
Y’all stay safe,

Giovanni
(On behalf of the whole Almond & OVAL team)

Status update: the website, the Thingpedia API and the NLP backend have been updated, so minimal functionality for some commands should be working again.

I am not done with infrastructure updates, so there might still be some downtime in the NLP API. After that, I will start training the new models.

On the Almond desktop, we have successful test builds on Flathub. After I’m done with the cloud updates, I will test those builds, and if everything looks ok they will go public.

And we’re done! The website and API are updated and functional, all our infrastructure is operational, and Almond desktop is available on Flathub.

The models are training (it will take several hours for the new models to complete) but natural language should be already functional in the meantime.

Please report any issue you encounter, here or GitHub!