Creating a Bot using Opsdroid and Rasa
by Damien
Objectives of this Tutorial
- Set up a bot using Opsdroid that can use Matrix.
- Use Rasa to train an NLU model.
- Have Opsdroid integrate with Rasa to understand messages better.
Setting up Opsdroid
This tutorial is a little different in a few ways. Firstly, this is not about working on my game. Secondly, there is no video tutorial for this one and finally, we are working in Python today! Now, why are we using Opsdroid as our bot framework? Well, it is one of the few frameworks that supports end to end encryption (e2ee) for the Matrix protocol. It also makes use async functions in order to process many requests at once, which is useful for scaling.
The first step to setting up Opsdroid is to install Python. On Debian based distributions, this can be installed in one command in a terminal. If you have never touched a terminal before, don't worry, I will talk you through it.
sudo apt install python3 python3-venv libolm3 libolm-dev
Ok, so the sudo
means that we are escalating privileges from the user to the root, as installing new programs is restricted to the root user by default. Next, apt
is the package manager in Debian based distributions, which is used to install, upgrade and remove packages from your system. Now, we use install to tell apt that we want to install
a new package followed by the name of the packages we want to install (in this case python3 python3-venv libolm3 libolm-dev
).
Now we need a new user, which can be created using the useradd
utility.
sudo useradd -m opsdroid
Again, we need to escalate permission with sudo
, but now we are using useradd
to create a new user. We want to create a home directory for them with -m
and we set their name to opsdroid
. To log into this user, we can use the su
utility.
sudo su opsdroid
Now, we need to create a virtual environment for opsdroid, so that any packages we install cannot interfere with any other python programs/scripts. This is why we also install python3-venv, which allows us to create these virtual environments, or venv for short.
cd
python3 -m venv .venv
Here we have two commands, the first, cd
without any arguments will move us into the home directory of the current user (opsdroid). Then we call python3
and load a module with -m
, choosing to load venv
. This module requires one argument, which is the folder to put this new virtual environment, which I like to keep hidden and call .venv
. This will create all the files necessary to use the environment. If you use the default shell, then you must run the following command to activate it.
source .venv/bin/activate
This tells our shell to run the commands in that file (that was automatically created) in our current session. If you use a non-POSIX compliant shell such as fish, you can use the script tailored for your sell (eg .venv/bin/activate.fish
) instead. When you are done with the environment, you can then run deactivate
.
Now, inside this environment, we can use python's own package manager, called pip, to install opsdroid. Since we want to be able to use opsdroid with matrix and e2ee, we need to add some extras using square brackets.
pip install opsdroid[connector_matrix_e2e,database_sqlite]
If you want to use any other connectors, parsers or databases, you can see the extras available on their GitHub. After a while, this should install opsdroid and all its dependencies. Now we need to configure it. Run the following command to open the configuration file of opsdroid.
EDITOR=nano opsdroid config edit
You can change the editor that opens the config by changing nano
to your preferred editor. Note that nano
is one of the easier editors to use. Now, either copy and paste and edit or type out the following configuration.
welcome-message: true
connectors:
matrix:
mxid: "@username:matrix.home.server"
password: "really_secure_password"
rooms:
'main': '!internal_room_id:matrix.home.server'
homeserver: "https://matrix.home.server"
nick: 'Botty McBotface'
device_name: 'Opsdroid'
enable_encryption: True
store_path: '/home/opsdroid/.matrix_key_store'
skills:
hello: {}
This config file is written in a syntax called yaml. Yaml is designed to be more human readable that the likes of json and makes a good choice for config syntax. Every indentation indicates that this section belongs to the name before the indentation. For example, 'main'
has the value '!internal_room_id:matrix.home.server'
and is a child of rooms
which is a child of matrix which is a child of connectors
.
Walking through this file, we start off with setting welcome-message to true. This just means that when you start opsdroid, you get a friendly message in the logs. Next, we enter the connectors section where we define which services we can talk to. Since we want to talk to the matrix, we need to configure that. Firstly, we need the mxid
of the bot. If you do not have an account for your bot, make one now and put their full username, including the homeserver here. We then need to give opsdroid the password
for the bot account. Following that we can define some rooms
want the bot to join on starting up. We can add as many rooms
as we want here, but we need to use the internal room id, which is found in the advanced room settings tab if you use Element.
Next up, we need to define the homeserver
that we need to talk to to send messages. Note the inclusion of https://
. Then, we can give our bot a nickname (nick
) that it will change to on starting up. We also need to give a device_name
that will help you recognise that this is a legitimate login if you ever check active logins. It is also important to set enable_encryption
to true in order to be able to have your bot active in encrypted rooms! For that option to work, we also need to define the store_path
which is where encryption keys are stored.
The next section is where we define skills
. We can use the builtin skill called hello
here. Since it is builtin, we do not need to give it any more arguments, so we terminate it with {}
Now save and quit the editor. In order to check you haven't made any mistakes, use opsdroid config lint
to check your syntax. If all is good, then we can start up opsdroid with opsdroid start
. Once it has started, go ahead and message your bot 'hello' and see their response!
Setting up Rasa
Since some dependencies of Rasa do not work on python3.9, you may need an older version. If the version that you need is not available in your repositories, do not dispair, we can compile our own version of python3.8 in a few steps. First, we need to download the source code.
wget https://www.python.org/ftp/python/3.8.12/Python-3.8.12.tar.xz
This will download an archive of the Python source code which we need to decompress. We then need to move it to a known location.
tar -xf Python-3.8.12.tar.xz
mv Python3.8.12 /opt/Python3.8.12
Now, in preparations for the build, we need to install some dependencies using apt. We can then move into the building directory and use a utility we downloaded with the source code to configure the build for us.
sudo apt install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libsqlite3-dev libreadline-dev libffi-dev curl libbz2-dev -y
cd /opt/Python3.8.12
./configure --enable-optimizations --enable-shared
Now, we can start the build process. This might take a while based on the speed of your machine. This command will make your computer use as many threads as you have cores to speed up the process.
make -j $(nproc)
Now, we want to install this version of python, but we want it alongside our other installations, so we will use altinstall
to avoid overwriting the current version.
sudo make altinstall
If you try running python3.8
now, it will fail. This is because we told python that we were going to use shared libraries that we have not allowed linux to link to. We will use the utility ldconfig
to allow linux to link to these libraries.
sudo ldconfig /opt/Python3.8.12
Now, try running python3.8 -V
and bask in the gloriousness that is your self-compiled python! Note that all the usual python utilities were created at the same time. However, our work has only just begun. Log into our opsdroid user, move into the home directory again with cd
and create a new folder for Rasa with mkdir rasa
and move into it with cd rasa
. In here we can create a new venv using our new version of python.
python3.8 -m venv .venv
source .venv/bin/activate
Now we need to install the latest pip, because the version that you just compiled has a bug with dependency resolving that causes massive slow down if the dependency tree becomes large, like it is for rasa. To do this, make sure you are in the environment and run the following command.
pip install --upgrade pip
This will tell pip to upgrade itself. Then we can install rasa with the spacy extra which will allow us to include spacy layers in our models.
pip install rasa[spacy]
Once that goes through, we need to install a spacy model so that we can include the layers and create a new folder for rasa to populate with required files.
python -m spacy download en_core_web_md
mkdir models
cd models
rasa init
Now, we can edit the config.yml
file with your favourite text editor and change the pipeline to include spacy layers.
language: en
pipeline:
- name: SpacyNLP
model: en_core_web_md
case_sensitive: False
- name: SpacyTokenizer
- name: SpacyFeaturizer
- name: SpacyEntityExtractor
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: char_wb
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
epochs: 100
constrain_similarities: true
- name: EntitySynonymMapper
- name: ResponseSelector
epochs: 100
constrain_similarities: true
- name: FallbackClassifier
threshold: 0.3
ambiguity_threshold: 0.1
policies:
Running through this yaml file, we start off by defining the language that we are working with. In my case this is english (en). We then define the layers in the pipeline. First we bring in the SpacyNLP which we downloaded earlier and pass in the model name (important or you get an obscure error). Following that we can use their layers in our model, specifically the Tokenizer, Feauturizer and EntityExtractor which all add meaning to the text. The rest of the layers are from the automatic generation and are fairly standard. We then leave policies blank, so rasa can put in some sane defaults in there. If we want to train our own intents, then we need to add these into data/nlu.yml
. You can add new intents and change the original intents here. For example, you might add the following to the bottom.
- intent: who_are_you
examples: |
- who are you?
- what is your name?
- who is this?
- I don't know who you are
Note the | symbol after examples that creates a multiline value and is important for rasa. You can now train and test your model on the command line with the following commands.
rasa train nlu
rasa shell nlu
If you are not satisfied with the results, give the model more examples to help train it. Once you are happy, we can work on integrating this with opsdroid.
Integration
Firstly, we need to setup rasa to work as a server. We can do this by firstly creating a file with a cryptographically secure key in it and then calling rasa with a couple arguments.
openssl rand -base64 64 > .rasa_key
nano .rasa_key #edit the file to be on one line and remove special characters
rasa run --enable-api --auth-token $(cat .rasa_key)
Now that that is running, we can configure opsdroid to be able to talk with our rasa server. Make sure to copy the key we just made and open up the opsdroid config again. Then we can add a new root section called parsers for us to put details about rasa.
parsers:
rasanlu:
url: http://localhost:5005
models-path: models
token: YOUR_RASA_TOKEN
min-score: 0.8
Now we can create a new python file where we will create a fairly generic skill that will match to a rasa intent and then pick a random line to reply with. Put the following contents into a python file.
from opsdroid.skill import Skill
from opsdroid.matchers import match_rasanlu, match_parse
import random
class Respond(Skill):
def __init__(self, opsdroid, config):
super().__init__(opsdroid, config)
self.intent = self.config.get("intent")
self.respond = match_rasanlu(self.intent)(self.respond)
@match_parse("hgftgyhjknbvhgftyuihjkvhgftyuijkbjhgtyyuijhkjghfdtrytyihujkbvncgfxdtrytuyiuhkbjvbhcfdytryuhjkbvhcfdyrtuyiuhkbhj")
async def respond(self, message):
response_options = self.config.get("response_options")
await message.respond(random.choice(response_options))
Here, we create a new class that inherits from opsdroid's Skill
class. We give it an asynchronous method called response. We give it a decorator with a match_parse
that will never be matched. This is a work around so we can apply a decorator later without running into an internal error with opsdroid. We then get the possible responses from our config and pick a random one to respond with.
However, in order to get the intent from our config into the decorator, we need to apply the decorator we will actually be matching in our constructor. Therefore we need to overload the constructor while keeping the original functionality for opsdroid to handle it, thus the line beginning with super()
. We can then get the intent and apply the match_rasanlu
decorator.
Now we can create as many skills based off this one script as we like. Enter your opsdroid config and for each instance of this skill, add the following section in the skills section.
rasa_intent:
path: '/path/to/python/skill.py'
intent: "name_of_rasa_intent_to_match"
response_options:
- "This is a possible response"
- "This is another response"
- "And one more response for good luck"
Extra Resources
As always, the official documentation is a great place to start when it comes to programming. Both the rasa and opsdroids documentation are fantastic!