Train All the Things — Version 0.1
My first commit to on-air shows March 3, 2020. I know that the weeks leading up to that commit I spent some time reading through the TF Lite documentation, playing with Cloudflare Workers K/V and getting my first setup of esp-idf squared away. After that it was off to the races. I outlined my original goal in the planning post. I didn't quite get to that goal. The project currently doesn't have a VAD to handle the scenario where I forget to activate the display before starting a call or hangout. Additionally I wasn't able to train a custom keyword as highlighted in the custom model post. I was however able to get a functional implementation of the concept. I am able to hang the display up, and then in my lab with the ESP-EYEplugged in I can use the wake word visual followed by on/off to toggle the display status.
While it’s not quite what I had planned it’s a foundation. I’ve got a lot more tools and knowledge under my belt. Round 2 will probably involved Skainet just due to the limitations in voice data that’s readily available. Keep an eye out for a couple more post highlighting some bumps along the way and summary of lessons learned.
The code, docs, images etc for the project can be found here and I’ll be posting any further updates to HackadayIO. For anybody that might be interested in building this the instructions below provide a brief outline. Updated versions will be hosted in the repo. If you have any questions or ideas reach out.
Required Hardware:
- ESP-EYE
- Optional ESP-EYE case
- PyPortal
- Optional PyPortal case
- Two 3.3v usb to outler adapters and two usb to usb mini cables
OR
- Two 3.3v micro usb wall outlet chargers
Build Steps:
- Clone the on-air repo.
Cloudflare Worker:
- Setup Cloudflare DNS records for your domain and endpoint, or setup a new domain with Cloudflare if you don’t have one to resolve the endpoint.
- Setup a Cloudflare workers account with worker K/V.
- Setup the Wrangler CLI tool.
- cd into the on-air/sighandler directory.
- Update toml
- Run wrangler preview
- wrangler publish
- Update Makefile with your domain and test calling.
PyPortal:
- Setup CircuitPython 5.x on the PyPortal.
- If you’re new to CircuitPython you should read this first.
- Go to the directory where you cloned on-air.
- cd into display.
- Update secrets.py` with your wifi information and status URL endpoint.
- Copy code.py, secrets.py and the bitmap files in screens/ to the root of the PyPortal.
- The display is now good to go.
ESP-EYE:
- Setup esp-idf using the 4.1 release branch.
- Install espeak and sox.
- Setup a Python 3.7 virtual environment and install Tensorflow 1.15.
- cd into on-air/voice-assistant/train
- chmod +x orchestrate.sh and ./orchestrate.sh
- Once training completes cd ../smalltalk
- Activate the esp-idf tooling so that $IDFPATH is set correctly and all requirements are met.
- idf.py menuconfig and set your wifi settings.
- Update the URL in toggle\status.cc
- This should match the host and endpoint you deployed the Cloudflare worker to above
- idf.py build
- idf.py --port \<device port> flash monitor
- You should see the device start, attach to WiFi and begin listening for the wake word “visual” followed by “on” or “off”.