Introduction:
HandMusic可以通过webcam实现手势追踪,实现非触控的音乐切换和混合。
This is a simple demo work that uses handtrack.js and Tone.js; by capturing the position of the hand, it can switch the background music and change the simple filter of the canvas; similarly, a simple drums created loops by Tone.js, you can simply mix different sounds together.
You can try my online demo:
CodeSandbox: https://ggddh.csb.app
You can also watch the demo video online:
YouTube: https://youtu.be/8Pg5hY85aWA
You can find this project in my code repository:
Github: https://github.com/ShuSQ/CCI_ACC_FP2_handMusic
0.Why did I do this project? 🧐
When I was a high school student, I tried audio production software like fruitloops, and I was very interested in audio production methods. At that time, I thought about whether users can control the audio through some interactive methods of the body. I have come into contact with TensorFlow this semester, and I decided to explore and experiment in this direction.
Image from:34.4.magnusson.pdf
1.Handtrack 🖐🏼
First, I need to find a way to recognize the hand and control it through the video information collected by the webcam. Among the available technical tools, we have many choices, such as TensorFlow, openCV and MediaPipe, etc. Finally, I chose Handtrack.js as the technical solution for finding hands.
What is Handtrack.js?
Handtrack.js is a library for prototyping realtime hand detection (bounding box), directly in the browser. It frames handtracking as an object detection problem, and uses a trained convolutional neural network to predict bounding boxes for the location of hands in an image.
Why I choose Handtrack.js?
Because I am more familiar with JavaScript, and it is easier to obtain the case-study of Handtrack.js on the Internet; of course, in terms of calculation speed and accuracy, it is not as good as the Python+MediaPipe solution. Considering the scale of the project, the current Performance is also acceptable.
Other, we can also detect the image information collected by the webcam through JavaScript, compare each frame, and then detect the motion information. This is a clever method, and it can also generate its own style of video.
You can learn more from this code repository here:
https://github.com/jasonmayes/JS-Motion-Detection
Testing the technical solutions of tensorflow.js and fingerpose is great, but we don’t need it at the moment
Handtrack.js is very convenient to use, we can quickly configure parameters:
1 | const modelParams = { // 導入handtrack默認的參數模型 |
I learned a lot from these materials:
- Real Time AI GESTURE RECOGNITION with Tensorflow.JS
- Handtrack.js: Hand Tracking Interactions in the Browser using Tensorflow.js
- Controlling 3D Objects with Hands
- Machine learning for everyone: How to implement pose estimation in a browser using your webcam
- Teachable Machine
- Air guitar tutorial
- A modern approach for Computer Vision on the web
- Motion Tracking & Music in < 100 lines of JavaScript
- MediaPipe in JavaScript
2.Binding with audio player
Create an empty audio tag in .html
:
1 | <audio></audio> |
We need to get the page elements in handplay.js:
1 | const audio = document.getElementsByTagName("audio")[0]; |
Through the length of predictions, bbox
is the coordinate of the box that stores the recognition result. We can read the xy coordinate information of the handbook through bbox[0] and bbox[1], and then divide it into 3 recognition areas:
1 | if (predictions.length > 0) { |
The following articles are very helpful for me:
Making Music In A Browser: Recreating Theremin With JS And Web Audio API
How To Create A Responsive 8-Bit Drum Machine Using Web Audio, SVG And Multitouch
3.Add simple filter effects
In addition to switching music with gestures, I also considered adding simple filters to change the effect of the video, mainly through the filter attribute of CSS, using the three effects of blur
brightness
contrast
, and initially considered implementing similar to TensorFlow The effect of NST (Neural Style Transfer), considering the performance of the browser and the implementation cost, chose a more convenient filter solution.
4.Create loops by Tone.js
Tone.js is a Web Audio framework for creating interactive music in the browser. Earlier, Tone.js and magenta.js were compared. There are many similarities between the two, and magenta.js is okay. Through MusicRNN and MusicVAE to complete machine learning, I really want to do this, let the ambient sound, background sound and Samples mix! However, there are not enough materials for making music with Magenta.js on the Internet. After learning, I found that I could not achieve it in a short time, so I chose Tone.js. To achieve this, Tone.js loads audio samples and sets the scheduleRepeat() poetry selection loop effect, and sets the playback step through the <input type="checkbox">
tag.
1 | // .HTML |
1 | function sequencer() { //載入音頻 |
1 | function repeat() { // 實現單個drums的播放 |
In addition, because the browser prohibits the automatic playback of audio, we need to add resume() to skip this:
1 | document.documentElement.addEventListener( |
I learned a lot from these materials:
5. Can do more
After the whole demo is finished, I have a better understanding of webaudio and webcam, but I also have more things to enrich, such as:
- Rich musicmachine GUI elements, you can control the step and volume, and choose more drums types;
- Consider using a track library with better performance to achieve more detailed interaction, such as using fingers to play audio sources and click checkboxes;
- Do more experiments on the filter effect, and have parameter conduction with the mixed audio;
- Use MusicRNN, MusicVAE, etc. to convolve the audio, you can mix the audio through more modes.
- Consider tracking targets other than hands, such as facemesh, etc.
About this Post
This post is written by Siqi Shu, licensed under CC BY-NC 4.0.