October 15, 2021

Magenta in the Browser with Magenta.js

在本篇中，主要讨论Magenta.js，是Magenta的JavaScript应用版本，可以在浏览器中运行，作为web页面分销，包括如何展示模型，如何混合已经训练的模型。然后回创建一个web应用，使用GANSynth和MusicVAE，采样音频并序列化。

在Magenta.js中，我们使用Music RNN和MusicVAE模型来生成MIDI序列，GANSynth来生成音频。

TensorFlow.js&Magenta.js

首先我们需要了解TensorFlow.js(www.tensorflow.org/js)，它允许我们在浏览器中使用和训练模型。提升和运行预训练的模型。

使用TensorFlow.js很简单，可以使用script标签，来引入：

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs/dist/tf.min.js"></script>
<script>
  const model = tf.sequential();
	model.add(tf.layers.dense({units:1, inputShape:[1]}));
	model.compile({loss: 'meanSquaredError', optimizer:'sgd'});
</script>

或者也可以使用npm或者yard命令使用下面的代码块：

import * as tf from '@tensorflow/tfjs';
const model = tf.sequential();
model.add(tf.layers.dense({units: 1, inputShape:[1]}));
model.compile({loss: 'meanSquareError', optimizer: 'sgd'});

在这两个方法中都使用了tf变量，是和脚本一起导入的，我们不会过于具体的解释TensorFlow.js，我们会专注在Magenta.js中。

TensorFlow.js另一个亮点是使用WebGL进行计算，所以是支持GPU计算，不需要安装CUDA库。我们不需要手动去处理GPU的过程，因为TensorFlow的后段已经帮助我们处理了。

接下来我们需要了解Magenta.js可以做什么。Magenta.js本身不能训练模型，但是可以导入已经训练好的模型。另一个限制是，它并不是支持所有的模型，下面是一个Magenta.js提供的预训练模型：

**Onsets and Frames: **piano脚本话，将生音频数据转化为MIDI

**Music RNN(LSTM): **单复音的MIDI生成，包括Melody RNN， Drums RNN，Improv RNN以及Performance RNN模型。

**MusicVAE: ** 单或多采样，包括GrooVAE

**Piano Genie: ** 将8键输入映射到88键的钢琴

使用GANSynth在浏览器中生成乐器

在这一部分，我们会使用GANSynth去采阳蛋哥乐器notes，是一个4秒的短音频片段。我们会对音频片段分层来实现有趣的效果。首先我们创建HTML淹没，然后导入需要的脚本，然后我们写入GANSynth采样代码然后解释每一部分的细节。

<html lang="en">
<head>
  <title>Music Generation With Magenta.js - GanSynth example</title>
  <style>
    * {
      font-family: monospace;
    }

    canvas {
      width: 100%;
    }
  </style>
</head>
<body>
<div>
  <h1>Music Generation With Magenta.js - GANSynth example</h1>
  <p>
    Press "Sample GANSynth note" to sample a new note using GANSynth and play
    it immediately. You can layer as many notes as you want, each note will
    loop each 4 seconds.
  </p>
  <p>
    Reload the page to stop.
  </p>
  <p>
    <button disabled id="button-sample-gansynth-note">
      Sample GANSynth note
    </button>
  </p>
  <div id="container-plots"></div>
</div>
<script
    src="https://cdn.jsdelivr.net/npm/@magenta/music@1.12.0/dist/magentamusic.min.js"></script>
</body>
</html>

这个页面结构包含了一个button，可以调用GANSynth生成，和一个容器，可以绘制。生成的频谱图。

我们有两种方法在浏览器中使用Magenta.js：

我们可以导入整个Magenta.js在dist/magentamusic.min.js，在Magenta的文档中，是ES5的绑定方法，这个会包含Magenta.js还有所有的依赖，包括TensorFlow.js和Tone.js。
我们可以只导入需要的Magenta.js元素，这是一个ES6的绑定方法，例如，如果我们需要GANSynth模型，我们需要导入Tone.js，Tensorflow.js和Magenta.js core，以及Magenta.js GANSynth。

下面是ES6绑定导入GANSynth模型的方法：

<script
src="https://cdn.jsdelivr.net/npm/tone@13.8.25/build/Tone.min.js"></script>
<script
src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@1.4.0/dist/tf.min.js"></
script>
<script
src="https://cdn.jsdelivr.net/npm/@magenta/music@^1.0.0/es6/core.js"></scri
pt>
<script
src="https://cdn.jsdelivr.net/npm/@magenta/music@^1.0.0/es6/gansynth.js"></
script>

在导入了GANSynth模型之后，我们可以声明使用new gansynth.GANSynth(...)，当我们使用ES6模块，我们需要要单独导入每个脚本。

接下来我们来编写GANSynth代码，然后解释：

第一步，我们需要初始化DOM元素，然后初始化GANSynth，例如：

// Get DOM elements
const buttonSampleGanSynthNote = document.getElementById("button-sample-gansynth-note");
const containerPlots = document.getElementById("container-plots");
// Starts the GANSynth model and initializes it. When finished, enables
// the button to start the sampling async 
function startGanSynth() {
	const ganSynth = new mm.GANSynth("https://storage.googleapis.com/" +
"magentadata/js/checkpoints/gansynth/acoustic_only");
  await ganSynth.initialze();
  window.ganSynth = gansynth;
  buttonSampleGanSynthNote.disabled = false;
}

我们通过mm.GANSynth()实力化GANSynth，如果我们使用Magenta.js ES6，我们会使用以下代码：

1 2	const ganSynth = new gansynth.GANSynth("https://storage.googleapis.com/" + "magentadata/js/checkpoints/gansynth/acoustic_only");

gansynth.GANSynth取代mm.GANSynth。

现在，我们来写一个异步函数，将频谱图生成到canvas中：

// Plots the spectrogram of the given channel
// see music/demos/gansynth.ts:28 in magenta.js source code
async function plotSpectra(spectra, channel) {
  const spectraPlot = mm.tf.tidy(() => {
      // Slice a single example.
    let spectraPlot = mm.tf.slice(spectra, [0, 0, 0, channel], [1, -1, -1,  1])
                      .reshape([128, 1024]);
    // Scale to [0, 1].
    spectraPlot = mm.tf.sub(spectraPlot, mm.tf.min(spectraPlot));
    spectraPlot = mm.tf.div(spectraPlot, mm.tf.max(spectraPlot));
    return spectraPlot;
  });
  // Plot on canvas
  const canvas = document.createElement("canvas");
  containerPlots.appendChild(canvas);
  await mm.tf.browser.toPixels(spectraPlot, canvas);
  spectraPlot.dispose();
}

这个方法创建了一个频谱图，然后插入了一个canvas元素，在containerPlots中；会在每次生成之后加入。

你可能注意到了tf.tidy和dispose，使用这两个方法避免内容泄漏。因为TensorFlow.js使用WebGL去计算，WebGL的资源需要显式回收。

async和await关键字，使用了异步的方法。我们可以通过async进行异步声明，然后需要await来唤起，意味着会等待一个值的返回。所以await只能和async关键字使用，在我们的例子中，mm.tf.browser.toPixels方法被async标记，所以我们需要使用await等待返回，我们也可以使用Promise语法来实现异步操作Promise.all([myAsyncMethod])。

然后我们写一个异步函数，从GANSynth获取样本，然后播放，绘制图像：

// Samples a single note of 4 seconds from GANSynth and plays it repeately
async function sampleGanNote() {
  const lengthInSeconds = 4.0;
  const sampleRate = 16000;
  const length = lengthInSeconds * sampleRate;
  
  // The sampling returns a spectrogram, convert that to audio in
  // a tone.js buffer
  const specgrams = await ganSynth.randomSample(60);
  const audio = await ganSynth.specgramsToAudio(specgrams);
  const audioBuffer = mm.Player.tone.context.createBuffer(
  1, length, sampleRate);
  audioBuffer.copyToChannel(audio,0 ,0);
  
  // Play the sample audio using tone.js and loopit
  const playerOptions = {"url": audioBuffer, "loop": true, "volume": -25};
  const player = new mm.Player.tone.Player(playerOptions).toMaster();
  player.start();
  
  // Plot the resulting spectrograms
  await plotSpectra(specgrams, 0);
  await plotSpectra(specgrams, 1);
}

我们先使用GANSynthrandomSample方法，创建一个pitch为60，C4的参数。这个告诉了模型参数一个值，可以根据pitch作出反应，然后返回一个频谱图转化音频specgramsToAudio。最后我们使用Tone.js buffer来播放样本。

实例化播放器，mm.Player.tone.Player。使用ES6会是：

1	const player = new Tone.Player(playerOptions).toMaster();

最后，让我给项目添加一个按钮来初始化GANSynth操作：

// Add on click handler to call the GANSynth sampling
buttonSampleGanSynthNote.addEventLister("click", () => {
  sampleGanNote();
});

// Call the initializeation of GANSynth
 try {
 Promise.all([startGanSynth()]);
 } catch (error) {
   console.error(error);
 }

最后我们给按钮添加了sampleGanNote方法，可以初始化GANSynth，通过startGanSynth。

Lauching the web application

现在我们已经有了web app，我们可以测试我们的代码。

在之前的内容中，我们生成了一些GANSynthsamples，每一个生产的图表都有两个频谱图。当完成之后 Sample GANSYnth note的按钮会可用。

继续生成一些声音：你可以得到一些有趣的效果，当叠加不同的效果。

Generating a trio using MusicVAE

我们现在使用Magenta.js的MusicVAE模型生成一些序列，然后在播放器中使用Tone.js播放他们。我们使用trio模型的站点，意味着我们会生成三个序列：drum kit， bass kit 还有lead。

首先，我们定义页面结构，导入script:

<html lang="en"> 
  <body> 
  <div> 
  <button disabled id="button-sample-musicae-trio">
 Sample MusicVAE trio
	</button> <canvas id="canvas-musicvae-plot"></canvas>
	</div> 
	<script
src="https://cdn.jsdelivr.net/npm/@magenta/music@1.12.0/dist/magent
amusic.min.js"></script> <script>
// MusicVAE code
</script>
</body>
</html>

然后，我们初始化MusicVAE模型，如下：

// Get DOM element
const buttonSampleMusicVaeTrio = document.getElementById("button-sample-musicae-trio");
const canvasMusicVaePlot = document.getElementById("canvas-musicvae-plot");

// Starts the MusicvAE model and initializes it. When finished, enables
// the button to start the sampling
async function startMusicVAE() {
  const musicvae = new mm.MusicVAE("https://storage.googleapis.com/" +
"magentadata/js/checkpoints/music_vae/trio_4bar");
  await muscivae.initialize();
  window.musicvae = musicvae;
  buttonSampleMusicVaeTrio.disalbed = false;
}

我们现在创建一个新的Tone.js播放器来播放生成的三个序列:

// Declares a new player that have 3 synths for the drum kit (only the bass drum), the bass and the lead.
class Player extends mm.BassPlayer {
  bassDrumSynth = new mm.Player.tone.MembraneSynth().toMaster();

bassSynth = new mm.Player.tone.Synth({
  valume: 5,
  oscillator:{type: "triangle"}
}).toMaster();
leadSynth = new mm.Player.tone.PolySynth(5).toMaster();

// Plays the note at the proper time using tone.js
playNote(time, note) {
  let frequency, duration, synth;
  if (note.isDrum) {
    if (note.pitch === 35 || note.pitch === 36) {
      // If this is a bass drum, we use the kick pitch for an eight note and the bass drum synth
      frequency = "C2";
      duration = "8n";
      synth = this.bassDrumSynth;
    }
  } else {
    // If this is a bass note or lead note, we convert hte frequency and the duration for  tone.js and fetch the proper synth
    frequency = new mm.Player.tone.Frequence(note.pitch, "midi");
    duration = note.endTime - note.startTime;
    if (note.program >= 32 && note.program <= 39) {
      synth = this.bassSynth;
    } else {
      synth =this.leadSynth;
    }
  }
  if (synth) {
    synth.triggerAttackRelease(frequency, duration, time, 1);
  	}
	}
}

这个代码扩展了类mm.BasePlayer，我们只使用playNote方法就可以播放序列；首先我们定义了三个合成器：bassDrumSynth, bassSynth, leadSynth:

bass drum synth只会播放bass drum，通过note.isDrum和MIDI notes35或36表示，通常播放C2的频率和8notes长度，8n，使用Tone.js的MembranceSynth来实现。乐器都会被定义在具体的note’s pitch，例如pitch 35是Acoustic Bass Drum。
bass synth只会播放项目中的32-39，使用Tone.js的Synth的三角波形。在MIDI规格中，项目会指定具体的乐器播放。例如program 1是Acoustic Grand Piano，program 33是Acoustic Bass。
lead synth使用Tone.js的PolySynth来播放5个音。
我们首先要转化MIDI note到Tone.js频率，使用Frequency类。

另一个重要的需要讨论的是，note envelope，使用Tone.js的triggerAttackRelease方法。一个封装动作会让音乐在一段时间内可以被听到。就像是打开信封，然后声音可以被听到，收起来，就无法听到，slope可以控制播放速率。这个动作叫做attack和release。每次我们唤起trigger方法，合成器会在被给定的时间段内，使用设定的斜率slope。

另一个名词是ADSR（Attack Decay Sustain Release），是一个更加复杂的包络形式。

让我们来采样MusicVAE模型，如下：

// Samples a trio of drum kit, bass and lead from MusicVAE and plays it repeatedly at 120QPM
async functio nsampleMusicVaeTrio() }
const samples = await musicvae.sample(1);
const sample = samples[0];
new mm.PianoRollCanvasVisualizer(sample, canvasMusicVaePlot, {"pixelsPerTimeStep": 50});
const player = new Player();
mm.Player.tone.Transport.loop = true;
mm.Player.tone.Transport.loopStart = 0;
mm.Player.tone.Transport.loopEnd = 8;
player.start(sample, 120);

首先，我们使用sample方法和参数1，来采样MusicVAE模型。然后绘制note序列，通过使用mm.PianoRollCanvasVisualizer在之前声明过的canvas中。最后，我们开始播放样本在120QPM，然后在8秒的小姐中循环，使用Tone.js的Transport类。MusicVAE模型已经修正了长度，如果我们使用4-bar trio模型，我们会生成8秒的样本，在120QPM。

最后，让我们绑定一个按钮动作，来初始化MusicVAE模型：

// Add on click handler to call the MusicVAE sampling
buttonSampleMusicVaeTrio.addEventListener("click", (event) => {
  sampleMusicVaeTrio();
  event.target.disabled = true;
});

// Calls the initialization of MusicVAE
try {
  Promise.all([startMusicVae()]);
} catch (error) {
  console.error(error);
}

我们绑定按钮在使用sampleMusicVaeTrio方法，然後我們初始化MusicVAE模型，使用startMusicVae方法，你可以看到我們使用了Promise.all來調用之前准备的異步函數。在按下Sample MusicVAE trio按钮，这个MusicVAE会采样一个序列，然后绘制图像，最后播放我们定义的合成器，这个生产的图像很基础，没有显示不同的乐器，但是可以通过PianoRollCanvasVisualizer类来自定义，刷新页面会后生成新的序列。

使用SoundFont建立更真实的乐器声音

当你在听到生成的声音时候，可能注意到声音会有一点bacis或者simple。是因为我们使用了Tone.js的默认合成器，这很方便使用，但是却没有那么理想，Tone.js的合成器可以被定制，听起来效果更好。

所以我们可以使用SoundFont。SoundFont记录了不同乐器的notes，在Magenta.js中，我们可以使用SoundFontPlayer取代Player：

1
2
3

const player = new mm.SoundFontPlayer("https://storage.googleapis.com/" +
 "magentadata/js/soundfonts/salamander");
player.start(sequence, 120);

通过trio播放生成的乐器

现在，我们有MusicVAE生成的三乐器序列，和GANSynth生成的音频，现在把这两个部分连接起来。

定义页面结构和脚本导入：

<html lang="en"> <body>
<div>
 <button disabled id="button-sample-musicae-trio">
 Sample MusicVAE trio
 </button>
 <button disabled id="button-sample-gansynth-note">
 Sample GANSynth note for the lead synth
 </button>
 <canvas id="canvas-musicvae-plot"></canvas>
 <div id="container-plots"></div>
</div>
<script
src="https://cdn.jsdelivr.net/npm/@magenta/music@1.12.0/dist/magent
amusic.min.js"></script> <script>
// MusicVAE + GANSynth code
</script>
</body>
</html>

让我们初始化MusicVAE模型和GANSynth模型，如下：

// Get DOM elements
const buttonSampleGanSynthNote = document
.getElementById("button-sample-gansynth-note");
const buttonSampleMusicVaeTrio = document
.getElementById("button-sample-musicae-trio");
const containerPlots = document
.getElementById("container-plots");
const canvasMusicVaePlot = document
.getElementById("canvas-musicvae-plot");
// Starts the MusicVAE model and initializes it. When finished,
enables
// the button to start the sampling
async function startMusicVae() {
const musicvae = new
mm.MusicVAE("https://storage.googleapis.com/" +"magentadata/js/checkpoints/music_vae/trio_4bar");
await musicvae.initialize();
window.musicvae = musicvae;
buttonSampleMusicVaeTrio.disabled = false; }
// Starts the GANSynth model and initializes it
async function startGanSynth() {
const ganSynth = new
mm.GANSynth("https://storage.googleapis.com/" + "magentadata/js/checkpoints/gansynth/acoustic_only");
await ganSynth.initialize();
window.ganSynth = ganSynth
}

在这里我们可用了MusicVAE sampling按钮，GANSynth sampling按钮可以在MusicVAE生成后可用。

继续使用plotSpectra方法。
保留声音合成器的Player类，我们可以设置leadSynth = null，因为他会取代GANSynth生成，但不是必要的。
保留sampleMusicVaeTrio方法，但是我们也会设置window.player = player实例化的播放器，最为全局变量。因为GANSynth会需要改变lead synth。

我们重写sampleGanNote方法来添加样本播放器：

// Samples a single note of 4 seconds from GANSynth and plays it repeatedly
async function sampleGanNote() {
  const lengthInSeconds = 4.0;
  const sampleRate = 16000;
  const length = lengthInSeconds * sampleRate;
  
  // The sampling returns a spectrogram, convert that to audio in a tone.js buffer
  const specgrams = await ganSynth.randomSample(60);
  const audio = await ganSynth.specgramsToAudio(specgrams);
  const audioBuffer = mm.Player.tone.context.createBuffer(1, length, sampleRate);
  audioBuffer.copyToChannel(audio, 0, 0);
  
  // Plays the sample using tone.js by using C4 as a base note, since this is what we asked the model for (MIDI pitch 60).
  // If the sequence contains other notes, the pitch will be changed automatically
  const volume = new mm.Player.tone.Sampler({"C4": audioBuffer});
  instrument.chain(volume, mm.Player.tone.Master);
  window.player.leadSynth = instrument;
  
  // Plots the resulting spectrograms
  await plotSpectra(specgrams, 0);
  await plotSpectra(specgrams, 1);
}

首先，我们使用randomSample从GANSynth中采样一个随机的乐器。然后我们需要通过Tone.js合成器来播放，所以我们使用Sampler类，包含一个键值对字典。因为我们采样的模型使用MIDI pitch 60，我们使用C4处理最后的audio buffer，使用window.player.leadSynth = instrument将合成器添加到播放器中。

将样本绑定按钮，然后初始化MusicVAE和GANSynth模型，如下：

// Add on click handler ti call the MusicVAE sampling
buttonSampleMusicVaeTrio.addEventListener("click", (event) => {
  sampleMusicVaeTrio();
  event.target.disabled = true;
  buttonSampleGanSynthNote.disabled = false;
});

// Add onclick handler to call the GANSynth sampling
buttonSampleGanSynthNote.addEventListener("click", () => {
  sampleGanNote();
});

// Calls the initialization of MusicVAE and GasnSynth

try {
  Promise.all([startMusicVae(), startGanSynth()]);
} catch (error) {
  console.log(error);
}

通过点击Sample MusicVAE trio按钮，MusicVAE应该采样序列，可视化绘制，然后使用合成器播放。

使用Web Worker API从UI线程中卸载计算

在之前的例子中，你是用Sample GANSynth note for the lead synth按钮时，音频是无法被听到的。

这是因为JavaScript的并发建立在事件循环模式上，所有的任务都通过UI线程处理。这样的工作模式不错，因为JavaScript使用非阻塞的I/O，意味着大多数的高成本操作可以被快速完成，然后使用事件和回调函数返回数值。然而，如果一个冗长计算是异步的，他就会阻塞UI线程。这就是当GANSynth生成sample的时候所发生的情况。

我们的解决方案是使用Web Workers API，可以卸载计算到其他线程，而不会阻塞带UI线程中。一个web worker是一个基础的JavaScript文件，从主线程开始，运行在自己的现场之中，可以从主线程中发送和接受消息。Web Worker API非常成熟，可以跨浏览器支持。

让我们编写部分的JavaScript代码：

// Starts a new worker that will load the MusicVAE model
const worker new Worker("sketch.js");
worker.onmessage = function (event) {
  const message = event.data[0];
  if (message ==="initialized") {
    // When the worker sends the "initialized" message, we enable the button to sample the model
    buttonSampleMusicVaeTrio.disabled = false;
  }
  if (message ==="sample") {
    // When the worked sends the "sample" message, we take the data (the note sequence sample), from the event, create and start a new player, using the sequence
    const data = event.data[1];
    const sample = data[0];
    const player = new mm.player();
    mm.Player.tone.Transport.loop = true;
    mm.Player.tone.Transport.loopStart = 0;
    mm.Player.tone.Transport.loopEnd = 8;
    player.start(sample, 120);
  }
};

// Add click handler to call the MusicVAE sampling, by posting a message to the web worker which sample and return the sequence using a message 
const buttonSampleMusicVaeTrio = document.getElementById("button-sample-musicvae-trio ");
buttonSampleMusicVaeTrio.addEventListener("click", (event) => {
  worker.postMessage([]);
  event.target.disabled = true;
});

现在让我们来解释上述的买的内容，web worker如何创建和传递message，在主线程与web worker之间，如下：

首先，我们需要启动worker，通过使用new Worker("sketch.js")。这个会运行JavaScript文件然后返回一个handle我们可以注册变量。
然后，我们绑定onmessage属性给worker，这个会在worker使用postMessage函数后唤起。在event的data属性，我们可以传递我们想要的任何东西：
- 如果worker发送initialized作为data数组的第一个元素，这意味着worker已经初始化了。
- 如果worker发送sanple作为data数组的第一个元素，这意味着worker已经采样了MusicVAE序列，然后正在返回它，作为第二个元素加在data数组中。
最后，当HTML中的按钮被点击后，我们唤起了postMessage方法。web worker不会和主线程分享状态，意味着所有的数据分享需要使用onmessage和postMessage方法或其他函数。

现在，我们来写JavaScript的worker代码：

importScripts("https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@1.4.0/dist/tf.min.js");
importScripts("https://cdn.jsdelivr.net/npm/@magenta/music@^1.12.0/es6/core.js");
importScripts("https://cdn.jsdelivr.net/npm/@magenta/music@^1.12.0/es6/music_vae.js");
            
async function initialize() {
  musicvae = new music_vae.MusicVAE("https://storage.googleapis.com/" + "magentadata/js/checkpoints/music_vae/trio_4bar");
  await musicvae.initialize();
  postMessage(["initialized"]);
}

onmessage = function (event) {
  Promise.all([musicvae.sample(1)])
  	.then(samples => postMessage(["sample", samples[0]]));
};

try {
  Pormise.all([initialize()]);
} catch (error) {
  console.error(error);
}

首先，我们需要发送一个initialized消息给主线程，使用postMessage，当模型准备去roll的时候。第二部，我们绑定模型onmessage属性，当主线程发送给worker消息后会被唤起。我们采样了MusicVAE模型，然后使用postMessage 方法发送结果给主线程。以上就是如何创建web worker，并与主线程交换数据的方法。

使用其他的Magenta.js模型

我们不能覆盖所有的模型用力，但是使用其他模型的方法也类似。因为Magenta.js运行在浏览器中，所以很难与其他应用互动，但也因为如此，它更加简单。

About this Post

This post is written by Siqi Shu, licensed under CC BY-NC 4.0.