{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# This cell is added by sphinx-gallery\n# It can be customized to whatever you like\n%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Optimizing a quantum optical neural network\n===========================================\n\n*Author: PennyLane dev team. Last updated: 15 Apr 2021.*\n\nThis tutorial is based on a paper from [Steinbrecher et al.\n(2019)](https://www.nature.com/articles/s41534-019-0174-7) which\nexplores a Quantum Optical Neural Network (QONN) based on Fock states.\nSimilar to the continuous-variable quantum neural network\n</demos/quantum\\_neural\\_net> (CV QNN) model described by\n[Killoran et al.\n(2018)](https://journals.aps.org/prresearch/abstract/10.1103/PhysRevResearch.1.033063),\nthe QONN attempts to apply neural networks and deep learning theory to\nthe quantum case, using quantum data as well as a quantum hardware-based\narchitecture.\n\nWe will focus on constructing a QONN as described in Steinbrecher et al.\nand training it to work as a basic CNOT gate using a \"dual-rail\" state\nencoding. This tutorial also provides a working example of how to use\nthird-party optimization libraries with PennyLane; in this case,\n[NLopt](https://nlopt.readthedocs.io/en/latest/) will be used.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![A quantum optical neural network using the Reck encoding (green) with\na Kerr non-linear layer\n(red)](../demonstrations/qonn/qonn_thumbnail.png){width=\"100%\"}\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Background\n==========\n\nThe QONN is an optical architecture consisting of layers of linear\nunitaries, using the encoding described in [Reck et al.\n(1994)](https://dx.doi.org/10.1103/PhysRevLett.73.58), and Kerr\nnon-linearities applied on all involved optical modes. This setup can be\nconstructed using arrays of beamsplitters and programmable phase shifts\nalong with some form of Kerr non-linear material.\n\nBy constructing a cost function based on the input-output relationship\nof the QONN, using the programmable phase-shift variables as\noptimization parameters, it can be trained to both act as an arbitrary\nquantum gate or to be able to generalize on previously unseen data. This\nis very similar to classical neural networks, and many classical machine\nlearning task can in fact also be solved by these types of quantum deep\nneural networks.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Code and simulations\n====================\n\nThe first thing we need to do is to import PennyLane, NumPy, as well as\nan optimizer. Here we use a wrapped version of NumPy supplied by\nPennyLane which uses Autograd to wrap essential functions to support\nautomatic differentiation.\n\nThere are many optimizers to choose from. We could either use an\noptimizer from the `pennylane.optimize` module or we could use a\nthird-party optimizer. In this case we will use the Nlopt library which\nhas several fast implementations of both gradient-free and\ngradient-based optimizers.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import pennylane as qml\nfrom pennylane import numpy as np\n\nimport nlopt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We create a Strawberry Fields simulator device with as many quantum\nmodes (or wires) as we want our quantum-optical neural network to have.\nFour modes are used for this demonstration, due to the use of a\ndual-rail encoding. The cutoff dimension is set to the same value as the\nnumber of wires (a lower cutoff value will cause loss of information,\nwhile a higher value might use unnecessary resources without any\nimprovement).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"dev = qml.device(\"strawberryfields.fock\", wires=4, cutoff_dim=4)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> **note**\n>\n> You will need to have [Strawberry\n> Fields](https://strawberryfields.ai/) as well as the [Strawberry\n> Fields plugin](https://pennylane-sf.readthedocs.io/en/latest/) for\n> PennyLane installed for this tutorial to work.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Creating the QONN\n=================\n\nCreate a layer function which defines one layer of the QONN, consisting\nof a linear\n[interferometer](https://pennylane.readthedocs.io/en/stable/code/api/pennylane.templates.subroutines.Interferometer.html)\n(i.e., an array of beamsplitters and phase shifts) and a non-linear Kerr\ninteraction layer. Both the interferometer and the non-linear layer are\napplied to all modes. The triangular mesh scheme, described in [Reck et\nal. (1994)](https://dx.doi.org/10.1103/PhysRevLett.73.58) is chosen here\ndue to its use in the paper from Steinbrecher et al., although any other\ninterferometer scheme should work equally well. Some might even be\nslightly faster than the one we use here.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def layer(theta, phi, wires):\n M = len(wires)\n phi_nonlinear = np.pi / 2\n\n qml.templates.Interferometer(\n theta, phi, np.zeros(M), wires=wires, mesh=\"triangular\",\n )\n\n for i in wires:\n qml.Kerr(phi_nonlinear, wires=i)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we define the full QONN by building each layer one-by-one and then\nmeasuring the mean photon number of each mode. The parameters to be\noptimized are all contained in `var`, where each element in `var` is a\nlist of parameters `theta` and `phi` for a specific layer.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"@qml.qnode(dev)\ndef quantum_neural_net(var, x):\n wires = list(range(len(x)))\n\n # Encode input x into a sequence of quantum fock states\n for i in wires:\n qml.FockState(x[i], wires=i)\n\n # \"layer\" subcircuits\n for i, v in enumerate(var):\n layer(v[: len(v) // 2], v[len(v) // 2 :], wires)\n\n return [qml.expval(qml.NumberOperator(w)) for w in wires]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Defining the cost function\n==========================\n\nA helper function is needed to calculate the normalized square loss of\ntwo vectors. The square loss function returns a value between 0 and 1,\nwhere 0 means that `labels` and `predictions` are equal and 1 means that\nthe vectors are fully orthogonal.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def square_loss(labels, predictions):\n term = 0\n for l, p in zip(labels, predictions):\n lnorm = l / np.linalg.norm(l)\n pnorm = p / np.linalg.norm(p)\n\n term = term + np.abs(np.dot(lnorm, pnorm.T)) ** 2\n\n return 1 - term / len(labels)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we define the cost function to be used during optimization. It\ncollects the outputs from the QONN (`predictions`) for each input\n(`data_inputs`) and then calculates the square loss between the\npredictions and the true outputs (`labels`).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def cost(var, data_input, labels):\n predictions = np.array([quantum_neural_net(var, x) for x in data_input])\n sl = square_loss(labels, predictions)\n\n return sl"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Optimizing for the CNOT gate\n============================\n\nFor this tutorial we will train the network to function as a CNOT gate.\nThat is, it should transform the input states in the following way:\n\n![](../demonstrations/qonn/cnot.png){width=\"30%\"}\n\n|\n\nWe need to choose the inputs `X` and the corresponding labels `Y`. They\nare defined using the dual-rail encoding, meaning that\n$|0\\rangle = [1, 0]$ (as a vector in the Fock basis of a single mode),\nand $|1\\rangle = [0, 1]$. So a CNOT transformation of\n$|1\\rangle|0\\rangle = |10\\rangle = [0, 1, 1, 0]$ would give\n$|11\\rangle = [0, 1, 0, 1]$.\n\nFurthermore, we want to make sure that the gradient isn't calculated\nwith regards to the inputs or the labels. We can do this by marking them\nwith requires\\_grad=False.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Define the CNOT input-output states (dual-rail encoding) and initialize\n# them as non-differentiable.\n\nX = np.array([[1, 0, 1, 0],\n [1, 0, 0, 1],\n [0, 1, 1, 0],\n [0, 1, 0, 1]], requires_grad=False)\n\nY = np.array([[1, 0, 1, 0],\n [1, 0, 0, 1],\n [0, 1, 0, 1],\n [0, 1, 1, 0]], requires_grad=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"At this stage we could play around with other input-output combinations;\njust keep in mind that the input states should contain the same total\nnumber of photons as the output, since we want to use the dual-rail\nencoding. Also, since the QONN will act upon the states as a unitary\noperator, there must be a bijection between the inputs and the outputs,\ni.e., two different inputs must have two different outputs, and vice\nversa.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> **note**\n>\n> Other example gates we could use include the dual-rail encoded SWAP\n> gate,\n>\n> ``` {.sourceCode .python}\n> X = np.array([[1, 0, 1, 0],\n> [1, 0, 0, 1],\n> [0, 1, 1, 0],\n> [0, 1, 0, 1]])\n>\n> Y = np.array([[1, 0, 1, 0],\n> [0, 1, 1, 0],\n> [0, 0, 0, 1],\n> [0, 1, 0, 1]])\n> ```\n>\n> the single-rail encoded SWAP gate (remember to change the number of\n> modes to 2 in the device initialization above),\n>\n> ``` {.sourceCode .python}\n> X = np.array([[0, 1], [1, 0]])\n> Y = np.array([[1, 0], [0, 1]])\n> ```\n>\n> or the single 6-photon GHZ state (which needs 6 modes, and thus might\n> be very heavy on both memory and CPU):\n>\n> ``` {.sourceCode .python}\n> X = np.array([1, 0, 1, 0, 1, 0])\n> Y = (np.array([1, 0, 1, 0, 1, 0]) + np.array([1, 0, 1, 0, 1, 0])) / 2\n> ```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we must set the number of layers to use and then calculate the\ncorresponding number of initial parameter values, initializing them with\na random value between $-2\\pi$ and $2\\pi$. For the CNOT gate two layers\nis enough, although for more complex optimization tasks, many more\nlayers might be needed. Generally, the more layers there are, the richer\nthe representational capabilities of the neural network, and the better\nit will be at finding a good fit.\n\nThe number of variables corresponds to the number of transmittivity\nangles $\\theta$ and phase angles $\\phi$, while the Kerr non-linearity is\nset to a fixed strength.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"num_layers = 2\nM = len(X[0])\nnum_variables_per_layer = M * (M - 1)\n\nrng = np.random.default_rng(seed=1234)\nvar_init = (4 * rng.random(size=(num_layers, num_variables_per_layer)) - 2) * np.pi\nprint(var_init)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The NLopt library is used for optimizing the QONN. For using\ngradient-based methods the cost function must be wrapped so that NLopt\ncan access its gradients. This is done by calculating the gradient using\nautograd and then saving it in the `grad[:]` variable inside of the\noptimization function. The variables are flattened to conform to the\nrequirements of both NLopt and the above-defined cost function.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"cost_grad = qml.grad(cost)\n\nprint_every = 1\n\n# Wrap the cost so that NLopt can use it for gradient-based optimizations\nevals = 0\ndef cost_wrapper(var, grad=[]):\n global evals\n evals += 1\n\n if grad.size > 0:\n # Get the gradient for `var` by first \"unflattening\" it\n var_grad = cost_grad(var.reshape((num_layers, num_variables_per_layer)), X, Y)\n grad[:] = var_grad.flatten()\n cost_val = cost(var.reshape((num_layers, num_variables_per_layer)), X, Y)\n\n if evals % print_every == 0:\n print(f\"Iter: {evals:4d} Cost: {cost_val:.4e}\")\n\n return float(cost_val)\n\n\n# Choose an algorithm\nopt_algorithm = nlopt.LD_LBFGS # Gradient-based\n# opt_algorithm = nlopt.LN_BOBYQA # Gradient-free\n\nopt = nlopt.opt(opt_algorithm, num_layers*num_variables_per_layer)\n\nopt.set_min_objective(cost_wrapper)\n\nopt.set_lower_bounds(-2*np.pi * np.ones(num_layers*num_variables_per_layer))\nopt.set_upper_bounds(2*np.pi * np.ones(num_layers*num_variables_per_layer))\n\nvar = opt.optimize(var_init.flatten())\nvar = var.reshape(var_init.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> **note**\n>\n> It\u2019s also possible to use any of PennyLane\u2019s built-in gradient-based\n> optimizers:\n>\n> ``` {.sourceCode .python}\n> from pennylane.optimize import AdamOptimizer\n>\n> opt = AdamOptimizer(0.01, beta1=0.9, beta2=0.999)\n>\n> var = var_init\n> for it in range(200):\n> var = opt.step(lambda v: cost(v, X, Y), var)\n>\n> if (it+1) % 20 == 0:\n> print(f\"Iter: {it+1:5d} | Cost: {cost(var, X, Y):0.7f} \")\n> ```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we print the results.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"print(f\"The optimized parameters (layers, parameters):\\n {var}\\n\")\n\nY_pred = np.array([quantum_neural_net(var, x) for x in X])\nfor i, x in enumerate(X):\n print(f\"{x} --> {Y_pred[i].round(2)}, should be {Y[i]}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also print the circuit to see how the final network looks.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"quantum_neural_net(var_init, X[0])\nprint(quantum_neural_net.draw())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.2"
}
},
"nbformat": 4,
"nbformat_minor": 0
}