CN113496281B

CN113496281B - Optoelectronic computing system

Info

Publication number: CN113496281B
Application number: CN202110291311.8A
Authority: CN
Inventors: 孟怀宇; Y.徐; G.亨德里; L.欧; J.邓; R.加格农; 卢正观; M.斯坦曼; M.埃文斯; 吴建华; 沈亦晨
Original assignee: Photon Smart Private Technology Co ltd
Current assignee: Photon Smart Private Technology Co ltd
Priority date: 2020-03-19
Filing date: 2021-03-18
Publication date: 2025-03-07
Anticipated expiration: 2041-03-18
Also published as: TW202147060A; CN113496281A

Abstract

An optoelectronic computing system includes: a first semiconductor die having a photonic integrated circuit (PIC) and a second semiconductor die having an electronic integrated circuit (EIC). The PIC includes an optical waveguide, wherein input values are encoded on respective optical signals carried by the optical waveguide. The PIC includes an optical replication distribution network having an optical splitter. The PIC includes an array of optoelectronic circuit portions, each optoelectronic circuit portion receiving an optical wave from one of the output ports of the optical replication distribution network, and each optoelectronic circuit portion includes: at least one photodetector that detects at least one optical wave from the optoelectronic operation. The EIC includes an electrical input port that receives a corresponding electrical value. The first semiconductor die is electrically coupled to the second semiconductor die with a controlled collapse chip connection, wherein the electrical output port of the PIC is connected to one of the electrical input ports of the EIC.

Description

Photoelectric computing system

Cross Reference to Related Applications

The present application claims priority from PCT application PCT/US2020/023674 filed on 3/19/2020, and from U.S. provisional patent application 63/061,995 filed on 8/6/2020. The entire disclosure of the above application is incorporated herein by reference.

Technical Field

The present disclosure relates to an optoelectronic computing system.

Background

Neuromorphic computation (neuromorphic computing) is a method of approximating brain operations in the electronics field. One prominent approach to neuromorphic computation is an artificial neural network (ARTIFICIAL NEURAL NETWORK; ANN), which is a collection of artificial neurons that are interconnected in a specific manner to process information in a manner similar to brain function. Artificial neural networks have found use in a variety of applications including artificial intelligence, speech recognition, text recognition, natural language processing, and various forms of pattern recognition.

The artificial neural network has an input layer, one or more hidden layers, and an output layer. Each layer has nodes or artificial neurons, and the nodes are interconnected between layers. Each node of the hidden layer performs a weighted sum (weighted sum) of signals received from nodes of the previous layer and performs a nonlinear transformation ("activation") of the weighted sum to produce an output. The weighted sum may be calculated by performing a matrix multiplication step. Thus, computing artificial neural networks typically involves multiple matrix multiplication steps, which are typically performed using electronic integrated circuits.

The computation performed on electronic data encoded in analog or digital form on an electronic signal (e.g., voltage or current) is typically implemented using electronic computing hardware, such as analog or digital electronics implemented in an integrated circuit (e.g., a processor, application-specific integrated circuit (ASIC), or system on a chip (SoC)), an electronic circuit board, or other electronic circuit. Optical signals have been used to transmit data over long and short distances (e.g., within a data center). Operations performed on such optical signals are typically performed in the context of optical data transmission, such as within an apparatus for switching or filtering optical signals in a network. The use of optical signals in computing platforms has been more limited. Various components and systems for all-optical (all-optical) computing have been proposed. Such a system may include conversion from and to electrical signals at input and output, respectively, but may not use both types of signals (electrical and optical) for important operations performed in the computation.

Disclosure of Invention

In general terms, in a first aspect, an optoelectronic computing system includes a first semiconductor die comprising a Photonic Integrated Circuit (PIC) including a plurality of optical waveguides in which a set of a plurality of input values are encoded on respective optical signals carried by the optical waveguides, an optical replication distribution network including a plurality of optical splitters in which each optical splitter transmits half of the power of an input optical wave at an input port to each of two output ports, and an array of optoelectronic circuit portions each receiving an optical wave from one of the output ports of the optical replication distribution network, and each optoelectronic circuit portion including at least one photodetector that detects at least one optical wave from an optoelectronic operation, and at least one conductor integrated in the photonic integrated circuit that is electrically coupled to the photodetector and to an electrical output port, and a second semiconductor die including an Electronic Integrated Circuit (EIC) including a plurality of electrical input ports that receives a respective electrical value, wherein the semiconductor die is electrically coupled to the first semiconductor die and the electrical output port.

Embodiments of the system may include one or more of the following features.

Each optoelectronic circuit portion includes an optoelectronic operation module that performs an operation between (1) an optical value based on one of the input values scaled by the optical replication distribution network and (2) an electrical value provided by an electrical input port, at least one photodetector that detects at least one optical wave from the optoelectronic operation, and at least one wire integrated in the photonic integrated circuit that is electrically coupled to the photodetector and to an electrical output port.

The electronic integrated circuit further includes a plurality of digital-to-analog converters (DACs) that provide electrical values to respective electrical output ports, and the electrical input ports of the photonic integrated circuit are connected to the electrical output ports of the electronic integrated circuit.

The optical splitters are arranged as nodes in a binary tree arrangement connected by optical waveguides as links in the binary tree arrangement.

The optical replication distribution network comprises a plurality of binary tree arrangements, each binary tree arrangement distributing a different one of the plurality of input values encoded on the respective optical signal.

The root of the binary tree arrangement and the light propagation length between the different opto-electronic circuit sections are all different from each other.

The optical waveguides in the optical replication distribution network are arranged in the first semiconductor die to avoid crossing any optical waveguides in the optical replication distribution network.

The optoelectronic circuit portion is arranged in a plurality of substantially straight lines on the first semiconductor die.

The plurality of wires are optically coupled to each other by one or more optical waveguides in the optical replication distribution network.

A portion of the wires integrated in the photonic integrated circuit connect the photodetector to a junction between wires from different portions of the optoelectronic circuit.

The optical-to-electrical operation module includes a mach-zehnder interferometer configured to perform a multiplication operation between (1) an optical value based on one of the input values scaled by the optical replication distribution network and (2) an electrical value provided by an electrical input port.

The electronic integrated circuit also includes a transimpedance amplifier having an input electrically coupled to an electrical output port of the photonic integrated circuit.

In another aspect, a system includes a first unit configured to generate a plurality of modulator control signals, and a processing unit. The processing unit includes a light source or port configured to provide a plurality of light outputs and a first set of light modulators coupled to the light source or port and the first unit. The optical modulators of the first set of optical modulators are configured to modulate a plurality of optical outputs provided by the optical source or port based on a digital input value corresponding to a first set of the plurality of modulator control signals to produce an optical input vector comprising a plurality of optical signals. The processing unit further comprises a matrix multiplication unit comprising a second set of light modulators. The matrix multiplication unit is coupled to the first unit and is configured to convert the optical input vector into an analog output vector based on a plurality of digital weight values corresponding to a second set of modulator control signals of the plurality of modulator control signals applied to the second set of optical modulators. At least one optical modulator of at least one of the first or second sets of optical modulators is configured to modulate the optical signal based on a first modulator control signal of the plurality of modulator control signals, and the first unit is configured to shape the first modulator control signal to include a bandwidth enhancement associated with an amplitude variation associated with a corresponding variation of a consecutive digital value corresponding to the first modulator control signal.

Embodiments of the system may include one or more of the following features. The system may include a second unit coupled to the matrix multiplication unit and configured to convert the analog output vector to a digital output vector, and a controller. The controller may include an integrated circuit configured to perform operations including receiving an artificial neural network calculation request, the artificial neural network calculation request including an input data set, the input data set including a first digital input vector, receiving a first plurality of neural network weights, and generating, by a first unit, a first plurality of modulator control signals based on the first digital input vector and a first plurality of weight control signals based on the first plurality of neural network weights.

The first unit may include a digital-to-analog converter (digital to analog converter; DAC).

The system may include a storage unit configured to store a data set and a plurality of neural network weights.

The integrated circuit of the controller may be further configured to perform operations comprising storing the input data set and the first plurality of neural network weights in a memory unit.

The controller may include an Application SPECIFIC INTEGRATED Circuit (ASIC), and receiving the artificial neural network computation request may include receiving the artificial neural network computation request from a general purpose data processor.

The first unit, the processing unit, the second unit, and the controller may be disposed on at least one of a multi-chip module or an integrated circuit. Receiving the artificial neural network computation request may include receiving the artificial neural network computation request from a second data processor, wherein the second data processor is external to the multi-chip module or integrated circuit, the second data processor is coupled to the multi-chip module or integrated circuit through a communication channel (communication channel), and the processing unit may process the data at a data rate that is at least an order of magnitude greater than a data rate of the communication channel.

The first unit, the processing unit, the second unit, and the controller may be used for an electro-optical processing loop that is repeated in multiple iterations. The electro-optical processing cycle includes (1) at least a first optical modulation operation based on at least one of the modulator control signals and at least a second optical modulation operation based on at least one of the weight control signals, and at least one of (2) an electrical summing operation or (b) an electrical storage operation.

The electro-optical processing cycle may include an electrical storage operation, and the electrical storage operation is performed using a memory unit coupled to the controller. The operations performed by the controller may further include storing the input data set and the first plurality of neural network weights in a memory unit.

The optoelectronic processing cycle may include an electrical summation operation, and the electrical summation operation may be performed using an electrical summation module within the matrix multiplication unit. The electrical summing module may be configured to generate currents corresponding to elements of the analog output vector, the currents representing a sum of respective elements of the optical input vector multiplied by respective neural network weights.

The first modulator control signal may comprise an analog signal associated with a plurality of predetermined amplitude levels, and each of the amplitude levels is associated with a different corresponding digital value.

The first modulator control signal may comprise an analog signal associated with two predetermined amplitude levels, and each of the amplitude levels is associated with a different corresponding binary value.

The consecutive digital values may comprise a plurality of consecutive binary values in a series of binary values.

The controller may be configured to shape the first modulator control signal to include a bandwidth boost for an initial portion of the second time interval by increasing a magnitude of an amplitude variation between a first predetermined amplitude level associated with the first time interval and a second predetermined amplitude level associated with the second time interval.

A series of binary values may be used to determine an amplitude level of a first modulator control signal for modulating an optical signal according to a non-return-to-zero (NRZ) modulation mode.

The first unit may be configured to shape the first modulator control signal to include bandwidth boosting by pumping (pumping) a current between a diode structure of the first modulator in the second set of light modulators and a capacitance connected in series between the diode structure and a circuit providing the first modulator control signal, and an amount of charge transferred by the pumping current may be determined based at least in part on a constant voltage over a period of time providing the continuous digital value.

In another general aspect, an apparatus includes a plurality of optical waveguides coupled to a first set of optical amplitude modulators, wherein a set of a plurality of input values are encoded on respective optical signals carried by the optical waveguides using the first set of optical amplitude modulators. The apparatus includes a plurality of replication modules, and for each of at least two subsets of the one or more optical signals, a corresponding set of the one or more replication modules is configured to divide the subset of the one or more optical signals into two or more copies of the optical signal. The apparatus includes a plurality of multiplication modules, each of the multiplication modules including an optical amplitude modulator of the second set of optical amplitude modulators, and for each of at least two copies of the first subset of one or more optical signals, the multiplication module of a corresponding one is configured to multiply the one or more optical signals of the first subset by one or more matrix element values using the optical amplitude modulators of the second set of optical amplitude modulators. The apparatus includes one or more summation modules, and for the results of two or more multiplication modules, a corresponding one of the summation modules is configured to produce an electrical signal representative of a sum of the results of the two or more multiplication modules. At least one optical amplitude modulator of at least one of the first set of optical amplitude modulators or the second set of optical amplitude modulators is configured to modulate the optical signal with a modulation value using a monotonically increasing (monotonically increase) power relative to an absolute value of the modulation value.

Embodiments of the apparatus may include one or more of the following features. At least one optical amplitude modulator of at least one of the first set of optical amplitude modulators or the second set of optical amplitude modulators may comprise a coherence sensitive optical amplitude modulator configured to modulate the optical signal by a modulation value based on interference between optical waves, the optical waves having a coherence length at least as long as a propagation distance through the coherence sensitive optical amplitude modulator.

The coherence sensitive optical amplitude modulator may include a Mach-Zehnder Interferometer (MZI) interferometer that separates the optical wave guided by the input optical waveguide into a first optical waveguide arm of the Mach-Zehnder interferometer and a second optical waveguide arm of the Mach-Zehnder interferometer. The first optical waveguide arm may include an active phase shifter that produces a relative phase shift with respect to a phase delay of the second optical waveguide arm, and the mach-zehnder interferometer may combine optical waves from the first optical waveguide arm and the second optical waveguide arm into at least one output optical waveguide.

The power used to modulate the optical signal by the modulation value may include power applied to an active phase shifter.

The input values in the set of multiple input values encoded on the respective optical signals may represent elements of an input vector multiplied by a matrix comprising one or more matrix element values.

A set of the plurality of output values may be encoded on a plurality of respective electrical signals generated by one or more summing modules, and the output values of the set of the plurality of output values may represent elements of an output vector, the output vector being generated by multiplying the input vector by a matrix.

Each of the optical signals carried by the optical waveguides may comprise an optical wave having a common wavelength that is substantially the same for all of the optical signals.

The replication module may include at least one replication module having an optical splitter that transmits power of a predetermined proportion of the optical waves at an input port of the replication module to a first output port of the replication module and transmits power of the remaining proportion of the optical waves at the input port of the replication module to a second output port of the replication module.

The optical splitter may include a waveguide splitter that transmits power of a predetermined proportion of the optical waves guided by the input optical waveguide of the replication module to the first output optical waveguide of the replication module and transmits power of the remaining proportion of the optical waves guided by the input optical waveguide of the replication module to the second output optical waveguide of the replication module.

The guided modes of the input optical waveguide may be adiabatically coupled to a plurality of guided modes of each of the first and second output optical waveguides.

The optical splitter may include a beam splitter including at least one surface that transmits a predetermined proportion of the power of the optical wave at the input port and reflects the remaining proportion of the power of the optical wave at the input port.

At least one of the plurality of optical waveguides may include an optical fiber coupled to an optical coupler that couples a guided mode of the optical fiber to a free-space propagation mode (free-space propagation mode).

The multiplication module may comprise at least one coherence sensitive optical amplitude modulator configured to multiply one or more optical signals of the first subset by one or more matrix element values based on interference between the optical waves, the optical waves having a coherence length at least as long as a propagation distance through the coherence sensitive optical amplitude modulator.

The coherence sensitive optical amplitude modulator may include a mach-zehnder interferometer (MZI) that splits an optical wave guided by the input optical waveguide into a first optical waveguide arm of the mach-zehnder interferometer and a second optical waveguide arm of the mach-zehnder interferometer. The first optical waveguide arm may include a phase shifter that imparts a relative phase shift with respect to a phase delay of the second optical waveguide arm, and the mach-zehnder interferometer may combine a plurality of optical waves from the first optical waveguide arm and the second optical waveguide arm into at least one output optical waveguide.

The mach-zehnder interferometer may combine optical waves from the first optical waveguide arm and the second optical waveguide arm into each of the first output optical waveguide and the second output optical waveguide. The first photodetector may receive light waves from the first output light guide to produce a first photocurrent, the second photodetector may receive light waves from the second output light guide to produce a second photocurrent, and the result of the coherent sensitive light amplitude modulator may include a difference between the first photocurrent and the second photocurrent.

The coherence sensitive optical amplitude modulator can include one or more ring resonators including at least one ring resonator coupled to the first optical waveguide and at least one ring resonator coupled to the second optical waveguide.

The first photodetector may receive the light wave from the first light guide to produce a first photocurrent, the second photodetector may receive the light wave from the second light guide to produce a second photocurrent, and the result of the coherence sensitive optical amplitude modulator may include a difference between the first photocurrent and the second photocurrent.

The multiplication module may include at least one coherent non-sensitive optical amplitude modulator configured to multiply one or more optical signals of the first subset by one or more matrix element values based on energy absorption within the optical wave.

The coherent non-sensitive optical amplitude modulator may comprise an electroabsorption modulator.

The one or more summing modules may include at least one summing module having (1) two or more input conductors, each of the input conductors carrying an electrical signal in the form of an input current, the magnitude of the input current representing a respective result of a respective one of the multiplication modules, and (2) at least one output conductor carrying an electrical signal representing a sum of the respective results in the form of an output current, the output current being proportional to the sum of the input currents.

The two or more input conductors and the output conductor may comprise wires that contact at one or more junctions between the wires, and the output current may be approximately equal to the sum of the input currents.

At least a first one of the input currents may be provided in the form of at least one photocurrent generated by at least one photodetector that receives the optical signal generated by the first one of the multiplication modules.

The first input current may be provided in the form of a difference between two photocurrents generated by different respective photodetectors receiving different respective optical signals generated by the first multiplication module.

One of the copies of the first subset of one or more optical signals may be comprised of a single optical signal on which one of the input values is encoded.

The multiplication module corresponding to the copy of the first subset may multiply the encoded input values by the single matrix element values.

One of the copies of the first subset of one or more optical signals may include more than one optical signal and less than all of the optical signals on which the plurality of input values are encoded.

The multiplication module corresponding to the copy of the first subset may multiply the encoded input values by different respective matrix element values.

The different multiplication modules corresponding to different respective copies of the first subset of one or more optical signals may be included by different devices that are in optical communication to transmit one of the copies of the first subset of one or more optical signals between the different devices.

Two or more of the plurality of optical waveguides, two or more of the plurality of replication modules, two or more of the plurality of multiplication modules, and at least one of the one or more summation modules may be arranged on a substrate of a common device.

The device may perform vector matrix multiplication, where an input vector may be provided as a set of optical signals and an output vector may be provided as a set of electrical signals.

The apparatus may further include an accumulator that integrates an input electrical signal corresponding to the output of the multiplication or summation module, wherein the input electrical signal is encoded using time domain encoding using switched amplitude modulation within each of the plurality of time slots, and the accumulator generates an output electrical signal encoded at more than two amplitude levels, the amplitude levels corresponding to different duty cycles of the time domain encoding over the plurality of time slots.

Each of the two or more multiplication modules may correspond to a different subset of the one or more optical signals.

The apparatus may further include a multiplication module for each copy of a second subset of the one or more optical signals different from the first subset of the one or more optical signals configured to multiply the one or more optical signals of the second subset by one or more matrix element values using optical amplitude modulation.

In another general aspect, a method includes encoding a set of multiple input values on respective optical signals using a first set of optical amplitude modulators, for each of at least two subsets of one or more optical signals, using a corresponding set of one or more replica modules to divide the subset of one or more optical signals into two or more copies of the optical signals, for each of the at least two copies of the first subset of one or more optical signals, using a corresponding multiplication module to multiply the one or more optical signals of the first subset by one or more matrix element values using an optical amplitude modulator of a second set of optical amplitude modulators, and for a result of the two or more multiplication modules, using a summation module configured to produce an electrical signal representing a sum of the results of the two or more multiplication modules. At least one optical amplitude modulator of at least one of the first set of optical amplitude modulators or the second set of optical amplitude modulators is configured to modulate the optical signal with a modulation value using a monotonically increasing power relative to an absolute value of the modulation value.

In another general aspect, a system includes a storage unit configured to store a data set and a plurality of neural network weights, a digital-to-analog conversion (DAC) unit configured to generate a plurality of modulator control signals and to generate a plurality of weight control signals, an optical processor including a laser unit configured to generate a plurality of optical outputs, a plurality of optical modulators coupled to the laser unit and the DAC unit, the plurality of optical modulators configured to generate optical input vectors by modulating the plurality of optical outputs generated by the laser unit based on the plurality of modulator control signals, an optical matrix multiplication unit coupled to the plurality of optical modulators and the DAC unit, the optical matrix multiplication unit configured to convert the optical input vectors into optical output vectors based on the plurality of weight control signals, and a photo detection unit coupled to the optical matrix multiplication unit and configured to generate a plurality of output voltages corresponding to the optical output vectors, an analog-to-digital conversion (ADC) unit coupled to the photo detection unit and configured to convert the plurality of output voltages into a plurality of digital optical outputs, a controller including an integrated circuit configured to perform operations including receiving an input signal from a computer and a first neural network comprising a first input and a first neural network based on the plurality of the first data set and the first neural network weights, the first neural network being generated based on the plurality of first input data sets.

Embodiments of the system may include one or more of the following features. For example, the operations may further include obtaining a first plurality of digital light outputs from the ADC unit corresponding to the light output vectors of the light matrix multiplication unit, the first plurality of digital light outputs forming a first digital output vector, performing a nonlinear transformation on the first digital output vector to produce a first transformed digital output vector, and storing the first transformed digital output vector in the storage unit.

The system may have a first recurring time period defined as an elapsed time between the step of storing the input data set and the first plurality of neural network weights in the memory unit and the step of storing the first transformed digital output vector in the memory unit. The first cycle period may be less than or equal to 1ns.

In some embodiments, the operations may further include outputting an artificial neural network output generated based on the first transformed digital output vector.

In some embodiments, the operations may further include generating, by the DAC unit, a second plurality of modulator control signals based on the first transformed digital output vector.

In some embodiments, the artificial neural network computation request may further include a second plurality of neural network weights, and the operations may further include generating, by the DAC unit, a second plurality of weight control signals based on the second plurality of neural network weights based on the obtaining of the first plurality of digital light outputs. The first plurality of neural network weights and the second plurality of neural network weights may correspond to different layers of the artificial neural network.

In some embodiments, the input data set may further include a second digital input vector, and the operations may further include generating, by the DAC unit, a second plurality of modulator control signals based on the second digital input vector, deriving a second plurality of digital light outputs from the ADC unit corresponding to the light output vectors of the light matrix multiplication unit, the second plurality of digital light outputs forming a second digital output vector, performing a nonlinear transformation on the second digital output vector to generate a second transformed digital output vector, storing the second transformed digital output vector in the storage unit, and outputting an artificial neural network output generated based on the first transformed digital output vector and the second transformed digital output vector. The light output vector of the light matrix multiplication unit is generated by a second light input vector generated based on a second plurality of modulator control signals, which is transformed by the light matrix multiplication unit based on the first mentioned plurality of weight control signals.

In some embodiments, the system may further include an analog nonlinear unit disposed between the photodetection unit and the ADC unit, the analog nonlinear unit configured to receive the plurality of output voltages from the photodetection unit, apply a nonlinear transfer function, and output a plurality of converted output voltages to the ADC unit, and the operations further include deriving a first plurality of converted digital output voltages corresponding to the plurality of converted output voltages from the ADC unit, the first plurality of converted digital output voltages forming a first converted digital output vector, and storing the first converted digital output vector in the storage unit.

In some embodiments, the integrated circuit of the controller may be configured to generate the first plurality of modulator control signals at a rate greater than or equal to 8 GHz.

In some embodiments, the system may further include an analog storage unit disposed between the DAC unit and the plurality of optical modulators, the analog storage unit configured to store an analog voltage and output the stored analog voltage, and an analog nonlinear unit disposed between the photo detection unit and the ADC unit, the analog nonlinear unit configured to receive the plurality of output voltages from the photo detection unit, apply a nonlinear transfer function, and output the plurality of converted output voltages. The analog memory cell may include a plurality of capacitors.

In some embodiments, the analog storage unit may be configured to receive and store a plurality of converted output voltages of the analog nonlinear unit and output the stored plurality of converted output voltages to the plurality of optical modulators, and the operations may further include storing the plurality of converted output voltages of the analog nonlinear unit in the analog storage unit based on generating the first plurality of modulator control signals and the first plurality of weight control signals, outputting the stored converted output voltages through the analog storage unit, deriving a second plurality of converted digital output voltages from the ADC unit, the second plurality of converted digital output voltages forming a second converted digital output vector, and storing the second converted digital output vector in the storage unit.

In some embodiments, the input data set of the artificial neural network computation request may include a plurality of digital input vectors. The laser unit may be configured to generate a plurality of wavelengths. The plurality of optical modulators may include optical modulator banks (banks) configured to generate a plurality of optical input vectors, each optical modulator bank corresponding to one of the plurality of wavelengths and generating a respective optical input vector having a respective wavelength, and an optical multiplexer configured to combine the plurality of optical input vectors into a combined optical input vector comprising the plurality of wavelengths. The photodetecting unit may be further configured to demultiplex a plurality of wavelengths and generate a plurality of demultiplexed output voltages. The operations may include obtaining a plurality of digital demultiplexed optical outputs from an ADC unit, the plurality of digital demultiplexed optical outputs forming a plurality of first digital output vectors, wherein each of the plurality of first digital output vectors corresponds to one of a plurality of wavelengths, performing a nonlinear transformation on each of the plurality of first digital output vectors to produce a plurality of transformed first digital output vectors, and storing the plurality of transformed first digital output vectors in a storage unit. Each of the plurality of digital input vectors may correspond to one of the plurality of optical input vectors.

In some embodiments, the artificial neural network computation request may include a plurality of digital input vectors. The laser unit may be configured to generate a plurality of wavelengths. The plurality of optical modulators may include optical modulator groups configured to generate a plurality of optical input vectors, each optical modulator group corresponding to one of the plurality of wavelengths and generating a respective optical input vector having a respective wavelength, and an optical multiplexer configured to combine the plurality of optical input vectors into a combined optical input vector comprising the plurality of wavelengths. The operations may include obtaining a first plurality of digital light outputs corresponding to a light output vector from an ADC unit, the light output vector comprising a plurality of wavelengths, the first plurality of digital light outputs forming a first digital output vector, performing a nonlinear transformation on the first digital output vector to produce a first transformed digital output vector, and storing the first transformed digital output vector in a storage unit.

In some embodiments, the DAC unit may comprise a 1-bit DAC subunit configured to generate a plurality of 1-bit modulator control signals. The resolution of the ADC unit may be 1 bit. The resolution of the first digital input vector may be N bits. The operations may include decomposing a first digital input vector into N1-bit input vectors, each of the N1-bit input vectors corresponding to one of N bits of the first digital input vector, generating a sequence of N1-bit modulator control signals corresponding to the N1-bit input vectors by a 1-bit DAC subunit, deriving a sequence of N digital 1-bit optical outputs corresponding to the sequence of N1-bit modulator control signals from an ADC unit, constructing an N-bit digital output vector from the sequence of N digital 1-bit optical outputs, performing a nonlinear transformation on the constructed N-bit digital output vector to generate a transformed N-bit digital output vector, and storing the transformed N-bit digital output vector in a storage unit.

In some embodiments, the memory unit may include a digital input vector memory configured to store a first digital input vector and including at least one SRAM, and a neural network weight memory configured to store a plurality of neural network weights and including at least one DRAM.

In some embodiments, the DAC unit may include a first DAC subunit configured to generate the plurality of modulator control signals, and a second DAC subunit configured to generate the plurality of weight control signals, wherein the first DAC subunit and the second DAC subunit are different.

In some embodiments, a laser unit may include a laser source configured to generate light and an optical power splitter configured to split the light generated by the laser source into a plurality of light outputs, wherein each of the plurality of light outputs has substantially the same power.

In some embodiments, the plurality of optical modulators may comprise one of an MZI modulator, a ring resonator modulator (ring resonator modulator), or an electro-absorption (electro-absorption) modulator.

In some embodiments, the photo-detection unit may include a plurality of photo-detectors and a plurality of amplifiers configured to convert photocurrents generated by the photo-detectors into a plurality of output voltages.

In some embodiments, the integrated circuit may be an application specific integrated circuit.

In some embodiments, the optical matrix multiplication unit may include an array of input waveguides for receiving an optical input vector, an optical interference unit in optical communication with the array of input waveguides for performing a linear transformation that converts the optical input vector into a second array of optical signals, and an array of output waveguides in optical communication with the optical interference unit for guiding the second array of optical signals, wherein at least one input waveguide in the array of input waveguides is in optical communication with each output waveguide in the array of output waveguides through the optical interference unit.

In some embodiments, the optical interference unit may include a plurality of interconnected Mach-Zehnder interferometer (MZI) Mach-Zehnder interferometers, each Mach-Zehnder interferometer of the plurality of interconnected Mach-Zehnder interferometers including a first phase shifter configured to change a splitting ratio of the Mach-Zehnder interferometer, and a second phase shifter configured to shift a phase of one output of the Mach-Zehnder interferometers, wherein the first phase shifter and the second phase shifter are coupled to a plurality of weight control signals.

In another aspect, a system includes a storage unit configured to store a data set and a plurality of neural network weights, a driver unit configured to generate a plurality of modulator control signals and to generate a plurality of weight control signals, an optical processor including a laser unit configured to generate a plurality of optical outputs, a plurality of optical modulators coupled to the laser unit and the driver unit, the plurality of optical modulators configured to generate an optical input vector by modulating the plurality of optical outputs generated by the laser unit based on the plurality of modulator control signals, an optical matrix multiplication unit coupled to the plurality of optical modulators and the driver unit, the optical matrix multiplication unit configured to convert the optical input vector into an optical output vector based on the plurality of weight control signals, and a photo detection unit coupled to the optical matrix multiplication unit and configured to generate a plurality of output voltages corresponding to the optical output vector, a comparator unit coupled to the photo detection unit and configured to convert the plurality of output voltages into a plurality of digital 1-bit optical outputs, and a controller including an integrated circuit configured to perform operations of receiving an artificial data set from a computer and a first neural network including a first set of input and a first set of weights and a first set of N-bit data input to a first neural network including a first set of N-bit input and a first set of a first neural network including a first set 1, the method includes generating N1-bit modulator control signals corresponding to N1-bit input vectors, each of the N1-bit input vectors corresponding to one of N bits of a first digital input vector, generating a sequence of N1-bit modulator control signals corresponding to the N1-bit input vectors by a driver unit, deriving a sequence of N digital 1-bit light outputs corresponding to the sequence of N1-bit modulator control signals from a comparator unit, constructing an N-bit digital output vector from the sequence of N digital 1-bit light outputs, performing a nonlinear transformation on the constructed N-bit digital output vector to generate a transformed N-bit digital output vector, and storing the transformed N-bit digital output vector in a storage unit.

In another aspect, a method for performing artificial neural network calculations in a system having an optical matrix multiplication unit configured to convert an optical input vector into an optical output vector based on a plurality of weight control signals includes receiving an artificial neural network calculation request including an input data set and a first plurality of neural network weights from a computer, wherein the input data set includes a first digital input vector, storing the input data set and the first plurality of neural network weights in a storage unit, generating, by a digital-to-analog conversion (DAC) unit, a first plurality of modulator control signals based on the first digital input vector and a first plurality of weight control signals based on the first plurality of neural network weights, deriving, from the analog-to-digital conversion (ADC) unit, a first plurality of digital optical outputs corresponding to the optical output vector of the optical matrix multiplication unit, the first plurality of digital optical outputs forming a first digital output vector, performing a nonlinear transformation on the first digital output vector by a controller to generate a first transformed digital output vector, storing the first transformed digital output vector in the storage unit, and generating, by the controller, the artificial digital output vector based on the first transformed digital output vector generated by the controller.

In another aspect, a method includes providing input information in an electronic format, converting at least a portion of the electronic input information into an optical input vector, optically converting the optical input vector into an optical output vector based on optical matrix multiplication, converting the optical output vector into the electronic format, and electronically applying a nonlinear transformation to the electronically converted optical output vector to provide output information in the electronic format.

Embodiments of the method may include one or more of the following features. For example, the method may further include repeating an electro-optical conversion (electronic-to-optical converting), an optical conversion (optical transforming), an optical-to-electronic converting), and a non-linear conversion of the electrical application for new electronic input information corresponding to the output information provided in the electronic format.

In some embodiments, the light matrix multiplication for the initial light transformation and the light matrix multiplication for the repeated light transformation may be the same and may correspond to the same layer of the artificial neural network.

In some embodiments, the light matrix multiplication for the initial light transformation and the light matrix multiplication for the repeated light transformation may be different and may correspond to different layers of the artificial neural network.

In some embodiments, the method may further include repeating the electro-optic conversion, the optical transformation, the photoelectric conversion, and the electrically applied nonlinear transformation for different portions of the electronically input information, wherein the optical matrix multiplication for the initial optical transformation and the optical matrix multiplication for the repeated optical transformation are the same and correspond to the first layer of the artificial neural network.

In some embodiments, the method may further include providing the intermediate information in an electronic format based on electronic output information for a plurality of portions of the electronic input information generated by a first layer of the artificial neural network, and repeating the electro-optic conversion, the optical conversion, the photoelectric conversion, and the electrically applied nonlinear conversion for each different portion of the electronic intermediate information, wherein an optical matrix multiplication for the initial optical conversion and an optical matrix multiplication for the repeated optical conversion associated with the different portion of the electronic intermediate information are the same and correspond to a second layer of the artificial neural network.

In another aspect, a system includes an optical processor including a passive diffractive optical element (PASSIVE DIFFRACTIVE optical element), wherein the passive diffractive optical element is configured to transform an optical input vector or matrix into an optical output vector or matrix that represents a result of matrix processing applied to the optical input vector or matrix and a predetermined vector defined by an arrangement of the diffractive optical element.

Embodiments of the system may include one or more of the following features. For example, the matrix processing may comprise matrix multiplication between the light input vector or matrix and a predetermined vector defined by the arrangement of diffractive optical elements.

In some embodiments, an optical processor may include an optical matrix processing unit including an input waveguide array for receiving an optical input vector, an optical interference unit including a passive diffractive optical component, wherein the optical interference unit is in optical communication with the input waveguide array and configured to perform a linear transformation that converts the optical input vector into a second array of optical signals, and an output waveguide array in optical communication with the optical interference unit for guiding the second array of optical signals, wherein at least one input waveguide of the input waveguide array is in optical communication with each of the output waveguides of the output waveguide array through the optical interference unit.

In some embodiments, the light interference unit may include a substrate having at least one of a hole or a stripe (stripe), the hole having a size in a range of 100nm to 10 μm, and the stripe having a width in a range of 100nm to 10 μm.

In some embodiments, the optical interference unit may include a substrate having passive diffractive optical components arranged in a two-dimensional configuration, and the substrate includes at least one of a planar substrate or a curved substrate.

In some embodiments, the substrate may comprise a planar substrate that is parallel to the direction of light propagation from the input waveguide array to the output waveguide array.

In some embodiments, an optical processor may include an optical matrix processing unit including an input waveguide matrix for receiving an optical input matrix, an optical interference unit including a passive diffractive optical component, wherein the optical interference unit is in optical communication with the input waveguide matrix and configured to perform a linear transformation of the optical input matrix into a second optical signal matrix, and an output waveguide matrix in optical communication with the optical interference unit for guiding the second optical signal matrix, wherein at least one input waveguide of the input waveguide matrix is in optical communication with each of the output waveguides of the output waveguide matrix through the optical interference unit.

In some embodiments, the optical interference unit may include a substrate having at least one of a hole or a stripe, the hole having a size in a range of 100nm to 10 μm, and the stripe having a width in a range of 100nm to 10 μm.

In some embodiments, the optical interference unit may include a substrate having passive diffractive optical components arranged in a three-dimensional configuration.

In some embodiments, the substrate may have a shape of at least one of a cube, a column, a prism, or an irregular volume.

In some embodiments, the light processor may include a light interference unit including a hologram (hologram) having a passive diffractive optical element, the light processor configured to receive modulated light representing a light input matrix and to continuously convert the light as it passes through the hologram until the light emerges from the hologram as a light output matrix.

In some embodiments, the optical interference unit may include a substrate having a passive diffractive optical element, and the substrate includes at least one of silicon, silicon oxide, silicon nitride, quartz, lithium niobate, a phase change material, or a polymer.

In some embodiments, the optical interference unit may include a substrate having a passive diffractive optical element, and the substrate includes at least one of a glass substrate or an acrylic substrate.

In some embodiments, the passive diffractive optical component may be formed in part from dopants.

In some embodiments, the matrix processing may represent the processing of input data by the neural network, the input data being represented by an optical input vector.

In some embodiments, an optical processor may include a laser unit configured to generate a plurality of light outputs, a plurality of optical modulators coupled to the laser unit and configured to generate an optical input vector by modulating the plurality of light outputs generated by the laser unit based on a plurality of modulator control signals, an optical matrix processing unit coupled to the plurality of optical modulators, the optical matrix processing unit including a passive diffractive optical component configured to convert the optical input vector into an optical output vector based on a plurality of weights defined by the passive diffractive optical component, and a photodetecting unit coupled to the optical matrix processing unit and configured to generate a plurality of output electrical signals corresponding to the optical output vector.

In some embodiments, the passive diffractive optical element may be arranged in a three-dimensional configuration, the plurality of light modulators comprising a two-dimensional array of light modulators, and the photodetecting unit comprising a two-dimensional array of photodetectors.

In some embodiments, an optical matrix processing unit may include a housing module (housing module) to support and protect an input waveguide array, an optical interference unit, and an output waveguide array, the optical processor including a receiving module configured to receive the optical matrix processing unit, the receiving module including a first interface (interface) to enable the optical matrix processing unit to receive optical input vectors from a plurality of optical modulators, and a second interface to enable the optical matrix processing unit to transmit the optical output vectors to a photodetecting unit.

In some embodiments, the plurality of output electrical signals may include at least one of a plurality of voltage signals or a plurality of current signals.

In some embodiments, a system may include a storage unit, a digital-to-analog conversion (DAC) unit configured to generate a plurality of modulator control signals, an analog-to-digital conversion (ADC) unit coupled to the photo detection unit and configured to convert a plurality of output electrical signals into a plurality of digital outputs, and a controller including an integrated circuit configured to receive an artificial neural network calculation request from a computer including an input data set, wherein the input data set includes a first digital input vector, store the input data set in the storage unit, and generate, by the DAC unit, the first plurality of modulator control signals based on the first digital input vector.

In another aspect, a method includes 3D printing an optical matrix processing unit including a passive diffractive optical element, wherein the passive diffractive optical element is configured to transform an optical input vector or matrix into an optical output vector or matrix representing a result of matrix processing applied to the optical input vector or matrix and a predetermined vector defined by an arrangement of the diffractive optical element.

In another aspect, a method includes generating a hologram including a passive diffractive optical element using one or more laser beams, wherein the passive diffractive optical element is configured to transform a light input vector or matrix into a light output vector or matrix that represents a result of a matrix process applied to the light input vector or matrix and a predetermined vector defined by an arrangement of the diffractive optical element.

In another aspect, a system includes an optical processor including passive diffractive optical components arranged in a one-dimensional manner, wherein the passive diffractive optical components are configured to convert an optical input into an optical output that represents a result of a matrix process applied to the optical input and a predetermined vector defined by the arrangement of the diffractive optical components.

Embodiments of the system may include one or more of the following features. For example, the matrix processing may comprise matrix multiplication between the light input and a predetermined vector defined by the arrangement of the diffractive optical element.

In some embodiments, an optical processor may include an optical matrix processing unit including an input waveguide for receiving an optical input, an optical interference unit including a passive diffractive optical component, wherein the optical interference unit is in optical communication with the input waveguide and configured to perform a linear transformation of the optical input, and an output waveguide in optical communication with the optical interference unit for guiding an optical output.

In some embodiments, the optical interference unit may include a substrate having at least one of holes or gratings (gratings), and the holes or grating components may have a size in a range of 100nm to 10 μm.

In another aspect, a system includes a storage unit, a digital-to-analog conversion (DAC) unit configured to generate a plurality of modulator control signals, and an optical processor including a laser unit configured to generate a plurality of light outputs, a plurality of light modulators coupled to the laser unit and the DAC unit, the plurality of light modulators configured to generate light input vectors by modulating the plurality of light outputs generated by the laser unit based on the plurality of modulator control signals, an optical matrix processing unit coupled to the plurality of light modulators, the optical matrix processing unit including a passive diffractive optical component configured to convert the light input vectors into light output vectors based on a plurality of weights defined by the passive diffractive optical component, and a photo detection unit coupled to the optical matrix processing unit and configured to generate a plurality of output electrical signals corresponding to the light output vectors. The system further includes an analog-to-digital conversion (ADC) unit coupled to the photo-detection unit and configured to convert the plurality of output electrical signals into a plurality of digital light outputs, and a controller including an integrated circuit configured to receive an artificial neural network calculation request from a computer including an input data set, wherein the input data set includes a first digital input vector, store the input data set in a storage unit, and generate, by the DAC unit, a first plurality of modulator control signals based on the first digital input vector.

Embodiments of the system may include one or more of the following features. For example, the matrix processing unit may comprise a passive diffractive optical element configured to convert a light input vector into a light output vector, the light output vector representing the product of a matrix multiplication between the light input vector and a predetermined vector defined by the passive diffractive optical element.

In some embodiments, the operations further comprise obtaining a first plurality of digital light outputs from the ADC unit corresponding to the light output vectors of the light matrix processing unit, the first plurality of digital light outputs forming a first digital output vector, performing a nonlinear transformation on the first digital output vector to produce a first transformed digital output vector, and storing the first transformed digital output vector in the storage unit.

In some embodiments, the system may have a first cycle period defined as the time elapsed between the step of storing the input data set in the memory unit and the step of storing the first transformed digital output vector in the memory unit, and wherein the first cycle period may be less than or equal to 1ns.

In some embodiments, the input data set may further include a second digital input vector, and wherein the operations may further include generating, by the DAC unit, a second plurality of modulator control signals based on the second digital input vector, deriving a second plurality of digital light outputs from the ADC unit corresponding to the light output vectors of the light matrix processing unit, the second plurality of digital light outputs forming a second digital output vector, performing a nonlinear transformation on the second digital output vector to generate a second transformed digital output vector, storing the second transformed digital output vector in the storage unit, and outputting an artificial neural network output generated based on the first transformed digital output vector and the second transformed digital output vector, wherein the light output vector of the light matrix processing unit is generated by the second light input vector generated based on the second plurality of modulator control signals, the second light input vector being transformed by the light matrix processing unit based on the plurality of weights defined by the passive diffractive optical component.

In some embodiments, the system may further include an analog nonlinear unit disposed between the photodetecting unit and the ADC unit, the analog nonlinear unit configured to receive the plurality of output electrical signals from the photodetecting unit, apply a nonlinear transfer function, and output the plurality of converted output electrical signals to the ADC unit, wherein the operations may further include obtaining a first plurality of converted digital output electrical signals from the ADC unit corresponding to the plurality of converted output electrical signals, the first plurality of converted digital output electrical signals forming a first converted digital output vector, and storing the first converted digital output vector in the memory unit.

In some embodiments, the system may further include an analog storage unit disposed between the DAC unit and the plurality of optical modulators, the analog storage unit configured to store an analog voltage and output the stored analog voltage, and an analog nonlinear unit disposed between the photo detection unit and the ADC unit, the analog nonlinear unit configured to receive the plurality of output electrical signals from the photo detection unit, apply a nonlinear transfer function, and output the plurality of converted output electrical signals.

In some embodiments, the analog memory cell may include a plurality of capacitors.

In some embodiments, the analog storage unit may be configured to receive and store a plurality of converted output electrical signals of the analog nonlinear unit and output the stored plurality of converted output electrical signals to the plurality of optical modulators, and wherein the operations may further include storing the plurality of converted output electrical signals of the analog nonlinear unit in the analog storage unit based on generating the first plurality of modulator control signals, outputting the stored converted output electrical signals through the analog storage unit, deriving a second plurality of converted digital output electrical signals from the ADC unit, the second plurality of converted digital output electrical signals forming a second converted digital output vector, and storing the second converted digital output vector in the storage unit.

In some embodiments, the input data set of the artificial neural network computation request may include a plurality of digital input vectors, wherein the laser unit may be configured to generate a plurality of wavelengths, and wherein the plurality of optical modulators may include a plurality of optical modulator groups configured to generate a plurality of optical input vectors, each optical modulator group corresponding to one of the plurality of wavelengths and generating a respective optical input vector having a respective wavelength, and an optical multiplexer configured to combine the plurality of optical input vectors into a combined optical input vector including the plurality of wavelengths. The photodetecting unit may be further configured to demultiplex a plurality of wavelengths and produce a plurality of demultiplexed output electrical signals, and the operations may include obtaining a plurality of digital demultiplexed optical outputs from the ADC unit, the plurality of digital demultiplexed optical outputs forming a plurality of first digital output vectors, wherein each of the plurality of first digital output vectors corresponds to one of the plurality of wavelengths, performing a nonlinear transformation on each of the plurality of first digital output vectors to produce a plurality of transformed first digital output vectors, and storing the plurality of transformed first digital output vectors in the storage unit, wherein each of the plurality of digital input vectors corresponds to one of the plurality of optical input vectors.

In some embodiments, the artificial neural network computation request may include a plurality of digital input vectors, wherein the laser unit is configured to generate a plurality of wavelengths, and wherein the plurality of optical modulators may include optical modulator groups configured to generate a plurality of optical input vectors, each optical modulator group corresponding to one of the plurality of wavelengths and generating a respective optical input vector having a respective wavelength, and an optical multiplexer configured to combine the plurality of optical input vectors into a combined optical input vector including the plurality of wavelengths. The operations may include obtaining a first plurality of digital light outputs corresponding to a light output vector from an ADC unit, the light output vector comprising a plurality of wavelengths, the first plurality of digital light outputs forming a first digital output vector, performing a nonlinear transformation on the first digital output vector to produce a first transformed digital output vector, and storing the first transformed digital output vector in a storage unit.

In some embodiments, the DAC unit may comprise a 1-bit DAC unit configured to generate a plurality of 1-bit modulator control signals, wherein the resolution of the ADC unit may be 1 bit, and wherein the resolution of the first digital input vector may be N bits. The operations may include decomposing a first digital input vector into N1-bit input vectors, each of the N1-bit input vectors corresponding to one of N bits of the first digital input vector, generating a sequence of N1-bit modulator control signals corresponding to the N1-bit input vectors by a 1-bit DAC unit, deriving a sequence of N digital 1-bit light outputs corresponding to the sequence of N1-bit modulator control signals from an ADC unit, constructing an N-bit digital output vector from the sequence of N digital 1-bit light outputs, performing a nonlinear transformation on the constructed N-bit digital output vector to produce a transformed N-bit digital output vector, and storing the transformed N-bit digital output vector in a storage unit.

In some embodiments, the memory unit may include a digital input vector memory configured to store a first digital input vector and including at least one SRAM.

In some embodiments, the plurality of optical modulators includes one of MZI modulators, ring resonance modulators, or electroabsorption modulators.

In some embodiments, the photodetection unit may include a plurality of photodetectors and a plurality of amplifiers configured to convert photocurrent generated by the photodetectors into a plurality of output electrical signals.

In some embodiments, the integrated circuit may comprise an application specific integrated circuit.

In some embodiments, the optical matrix processing unit may include an input waveguide array for receiving the optical input vector, an optical interference unit in optical communication with the input waveguide array for performing a linear transformation that converts the optical input vector into a second array of optical signals, wherein the optical interference unit includes a passive diffractive optical component, and an output waveguide array in optical communication with the optical interference unit for guiding the second array of optical signals, wherein at least one input waveguide in the input waveguide array is in optical communication with each output waveguide in the output waveguide array through the optical interference unit.

In another aspect, a system includes a storage unit, a driver unit configured to generate a plurality of modulator control signals, an optical processor including a laser unit configured to generate a plurality of light outputs, a plurality of light modulators coupled to the laser unit and the driver unit, the plurality of light modulators configured to generate light input vectors by modulating the plurality of light outputs generated by the laser unit based on the plurality of modulator control signals, an optical matrix processing unit coupled to the plurality of light modulators and the driver unit, the optical matrix processing unit including a passive diffractive optical component configured to convert the light input vectors into light output vectors based on a plurality of weight control signals defined by the passive diffractive optical component, and a photo detection unit coupled to the optical matrix processing unit and configured to generate a plurality of output electrical signals corresponding to the light output vectors. The system further includes a comparator unit coupled to the photodetection unit and configured to convert the plurality of output electrical signals into a plurality of digital 1-bit optical outputs, and a controller including an integrated circuit configured to receive an artificial neural network calculation request including an input data set from the computer, wherein the input data set includes a first digital input vector having an N-bit resolution, store the input data set in the storage unit, decompose the first digital input vector into N1-bit input vectors, each of the N1-bit input vectors corresponding to one of the N bits of the first digital input vector, generate a sequence of N1-bit modulator control signals corresponding to the N1-bit input vectors by the driver unit, obtain a sequence of N digital 1-bit optical outputs corresponding to the sequence of N1-bit modulator control signals from the comparator unit, construct an N-bit digital output vector from the sequence of N digital 1-bit optical outputs, perform a nonlinear transformation on the constructed N-bit digital output vector to produce a transformed N-bit digital output vector, and store the transformed N-bit digital output vector in the storage unit.

Embodiments of the system may include one or more of the following features. For example, the optical matrix processing unit may comprise an optical matrix multiplication unit configured to convert an optical input vector into an optical output vector, the optical output vector representing a product of a matrix multiplication between an input vector represented by the optical input vector and a predetermined vector defined by the passive diffractive optical element.

In another aspect, a method for performing artificial neural network computation in a system having an optical matrix processing unit includes receiving an artificial neural network computation request including an input data set from a computer, the input data set including a first digital input vector, storing the input data set in a storage unit, generating, by a digital-to-analog conversion (DAC) unit, a first plurality of modulator control signals based on the first digital input vector, converting, by the optical matrix processing unit using an arrangement including a passive diffractive optical element, the optical output vector representing a result of matrix processing applied to the optical input vector and a predetermined vector defined by the arrangement of the diffractive optical element, obtaining, from an analog-to-digital conversion (ADC) unit, a first plurality of digital light outputs corresponding to the optical output vector of the optical matrix processing unit, the first plurality of digital light outputs forming a first digital output vector, performing, by a controller, a nonlinear transformation on the first digital output vector to generate a first transformed digital output vector, storing, in the storage unit, the first transformed digital output vector, and outputting, by the controller, the neural network generated based on the first transformed digital output vector.

Embodiments of the method may include one or more of the following features. For example, converting the light input vector into the light output vector may include converting the light input vector into a light output vector representing a product of a matrix multiplication between the digital input vector and a predetermined vector defined by the arrangement of diffractive optical elements.

In another aspect, a method includes providing input information in an electronic format, converting at least a portion of the electronic input information into an optical input vector, optically converting, by an optical processor including a passive diffractive optical element, the optical input vector into an optical output vector based on optical matrix processing, converting the optical output vector into the electronic format, and electronically applying a nonlinear transformation to the electronically converted optical output vector to provide output information in the electronic format.

Embodiments of the method may include one or more of the following features. For example, optically converting the light input vector into the light output vector may include optically converting the light input vector into the light output vector based on an optical matrix multiplication between a digital input vector represented by the light input vector and a predetermined vector defined by the passive diffractive optical element.

In some embodiments, the method may further include repeating the electro-optic conversion, the optical-to-electrical conversion, the electro-optic conversion, and the non-linear conversion of the electrical application for new electronic input information corresponding to the output information provided in the electronic format.

In some embodiments, the light matrix process for the initial light transformation and the light matrix process for the repeated light transformation may be the same and may correspond to the same layer of the artificial neural network.

In some embodiments, the method may further include repeating the electro-optic conversion, the optical transformation, the photoelectric conversion, and the electrically applied nonlinear transformation for different portions of the electronically input information, wherein the optical matrix process for the initial optical transformation and the optical matrix process for the repeated optical transformation may be the same and correspond to one layer of the artificial neural network.

In another aspect, a system includes an optical matrix processing unit configured to process an input vector of N length, where the optical matrix processing unit includes an N+2 layer directional coupler (directional coupler) and an N layer phase shifter, and N is a positive integer.

Embodiments of the system may include one or more of the following features. For example, the optical matrix processing unit may comprise no more than n+2 layers of directional couplers.

In some embodiments, the light matrix processing unit may comprise a light matrix multiplication unit.

In some embodiments, the optical matrix processing unit may include a substrate and interconnect interferometers disposed on the substrate, wherein each interferometer includes an optical waveguide disposed on the substrate, and the directional coupler and the phase shifter are part of the interconnect interferometer.

In some embodiments, the optical matrix processing unit may include a layer of attenuators (attenuators) that follow the last layer of directional couplers.

In some embodiments, a layer of attenuators may include N attenuators.

In some embodiments, the system may include one or more homodyne detectors (homodyne detector) for detecting an output from the attenuator.

In some embodiments, n=3, and the optical matrix processing unit may include an input (terminal) configured to receive an input vector, a first layer directional coupler coupled to the input, a first layer phase shifter coupled to the first layer directional coupler, a second layer directional coupler coupled to the first layer phase shifter, a second layer phase shifter coupled to the second layer directional coupler, a third layer directional coupler coupled to the second layer phase shifter, a third layer phase shifter coupled to the third layer directional coupler, a fourth layer directional coupler coupled to the third layer phase shifter, and a fifth layer directional coupler coupled to the fourth layer directional coupler.

In some embodiments, n=4, and the optical matrix processing unit may include an input configured to receive an input vector, first, second, third, and fourth layers of directional couplers, each layer of directional coupler followed by a layer of phase shifter, wherein the first layer of directional coupler is coupled to the input, a second-to-LAST LAYER (second-to-LAST LAYER) layer of directional coupler coupled to the fourth layer of phase shifter, and a final layer of directional coupler coupled to the second-to-last layer of directional coupler.

In some embodiments, n=8, and the optical matrix processing unit may include an input configured to receive an input vector, eight layers of directional couplers, each layer of directional coupler followed by a layer of phase shifter, wherein a first layer of directional coupler is coupled to the input, a penultimate layer of directional coupler is coupled to the eighth layer of phase shifter, and a final layer of directional coupler is coupled to the penultimate layer of directional coupler.

In some embodiments, an optical matrix multiplication unit may include an input configured to receive an input vector, N layers of directional couplers, each layer of directional coupler followed by a layer of phase shifter, wherein a first layer of directional coupler is coupled to the input, a penultimate layer of directional coupler is coupled to the N layer of directional coupler, and a final layer of directional coupler is coupled to the penultimate layer of directional coupler.

In some embodiments, N is an even number.

In some embodiments, each ith layer directional coupler includes N/2 directional couplers, where i is an odd number, and each jth layer directional coupler includes N/2-1 directional couplers, where j is an even number.

In some embodiments, for each i-th layer directional coupler where i is odd, the kth directional coupler may be coupled to the (2 k-1) th and 2 k-th outputs of the previous layer, k being an integer from 1 to N/2.

In some embodiments, for each jth layer directional coupler where j is even, the mth directional coupler may be coupled to the (2 m) th and (2m+1) th outputs of the previous layer, m being an integer from 1 to N/2-1.

In some embodiments, each ith layer shifter may include N shifters, where i is an odd number, and each jth layer shifter may include N-2 shifters, where j is an even number.

In some embodiments, N may be an odd number.

In some embodiments, each layer of directional couplers may include (N-1)/2 directional couplers.

In some embodiments, each layer of phase shifters may include N-1 phase shifters.

In another aspect, a system includes a generator (generator) configured to generate a first data set, wherein the generator includes an optical matrix processing unit, and a discriminator (discriminator) configured to receive a second data set including data from the first data set and data from a third data set, the data in the first data set having similar characteristics (dynamics) as the data in the third data set, and to classify the data in the second data set as either data from the first data set or data from the third data set.

Embodiments of the method may include one or more of the following features. For example, the light matrix processing unit may include at least one of (i) the light matrix multiplying unit described above, (ii) the passive diffractive optical element described above, or (iii) the light matrix processing unit described above.

In some embodiments, the third data set may include real data, the generator is configured to generate synthetic data (synthesized data) similar to the real data, and the discriminator is configured to classify the data as real data or synthetic data.

In some embodiments, the generator may be configured to generate a data set for training at least one of an automated driving vehicle (vehicle), a medical diagnostic system, a fraud detection system, a weather forecast system, a financial prediction system, a facial recognition system, a speech recognition system, or a product defect detection system.

In some embodiments, the generator may be configured to generate an image that is similar to an image of at least one of the real object or the real scene, and the discriminator is configured to classify the received image as either (i) an image of the real object or the real scene, or (ii) a composite image generated by the generator.

In some embodiments, the real object may comprise at least one of a person, an animal, a cell, a tissue, or a product, and the real scene comprises a scene encountered by a vehicle.

In some embodiments, the discriminator may be configured to classify the received image as being (i) an image of a real person, a real animal, a real cell, a real tissue, a real product, or a real scene encountered by the vehicle, or (ii) a composite image produced by the generator.

In some embodiments, the vehicle may include at least one of a motorcycle, an automobile, a truck, a train, a helicopter, an airplane, a submarine, a ship, or an unmanned aerial vehicle.

In some embodiments, the generator may be configured to generate an image of tissue or cells associated with at least one of a human disease, an animal disease, or a plant disease.

In some embodiments, the generator may be configured to generate an image of tissue or cells associated with a human disease, and the disease includes at least one of cancer, parkinson's disease, sickle cell anemia, heart disease, cardiovascular disease, diabetes, chest disease, or skin disease.

In some embodiments, the generator may be configured to generate an image of tissue or cells associated with the cancer, and the cancer may include at least one of skin cancer, breast cancer, lung cancer, liver cancer, prostate cancer, or brain cancer.

In some embodiments, the system may further include a random noise generator configured to generate random noise input to the generator, and the generator is configured to generate the first data set based on the random noise.

In another aspect, a system includes a random noise generator configured to generate random noise and a generator configured to generate data based on the random noise, wherein the generator includes an optical matrix processing unit.

Embodiments of the system may include one or more of the following features. For example, the light matrix processing unit may include at least one of (i) the light matrix multiplying unit described above, (ii) the passive diffractive optical element described above, or (iii) the light matrix processing unit described above.

In another aspect, a system includes an optical circuit configured to perform a logic function on two input signals, the optical circuit including a first directional coupler having two inputs configured to receive the two input signals and two outputs, a first pair of phase shifters configured to modify phases of signals at the two outputs of the first directional coupler, a second directional coupler having two inputs and two outputs, the two inputs configured to receive signals from the first pair of phase shifters, and a second pair of phase shifters configured to modify phases of signals at the two outputs of the second directional coupler.

Embodiments of the method may include one or more of the following features. For example, the phase shifter may be configured to cause the optical circuit to perform rotation (rotation):

In some embodiments, when input signals x1 and x2 are provided to both inputs of the first directional coupler, the phase shifter may be configured to cause the optical circuit to perform the operations of:

In some embodiments, the optical circuit may include a first photodetector configured to generate an absolute value of a signal from the second pair of phase shifters to cause the optical circuit to perform operations:

in some embodiments, the optical circuit may include a comparator configured to compare an output signal of the first photodetector with a threshold value to generate a binary value (binary value) to cause the optical circuit to generate an output:

In some embodiments, the optical circuit may include a feedback mechanism (feedback mechanism) configured to cause an output signal of the photodetector to be fed back to an input of the first directional coupler and pass through the first directional coupler, the first pair of phase shifters, the second directional coupler, and the second pair of phase shifters, and to be detected by the photodetector to cause the optical circuit to perform operations:

which produces outputs AND (x 1, x 2) AND OR (x 1, x 2).

In some embodiments, the optical circuit may include a third directional coupler having two inputs and two outputs, the two inputs configured to receive signals from the second pair of phase shifters, a third pair of phase shifters configured to modify phases of signals at the two outputs of the third directional coupler, a fourth directional coupler having two inputs and two outputs, the two inputs configured to receive signals from the third pair of phase shifters, a fourth pair of phase shifters configured to modify phases of signals at the two outputs of the fourth directional coupler, and a second photodetector configured to generate absolute values of signals from the fourth pair of phase shifters to cause the optical circuit to perform operations of:

which produces outputs AND (x 1, x 2) AND OR (x 1, x 2).

In some embodiments, the system may include a double-tone sorter (Bitonic sorter) configured to perform a sorting function (sorting function) of the double-tone sorter using an optical circuit.

In some embodiments, a system may include a device configured to perform a hash function (hash function) using an optical circuit.

In some embodiments, the hash function may include a secure hash algorithm (secure hash algorithm) 2 (SHA-2).

Generally, systems for performing calculations use different types of operations to produce a calculation result, each operation being performed on a signal (e.g., an electrical or optical signal) that best suits the basic physical characteristics of the operation (e.g., in terms of energy consumption and/or speed). For example, three such operations are replication (copying), summation (summation), and multiplication (multiplication). Replication may be performed using optical power splitting (optical power splitting), summation may be performed using current-based summation (ELECTRICAL CURRENT-based summation), and multiplication may be performed using optical amplitude modulation (optical amplitude modulation), as described in more detail below. An example of a calculation that may be performed using these three types of operations is to multiply a vector by a matrix (e.g., as employed by artificial neural network calculations). These operations may be used to perform various other calculations, representing a set of general linear operations in which various calculations may be performed, including, but not limited to, vector-vector dot product (vector-vector dot product), vector-vector element-by-element multiplication (vector-vector element-wise multiplication), vector-scalar element-by-element multiplication (vector-SCALAR ELEMENT WISE multiple), or matrix-matrix element-by-element multiplication (matrix-matrix element-wise multiplication). Some of the examples described herein illustrate techniques and configurations for vector-matrix multiplication, but the corresponding techniques and configurations may be used for any of these types of computations.

Aspects can have one or more of the following advantages.

The optoelectronic computing systems described herein using electrical and optical signals may facilitate increased flexibility and/or efficiency. In the past, there may be potential challenges associated with combining optical (or photonic) and electrical (or electronic) integrated devices on a common (common) platform, such as a common semiconductor die (die), or multiple semiconductor dies combined in a controlled collapse chip connection (controlled collapsed chip connection) or "flip-chip" arrangement. Such potential challenges may include, for example, input/output (I/O) packaging or temperature control. For those systems described herein, potential challenges may be added when used with a relatively large number of optical input/output ports (ports) and a relatively large number of electrical input/output ports (e.g., 4 or more optical input/output ports, 200 or more electrical input/output ports). For example, in a controlled collapse chip connection, a semiconductor die having a photonic integrated circuit (e.g., as described below with reference to FIG. 1A implementing an optical processor) may include an electrical input port and an electrical output port, and an electrical output port and an electrical input port of a corresponding electronic integrated circuit are connected (e.g., as described below with reference to FIG. 1A implementing a controller 110, a memory cell 120, a digital-to-analog converter (DAC) unit 130, and/or an analog-to-digital converter (ADC) unit 160). For example, controlled collapse chip connection may use solder balls (or "bumps") of alloy composition in direct contact with metal pads integrated into the die, where the need for more complex, less compact packaging of wire-to-pad bonds is eliminated. These potential challenges can be alleviated using appropriate system designs. For example, the system may use a high density packaging arrangement that uses temperature control (e.g., thermoelectric cooling) to control thermal expansion between different material types (e.g., semiconductor material (e.g., silicon), glass material (Silica or "Silica"), ceramic material, etc.), and/or use a hermetic enclosure (enclosing housing) as a heat sink and provide a degree of sealing (sealing). With this temperature stabilization technique, different coefficients of thermal expansion (coefficients of thermal expansion; CTE) and the resulting misalignment between the system port and the ports of the packaged high-density fiber array can be limited.

For copy operations, since optical power splitting is passive, no power is consumed to perform the operation. In addition, the frequency bandwidth of the electrical splitter has a limitation related to the RC time constant. In contrast, the frequency bandwidth of the beam splitter is virtually unlimited. Different types of optical power splitters (optical power splitter) may be used, including waveguide splitters (waveguide optical splitter) or free-space splitters (free-space beam splitter), as described in more detail below.

For the multiplication operation, one value may be encoded as an optical signal and the other value may be encoded as an amplitude scaling factor (amplitude scaling coefficient) (e.g., multiplied by a value in the range of 0 to 1). After the scaling factor is set, the requirement for conditioning of the electrical signal by the multiplication operation in the optical domain (optical domain) is reduced (or absent), and thus the constraints (constraints) due to electrical noise, power consumption and bandwidth limitations are reduced. By appropriate selection of the detection scheme, signed (e.g., multiplied by a value between-1 and +1) results may be obtained, as described in more detail below.

For the summing operation, different techniques may be used to achieve a result in which the magnitude of the current in the conductor is determined based on the sum of the different contributions. In the case of input current signals, a single conductor carrying output current signals represents the sum of those input current signals when two or more conductors carrying those input current signals are combined at a junction (junction). In the case of an input optical signal, when two or more light waves of different wavelengths impinge on the detector, the current signal carried on the photocurrent generated by the detector is indicative of the sum of the power in the input optical signal. Both generate electrical signals (e.g., current) as outputs representing the sum, but one uses current as input (current-input-based summation), also referred to as "electrical summation (ELECTRICAL SUMMATION) performed in the" electrical domain (ELECTRICAL DOMAIN) "), and the other uses optical waves as input (optical-input-based summation), also referred to as" opto-electrical summation (optoelectronic summation) performed in the "opto-electrical domain (optoelectronic domain)"). In some embodiments, however, a summation based on current input is used instead of summation based on light input, which enables a single optical wavelength to be used in the system, avoiding potentially complex components of the system that may need to be provided and maintaining multiple wavelengths.

The combination of these basic operations performed by these modules may be arranged to provide a means to perform linear operations, such as vector matrix multiplication (vector-matrix multiplication) with arbitrary matrix element sizes (arbitrary matrix element magnitude). Other implementations of matrix multiplication using optical signals and interferometers for combining signals using optical interference have been limited to providing vector matrix multiplication with certain limitations, such as unitary (unitary matrix) or diagonal matrices. In addition, some other embodiments may rely on large scale phase alignment of multiple optical signals because they propagate through a relatively large number of optical components (e.g., optical modulators). Alternatively, embodiments described herein may relax such phase alignment constraints by converting the optical signal to an electrical signal after propagating through fewer optical components (e.g., after propagating through no more than a single optical amplitude modulator), which allows for the use of optical signals with reduced coherence, or even incoherent optical signals using optical modulators that do not rely on constructive/destructive (constructive/optical) interference.

For time domain encoding of optical and electrical signals, as will be described in more detail below, analog electronic circuits may be optimized for operation at a particular power level (level), which may be helpful if the circuit is operating at high speed. Such time domain coding is useful in reducing any challenges that may be associated with precisely controlling a relatively large number of clearly distinguishable intensity levels for each symbol. Conversely, when precise control of the duty cycle is applied in the time domain over multiple slots (time slots) within a single symbol duration (single symbol duration), a relatively constant amplitude may be used (with an amplitude of zero or near zero at the "off" level for the "on" level).

By integrating photons and electrons on a common substrate (e.g., a silicon chip), or by connecting the die fabricated using a flip-chip configuration as described above, modules can be conveniently fabricated on a large scale and coupled in a compact system. Wiring signals on the substrate as optical signals rather than electrical signals in a manner that allows grouping of photodetectors in a portion of the substrate and/or in a compact die layout (as described in more detail below) can help avoid long electronic wiring and its associated challenges (e.g., parasitic capacitance, inductance, and crosstalk).

For embodiments of a system using submatrix multiplication, each element of the output vector may be computed simultaneously using different means (e.g., different cores, different processors, different computers, different servers), helping to alleviate some potential limitations (e.g., memory wall) and helping the overall system to scale to very large matrices. In some embodiments, each sub-matrix may be multiplied by a corresponding sub-vector using different means. The sum may then be calculated by collecting or accumulating summands (summand) from different devices. Intermediate results in the form of optical signals can be transmitted conveniently between devices even if the devices are separated by relatively large distances.

Other aspects include other combinations of the features described above and other features expressed as methods, apparatus, systems, program products, and in other ways.

Particular embodiments of the subject matter described in this specification can be implemented to realize one or more of the following advantages. The artificial neural network computational throughput (throughput), latency (latency), or both may be improved. The power efficiency of the artificial neural network calculation can be improved.

In another aspect, an apparatus includes a plurality of optical waveguides in which a set of a plurality of input values are encoded on respective optical signals carried by the optical waveguides, a plurality of replica modules and, for each of at least two subsets of one or more optical signals, a respective set of one or more replica modules configured to divide the subset of one or more optical signals into copies (copies) of two or more optical signals, a plurality of multiplication modules and, for each of at least two copies of a first subset of one or more optical signals, a respective multiplication module configured to multiply the one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation, wherein at least one of the multiplication modules includes an optical amplitude modulator including one input port and two output ports and provides a pair of correlated optical signals from the two output ports such that a difference between the amplitudes of the correlated optical signals corresponds to a result of multiplying the input values by a signed matrix element value, and one or more multiplication modules are configured to produce a result of the sum of the two or more respective multiplication modules.

Embodiments of the apparatus may include one or more of the following features. For example, an input value in a set of multiple input values encoded on a respective optical signal may represent an element of an input vector multiplied by a matrix comprising one or more matrix element values.

In some embodiments, a set of multiple output values may be encoded on respective electrical signals generated by one or more summing modules, and the output values of the set of multiple output values may represent elements of an output vector, the output vector being generated by multiplying the input vector by a matrix.

In some embodiments, each optical signal carried by the optical waveguide may comprise an optical wave having a common wavelength that is substantially the same for all optical signals.

In some embodiments, the replication module may include at least one replication module having an optical splitter that transmits power of a predetermined proportion of the optical waves at the input port to the first output port and transmits power of the remaining proportion of the optical waves at the input port to the second output port.

In some embodiments, the optical splitter may include a waveguide splitter that transmits power of a predetermined proportion of the light waves guided by the input light waveguide to the first output light waveguide and transmits power of the remaining proportion of the light waves guided by the input light waveguide to the second output light waveguide.

In some embodiments, the guided mode of the input optical waveguide may be adiabatically (adiabatically) coupled to the guided mode of each of the first and second output optical waveguides.

In some embodiments, the optical splitter may include a beam splitter that includes at least one surface that transmits a predetermined proportion of the power of the optical wave at the input port and reflects the remaining proportion of the power of the optical wave at the input port.

In some embodiments, at least one of the plurality of optical waveguides may include an optical fiber coupled to an optical coupler that couples a guided mode of the optical fiber to a free-space propagation mode (free-space propagation mode).

In some embodiments, the multiplication module may comprise at least one coherence sensitive multiplication module (coherence-SENSITIVE MULTIPLICATION MODULE) configured to multiply one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation based on interference between the optical waves, the optical waves having a coherence length at least as long as a propagation distance through the coherence sensitive multiplication module.

In some embodiments, the coherence sensitive multiplication module may include a Mach-Zehnder interferometer (MZI) that separates the optical waves guided by the input optical waveguide into a first optical waveguide arm (optical waveguide arm) of the Mach-Zehnder interferometer and a second optical waveguide arm of the Mach-Zehnder interferometer, the first optical waveguide arm including a phase shifter that produces a relative phase shift with respect to a phase delay of the second optical waveguide arm, and the Mach-Zehnder interferometer combining the optical waves from the first optical waveguide arm and the second optical waveguide arm into at least one output optical waveguide.

In some embodiments, the mach-zehnder interferometer may combine light waves from the first and second optical waveguide arms into each of the first and second output optical waveguides, the first photodetector may receive light waves from the first output optical waveguide to produce a first photocurrent, the second photodetector may receive light waves from the second output optical waveguide to produce a second photocurrent, and the result of the coherence sensitive multiplication module may include a difference between the first and second photocurrents.

In some embodiments, the coherence sensitive multiplication module may include one or more ring resonators (ring resonators) including at least one ring resonator coupled to the first optical waveguide and at least one ring resonator coupled to the second optical waveguide.

In some embodiments, the first photodetector may receive light waves from the first light guide to produce a first photocurrent, the second photodetector may receive light waves from the second light guide to produce a second photocurrent, and the result of the coherence sensitive multiplying module may include a difference between the first photocurrent and the second photocurrent.

In some embodiments, the multiplication module may include at least one coherent non-sensitive multiplication module (coherent-INSENSITIVE MULTIPLICATION MODULE) configured to multiply the one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation based on energy absorption within the optical wave.

In some embodiments, the coherent non-sensitive multiplying module may include an electro-absorption modulator (electro-absorption modulator).

In some embodiments, the one or more summing modules may include at least one summing module having (1) two or more input conductors, each carrying an electrical signal in the form of an input current, the magnitude of the input current representing a respective result of a respective one of the multiplying modules, and (2) at least one output conductor carrying an electrical signal representing a sum of the respective results in the form of an output current, the output current being proportional to the sum of the input currents.

In some embodiments, the two or more input conductors and the output conductor may include a plurality of wires that contact one or more nodes between the wires and the output current is substantially equal to the sum of the input currents.

In some embodiments, at least a first input current of the input currents may be provided in the form of at least one photocurrent generated by at least one photodetector that receives the optical signal generated by the first multiplication module of the multiplication modules.

In some embodiments, the first input current may be provided in the form of a difference between two photocurrents generated by different respective photodetectors that receive different respective optical signals generated by the first multiplication module.

In some embodiments, one of the copies of the first subset of one or more optical signals may be comprised of a single optical signal, wherein one of the input values is encoded on the single optical signal.

In some embodiments, the multiplication module corresponding to the copy of the first subset may multiply the encoded input values by the single matrix element values.

In some embodiments, one of the copies of the first subset of one or more optical signals may include more than one optical signal and less than all of the optical signals on which the plurality of input values are encoded.

In some embodiments, the multiplication module corresponding to the copy of the first subset may multiply the encoded input values by different respective matrix element values.

In some embodiments, different multiplication modules corresponding to different respective copies of the first subset of one or more optical signals may be included by different devices, the different devices in optical communication to transmit one of the copies of the first subset of one or more optical signals between the different devices.

In some embodiments, at least one of the two or more of the plurality of optical waveguides, the two or more of the plurality of replication modules, the two or more of the plurality of multiplication modules, and the one or more summation modules may be disposed on a substrate of a common device.

In some embodiments, the device performs vector matrix multiplication, where an input vector may be provided as a set of optical signals and an output vector may be provided as a set of electrical signals.

In some embodiments, the apparatus may further include an accumulator that combines the input electrical signals corresponding to the outputs of the multiplication or summation modules, wherein the input electrical signals may be encoded using time domain encoding (time domain encoding) using switched amplitude modulation (on-off amplitude modulation) within each of the plurality of time slots, and the accumulator may generate the output electrical signals encoded at more than two amplitude levels, the amplitude levels corresponding to different duty cycles of the time domain encoding over the plurality of time slots.

In some embodiments, each of the two or more of the multiplication modules corresponds to a different subset of the one or more optical signals.

In some embodiments, the apparatus may further comprise a multiplication module for each copy of a second subset of the one or more optical signals different from the optical signals in the first subset of the one or more optical signals, configured to multiply the one or more optical signals of the second subset by the one or more matrix element values using optical amplitude modulation.

In another aspect, a method includes encoding a set of multiple input values on respective optical signals, for each of at least two subsets of one or more optical signals, using respective sets of one or more replica modules to divide the subset of one or more optical signals into two or more copies of the optical signals, for each of at least two copies of a first subset of one or more optical signals, using respective multiplication modules to multiply the one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation, wherein at least one multiplication module includes an optical amplitude modulator including one input port and two output ports, and providing a pair of correlated optical signals from the two output ports such that a difference between amplitudes of the correlated optical signals corresponds to a result of multiplying the input values by a signed matrix element value, and for a result of the two or more multiplication modules, using a summation module configured to generate an electrical signal representing a summation of the results of the two or more multiplication modules.

In another aspect, a method includes encoding a set of input values representing elements of an input vector on a respective optical signal, encoding a set of coefficients representing elements of a matrix as amplitude modulation levels of a set of optical amplitude modulators coupled to the optical signal, wherein at least one optical amplitude modulator including one input port and two output ports provides a pair of related optical signals from the two output ports such that a difference between amplitudes of the related optical signals corresponds to a result of multiplying the input values by a symbol matrix element value, and encoding a set of output values representing elements of an output vector on a respective electrical signal, wherein at least one electrical signal is in a form of a current whose amplitude corresponds to a sum of the respective elements of the input vector multiplied by a respective element of a row (row) of the matrix.

Embodiments of the method may include one or more of the following features. For example, at least one optical signal may be provided by a first optical waveguide, and the first optical waveguide may be coupled to an optical splitter that transmits power of a predetermined proportion of the optical waves guided by the first optical waveguide to a second output optical waveguide, and transmits the remaining proportion of the power of the optical waves guided by the first optical waveguide to a third optical waveguide.

In another aspect, an apparatus includes a plurality of optical waveguides encoding a set of input values representing elements of an input vector on respective optical signals carried by the optical waveguides, a set of optical amplitude modulators coupled to the optical signals encoding a set of coefficients representing matrix elements as amplitude modulation levels, wherein at least one optical amplitude modulator including one input port and two output ports provides a pair of related optical signals from the two output ports such that a difference between the amplitudes of the related optical signals corresponds to a result of multiplying the input values by a symbol matrix element value, and a plurality of summation modules encoding a set of output values representing elements of an output vector on respective electrical signals, wherein at least one electrical signal is in a form of electrical current whose amplitude corresponds to a sum of respective elements of the input vector multiplied by respective elements of a row (row) of the matrix.

In another aspect, a method for multiplying an input vector by a given matrix includes encoding a set of input values representing elements of the input vector on a corresponding optical signal of the set of optical signals, coupling a first set of one or more devices to a first set of one or more waveguides providing a first subset of the set of optical signals and producing a result of multiplying a first submatrix of the given matrix by values encoded on the first subset of the set of optical signals, coupling a second set of one or more devices to a second set of one or more waveguides providing a second subset of the set of optical signals and producing a result of multiplying a second submatrix of the given matrix by values encoded on the second subset of the set of optical signals, coupling a third set of one or more devices to a third set of one or more waveguides providing a replica of the first subset of the set of optical signals produced by a first optical splitter and producing a result of multiplying a given third submatrix by values encoded on the first subset of the set of optical signals, coupling a fourth set of one or more devices to a second set of one or more devices coupled to a second set of one or more devices that provide a result of the second submatrix of the set of optical signals by values encoded on the first subset of the set of optical signals, multiplying the first submatrix by the first submatrix and the first set of the first submatrix and the first output device and the first submatrix and the first device and the first submatrix form a result of the first output and the first matrix and the first submatrix.

Embodiments of the method may include one or more of the following features. For example, each pair of the first set of one or more devices, the second set of one or more devices, the third set of one or more devices, and the fourth set of one or more devices may be mutually exclusive (mutually exclusive).

In another aspect, an apparatus includes a first set of one or more devices configured to receive a first set of optical signals and produce a first matrix multiplied by a value encoded on the first set of optical signals, a second set of one or more devices configured to receive a second set of optical signals and produce a second matrix multiplied by a value encoded on the second set of optical signals, a third set of one or more devices configured to receive a third set of optical signals and produce a third matrix multiplied by a value encoded on the third set of optical signals, a fourth set of one or more devices configured to receive a fourth set of one or more devices and produce a fourth matrix multiplied by a value encoded on the fourth set of optical signals, and a configurable connection path between two or more of the first set of one or more devices, the second set of one or more devices, the third set of one or more devices, or the fourth set of devices, wherein the first configurable connection path is configured to provide a first signal (1) from the first set of optical signals as a summation module and the sum signal from the second set of one or more optical signals is provided as a sum module (2) of sum signals.

In another aspect, an apparatus includes a first set of one or more devices configured to receive a first set of one or more optical signals and generate a result based on an optical amplitude modulation of one or more optical signals of the first set of one or more optical signals, a second set of one or more devices configured to receive a second set of one or more optical signals and generate a result based on an optical amplitude modulation of one or more optical signals of the second set of one or more optical signals, a third set of one or more devices configured to receive a third set of one or more optical signals and generate a result based on an optical amplitude modulation of one or more optical signals of the third set of one or more optical signals, a fourth set of one or more devices configured to receive a fourth set of one or more optical signals and generate a result based on an optical amplitude modulation of one or more optical signals of the fourth set of one or more optical signals, and a configurable connection path between two or more of the first set of one or more devices, the second set of one or more devices, the third set of one or more devices, wherein the configurable connection path is configured to provide a first signal from a summing module(s) to a summing module(s) as a sum module(s) that is configured to provide a sum signal(s) from a first sum module(s) to a second module(s) that is configured to provide a sum signal(s) of one or more signals.

Embodiments of the apparatus may include one or more of the following features. For example, each pair of the first set of one or more devices, the second set of one or more devices, the third set of one or more devices, and the fourth set of one or more devices may be mutually exclusive.

In some embodiments, the first configuration of the configurable connection path is configured to (1) provide a copy of the first set of optical signals as the third set of optical signals, and (2) provide one or more signals from the first set of one or more devices and one or more signals from the second set of one or more devices to a summation module configured to generate an electrical signal representative of a sum of values encoded on at least two different signals received by the summation module.

In some embodiments, a first configuration of the configurable connection path may be configured to provide a copy of the first set of optical signals as the third set of optical signals, and a second configuration of the configurable connection path may be configured to provide one or more signals from the first set of one or more devices and one or more signals from the second set of one or more devices to a summation module configured to generate an electrical signal representative of a sum of values encoded on the signals received by the summation module.

In another aspect, an apparatus includes a plurality of optical waveguides in which a set of a plurality of input values is encoded on respective optical signals carried by the optical waveguides, a plurality of replica modules including, for each of at least two subsets of one or more optical signals, a respective set of one or more replica modules configured to divide the subset of one or more optical signals into copies of two or more optical signals, a plurality of multiplication modules including, for each of at least two copies of a first subset of one or more optical signals, a respective multiplication module configured to multiply the one or more optical signals of the first subset by one or more values using optical amplitude modulation, and one or more summation modules including, for the result of the two or more multiplication modules, a summation module configured to generate an electrical signal representative of a sum of the results of the two or more multiplication modules, wherein the result includes at least one electrical signal encoded on the electrical signal and the result is not transmitted beyond a single amplitude of the optical signal by a single amplitude modulator.

In another aspect, a system includes a first unit configured to generate a plurality of modulator control signals, and a processor including a light source configured to provide a plurality of light outputs, a plurality of light modulators coupled to the light source and the first unit, the plurality of light modulators configured to generate a light input vector by modulating the plurality of light outputs provided by the light source based on the plurality of modulator control signals, the light input vector including a plurality of light signals, and a matrix multiplication unit coupled to the plurality of light modulators and the first unit, the matrix multiplication unit configured to convert the light input vector into an analog output vector based on the plurality of weight control signals. The computing system also includes a second unit coupled to the matrix multiplication unit and configured to convert the analog output vector to a digital output vector, and a controller including an integrated circuit configured to receive an artificial neural network computation request including an input data set including a first digital input vector, receive a first plurality of neural network weights, and generate, by the first unit, a first plurality of modulator control signals based on the first digital input vector and a first plurality of weight control signals based on the first plurality of neural network weights.

Embodiments of the system may include one or more of the following features. For example, the first unit may include a digital-to-analog converter (DAC).

In some embodiments, the second unit may include an analog-to-digital converter (ADC).

In some embodiments, a system may include a storage unit configured to store a data set and a plurality of neural network weights.

In some embodiments, the integrated circuit of the controller may be further configured to perform operations comprising storing the input data set and the first plurality of neural network weights in the memory unit.

In some embodiments, the first unit may be configured to generate a plurality of weight control signals.

In some embodiments, the controller may include an Application Specific Integrated Circuit (ASIC), and receiving the artificial neural network computation request may include receiving the artificial neural network computation request from a general purpose data processor.

In some embodiments, the first unit, the processing unit, the second unit, and the controller may be disposed on at least one of a multi-chip module or an integrated circuit. Receiving the artificial neural network computation request may include receiving the artificial neural network computation request from a second data processor, wherein the second data processor may be external to the multi-chip module or integrated circuit, the second data processor may be coupled to the multi-chip module or integrated circuit through a communication channel (communication channel), and the processing unit may process the data at a data rate that is at least an order of magnitude greater than a data rate of the communication channel.

In some embodiments, the first unit, the processing unit, the second unit, and the controller may be used for an optoelectronic processing loop that repeats in a plurality of iterations, and the optoelectronic processing loop includes (1) at least a first light modulation operation based on at least one of the plurality of modulator control signals, and at least a second light modulation operation based on at least one of the weight control signals, and (2) at least one of (a) an electrical summation operation or (b) an electrical storage operation.

In some embodiments, the electro-optical processing cycle may include an electrical storage operation, and the electrical storage operation is performed using a storage unit coupled to the controller, wherein the operation performed by the controller may further include storing the input data set and the first plurality of neural network weights in the storage unit.

In some embodiments, the optoelectronic processing loop may include an electrical summing operation, and the electrical summing operation may be performed using an electrical summing module within the matrix multiplication unit, wherein the electrical summing module may be configured to generate currents corresponding to elements of an analog output vector representing a sum of respective elements of the optical input vector multiplied by respective neural network weights.

In some embodiments, the optoelectronic processing loop may include at least one signal path on which no more than one first optical modulation operation is performed in a single loop iteration based on at least one of the plurality of modulator control signals, and no more than one second optical modulation operation is performed in a single loop iteration based on at least one of the weight control signals.

In some embodiments, the first light modulation operation may be performed by one of a plurality of light modulators coupled to the light source of the light output and the matrix multiplication unit, and the second light modulation operation may be performed by a light modulator included in the matrix multiplication unit.

In some embodiments, the optoelectronic processing loop may include at least one signal path on which no more than one electrical storage operation is performed in a single loop iteration.

In some embodiments, the light source may include a laser unit configured to generate a plurality of light outputs.

In some embodiments, the matrix multiplication unit may include an input waveguide array for receiving the optical input vector and the optical input vector includes a first array of optical signals, an optical interference unit in optical communication with the input waveguide array for performing a linear transformation that converts the optical input vector into a second array of optical signals, and an output waveguide array in optical communication with the optical interference unit for guiding the second array of optical signals, wherein at least one input waveguide in the input waveguide array is in optical communication with each output waveguide in the output waveguide array through the optical interference unit.

In some embodiments, an optical interference unit may include a plurality of interconnected Mach-Zehnder interferometers (MZIs), each Mach-Zehnder interferometer of the plurality of interconnected Mach-Zehnder interferometers including a first phase shifter configured to change a splitting ratio of the Mach-Zehnder interferometer, and a second phase shifter configured to shift a phase of one output of the Mach-Zehnder interferometers, wherein the first phase shifter and the second phase shifter are coupled to a plurality of weight control signals.

In some embodiments, a matrix multiplication unit may include a plurality of replication modules, wherein each replication module corresponds to a subset of one or more optical signals of an optical input vector and is configured to divide the subset of one or more optical signals into two or more copies of the optical signals, a plurality of multiplication modules, wherein each multiplication module corresponds to a subset of one or more optical signals and is configured to multiply the one or more optical signals of the subset by one or more matrix element values using optical amplitude modulation, and one or more summation modules, wherein each summation module is configured to generate an electrical signal that represents a sum of results of two or more of the multiplication modules.

In some embodiments, the at least one multiplication module comprises an optical amplitude modulator comprising one input port and two output ports, and a pair of correlated optical signals may be provided from the two output ports such that a difference between the amplitudes of the correlated optical signals corresponds to a multiplication of the input value by the signed matrix element value.

In some embodiments, the matrix multiplication unit may be configured to multiply the light input vector by a matrix comprising one or more matrix element values.

In some embodiments, a set of multiple output values may be encoded on respective electrical signals generated by one or more summing modules, and the output values of the set of multiple output values may represent elements of an output vector, the output vector being generated by multiplying the optical input vector by a matrix.

In some embodiments, the system may include a storage unit configured to store the input data set and the neural network weights, the second unit may include an analog-to-digital conversion (ADC) unit, and the operations may further include obtaining a first plurality of digital outputs from the ADC unit corresponding to the analog output vectors of the matrix multiplication unit, the first plurality of digital outputs forming a first digital output vector, performing a nonlinear transformation on the first digital output vector to produce a first transformed digital output vector, and storing the first transformed digital output vector in the storage unit.

In some embodiments, the system has a first recurring time period defined as an elapsed time between the step of storing the input data set and the first plurality of neural network weights in the memory unit and the step of storing the first transformed digital output vector in the memory unit, and wherein the first recurring time period is less than or equal to 1ns.

In some embodiments, the first unit may include a digital-to-analog conversion (DAC) unit, and the operations may further include generating, by the DAC unit, a second plurality of modulator control signals based on the first transformed digital output vector.

In some embodiments, the first unit may include a digital-to-analog conversion (DAC) unit, the artificial neural network computation request may further include a second plurality of neural network weights, and wherein the operations may further include generating, by the DAC unit, a second plurality of weight control signals based on the second plurality of neural network weights based on the obtaining of the first plurality of digital outputs.

In some embodiments, the first plurality of neural network weights and the second plurality of neural network weights may correspond to different layers of an artificial neural network.

In some embodiments, the first unit may comprise a digital-to-analog conversion (DAC) unit, and the input data set may further comprise a second digital input vector. The operations may further include generating, by the DAC unit, a second plurality of modulator control signals based on the second digital input vector, deriving from the ADC unit a second plurality of digital outputs corresponding to the analog output vectors of the matrix multiplication unit, the second plurality of digital outputs forming a second digital output vector, performing a nonlinear transformation on the second digital output vector to generate a second transformed digital output vector, storing the second transformed digital output vector in the storage unit, and outputting an artificial neural network output generated based on the first transformed digital output vector and the second transformed digital output vector. The analog output vector of the matrix multiplication unit may be generated by a second optical input vector generated based on a second plurality of modulator control signals, the second optical input vector being converted by the matrix multiplication unit based on the first mentioned plurality of weight control signals.

In some embodiments, the system may include a storage unit configured to store the input data set and the neural network weights, and the second unit may include an analog-to-digital conversion (ADC) unit. The system may further include an analog nonlinear unit disposed between the matrix multiplication unit and the ADC unit, the analog nonlinear unit may be configured to receive the plurality of output voltages from the matrix multiplication unit, apply a nonlinear transfer function, and output the plurality of converted output voltages to the ADC unit. The operations performed by the integrated circuit of the controller may further include deriving a first plurality of converted digital output voltages from the ADC unit corresponding to the plurality of converted output voltages, the first plurality of converted digital output voltages forming a first converted digital output vector, and storing the first converted digital output vector in the memory unit.

In some embodiments, the first unit may include a digital-to-analog conversion (DAC) unit and the second unit may include an analog-to-digital conversion (ADC) unit. The matrix multiplication unit may include an optical matrix multiplication unit coupled to the plurality of optical modulators and the DAC unit, the optical matrix multiplication unit configured to convert an optical input vector into an optical output vector based on the plurality of weight control signals, and a photo detection unit coupled to the optical matrix multiplication unit and configured to generate a plurality of output voltages corresponding to the optical output vector.

In some embodiments, the system may further include an analog storage unit disposed between the DAC unit and the plurality of optical modulators, the analog storage unit configured to store an analog voltage and output the stored analog voltage, and an analog nonlinear unit disposed between the photo detection unit and the ADC unit, the analog nonlinear unit configured to receive the plurality of output voltages from the photo detection unit, apply a nonlinear transfer function, and output the plurality of converted output voltages.

In some embodiments, the analog storage unit may be configured to receive and store a plurality of converted output voltages of the analog nonlinear unit and output the stored plurality of converted output voltages to the plurality of optical modulators. The operations may further include storing a plurality of converted output voltages of the analog nonlinear unit in the analog memory unit based on generating the first plurality of modulator control signals and the first plurality of weight control signals, outputting the stored converted output voltages through the analog memory unit, deriving a second plurality of converted digital output voltages from the ADC unit, the second plurality of converted digital output voltages forming a second converted digital output vector, and storing the second converted digital output vector in the memory unit.

In some embodiments, the system may include a storage unit configured to store the input data set and the neural network weights, and the input data set of the artificial neural network calculation request may include a plurality of digital input vectors. The light source may be configured to produce a plurality of wavelengths. The plurality of optical modulators may include optical modulator groups configured to generate a plurality of optical input vectors, each optical modulator group corresponding to one of the plurality of wavelengths and generating a respective optical input vector having a respective wavelength, and an optical multiplexer configured to combine the plurality of optical input vectors into a combined optical input vector comprising the plurality of wavelengths. The photodetecting unit may be further configured to demultiplex a plurality of wavelengths and generate a plurality of demultiplexed output voltages. The operations may include obtaining a plurality of digital demultiplexed optical outputs from an ADC unit, the plurality of digital demultiplexed optical outputs forming a plurality of first digital output vectors, wherein each of the plurality of first digital output vectors corresponds to one of a plurality of wavelengths, performing a nonlinear transformation on each of the plurality of first digital output vectors to produce a plurality of transformed first digital output vectors, and storing the plurality of transformed first digital output vectors in a storage unit. Each of the plurality of digital input vectors corresponds to one of the plurality of optical input vectors.

In some embodiments, the system may include a storage unit configured to store the input data set and the neural network weights, the second unit may include an analog-to-digital conversion (ADC) unit, and the artificial neural network calculation request may include a plurality of digital input vectors. The light source may be configured to produce a plurality of wavelengths. The plurality of optical modulators may include optical modulator groups configured to generate a plurality of optical input vectors, each optical modulator group corresponding to one of the plurality of wavelengths and generating a respective optical input vector having a respective wavelength, and an optical multiplexer configured to combine the plurality of optical input vectors into a combined optical input vector comprising the plurality of wavelengths. Operations may include obtaining a first plurality of digital light outputs from an ADC unit corresponding to a light output vector, the light output vector comprising a plurality of wavelengths, the first plurality of digital light outputs forming a first digital output vector, performing a nonlinear transformation on the first digital output vector to produce a first transformed digital output vector, and transforming the first digital output vector in a storage unit.

In some embodiments, the first unit may include a digital-to-analog conversion (DAC) unit, the second unit may include an analog-to-digital conversion (ADC) unit, and the DAC unit may include a 1-bit DAC subunit configured to generate the plurality of 1-bit modulator control signals. The resolution of the ADC unit may be 1 bit and the resolution of the first digital input vector may be N bits. The operations may include decomposing a first digital input vector into N1-bit input vectors, each of the N1-bit input vectors corresponding to one of N bits of the first digital input vector, generating a sequence of N1-bit modulator control signals corresponding to the N1-bit input vectors by a 1-bit DAC subunit, deriving a sequence of N digital 1-bit optical outputs corresponding to the sequence of N1-bit modulator control signals from an ADC unit, constructing an N-bit digital output vector from the sequence of N digital 1-bit optical outputs, performing a nonlinear transformation on the constructed N-bit digital output vector to generate a transformed N-bit digital output vector, and storing the transformed N-bit digital output vector in a storage unit.

In some embodiments, the system may include a storage unit configured to store the input data set and the neural network weights. The memory unit may include a digital input vector memory configured to store a first digital input vector and including at least one SRAM, and a neural network weight memory configured to store a plurality of neural network weights and including at least one DRAM.

In some embodiments, the first unit may include a digital-to-analog conversion (DAC) unit including a first DAC subunit configured to generate the plurality of modulator control signals, and a second DAC subunit configured to generate the plurality of weight control signals, wherein the first DAC subunit and the second DAC subunit are different.

In some embodiments, the light source may include a laser source configured to generate light and an optical power splitter configured to split the light generated by the laser source into a plurality of light outputs, wherein each of the plurality of light outputs has substantially the same power.

In some embodiments, the system may include a plurality of optical waveguides coupled between the optical modulator and the matrix multiplication unit, wherein the optical input vector may include a set of a plurality of input values encoded on respective optical signals carried by the optical waveguides, and each optical signal carried by one of the optical waveguides may include an optical wave having a common wavelength that is substantially the same for all of the optical signals.

In some embodiments, at least one of the plurality of optical waveguides may include an optical fiber coupled to an optical coupler that couples a guided mode of the optical fiber to a free-space propagation mode.

In some embodiments, the multiplication module may comprise at least one coherence sensitive multiplication module configured to multiply one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation based on interference between the optical waves, the optical waves having a coherence length at least as long as a propagation distance through the coherence sensitive multiplication module.

In some embodiments, the coherence sensitive multiplication module may include a Mach-Zehnder interferometer (MZI) that separates the optical waves guided by the input optical waveguide into a first optical waveguide arm of the Mach-Zehnder interferometer and a second optical waveguide arm of the Mach-Zehnder interferometer, the first optical waveguide arm including a phase shifter that produces a relative phase shift with respect to a phase delay of the second optical waveguide arm, and the Mach-Zehnder interferometer may combine the optical waves from the first optical waveguide arm and the second optical waveguide arm into at least one output optical waveguide.

In some embodiments, the coherence sensitive multiplication module may include one or more ring resonators including at least one ring resonator coupled to the first optical waveguide and at least one ring resonator coupled to the second optical waveguide.

In some embodiments, the multiplication module may include at least one coherent non-sensitive multiplication module configured to multiply one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation based on energy absorption within the optical wave.

In some embodiments, the coherent non-sensitive multiplication module may include an electro-absorption modulator.

In some embodiments, the two or more input conductors and the output conductor may include wires that contact one or more nodes between the wires and the output current is substantially equal to the sum of the input currents.

In some embodiments, the apparatus may further comprise an accumulator that combines the input electrical signals corresponding to the outputs of the multiplication or summation modules, wherein the input electrical signals may be encoded using time domain encoding using switched amplitude modulation within each of the plurality of time slots, and the accumulator may generate the output electrical signals encoded at more than two amplitude levels corresponding to different duty cycles of the time domain encoding over the plurality of time slots.

In another aspect, a system includes a storage unit configured to store a data set and a plurality of neural network weights, and a driver unit configured to generate a plurality of modulator control signals. The system includes an optoelectronic processor including a light source configured to provide a plurality of light outputs, a plurality of light modulators coupled to the light source and the driver unit, the plurality of light modulators configured to generate light input vectors by modulating a plurality of light outputs generated by the light source based on a plurality of modulator control signals, a matrix multiplication unit coupled to the plurality of light modulators and the driver unit, the matrix multiplication unit configured to convert the light input vectors into analog output vectors based on a plurality of weight control signals, and a comparator unit coupled to the matrix multiplication unit and configured to convert the analog output vectors into a plurality of digital 1-bit outputs. The system includes a controller including an integrated circuit configured to receive an artificial neural network computation request including an input data set and a first plurality of neural network weights, wherein the input data set includes a first digital input vector having an N-bit resolution, store the input data set and the first plurality of neural network weights in a storage unit, decompose the first digital input vector into N1-bit input vectors, each of the N1-bit input vectors corresponding to one of N bits of the first digital input vector, generate a sequence of N1-bit modulator control signals corresponding to the N1-bit input vectors by a driver unit, obtain a sequence of N digital 1-bit outputs corresponding to the sequence of N1-bit modulator control signals from the comparator unit, construct an N-bit digital output vector from the sequence of N digital 1-bit outputs, perform a nonlinear transformation on the constructed N-bit digital output vector to produce a transformed N-bit digital output vector, and store the transformed N-bit digital output vector in the storage unit.

Embodiments of the system may include one or more of the following features. For example, receiving the artificial neural network calculation request may include receiving the artificial neural network calculation request from a general purpose computer (general purpose computer).

In some embodiments, the driver unit may be configured to generate a plurality of weight control signals.

In some embodiments, the matrix multiplication unit may include an optical matrix multiplication unit coupled to the plurality of light modulators and the driver unit, the optical matrix multiplication unit configured to convert the light input vector into the light output vector based on the plurality of weight control signals, and a photo detection unit coupled to the optical matrix multiplication unit and configured to generate a plurality of output voltages corresponding to the light output vector.

In some embodiments, the matrix multiplication unit may include an array of input waveguides for receiving the optical input vector, an optical interference unit in optical communication with the array of input waveguides for performing a linear transformation that converts the optical input vector into a second array of optical signals, and an array of output waveguides in optical communication with the optical interference unit for guiding the second array of optical signals, wherein at least one input waveguide in the array of input waveguides is in optical communication with each output waveguide in the array of output waveguides through the optical interference unit.

In some embodiments, an optical interference unit may include a plurality of interconnected Mach-Zehnder interferometers (MZIs), each Mach-Zehnder interferometer of the plurality of interconnected Mach-Zehnder interferometers including a first phase shifter configured to change a splitting ratio of the Mach-Zehnder interferometer, and a second phase shifter configured to shift a phase of one output of the Mach-Zehnder interferometers, wherein the first phase shifter and the second phase shifter may be coupled to a plurality of weight control signals.

In some embodiments, the matrix multiplication unit may include a plurality of replica modules, for each of at least two subsets of one or more optical signals of the optical input vector, the plurality of replica modules including a respective set of one or more replica modules configured to divide the subset of one or more optical signals into two or more copies of the optical signal, a plurality of multiplication modules, for each of the at least two copies of a first subset of one or more optical signals, the plurality of multiplication modules including a respective multiplication module configured to multiply the one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation, and one or more summation modules, for the results of the two or more multiplication modules, the one or more summation modules including a summation module configured to generate an electrical signal representative of a sum of the results of the two or more multiplication modules.

In some embodiments, the at least one multiplication module may include an optical amplitude modulator including one input port and two output ports, and a pair of correlated optical signals may be provided from the two output ports such that a difference between the amplitudes of the correlated optical signals corresponds to a result of multiplying the input value by the signed matrix element value.

In another aspect, a method is provided for performing artificial neural network computations in a system having a matrix multiplication unit configured to convert an optical input vector into an analog output vector based on a plurality of weight control signals. The method includes receiving an artificial neural network calculation request including an input data set and a first plurality of neural network weights, wherein the input data set includes a first digital input vector, storing the input data set and the first plurality of neural network weights in a storage unit, generating a first plurality of modulator control signals based on the first digital input vector and the first plurality of weight control signals based on the first plurality of neural network weights, obtaining a first plurality of digital outputs corresponding to output vectors of a matrix multiplication unit, the first plurality of digital outputs forming a first digital output vector, performing a nonlinear transformation on the first digital output vector by a controller to generate a first transformed digital output vector, storing the first transformed digital output vector in the storage unit, and outputting, by the controller, the artificial neural network output generated based on the first transformed digital output vector.

Embodiments of the method may include one or more of the following features. For example, receiving an artificial neural network calculation request may include receiving an artificial neural network calculation request from a computer over a communication channel.

In some embodiments, generating the first plurality of modulator control signals may include generating the first plurality of modulator control signals by a digital-to-analog conversion (DAC) unit.

In some embodiments, deriving the first plurality of digital outputs may include deriving the first plurality of digital outputs from an analog-to-digital conversion (ADC) unit.

In some embodiments, a method may include applying a first plurality of modulator control signals to a plurality of light modulators coupled to a light source and a DAC unit, and generating an optical input vector using the plurality of light modulators by modulating a plurality of optical outputs generated by a laser unit based on the plurality of modulator control signals.

In some embodiments, a matrix multiplication unit may be coupled to the plurality of light modulators and DAC units, and the method may include converting the light input vector into an analog output vector based on the plurality of weight control signals using the matrix multiplication unit.

In some embodiments, the ADC unit may be coupled to a matrix multiplication unit, and the method may include converting the analog output vector to a first plurality of digital outputs using the ADC unit.

In some embodiments, the matrix multiplication unit may include an optical matrix multiplication unit coupled to a plurality of light modulators and DAC units. Converting the light input vector into the analog output vector may include converting the light input vector into the light output vector based on the plurality of weight control signals using the light matrix multiplication unit. The method may include generating a plurality of output voltages corresponding to the light output vectors using a photo detection unit coupled to the light matrix multiplication unit.

In some embodiments, a method may include receiving an optical input vector at an input waveguide array, performing a linear transformation that converts the optical input vector into a second optical signal array using an optical interference unit in optical communication with the input waveguide array, and directing the second optical signal array using an output waveguide array in optical communication with the optical interference unit, wherein at least one input waveguide in the input waveguide array is in optical communication with each output waveguide in the output waveguide array through the optical interference unit.

In some embodiments, the optical interference unit may include a plurality of interconnected Mach-Zehnder interferometers (MZIs), each of the plurality of interconnected Mach-Zehnder interferometers may include a first phase shifter and a second phase shifter, and the first phase shifter and the second phase shifter may be coupled to a plurality of weight control signals. The method may include changing a splitting ratio of the mach-zehnder interferometer using a first phase shifter and shifting a phase of one output of the mach-zehnder interferometer using a second phase shifter.

In some embodiments, a method may include, for each of at least two subsets of one or more optical signals of an optical input vector, dividing the subset of one or more optical signals into two or more copies of the optical signal using a respective set of one or more replica modules, for each of the at least two copies of a first subset of one or more optical signals, multiplying the one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation using a respective multiplication module, and for the results of the two or more multiplication modules, using a summation module configured to generate an electrical signal representing a sum of the results of the two or more multiplication modules.

In some embodiments, a method may include multiplying a light input vector by a matrix including one or more matrix element values using a matrix multiplication unit.

In some embodiments, a method may include encoding a set of multiple output values on respective electrical signals produced by one or more summing modules, and representing elements of an output vector using the output values of the set of multiple output values, the output vector produced by multiplying an optical input vector by a matrix.

In another aspect, a method includes providing input information in an electronic format, converting at least a portion of the electronic input information into an optical input vector, photoelectrically converting the optical input vector into an analog output vector based on matrix multiplication, and electronically applying a nonlinear transformation to the analog output vector to provide output information in the electronic format.

Embodiments of the method may include one or more of the following features. For example, the method may further include repeating the electro-optic conversion, the photoelectric conversion, and the nonlinear transformation of the electrical application for new electronic input information corresponding to the output information provided in the electronic format.

In some embodiments, the matrix multiplication for the initial photoelectric conversion and the matrix multiplication for the repeated photoelectric conversion may be the same and may correspond to the same layer of the artificial neural network.

In some embodiments, the matrix multiplication for the initial photoelectric conversion and the matrix multiplication for the repeated photoelectric conversion may be different and may correspond to different layers of the artificial neural network.

In some embodiments, the method may further include repeating the electro-optic conversion, the photoelectric conversion, and the electrically applied nonlinear conversion for different portions of the electronic input information, wherein the matrix multiplication for the initial photoelectric conversion and the matrix multiplication for the repeated photoelectric conversion are the same and correspond to the first layer of the artificial neural network.

In some embodiments, the method may further include providing the intermediate information in an electronic format based on electronic output information for a plurality of portions of the electronic input information generated by a first layer of the artificial neural network, and repeating the electro-optic conversion, and the electrically applied nonlinear transformation for each of the different portions of the electronic intermediate information, wherein the matrix multiplication for the initial electro-optic conversion and the matrix multiplication for the repeated electro-optic conversion associated with the different portions of the electronic intermediate information are the same and correspond to a second layer of the artificial neural network.

In another aspect, a system is provided for performing artificial neural network calculations. The system includes a first unit configured to generate a plurality of vector control signals and to generate a plurality of weight control signals, a second unit configured to provide an optical input vector based on the plurality of vector control signals, and a matrix multiplication unit coupled to the second unit and the first unit, the matrix multiplication unit configured to convert the optical input vector into an output vector based on the plurality of weight control signals. The system includes a controller including an integrated circuit configured to receive an artificial neural network calculation request including an input data set and a first plurality of neural network weights, wherein the input data set includes a first digital input vector, and generate, by a first unit, a first plurality of vector control signals based on the first digital input vector and a first plurality of weight control signals based on the first plurality of neural network weights, wherein the first unit, a second unit, a matrix multiplication unit, and the controller are used for an optoelectronic processing cycle that is repeated in a plurality of iterations, and the optoelectronic processing cycle includes (1) at least two light modulation operations, and at least one of (a) an electrical summation operation or (b) an electrical storage operation.

In another aspect, a method is provided for performing artificial neural network calculations. The method includes providing input information in an electronic format, converting at least a portion of the electronic input information into an optical input vector, and converting the optical input vector into an output vector based on matrix multiplication using a set of neural network weights. The providing and converting are performed in a photoelectric processing loop, the photoelectric processing loop being repeated in a plurality of iterations using different respective sets of neural network weights and different respective input information, and the photoelectric processing loop comprising at least one of (1) at least two light modulation operations, and (2) an electrical summing operation or (b) an electrical storage operation.

The details of one or more embodiments of the subject matter described in this disclosure are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages of the disclosure will become apparent from the description, the drawings, and the claims.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. In the event of conflict with a patent application or patent application publication, which is incorporated by reference herein, the present disclosure, including definitions, controls.

Drawings

The disclosure is best understood from the following detailed description when read in connection with the accompanying drawing figures. It is emphasized that, according to common practice, the various features of the drawing are not to scale. Conversely, the dimensions of the various features are arbitrarily expanded or reduced for clarity.

FIG. 1A is a schematic diagram of an example of an Artificial Neural Network (ANN) computing system.

Fig. 1B is a schematic diagram of an example of an optical matrix multiplication unit.

Fig. 1C and 1D are schematic diagrams of example configurations of an interconnected mach-zehnder interferometer (MZI).

Fig. 1E is a schematic diagram of an example of a mach-zehnder interferometer.

FIG. 1F is a schematic diagram of an example of a wavelength division multiplexed artificial neural network (WAVELENGTH DIVISION MULTIPLEXED ANN) computing system.

Fig. 2A is a flowchart illustrating an example of a method for performing artificial neural network calculations.

Fig. 2B is a diagram illustrating one aspect of the method of fig. 2A.

Fig. 3A and 3B are schematic diagrams of examples of artificial neural network computing systems.

Fig. 4A is a schematic diagram of an example of an artificial neural network computing system with 1-bit internal resolution (internal resolution).

FIG. 4B is a mathematical representation of the operation of the artificial neural network computing system of FIG. 4A.

FIG. 5 is a schematic diagram of an example of an Artificial Neural Network (ANN) computing system.

Fig. 6 is a schematic diagram of an example of an optical matrix multiplication unit.

FIG. 7 is a schematic diagram of an example of an Artificial Neural Network (ANN) computing system.

Fig. 8 is a diagram of an example of an optical matrix multiplication unit.

FIG. 9 is a schematic diagram of an example of an Artificial Neural Network (ANN) computing system.

Fig. 10 is a diagram of an example of an optical matrix multiplication unit.

Fig. 11 is a diagram of an example of a compact matrix multiplier unit (compact matrix multiplier unit).

Fig. 12A shows a graph of a comparative photon matrix multiplier cell.

Fig. 12B is a diagram of a compact interconnection interferometer.

Fig. 13 is a diagram of a compact matrix multiplier unit.

Fig. 14 is a diagram of an optical GENERATIVE ADVERSARIAL network.

Fig. 15 is a diagram of a mach-zehnder interferometer.

Fig. 16, 17A, and 17B are diagrams of photonic circuits.

Fig. 18 is a schematic diagram of an example of an optoelectronic computing system.

Fig. 19A and 19B are schematic diagrams of example system configurations.

Fig. 20A is a schematic diagram of an example of a symmetrical differential configuration (SYMMETRIC DIFFERENTIAL configuration).

Fig. 20B and 20C are circuit diagrams of examples of system modules.

Fig. 21A is a schematic diagram of an example of a symmetrical differential configuration.

Fig. 21B is a schematic diagram of an example of a system configuration.

Fig. 22A is a schematic diagram of an example optical amplitude modulator.

Fig. 22B to 22D are schematic diagrams of examples of optical amplitude modulators using optical detection in a symmetrical differential configuration.

Fig. 23A to 23C are photoelectric circuit diagrams of an example system configuration.

23D-23K are schematic diagrams of an example waveguide system including coupling segments to enable optical energy to transition between waveguides disposed in different layers.

24A-24E are schematic diagrams of example computing systems that use multiple optoelectronic systems.

Fig. 25 is a flowchart showing an example of a method for performing artificial neural network computation.

Fig. 26 and 27 are schematic diagrams of examples of artificial neural network computing systems.

Fig. 28 is a schematic diagram of an example of a neural network computing system using passive 2D optical matrix multiplication units (passive 2D optical matrix multiplication unit).

Fig. 29 is a schematic diagram of an example of a neural network computing system using passive 3D optical matrix multiplication units.

Fig. 30 is a schematic diagram of an example of an artificial neural network computing system with 1-bit internal resolution, where the system uses a passive 2D optical matrix multiplication unit.

Fig. 31 is a schematic diagram of an example of an artificial neural network computing system with 1-bit internal resolution, where the system uses a passive 3D optical matrix multiplication unit.

FIG. 32A is a schematic diagram of an example of an Artificial Neural Network (ANN) computing system.

Fig. 32B is a schematic diagram of an example of a photoelectric matrix multiplication unit.

Fig. 33 is a flowchart showing an example of a method for performing artificial neural network calculations using an optoelectronic processor.

Fig. 34 is a diagram illustrating one aspect of the method of fig. 33.

Fig. 35A is a schematic diagram of an example of a wavelength division multiplexed artificial neural network computing system using an optoelectronic processor.

Fig. 35B and 35C are schematic diagrams of examples of the wavelength division multiplexing photoelectric matrix multiplication unit.

Fig. 36 and 37 are schematic diagrams of examples of artificial neural network computing systems using a photoelectric matrix multiplication unit.

FIG. 38 is a schematic diagram of an example of an artificial neural network computing system with 1-bit internal resolution, where the system uses a photo matrix multiplication unit.

Fig. 39A is a schematic diagram of an example of a mach-zehnder modulator.

Fig. 39B is a graph showing the intensity-voltage curve of the mach-zehnder modulator of fig. 39A.

Fig. 40 is a schematic diagram of a homodyne detector.

FIG. 41 is a schematic diagram of a computing system including optical fibers, each of which carries signals having multiple wavelengths.

FIG. 42 is a graph of a modulation value probability distribution and an example relationship between modulator power and modulation values.

Fig. 43 is a diagram of an example of a mach-zehnder modulator.

Fig. 44 is a diagram of an example of a charge-pump bandwidth enhancement circuit.

45A-45H are diagrams of example layouts of portions of a photonic integrated circuit and an electronic integrated circuit on a die in a controlled collapse chip connection configuration.

46-49 Are diagrams of examples of artificial neural network computing systems, each artificial neural network computing system including at least one semiconductor die having a photonic integrated circuit and at least one semiconductor die having an electronic integrated circuit.

Like reference numbers and designations in the various drawings indicate like elements.

Detailed Description

FIG. 1A shows a diagram of an example of an Artificial Neural Network (ANN) computing system 100. The system 100 includes a controller 110, a storage unit 120, a digital-to-analog conversion (DAC) unit 130, an optical processor 140, and an analog-to-digital conversion (ADC) unit 160. The controller 110 is coupled to the computer 102, the storage unit 120, the DAC unit 130, and the ADC unit 160. The controller 110 includes an integrated circuit configured to control the operation of the artificial neural network computing system 100 to perform artificial neural network computations.

The integrated circuit of the controller 110 may be an application specific integrated circuit specifically configured to perform the steps of the artificial neural network computing process. For example, the integrated circuit may implement microcode or firmware specific to performing the artificial neural network computing process. As such, the controller 110 may have a reduced instruction set relative to a general-purpose processor used in a conventional computer (e.g., computer 102). In some embodiments, the integrated circuit of the controller 110 may include two or more circuits configured to perform different steps of the artificial neural network computing process.

In an example operation of the artificial neural network computing system 100, the computer 102 may issue an artificial neural network computing request to the artificial neural network computing system 100. The artificial neural network calculation request may include a neural network weight defining the artificial neural network, and an input data set processed by the provided artificial neural network. The controller 110 receives the artificial neural network calculation request and stores the input data set and the neural network weights in the storage unit 120.

The input data set may correspond to various digital information to be processed by the artificial neural network. Examples of input data sets include image files, audio (audio) files, laser radar (LiDAR) point clouds, and GPS coordinate sequences, and the operation of the artificial neural network computing system 100 will be described based on receiving image files as input data sets. In general, the size of the input data set can vary widely, from hundreds of data points (data points) to millions of data points or more. For example, a digital image file having a resolution of 1 million pixels (megapixel) has approximately one million pixels, and each of the one million pixels may be a data point processed by an artificial neural network. Because of the large number of data points in a typical input data set, the input data set is typically split into multiple digital input vectors of smaller size for separate processing by the light processor 140. As an example, for a grayscale digital image (GREYSCALE DIGITAL IMAGE), the elements of the digital input vector may be 8-bit values representing the image intensity, and the digital input vector may have a length ranging from tens of elements (e.g., 32 elements, 64 elements) to hundreds of elements (e.g., 256 elements, 512 elements). In general, an arbitrarily sized input data set may be divided into digital input vectors of a size suitable for processing by the light processor 140. In the case where the number of elements of the input data set is not divisible by the length of the digital input vector, zero padding (zero padding) may be used to fill the data set so that it is divisible by the length of the digital input vector. The processing outputs of the respective digital input vectors may be processed to reconstruct a complete output, which is the result of processing the input data set through the artificial neural network. In some embodiments, the division of the input data set into multiple input vectors and subsequent vector-level processing may be implemented using a block matrix multiplication technique (block matrix multiplication technique).

Neural network weights are a set of values that define the connectivity (connectivity) of the artificial neurons of an artificial neural network, including the relative importance or weights of those connections. The artificial neural network may include one or more hidden layers with corresponding sets of nodes. In the case of an artificial neural network with a single hidden layer, the artificial neural network may be defined by two sets of neural network weights, one set corresponding to connectivity between the input nodes and the nodes of the hidden layer, and a second set corresponding to connectivity between the hidden layer and the output nodes. Each set of neural network weights describing connectivity corresponds to a matrix implemented by the light processor 140. For artificial neural networks with two or more hidden layers, an additional set of neural network weights is required to define connectivity between the additional hidden layers. As such, in a typical scenario, the neural network weights included in the artificial neural network computation request may include multiple sets of neural network weights that represent connectivity between the various layers of the artificial neural network.

Since the input data set to be processed is typically divided into a plurality of smaller digital input vectors for separate processing, the input data set is typically stored in digital storage. However, the speed of the memory operations between the memory and the processor of the computer 102 is significantly slower than the rate at which the artificial neural network computing system 100 can perform the artificial neural network calculations. For example, the artificial neural network computing system 100 may perform tens to hundreds of artificial neural network computations during a typical memory read period of the computer 102. As such, during the process of processing an artificial neural network computing request, if the artificial neural network computing of the artificial neural network computing system 100 involves multiple data transmissions between the system 100 and the computer 102, the rate of the artificial neural network computing that can be performed by the artificial neural network computing system 100 may be limited below its overall processing rate. For example, if the computer 102 were to access an input data set from its own storage and provide a digital input vector to the controller 110 upon request, the operation of the artificial neural network computing system 100 may be greatly slowed by the time required for a series of data transfers required between the computer 102 and the controller 110. Notably, the memory access delay (latency) of the computer 102 is typically non-deterministic, which further complicates and reduces the speed at which digital input vectors can be provided to the artificial neural network computing system 100. Furthermore, processor cycles of the computer 102 may be wasted in managing data transmissions between the computer 102 and the artificial neural network computing system 100.

In contrast, in some embodiments, the artificial neural network computing system 100 stores the entire input data set in the storage unit 120, the storage unit 120 being part of the artificial neural network computing system 100 and dedicated to the artificial neural network computing system 100. The dedicated memory unit 120 allows transactions (transactions) between the memory unit 120 and the controller 110 to be particularly adapted to allow smooth and uninterrupted data flow between the memory unit 120 and the controller 110. Such uninterrupted data flow may significantly improve the overall throughput of the artificial neural network computing system 100 by allowing the optical processor 140 to perform matrix multiplication at its overall processing rate, without being limited by the slow storage operations of a conventional computer (e.g., computer 102). Further, because all of the data required in performing the artificial neural network computation is provided by the computer 102 to the artificial neural network computing system 100 in a single transaction, the artificial neural network computing system 100 may perform its artificial neural network computation in a unique manner independent of the computer 102. Such unique operation of the artificial neural network computing system 100 reduces the computational burden on the computer 102 and eliminates external dependencies in the operation of the artificial neural network computing system 100, improving the performance of the system 100 and the computer 102.

The internal operation of the artificial neural network computing system 100 will now be described. The optical processor 140 includes a laser unit 142, a modulator array 144, a detection unit 146, and an optical matrix multiplication (optical matrix multiplication; OMM) unit 150. The optical processor 140 operates by encoding a digital input vector of length N onto an optical input vector of length N and propagating the optical input vector through the optical matrix multiplication unit 150. The optical matrix multiplication unit 150 receives an optical input vector of length N, and performs n×n matrix multiplication on the received optical input vector in an optical domain (optical domain). The nxn matrix multiplication performed by the optical matrix multiplication unit 150 is determined by the internal configuration of the optical matrix multiplication unit 150. The internal configuration of the optical matrix multiplication unit 150 may be controlled by an electrical signal, for example, an electrical signal generated by the DAC unit 130.

The optical matrix multiplication unit 150 may be implemented in various ways. Fig. 1B shows a diagram of an example of the optical matrix multiplication unit 150. The optical matrix multiplication unit 150 may include an array of input waveguides 152 to receive an optical input vector, an optical interference unit 154 in optical communication with the array of input waveguides 152, and an array of output waveguides 156 in optical communication with the optical interference unit 154. The optical interference unit 154 linearly transforms the optical input vector into a second array of optical signals. An array of output waveguides 156 guide a second array of optical signals output by optical interference unit 154. At least one input waveguide of the array of input waveguides 152 is in optical communication with each output waveguide of the array of output waveguides 156 through an optical interference unit 154. For example, for an optical input vector of length N, the optical matrix multiplication unit 150 may include N input waveguides 152 and N output waveguides 156.

The optical interference unit may include a plurality of interconnected Mach-Zehnder interferometers (MZIs). Fig. 1C and 1D show diagrams of examples of configurations 157 and 158 of examples of interconnecting mach-zehnder interferometers. The mach-zehnder interferometers may be interconnected in various ways (e.g., in configurations 157 or 158) to achieve linear conversion of the optical input vector received through the array of input waveguides 152.

Fig. 1E shows a diagram of an example of a mach-zehnder interferometer 170. The mach-zehnder interferometer 170 comprises a first input waveguide 171, a second input waveguide 172, a first output waveguide 178 and a second output waveguide 179. In addition, each of the plurality of interconnected Mach-Zehnder interferometers 170 includes a first phase shifter 174, the first phase shifter 174 configured to change the splitting ratio (SPLITTING RATIO) of the Mach-Zehnder interferometer 170, and a second phase shifter 176 configured to shift the phase of one output of the Mach-Zehnder interferometer 170, such as light exiting the Mach-Zehnder interferometer 170 through a second output waveguide 179. The first phase shifter 174 and the second phase shifter 176 of the mach-zehnder interferometer 170 are coupled to a plurality of weight control signals generated by the DAC cell 130. The first phase shifter 174 and the second phase shifter 176 are examples of reconfigurable components of the optical matrix multiplication unit 150. Examples of reconfiguration components include a thermo-optic phase shifter (thermo-optic PHASE SHIFTER) or an electro-optic phase shifter (electro-optic PHASE SHIFTER). The thermo-optic phase shifter operates by heating the waveguide to change the refractive index of the waveguide and cladding material, which translates into a change in phase. The electro-optic phase shifter operates by applying an electric field (e.g., lithium niobate (LiNbO 3), reverse biasing a PN junction) or a current (e.g., forward biasing a PIN junction), which changes the refractive index of the waveguide material. By varying the weight control signals, the phase delays of the first 174 and second 176 phase shifters of each of the interconnected Mach-Zehnder interferometers 170 may be varied, which reconfigures the optical interference units 154 of the optical matrix multiplication unit 150 to achieve a particular matrix multiplication determined by the phase delays disposed across the optical interference units 154. Additional embodiments of the optical matrix multiplication unit 150 and the optical interference unit 154 are disclosed in U.S. patent publication No. US2017/0351293A1, entitled "APPARATUS AND METHODS FOR OPTICAL NEURAL NETWORK," which is incorporated herein by reference in its entirety.

The light input vector is generated by a laser unit 142 and a modulator array 144. The optical input vector of length N has N independent optical signals, each of which has an intensity corresponding to the value of the corresponding element of the digital input vector of length N. As an example, the laser unit 142 may generate N light outputs. The N light outputs have the same wavelength and are optically coherent (optically coherent). The optical coherence of the light outputs allows the light outputs to optically interfere with each other, which is a characteristic utilized by the optical matrix multiplication unit 150 (e.g., in the operation of a mach-zehnder interferometer). Furthermore, the light outputs of the laser units 142 may be substantially identical to each other. For example, the N light outputs may be substantially uniform in their intensities (e.g., within 5%, within 3%, within 1%, within 0.5%, within 0.1%, or within 0.01%) and in their relative phases (e.g., within 10 degrees, within 5 degrees, within 3 degrees, within 1 degree, within 0.1 degrees). Uniformity of the light output may improve the fidelity of the light input vector to the digital input vector (faithfulness), thereby improving the overall accuracy of the light processor 140. In some embodiments, the light output of the laser unit 142 may have an optical power of 0.1mW to 50mW per output, a wavelength in the near infrared range (e.g., between 900nm and 1600 nm), and a linewidth of less than 1 nm. The light output of the laser unit 142 may be a single transverse mode (transverse-mode) light output.

In some embodiments, the laser unit 142 includes a single laser source and an optical power splitter (optical power splitter). A single laser source is configured to generate laser light. The optical power splitter is configured to split light generated by the laser source into N light outputs having substantially the same intensity and phase. By dividing a single laser output into multiple outputs, optical coherence of the multiple light outputs can be achieved. For example, the single laser source may be a semiconductor laser diode, a vertical-cavity surface-emitting laser (VCSEL), a distributed feedback (distributed feedback; DFB) laser, or a distributed Bragg reflector (distributed Bragg reflector; DBR) laser. For example, the optical power splitter may be a 1:N multimode interference (multimode interference; MMI) splitter, a multi-stage splitter (multi-STAGE SPLITTER) comprising a plurality of 1:2 multimode interference splitters or directional couplers, or a star coupler. In some other embodiments, a master-SLAVE LASER configuration may be used in which the slave laser is injection locked (injection locked) by the master laser to have a stable phase relationship with the master laser.

The optical output of the laser unit 142 is coupled to a modulator array 144. The modulator array 144 is configured to receive the light input from the laser unit 142 and modulate the intensity of the received light input based on a modulator control signal (which is an electrical signal). Examples of modulators include Mach-Zehnder interference (MZI) modulators, ring resonator modulators (ring resonator modulator), and electro-absorption modulators (electro-absorption modulator). The modulator array 144 has N modulators, each of which receives one of the N optical outputs of the laser unit 142. The modulator receives control signals corresponding to elements of the digital input vector and modulates the intensity of the light. The control signal may be generated by the DAC unit 130.

The DAC unit 130 is configured to generate a plurality of modulator control signals and to generate a plurality of weight control signals under the control of the controller 110. For example, DAC unit 130 receives a first DAC control signal from controller 110 that corresponds to a digital input vector to be processed by light processor 140. The DAC unit 130 generates a modulator control signal, which is an analog signal adapted to drive the modulator array 144 and the optical matrix multiplication unit 150, based on the first DAC control signal. For example, the analog signal may be a voltage or a current, depending on the technology and design of the modulator of array 144. The voltage may have a magnitude ranging from + -0.1V to + -10V, and the current may have a magnitude ranging from 100 μA to 100 mA. In some embodiments, DAC unit 130 may include a modulator driver configured to buffer, amplify, or condition the analog signals so that the modulator of array 144 and optical matrix multiplication unit 150 may be adequately driven. For example, certain types of modulators may be driven with differential control signals. In this case, the modulator driver may be a differential driver that generates a differential electrical output based on a single-ended input signal. As another example, certain types of modulators may have a 3dB bandwidth that is less than the desired processing rate of the optical processor 140. In this case, the modulator driver may include a pre-emphasis circuit (pre-emphasis circuit) or other bandwidth enhancement circuit designed to extend the operating bandwidth of the modulator. For example, such bandwidth enhancement may be useful for modulators based on PIN diode structures that are forward biased to modulate the refractive index of a portion of a waveguide that guides the modulated light wave using carrier injection. For example, if the modulator is a Mach-Zehnder interferometer modulator, a PIN diode structure may be used to implement a phase shifter in one or both waveguide arms of the MZI modulator. Configuring the phase shifter for forward bias operation facilitates shorter modulator lengths and a more compact overall design, which may be useful for an optical matrix multiplication unit 150 with a large number of modulators.

For example, in a bandwidth enhanced pre-emphasis version (pre-emphasis form), an analog electrical signal (e.g., voltage or current) driving the modulator may be shaped to include a transient pulse (TRANSIENT PULSE) that overshoots a change in analog signal level (overshot) that represents a given digital data value of the DAC control signal in a series of digital data values (DIGITAL DATA value). Each digital data value may have any number of bits, including a single 1-bit data value, as assumed in the remainder of this example. Thus, if the value of the bit is the same as the previous value, the analog electrical signal driving the modulator is maintained at a steady-state level (step-STATE LEVEL) (e.g., signal level X ₀ with bit value 0 and higher signal level X ₁ with bit value 1). However, if the bit changes from 0 to 1, the corresponding analog electrical signal used to drive the modulator may include a transient pulse having a peak value X ₁+(X₁-X₀ at the beginning of the bit transition before settling to steady state value X ₁. Likewise, if the bit changes from 1 to 0, the corresponding analog electrical signal used to drive the modulator may include a transient pulse having a peak value X ₀+(X₀-X₁ at the beginning of the bit transition before settling to steady state value X ₀. The size and length of the transient pulses may be selected to optimize bandwidth enhancement (e.g., maximizing the open area of the eye diagram (EYE DIAGRAM) for non-return-to-zero (NRZ) modulation mode).

In the bandwidth enhanced charge pumping version, the analog current signal driving the modulator can be shaped to include a transient pulse that moves a precisely determined amount of charge. Fig. 44 shows a charge pumping bandwidth enhancement circuit that uses the capacitance connected in series between the voltage source and the modulator to precisely control the charge flow. A portion of the circuit shown in fig. 44 may be included in the modulator driver described above. In this embodiment, the modulator is represented by modulator circuit 4400, the modulator circuit 4400 modeling the electrical characteristics of the phase shifter of the modulator as a PIN diode. The modulator circuit 4400 includes a parallel connection of an ideal diode, a capacitor having a capacitance C _d, and a resistor having a resistance R. The pump capacitor (pump capacitor) 4402 has a capacitance C _p. The control voltage waveform 4404 is provided to an inverter circuit 4405 to generate a drive voltage waveform 4406 whose amplitude can be precisely calibrated to move a predetermined amount of charge into or out of the modulator circuit 4400 through the pump capacitor 4402. By applying a constant voltage vdd_io at terminal 4408, the PIN diode modeled by modulator circuit 4400 is forward biased. A charge pump control voltage VCP is applied at terminal 4410 of inverter 4405 to control the amount of charge pumped at the transition of the drive voltage waveform 4406, and the corresponding optical phase shift applied by the modulator.

The value of the charge pump control voltage VCP may be adjusted prior to operation such that the nominal charge Q stored in the charge pump capacitor 4402 is accurately calibrated based on a measured value of the capacitance C _p (e.g., due to uncertainty during manufacturing, which may have some variability). For example, the voltage VCP may be equal to the nominal charge Q divided by the capacitance C _p. The resulting change in refractive index of the portion of the waveguide intersecting the PIN diode may then provide a phase shift of the guided light wave that is linearly proportional to the amount of charge Q (e.g., stored by internal capacitance C _d) that moves between the PIN diode and the charge pumping capacitance 4402. If the driving voltage changes from a low value to a high value, the current flowing from the charge pumping capacitor 4402 into the PIN diode will deliver a predetermined amount of charge (i.e., the integration of positive current over time) in a short period of time. If the drive voltage changes from a high value to a low value, the current flowing from the PIN diode into the charge pumping capacitance 4402 will remove a predetermined amount of charge in a short time (i.e., the integration of the negative current over time). After this relatively short switching time (SWITCHING TIME), steady state current is provided by current source 4412, and current source 4412 is controlled by switch 4414 to replace the charge lost due to the internal capacitance losing current through internal resistor R while the drive voltage is maintained (e.g., during the hold time of a particular digital value). The use of such a charge pumping configuration may have advantages, such as better accuracy than other techniques, including some pre-emphasis techniques, because the amount of charge moved in a short switching time depends on a constant physical parameter (C _p) and steady state control Value (VCP), and is therefore precisely controllable and repeatable.

In some embodiments, reduced power consumption may be achieved by designing the modulators of modulator array 144 and/or optical matrix multiplication unit 150 such that less power is consumed when operating the modulators to produce modulation values representing more commonly occurring coefficients and more power is consumed when operating the modulators to produce modulation values representing less commonly occurring coefficients. For example, power consumption may be reduced for certain data sets known to have certain characteristics. Fig. 42 shows a modulation value probability distribution diagram 4200 (dashed line) superimposed on a modulator power map 4202 (solid line) for a particular design of modulator and/or optical matrix multiplication unit 150 for modulator array 144. Both figures are functions of the modulation value (on the horizontal axis) expressed in normalized units (normalized units) to represent coefficients between-1 and 1. In this embodiment, the data set includes various coefficients (e.g., vector coefficients and/or matrix coefficients) for the artificial neural network computation such that the probability distribution function (probability distribution functio; PDF) of the coefficients yields a higher probability (and thus instances more frequently) for smaller coefficients (i.e., coefficients with relatively smaller absolute values). For such data sets ("low coefficient weight data sets"), the modulator may be designed to achieve reduced power consumption by operating in a lower power state to use smaller coefficients (more frequently occurring in the data set) for computation and in a higher power state to use larger coefficients (less frequently occurring in the data set).

Some optical amplitude modulators use relatively high power to modulate an optical signal with a small modulation value. For example, for coherent non-sensitive optical amplitude modulators, a modulation value near zero may require relatively high modulator power, e.g., for electro-absorption modulators, for large absorption optical power, electro-absorption modulators may require relatively high current to drive diode-based absorbers to reduce the optical amplitude of the modulated optical signal. For coherence sensitive optical amplitude modulators, a modulation value near zero may require relatively high modulator power, for example for mach-zehnder interferometer modulators that require a diode-based phase shifter to be driven with relatively high current to provide a relative phase shift between the two mach-zehnder interferometer arms for destructive optical interference, thereby reducing the optical amplitude of the modulated signal.

The optical amplitude modulator may be configured to overcome this power relationship and achieve a modulator power as shown in fig. 42 that assigns a low power modulator state to a modulation value near zero. For example, as shown in FIG. 43, the Mach-Zehnder interferometer modulator 4300 may be configured with asymmetric arms that provide built-in passive relative phase shifts (build-IN PASSIVE RELATIVE PHASE SHIFT) (e.g., phase shifts around 180 degrees) so that less active relative phase shifts (and thus low modulator power) are required for destructive optical interference. The mach-zehnder interferometer modulator 4300 comprises an input optical splitter 4302 that splits the incoming optical signal to provide 50% of the power to the first arm and 50% of the power to the second arm. The active phase shifter 4304 in the first arm provides a method of using a variable phase shift to vary the modulation value over a range of possible values (possible value) (in this embodiment, for unsigned modulation values between 0 and 1). The variable phase shift is determined based on the magnitude of the applied electrical signal, which requires a certain amount of supplied electrical power (e.g., a diode-based phase shifter formed of doped semiconductor material within or near the waveguide of the first arm). The passive phase shifter 4306 in the second arm provides a relative phase shift between the first arm and the second arm even if no power is applied to the MZI modulator 4300. For example, an optical material with a high refractive index may be configured to impart a 180 degree relative phase shift between the arms such that the output optical combiner 4308 provides optical interference such that no significant optical power is coupled to its output. Various alternative configurations of active and passive phase shifters may be implemented, including but not limited to, both active and passive phase shifters may be in one arm without a modulator or phase shifter in the other arm, both arms may have active and passive phase shifters (in a push-pull arrangement), or both arms may have active and passive phase shifters.

Alternatively, a Mach-Zehnder interferometer modulator configured according to the symmetric differential configuration described herein can be used to provide near zero coefficients using only a small active relative phase shift (and thus low modulator power). For example, fig. 22A shows an optical amplitude modulator constructed using a mach-zehnder interferometer configured according to a symmetrical differential configuration, in which the optical output is detected as shown in fig. 22B. The low modulation power is used to perform multiplication (using optical amplitude modulation) of the modulation value with low amplitude (i.e. absolute value). Specifically, the low power applied to phase modulator 2204 corresponds to modulation of a low amplitude modulation value, thereby producing a corresponding near-equal separation (e.g., near 50%/50%) in the output of coupler 2206, and a low amplitude current at node 2216, representing the result of the multiplication. The symmetrical differential configuration also has the advantage of being able to provide signed modulation values between-1 and +1 (as described in more detail below). While this implementation uses a phase modulator in a single arm of the mach-zehnder interferometer, other implementations may have other configurations, such as a push-pull arrangement with phase modulators in both arms to provide opposite sign phase shifts.

The zero modulation power is shown in the example power distribution shown in fig. 42 as being used to achieve a zero modulation value, but in other embodiments there may be a residual low but non-zero modulation power at the zero modulation value. For these low-coefficient weight data sets, reduced power consumption may typically be achieved by using a modulator designed such that the modulator modulates the optical signal with a modulation value using an increased power relative to the absolute value of the modulation value. As the magnitude of the modulation value increases, the exact shape of the modulation power as a function of the modulation value may vary from implementation to implementation and does not necessarily increase linearly. There may be different power consuming elements in the optical amplitude modulator that contribute to the overall power consumption. In some embodiments, the modulators are designed such that they modulate the optical signal with a modulation value using a monotonically increasing power relative to the absolute value of the modulation value.

In some cases, the modulator of the array 144 and/or the optical matrix multiplication unit 150 may have a nonlinear transfer function. For example, a Mach-Zehnder interferometer optical modulator can have a nonlinear relationship (e.g., sinusoidal dependence (sinusoidal dependence)) between an applied control voltage and its transmission. In this case, the first DAC control signal may be adjusted or compensated based on the nonlinear transfer function of the modulator, such that a linear relationship between the digital input vector and the generated optical input vector may be maintained. Maintaining such linearity is generally important to ensure that the input to the optical matrix multiplication unit 150 is an accurate representation of the digital input vector. In some embodiments, the compensation of the first DAC control signal may be performed by the controller 110 through a look-up table that maps the values of the digital input vector to the values to be output by the DAC unit 130 such that the resulting modulated optical signal is linearly proportional to the elements of the digital input vector. The look-up table may be generated by characterizing (characterizing) the nonlinear transfer function of the modulator and calculating the inverse function (reverse function) of the nonlinear transfer function.

In some embodiments, the nonlinearity of the modulator and the resulting nonlinearity in the resulting optical input vector may be compensated for by an artificial neural network calculation algorithm.

The optical input vector generated by the modulator array 144 is input to an optical matrix multiplication unit 150. The optical input vector may be N spatially separated optical signals, each of which has optical power corresponding to an element of the digital input vector. For example, the optical power of the optical signal is typically in the range of 1 μw to 10 mW. The optical matrix multiplication unit 150 receives the optical input vector and performs n×n matrix multiplication based on its internal configuration. The internal configuration is controlled by an electrical signal generated by the DAC unit 130. For example, the DAC unit 130 receives a second DAC control signal from the controller 110, the second DAC control signal corresponding to the neural network weight to be implemented by the optical matrix multiplication unit 150. The DAC unit 130 generates a weight control signal based on the second DAC control signal, which is an analog signal adapted to control the reconfigurable components within the optical matrix multiplication unit 150. For example, the analog signal may be a voltage or a current depending on the type of reconfiguration component of the optical matrix multiplication unit 150. The voltage may have a magnitude ranging from 0.1V to 10V and the current may have a magnitude ranging from 100 μΑ to 10 mA.

The modulator array 144 may operate at a modulation rate that is different from the reconfiguration rate of the reconfigurable optical matrix multiplication unit 150. The light input vector produced by modulator array 144 propagates through the light matrix multiplication unit at a substantial proportion of the speed of light (e.g., 80%, 50%, or 25% of the speed of light), depending on the optical characteristics of light matrix multiplication unit 150 (e.g., effective refractive index (EFFECTIVE INDEX)). For a typical optical matrix multiplication unit 150, the propagation time of the optical input vector is in the range of 1 to 10 picoseconds, which corresponds to 10 to 100GHz of processing rate. As such, the rate at which the optical processor 140 can perform matrix multiplication operations is limited in part by the rate at which the optical input vectors can be generated. Modulators with bandwidths of several 10GHz are readily available, and modulators with bandwidths exceeding 100GHz are being developed. As such, for example, the modulation rate of the modulator array 144 may be in the range of 5GHz, 8GHz, or several 10GHz to several 100GHz. To maintain operation of modulator array 144 at such modulation rates, the integrated circuit of controller 110 may be configured to output control signals for DAC cell 130 at a rate greater than or equal to, for example, 5GHz, 8GHz, 10GHz, 20GHz, 25GHz, 50GHz, or 100GHz.

Depending on the type of reconfigurable components implemented by the optical matrix multiplication unit 150, the reconfiguration rate of the optical matrix multiplication unit 150 may be significantly slower than the modulation rate. For example, the reconfigurable components of the optical matrix multiplication unit 150 may be of the thermo-optic type that uses micro-heaters to adjust the temperature of the optical waveguides of the optical matrix multiplication unit 150, which in turn affects the phase of the optical signals within the optical matrix multiplication unit 150 and results in matrix multiplication. The reconfiguration rate can be limited to a number of 100kHz to a number of 10MHz due to the thermal time constant (THERMAL TIME constant) associated with heating and cooling of the structure. As such, the modulator control signals used to control the modulator array 144 and the weight control signals used to reconfigure the optical matrix multiplication unit 150 may have significantly different speed requirements. Furthermore, the electrical characteristics of modulator array 144 may be significantly different from the electrical characteristics of the reconfigurable components of optical matrix multiplication unit 150.

To accommodate the different characteristics of the modulator control signal and the weight control signal, in some embodiments, DAC unit 130 may include a first DAC subunit 132 and a second DAC subunit 134. The first DAC subunit 132 may be specifically arranged to generate the modulator control signal and the second DAC subunit 134 may be specifically arranged to generate the weight control signal. For example, the modulation rate of the modulator array 144 may be 25GHz, and the first DAC subunit 132 may have a per-channel output update rate (per-channel output update rate) of 25 gigasamples per second (GSPS) and a resolution of 8 bits or more. The reconfiguration rate of the optical matrix multiplication unit 150 may be 1MHz, and the second DAC subunit 134 may have an output update rate of 1 mega-SAMPLES PER second (MSPS) per second and a resolution of 10 bits. Implementing separate first DAC subunit 132 and second DAC subunit 134 allows the DAC subunits to be independently optimized for the respective signals, which may reduce the overall power consumption, complexity, cost, or a combination thereof, of DAC unit 130. It is noted that although the first DAC subunit 132 and the second DAC subunit 134 are described as subassemblies of the DAC unit 130, in general, the first DAC subunit 132 and the second DAC subunit 134 may be integrated on a common chip (common chip), or may be implemented as separate chips.

Based on the different characteristics of the first DAC subunit 132 and the second DAC subunit 134, in some embodiments, the memory unit 120 may include a first memory subunit and a second memory subunit. The first storage subunit may be a memory dedicated to storing the input data set and the digital input vector and may have an operating speed sufficient to support the modulation rate. The second storage subunit may be a memory dedicated to storing neural network weights and may have an operating speed sufficient to support the reconfiguration rate of the optical matrix multiplication unit 150. In some embodiments, the first storage subunit may be implemented using SRAM and the second storage subunit may be implemented using DRAM. In some embodiments, the first memory subunit and the second memory subunit may be implemented using DRAM. In some embodiments, the first storage unit may be implemented as part of the controller 110 or as a cache (cache) of the controller 110. In some embodiments, the first and second storage subunits may be implemented as different address spaces by a single physical storage device.

The optical matrix multiplication unit 150 outputs an optical output vector of length N, which corresponds to the result of n×n matrix multiplication of the optical input vector and the neural network weights. The optical matrix multiplication unit 150 is coupled to the detection unit 146, and the detection unit 146 is configured to generate N output voltages corresponding to the N optical signals of the optical output vector. For example, the detection unit 146 may include an array of N photodetectors configured to absorb optical signals and generate photocurrents, and an array of N transimpedance amplifiers (TRANSIMPEDANCE AMPLIFIER) configured to convert the photocurrents to output voltages. The bandwidths of the photodetectors and the transimpedance amplifier may be arranged based on the modulation rate of the modulator array 144. The photodetector may be formed of various materials based on the wavelength of the detected light output vector. Examples of materials for photodetectors include germanium, silicon germanium alloys, and indium gallium arsenide (InGaAs).

The detection unit 146 is coupled to the ADC unit 160. The ADC unit 160 is configured to convert the N output voltages into N digital light outputs, which are quantized digital representations of the output voltages. For example, the ADC unit 160 may be an N-channel ADC. The controller 110 may obtain N digital light outputs corresponding to the light output vectors of the light matrix multiplication unit 150 from the ADC unit 160. The controller 110 may form a digital output vector of length N from the N digital light outputs, which corresponds to the result of an nxn matrix multiplication of the input digital vector of length N.

The various electronic components of the artificial neural network computing system 100 may be integrated in various ways. For example, the controller 110 may be an application specific integrated circuit fabricated on a semiconductor die. Other electronic components (e.g., memory unit 120, DAC unit 130, ADC unit 160, or a combination thereof) may be monolithically integrated on the semiconductor die on which controller 110 is fabricated. As another example, two or more electronic components may be integrated into a System-on-Chip (SoC). In an embodiment of the SoC, the controller 110, the memory unit 120, the DAC unit 130, and the ADC unit 160 may be fabricated on respective dies, and the respective dies may be integrated on a common platform (e.g., interposer) that provides electrical connections between integrated components. Such an SoC approach may allow for faster data transfer between the electronic components of the artificial neural network computing system 100 relative to a method of separately disposing and routing components on a printed circuit board (printed circuit board; PCB), thereby increasing the operational speed of the artificial neural network computing system 100. Furthermore, soC approaches may allow for the use of different manufacturing techniques for different electronic component optimizations, which may improve the performance of different components and reduce the overall cost of monolithic integration approaches. Although the integration of the controller 110, the storage unit 120, the DAC unit 130, and the ADC unit 160 has been described, in general, a subset of the components may be integrated while other components are implemented as separate components for various reasons (e.g., performance or cost). For example, in some embodiments, the storage unit 120 may be integrated with the controller 110 as a functional block (functional block) within the controller 110.

The various optical components of the artificial neural network computing system 100 may also be integrated in various ways. Examples of optical components of the artificial neural network computing system 100 include a laser unit 142, a modulator array 144, an optical matrix multiplication unit 150, and a photodetector of a detection unit 146. These optical components may be integrated in various ways to improve performance and/or reduce cost. For example, the laser unit 142, modulator array 144, optical matrix multiplication unit 150, and photodetector may be monolithically integrated on a common semiconductor substrate as a photonic integrated circuit (photonic integrated circuit; PIC). On photonic integrated circuits formed based on compound semiconductor material systems, such as group III-V compound semiconductors (e.g., indium phosphide (InP)), lasers, modulators (e.g., electroabsorption modulators), waveguides, and photodetectors may be monolithically integrated on a single die. Such monolithically integrated methods may reduce the complexity of aligning the input and output of the various separate optical components, which may require alignment accuracy ranging from sub-microns to several microns. As another example, the laser source of the laser unit 142 may be fabricated on a compound semiconductor die, and the optical power splitters, modulator array 144, optical matrix multiplication unit 150, and photodetectors of the detection unit 146 of the laser unit 142 may be fabricated on a silicon die. Photonic integrated circuits fabricated on silicon wafers (which may be referred to as silicon photonics technologies) generally have greater integration density, higher lithographic (lithographic) resolution, and lower cost than III-V based photonic integrated circuits. Such a greater integration density may be beneficial in the fabrication of the optical matrix multiplication unit 150 because the optical matrix multiplication unit 150 typically includes 10 to 100 optical components, such as power splitters and phase shifters. Furthermore, the higher lithographic resolution of silicon photonics technologies may reduce manufacturing variations of the optical matrix multiplication unit 150, thereby improving the accuracy of the optical matrix multiplication unit 150.

The artificial neural network computing system 100 may be implemented in a variety of form factors. For example, the artificial neural network computing system 100 may be implemented as a co-processor (co-processor) plugged into a host computer (host computer). Such an artificial neural network computing system 100 may have a form factor such as a quick PCI (PCI Express) card and communicate with a host computer over a PCIe bus. The host computer may host (host) a plurality of co-processor type artificial neural network computing systems 100 and be connected to a computer 102 via a network. This type of embodiment may be applicable to cloud data centers where the server racks may be dedicated to processing artificial neural network computing requests received from other computers or servers. As another example, the co-processor type artificial neural network computing system 100 may be directly inserted into the computer 102 that issued the artificial neural network computing request.

In some embodiments, the artificial neural network computing system 100 may be integrated onto a physical system that requires real-time artificial neural network computing power. For example, systems that rely heavily on real-time artificial intelligence tasks (real-TIME ARTIFICIAL INTELLIGENCE TASK), such as automated driving vehicles, autonomous unmanned aerial vehicles (autonomous drone), object or face recognition security cameras, and various Internet of things (Internet-of-Things; ioT) devices, may benefit from integrating the artificial neural network computing system 100 directly with other subsystems of such systems. The artificial neural network computing system 100 with direct integration may implement real-time artificial intelligence in devices with poor or no network connectivity and enhance the reliability and usability of mission-critical artificial intelligence systems.

Although DAC unit 130 and ADC unit 160 are shown coupled to controller 110, in some embodiments DAC unit 130, ADC unit 160, or both may alternatively or additionally be coupled to memory unit 120. For example, direct memory access (direct memory access; DMA) operations of the DAC unit 130 or the ADC unit 160 may reduce the computational burden on the controller 110 and reduce the delay of reading and writing to the memory unit 120, thereby further increasing the operating speed of the artificial neural network computing unit 100.

FIG. 2 illustrates a flow chart of an example of a process 200 for performing artificial neural network calculations. The steps of process 200 may be performed by controller 110. In some embodiments, the various steps of process 200 may be run in parallel, in combination, in a loop, or in any order.

At step 210, an Artificial Neural Network (ANN) calculation request is received that includes an input data set and a first plurality of neural network weights. The input data set includes a first digital input vector. The first digital input vector is a subset of the input data set. For example, it may be a sub-region of an image. The artificial neural network computation request may be generated by various entities (e.g., computer 102). The computers may include one or more of various types of computing devices, such as personal computers, server computers, vehicle computers (vehicle computers), and flight computers (flight computers). An artificial neural network calculation request generally refers to an electrical signal that informs or informs the artificial neural network calculation system 100 of the calculation to be performed by the artificial neural network. In some embodiments, the artificial neural network computation request may be split into two or more signals. For example, the first signal may query (query) the artificial neural network computing system 100 to check whether the system 100 is ready to receive the input data set and the first plurality of neural network weights. In response to an acknowledgement by the system 100, the computer may transmit a second signal comprising the input data set and the first plurality of neural network weights.

In step 220, the input data set and the first plurality of neural network weights are stored. The controller 110 may store the input data set and the first plurality of neural network weights in the storage unit 120. Storing the input data set and the first plurality of neural network weights in the storage unit 120 may allow flexibility in the operation of the artificial neural network computing system 100, which may improve the overall performance of the system, for example. For example, the input data set may be divided into digital input vectors of a set size and format by retrieving (retrieve) a desired portion of the input data set from the storage unit 120. The different portions of the input data set may be processed in various orders, or shuffled (shuffled), to allow various types of artificial neural network calculations to be performed. For example, where the input and output matrices are of different sizes, shuffling may allow matrix multiplication to be performed by a block matrix multiplication technique. As another example, storing the input data set and the first plurality of neural network weights in the storage unit 120 may allow for queuing of a plurality of artificial neural network computing requests by the artificial neural network computing system 100, which may allow the artificial neural network computing system 100 to maintain operation at its full speed without periods of inactivity.

In some embodiments, the input data set may be stored in a first storage subunit and the first plurality of neural network weights may be stored in a second storage subunit.

In step 230, a first plurality of modulator control signals is generated based on the first digital input vector and a first plurality of weight control signals is generated based on the first plurality of neural network weights. The controller 110 may transmit the first DAC control signal to the DAC unit 130 to generate a first plurality of modulator control signals. DAC unit 130 generates a first plurality of modulator control signals based on the first DAC control signals and modulator array 144 generates an optical input vector representing a first digital input vector.

The first DAC control signal may include a plurality of digital values to be converted by DAC unit 130 into a first plurality of modulator control signals. The plurality of digital values generally corresponds to the first digital input vector and may be associated by various mathematical relationships or look-up tables. For example, the plurality of digital values may be linearly proportional to the values of the elements of the first digital input vector. As another example, the plurality of digital values may be associated with elements of the first digital input vector through a lookup table configured to maintain a linear relationship between the digital input vector and the optical input vector produced by the modulator array 144.

The controller 110 may transmit the second DAC control signal to the DAC unit 130 to generate a first plurality of weight control signals. The DAC unit 130 generates a first plurality of weight control signals based on the second DAC control signals, and reconfigures the optical matrix multiplication unit 150 according to the first plurality of weight control signals to implement a matrix corresponding to the first plurality of neural network weights.

The second DAC control signal may include a plurality of digital values to be converted into the first plurality of weight control signals by the DAC unit 130. The plurality of digital values generally correspond to a first plurality of neural network weights and may be associated by various mathematical relationships or look-up tables. For example, the plurality of digital values may be linearly proportional to the first plurality of neural network weights. As another example, the plurality of digital values may be calculated by performing various mathematical operations on the first plurality of neural network weights to generate the weight control signal, which may configure the optical matrix multiplication unit 150 to perform matrix multiplication corresponding to the first plurality of neural network weights.

In some embodiments, the first plurality of neural network weights representing the matrix M may be decomposed into m=usv by a singular value decomposition (singular value decomposition; SVD) method, where U is an mxm unitary matrix, S is an mxn diagonal matrix having non-negative real numbers on the diagonal, and V is the complex conjugate of an nxn unitary matrix V (complex conjugate). In this case, the first plurality of weight control signals may include a first plurality of optical matrix multiplication unit control signals corresponding to the matrix V, and a second plurality of optical matrix multiplication unit control signals corresponding to the matrix S. Further, the optical matrix multiplication unit 150 may be configured to have a first optical matrix multiplication subunit configured to implement the matrix V, a second optical matrix multiplication subunit configured to implement the matrix S, and a third optical matrix multiplication subunit configured to implement the matrix U, such that the optical matrix multiplication unit 150 implements the matrix M as a whole. The SVD process is further described in U.S. patent publication No. US2017/0351293A1, entitled "APPARATUS AND METHODS FOR OPTICAL NEURAL NETWORK," which is incorporated herein by reference in its entirety.

In step 240, a first plurality of digital light outputs corresponding to the light output vectors of the light matrix multiplication unit is obtained. The light input vector produced by modulator array 144 is processed by light matrix multiplication unit 150 and converted into a light output vector. The light output vector is detected by the detection unit 146 and converted into an electrical signal, which may be converted into a digital value by the ADC unit 160. The controller 110 may, for example, transmit a conversion request to the ADC unit 160 to start converting the voltage output by the detection unit 146 into a digital light output. Once the conversion is completed, the ADC unit 160 may transmit the conversion result to the controller 110. Alternatively, the controller 110 may take the conversion result from the ADC unit 160. The controller 110 may form a digital output vector from the digital light output, the digital output vector corresponding to the result of the matrix multiplication of the input digital vector. For example, the digital light output may be organized or connected (concatenated) to have a vector format.

In some embodiments, ADC unit 160 may be set or controlled to perform ADC conversion based on the DAC control signals issued by controller 110 to DAC unit 130. For example, the ADC conversion may be set to start at a preset time after the DAC unit 130 generates the modulation control signal. Such control of the ADC conversion may simplify the operation of the controller 110 and reduce the number of necessary control operations.

In step 250, a nonlinear transformation is performed on the first digital output vector to produce a first transformed digital output vector. The nodes or artificial neurons of the artificial neural network operate by first performing a weighted sum of the signals received from the nodes of the previous layer, and then performing a nonlinear transformation ("activation") of the weighted sum to produce an output. Various types of artificial neural networks may implement various types of differentiable nonlinear transformations. Examples of nonlinear transformation functions include modified linear unit (RECTIFIED LINEAR units; RELU) functions, sigmoid functions, hyperbolic tangent functions (hyperbolic tangent function), X2 functions, and |x| functions. This nonlinear transformation is performed on the first digital output by the controller 110 to produce a first transformed digital output vector. In some embodiments, the nonlinear transformation may be performed by an application specific digital integrated circuit within the controller 110. For example, the controller 110 may include one or more modules or circuit blocks that are particularly adapted to accelerate the computation of one or more types of nonlinear transformations.

In step 260, the first transformed digital output vector is stored. The controller 110 may store the first transformed digital output vector in the storage unit 120. In the case where the input data set is divided into a plurality of digital input vectors, the first transformed digital output vector corresponds to, for example, an artificial neural network calculation result of a portion of the input data set of the first digital input vector. As such, storing the first transformed digital output vector allows the artificial neural network computing system 100 to perform and store additional computations on other digital input vectors of the input dataset to be later aggregated into a single artificial neural network output.

In step 270, an artificial neural network output generated based on the first transformed digital output vector is output. The controller 110 generates an artificial neural network output that is the result of processing the input dataset through an artificial neural network defined by the first plurality of neural network weights. In the case where the input data set is split into a plurality of digital input vectors, the artificial neural network output produced is an aggregate output comprising the first converted digital output, but may further comprise additional converted digital outputs corresponding to other portions of the input data set. Once the artificial neural network output is generated, the generated output is transmitted to a computer (e.g., computer 102) that initiated the artificial neural network calculation request.

Various performance metrics may be defined for the artificial neural network computing system 100 implementing process 200 (performance metric). Defining performance metrics may allow the performance of the artificial neural network computing system 100 implementing the light processor 140 to be compared to the performance of other systems used to replace the artificial neural network computing implementing the electrical matrix multiplication unit (electronic matrix multiplication unit). In one aspect, the rate at which the artificial neural network computation may be performed may be indicated in part by a first recurring period defined as the time elapsed between the step 220 of storing the input data set and the first plurality of neural network weights in the memory unit and the step 260 of storing the first transformed digital output vector in the memory unit. Thus, the first cycle period includes the time it takes to convert the electrical signal to an optical signal (e.g., step 230), perform a matrix multiplication in the optical domain, and convert the result back to the electrical domain (e.g., step 240). Steps 220 and 260 both involve storing data in the storage unit 120, a step shared between the artificial neural network computing system 100 and a conventional artificial neural network computing system without the light processor 140. As such, measuring the first cycle period of memory-to-memory transaction time (memory-to-memory transaction time) may allow for an actual or fair comparison of artificial neural network computational throughput between the artificial neural network computing system 100 and an artificial neural network computing system without the light processor 140 (e.g., a system implementing an electrical matrix multiplication unit).

Because of the rate at which the modulator array 144 can generate the optical input vector (e.g., at 25 GHz) and the processing rate of the optical matrix multiplication unit 150 (e.g., greater than 100 GHz), the first cycle period of the artificial neural network computing system 100 for performing a single artificial neural network computation of a single digital input vector can approximate the inverse of the speed of the modulator array 144 (e.g., 40 ps). The first cycle period may be, for example, less than or equal to 100ps, less than or equal to 200ps, less than or equal to 500ps, less than or equal to 1ns, less than or equal to 2ns, less than or equal to 5ns, or less than or equal to 10ns after considering the delays associated with the signal generation of DAC unit 130 and the ADC conversion of ADC unit 160.

By comparison, the multiplication run time of the M1 vector and M matrix of the electrical matrix multiplication unit is generally proportional to M2-1 processor clock cycles (processor clock cycle). For m=32, this multiplication would take about 1024 cycles, which results in a run time exceeding 300ns at a 3GHz clock speed, which is several orders of magnitude slower than the first cycle period of the artificial neural network computing system 100.

In some embodiments, process 200 further includes the step of generating a second plurality of modulator control signals based on the first transformed digital output vector. In some types of artificial neural network calculations, a single digital input vector may be repeatedly propagated through or processed by the same artificial neural network. An artificial neural network implementing multipass processing (multi-pass processing) may be referred to as a recurrent neural network (recurrent neural network; RNN). A recurrent neural network is a neural network in which the output of the network is recycled back to the input of the neural network during the (k) th pass and used as input during the (k+1) th pass. Recurrent neural networks may have various applications in pattern recognition tasks, such as speech or handwriting recognition. Once the second plurality of modulator control signals are generated, process 200 may proceed from step 240 to step 260 to complete the first digital input vector second pass artificial neural network. In general, the recycling of the converted digital output into the digital input vector may be repeated for a predetermined number of cycles, depending on the characteristics of the recurrent neural network received in the artificial neural network calculation request.

In some embodiments, process 200 further includes the step of generating a second plurality of weight control signals based on the second plurality of neural network weights. In some cases, the artificial neural network computation request further includes a second plurality of neural network weights. In general, an artificial neural network has one or more hidden layers in addition to an input layer and an output layer. For an artificial neural network having two hidden layers, the second plurality of neural network weights may correspond to connectivity between a first layer of the artificial neural network and a second layer of the artificial neural network. In order to process the first digital input vector through the two hidden layers of the artificial neural network, the first digital input vector may first be processed according to the process 200 until step 260, wherein the result of processing the first digital input vector through the first hidden layer of the artificial neural network in step 260 is stored in the storage unit 120. The controller 110 then reconfigures the optical matrix multiplication unit 150 to perform matrix multiplication corresponding to a second plurality of neural network weights associated with a second hidden layer of the artificial neural network. Once the optical matrix multiplication unit 150 is reconfigured, the process 200 may generate a plurality of modulator control signals based on the first transformed digital output vector that generate an updated optical input vector corresponding to the output of the first hidden layer. The updated light input vector is then processed by the reconfigured light matrix multiplication unit 150, the light matrix multiplication unit 150 corresponding to the second hidden layer of the artificial neural network. In general, the steps described may be repeated until the digital input vector has been processed through all hidden layers of the artificial neural network.

As described above, in some embodiments of the optical matrix multiplication unit 150, the reconfiguration rate of the optical matrix multiplication unit 150 may be significantly slower than the modulation rate of the modulator array 144. In this case, the throughput of the artificial neural network computing system 100 may be adversely affected by the amount of time it takes to reconfigure the optical matrix multiplication unit 150 during the period in which the artificial neural network computation cannot be performed. To mitigate the effects of the relatively slow reconfiguration time of the optical matrix multiplication unit 150, batch processing (batch processing) techniques may be utilized in which two or more digital input vectors propagate through the optical matrix multiplication unit 150 without configuration changes to split (amortize) the reconfiguration time over a greater number of digital input vectors.

Fig. 2B shows a diagram 290 illustrating aspects of the process 200 of fig. 2A. For an artificial neural network with two hidden layers, instead of processing the first digital input vector through the first hidden layer, reconfiguring the optical matrix multiplication unit 150 for the second hidden layer, processing the first digital input vector through the reconfigured optical matrix multiplication unit 150, and repeating the same operations for the remaining digital input vectors, all digital input vectors of the input data set may be first processed through the optical matrix multiplication unit 150 configured for the first hidden layer (configuration # 1), as shown in the upper part of fig. 290. Once all digital input vectors have been processed by the optical matrix multiplication unit 150 with configuration #1, the optical matrix multiplication unit 150 is reconfigured to configuration #2, which corresponds to the second hidden layer of the artificial neural network. This reconfiguration may be significantly slower than the rate at which the optical matrix multiplication unit 150 may process the input vector. Once the optical matrix multiplication unit 150 is reconfigured for the second hidden layer, the output vectors from the previous hidden layer may be batched by the optical matrix multiplication unit 150. For large input data sets with tens or hundreds of thousands of digital input vectors, the impact of reconfiguration time can be reduced by approximately the same factors, which can significantly reduce the fraction of time that artificial neural network computing system 100 spends in reconfiguration.

To implement batch processing, in some embodiments, process 200 further includes the steps of generating, by the DAC unit, a second plurality of modulator control signals based on the second digital input vector, obtaining, from the ADC unit, a second plurality of digital light outputs corresponding to the light output vectors of the light matrix multiplication unit, the second plurality of digital light outputs forming a second digital output vector, performing a nonlinear transformation on the second digital output vector to generate a second transformed digital output vector, and storing the second transformed digital output vector in the storage unit. For example, generating the second plurality of modulator control signals may follow step 260. Further, the artificial neural network output of step 270 in this case is now based on the first transformed digital output vector and the second transformed digital output vector. The retrieving, executing, and storing steps are similar to steps 240 through 260.

Batch processing techniques are one of many techniques for improving the throughput of the artificial neural network computing system 100. Another technique for improving the throughput of the artificial neural network computing system 100 is to process multiple digital input vectors in parallel by utilizing wavelength division multiplexing (WAVELENGTH DIVISION MULTIPLEXING; WDM). WDM is a technique of simultaneously propagating a plurality of optical signals of different wavelengths through a common propagation channel (e.g., a waveguide of the optical matrix multiplication unit 150). Unlike electrical signals, optical signals of different wavelengths may propagate through a common channel without affecting other optical signals of different wavelengths on the same channel. In addition, optical signals may be added (multiplexed) or dropped (demultiplexed (demultiplexed)) from a common propagation channel using well-known structures such as optical multiplexers (multiplexers) and demultiplexers (demultiplexers).

In the context of the artificial neural network computing system 100, multiple light input vectors of different wavelengths may be independently generated, propagated through the light matrix multiplication unit 150 at the same time, and independently detected to enhance the throughput of the artificial neural network computing system 100. Referring to fig. 1F, a diagram of an example of a Wavelength Division Multiplexed (WDM) artificial neural network (artificial neural network) computing system 104 is shown. The WDM artificial neural network computing system 104 is similar to the artificial neural network computing system 100 unless otherwise described. To implement WDM technology, in some embodiments of the artificial neural network computing system 104, the laser unit 142 is configured to generate multiple wavelengths, such as λ ₁、λ₂ and λ ₃. The multiple wavelengths may preferably be separated by a sufficiently large wavelength spacing to allow for easy multiplexing and demultiplexing onto common propagation channels. For example, wavelength intervals greater than 0.5nm, 1.0nm, 2.0nm, 3.0nm, or 5.0nm may allow for simple multiplexing and demultiplexing. On the other hand, the range between the shortest and longest wavelengths of the plurality of wavelengths ("WDM bandwidth") may preferably be small enough that the characteristics or performance of the optical matrix multiplication unit 150 remains substantially the same across the plurality of wavelengths. The optical components are typically dispersive, meaning that their optical properties vary with wavelength. For example, the power splitting ratio of a Mach-Zehnder interferometer may vary with wavelength. However, by designing the optical matrix multiplication unit 150 to have a sufficiently large operating wavelength window (operating wavelength window), and by limiting the wavelengths within the operating wavelength window, the light output vector output by the optical matrix multiplication unit 150 at each wavelength may be a sufficiently accurate result of the matrix multiplication implemented by the optical matrix multiplication unit 150. The operating wavelength window may be, for example, 1nm, 2nm, 3nm, 4nm, 5nm, 10nm or 20nm.

Fig. 39A shows a diagram of an example of a mach-zehnder modulator 3900 that may be used to modulate the amplitude of an optical signal. The mach-zehnder modulator 3900 includes two 1x2 port multimode interference couplers (mmi_1x2) 3902a and 3902b, two balanced arms (arm) 3904a and 3904b, and a phase shifter 3906 in one arm (or one phase shifter in each arm). When a voltage is applied to the phase shifter in one arm through the signal line 3908, there will be a phase difference between the two arms 3904a and 3904b to be converted into amplitude modulation. The 1x2 port multimode interference couplers 3902a and 3902b and the phase shifter 3906 are configured as broadband (broadband) photonic components, and the optical path lengths of the two arms 3904a and 3904b are configured to be equal. This enables the mach-zehnder modulator 3900 to operate over a wide wavelength range.

Fig. 39B is a graph 3910 showing the intensity-voltage curves of a mach-zehnder modulator 3900 using the configuration shown in fig. 39A for wavelengths 1530nm, 1550nm, and 1570 nm. Graph 3910 shows that mach-zehnder modulator 3900 has similar intensity-voltage characteristics for different wavelengths in the 1530nm to 1570nm range.

Referring back to fig. 1f, the modulator array 144 of the wdm artificial neural network computing system 104 includes a set of optical modulators (banks of optical modulators) configured to generate a plurality of optical input vectors, each of the set of optical modulators corresponding to one of the plurality of wavelengths and generating a respective optical input vector having a respective wavelength. For example, for a system having light input vectors of length 32 and 3 wavelengths (e.g., lambda ₁、λ₂ and lambda ₃), the modulator array 144 may have 3 groups of 32 modulators each. In addition, modulator array 144 also includes an optical multiplexer configured to combine the plurality of optical input vectors into a combined optical input vector comprising a plurality of wavelengths. For example, an optical multiplexer may combine the outputs of three modulator groups of three different wavelengths into a single propagation channel (e.g., waveguide) for each element of an optical input vector. As such, returning to the example above, the combined optical input vector will have 32 optical signals, each signal comprising 3 wavelengths.

In addition, the detection unit 146 of the WDM artificial neural network computing system 104 is further configured to demultiplex a plurality of wavelengths and produce a plurality of demultiplexed output voltages. For example, the detection unit 146 may include a demultiplexer configured to demultiplex three wavelengths included in each of the 32 signals of the multi-wavelength light output vector and route (route) the 3 single wavelength light output vectors to three groups of photodetectors coupled to three groups of transimpedance amplifiers.

Furthermore, the ADC unit 160 of the WDM artificial neural network computing system 104 includes an ADC group configured to convert the plurality of demultiplexed output voltages of the detection unit 146. Each of the ADC groups corresponds to one of a plurality of wavelengths and produces a respective digital demultiplexed light output. For example, the ADC group may be coupled to a transimpedance amplifier group of the detection unit 146.

Controller 110 may implement a method similar to process 200, but extended to support multi-wavelength operation. For example, the method may include the steps of obtaining a plurality of digital demultiplexed light outputs from the ADC unit 160, the plurality of digital demultiplexed light outputs forming a plurality of first digital output vectors, wherein each of the plurality of first digital output vectors corresponds to one of the plurality of wavelengths, performing a nonlinear transformation on each of the plurality of first digital output vectors to produce a plurality of transformed first digital output vectors, and storing the plurality of transformed first digital output vectors in a memory unit.

In some cases, the artificial neural network may be specifically designed and the digital input vector may be specifically formed such that the multi-wavelength light output vector may be detected without demultiplexing. In this case, the detection unit 146 may be a wavelength-insensitive (wavelength-insensitive) detection unit that does not demultiplex multiple wavelengths of the multi-wavelength light output vector. In this way, each photodetector of the detection unit 146 effectively adds multiple wavelengths of the optical signal to the single photocurrent, and each voltage output by the detection unit 146 corresponds to an element-by-element sum (element-by-element sum) of the matrix multiplication results of multiple digital input vectors.

Up to now, the nonlinear transformation of the weighted sum performed as part of the artificial neural network calculation is performed in the digital domain (digital domain) by the controller 110. In some cases, the nonlinear transformation may be computationally intensive or power consuming, significantly increasing the complexity of the controller 110, or limiting the performance of the artificial neural network computing system 100 in terms of throughput or power efficiency. As such, in some embodiments of the artificial neural network computing system, the nonlinear transformation may be performed in an analog domain (analog domain) by analog electronics.

Fig. 3A illustrates a diagram of an example of an artificial neural network computing system 300. The artificial neural network computing system 300 is similar to the artificial neural network computing system 100, except that an analog nonlinear unit 310 is added. The analog nonlinear unit 310 is disposed between the detection unit 146 and the ADC unit 160. The analog nonlinear unit 310 is configured to receive the output voltage from the detection unit 146, apply a nonlinear transfer function, and output the converted output voltage to the ADC unit 160.

When the ADC unit 160 receives the voltage that has been non-linearly converted by the analog non-linear unit 310, the controller 110 may obtain a converted digital output voltage corresponding to the converted output voltage from the ADC unit 160. Since the digital output voltage obtained from the ADC unit 160 has been non-linearly transformed ("activated"), the non-linear transformation step of the controller 110 may be omitted, thereby reducing the computational burden of the controller 110. Next, the first converted voltage directly obtained from the ADC unit 160 may be stored in the storage unit 120 as a first converted digital output vector.

The analog nonlinear unit 310 may be implemented in various ways. For example, a high gain amplifier in a feedback configuration, a comparator with an adjustable reference voltage, a non-linear IV characteristic of a diode, a breakdown characteristic of a diode (breakdown behavior), a non-linear CV characteristic of a variable capacitance, or a non-linear IV characteristic of a variable resistance may be used to implement the analog non-linear unit 310.

The use of the analog nonlinear unit 310 may improve the performance, such as throughput or power efficiency, of the artificial neural network computing system 300 by reducing the steps performed in the digital domain. Shifting the nonlinear transformation step out of the digital domain may allow for additional flexibility and improvement in the operation of the artificial neural network computing system. For example, in a recurrent neural network, the output of the optical matrix multiplication unit 150 is activated and recycled back to the input of the optical matrix multiplication unit 150. The activation step is performed by the controller 110 in the artificial neural network computing system 100, which requires digitizing the output voltage of the detection unit 146 each time the optical matrix multiplication unit 150 is passed. However, since the activation step is now performed before the digitization of the ADC unit 160, the number of ADC conversions required in performing the recurrent neural network calculations can be reduced.

In some embodiments, the analog nonlinear unit 310 may be integrated into the ADC unit 160 as a nonlinear ADC unit. For example, the nonlinear ADC unit may be a linear ADC unit having a nonlinear look-up table that maps the linear digital output of the linear ADC unit to a desired nonlinear transformed digital output.

Fig. 3B shows a diagram of an example of an artificial neural network computing system 302. The artificial neural network computing system 302 is similar to the system 300 of FIG. 3A, except that it further includes an analog storage unit 320. The analog storage unit 320 is coupled to the DAC unit 130 (e.g., via the first DAC subunit 132), the modulator array 144, and the analog nonlinear unit 310. Analog storage unit 320 includes a multiplexer having a first input coupled to DAC unit 130 and a second input coupled to analog nonlinear unit 310. This allows the analog storage unit 320 to receive signals from the DAC unit 130 or the analog nonlinear unit 310. The analog memory unit 320 is configured to store an analog voltage and output the stored analog voltage.

The analog memory cell 320 may be implemented in various ways. For example, a capacitor array may be used as an analog voltage storage component. The capacitance of the analog memory cell 320 may be charged to the input voltage by a charging circuit. The storage of the input voltage may be controlled based on a control signal received from the controller 110. The capacitor may be electrically isolated from the surrounding environment to reduce charge leakage that leads to undesirable capacitor discharge. Additionally (or alternatively), a feedback amplifier may be used to maintain the voltage stored on the capacitor. The storage voltage of the capacitor can be read out by the buffer amplifier, which allows the charge stored by the capacitor to be held while outputting the storage voltage. These aspects of analog memory cell 320 may be similar to the operation of a sample and hold circuit (SAMPLE AND hold circuit). The buffer amplifier may perform the function of a modulator driver for driving the modulator array 144.

The operation of the artificial neural network computing system 302 will now be described. The first plurality of modulator control signals output by the DAC unit 130 (e.g., by the first DAC subunit 132) are first input to the modulator array 144 through the analog storage unit 320. In this step, analog storage unit 320 may simply pass or buffer the first plurality of modulator control signals. The modulator array 144 generates an optical input vector based on the first plurality of modulator control signals, which propagates through the optical matrix multiplication unit 150 and is detected by the detection unit 146. The output voltage of the detection unit 146 is non-linearly transformed by the analog non-linear unit 310. At this time, instead of being digitized by the ADC unit 160, the output voltage of the detection unit 146 is stored by the analog storage unit 320, which is then output to the modulator array 144 to be converted into the next optical input vector to be propagated through the optical matrix multiplication unit 150. The recursive process may be performed under the control of the controller 110 for a preset amount of time or for a preset number of cycles (recurrent processing). Once the recursive process is completed for a given digital input vector, the converted output voltage of the analog nonlinear unit 310 is converted by the ADC unit 160.

The use of analog storage unit 320 may significantly reduce the number of ADC conversions during the recurrent neural network calculations, for example, to one single ADC conversion per recurrent neural network calculation for a given digital input vector. Each ADC conversion takes a period of time and consumes some energy. As such, the throughput of the recurrent neural network computation of the artificial neural network computing system 302 may be higher than the throughput of the recurrent neural network computation of the artificial neural network computing system 100.

The execution of the recurrent neural network calculation may be controlled by controlling the analog storage unit 320. For example, the controller may control the analog memory cell 320 to store voltages at specific times and output the stored voltages at different times. As such, the signal cycling from analog storage 320 to modulator array 144 through analog nonlinear unit 310 and back to analog storage unit 320 may be controlled by controller 110 controlling the storage and readout of analog storage unit 320.

As such, in some embodiments, the controller 110 of the artificial neural network computing system 302 may perform the steps of storing a plurality of converted output voltages of the analog nonlinear unit through the analog storage unit based on generating the first plurality of modulator control signals and the first plurality of weight control signals, outputting the stored converted output voltages through the analog storage unit, deriving a second plurality of converted digital output voltages from the ADC unit, the second plurality of converted digital output voltages forming a second converted digital output vector, and storing the second converted digital output vector in the storage unit.

The input data set processed by the artificial neural network computing system typically includes data having a resolution of greater than 1 bit. For example, a typical pixel of a grayscale digital image may have a resolution of 8 bits, i.e., 256 different levels. One way to represent and process this data in the optical domain is to encode 256 pixels of different intensity levels as 256 different power levels of the optical signal input to the optical matrix multiplication unit 150. The optical signal is analog in nature and is therefore susceptible to noise and detection errors. Referring back to fig. 1A, to maintain the 8-bit resolution of the digital input vector throughout the artificial neural network computing system 100 and produce a true 8-bit digital light output at the output of the ADC unit 160, each portion of the signal chain (SIGNAL CHAIN) may preferably be designed to reproduce (reproduce) and maintain 8-bit resolution.

For example, DAC cell 130 may preferably be designed to support conversion of an 8-bit digital input vector to a modulator control signal of at least 8-bit resolution, so that modulator array 144 may produce an 8-bit optical input vector faithfully representing the digital input vector. In general, the modulator control signals may need to have an additional resolution of 8 bits over the digital input vector to compensate for the nonlinear response of the modulator array 144. Furthermore, the internal configuration of the optical matrix multiplication unit 150 may preferably be stable enough to ensure that the values of the light output vectors are not corrupted by any fluctuations in the configuration of the optical matrix multiplication unit 150. For example, the temperature of the optical matrix multiplication unit 150 may need to be stabilized within 5 degrees, 2 degrees, 1 degree, or 0.1 degree. Furthermore, the detection unit 146 may preferably have a sufficiently low noise to not destroy the 8-bit resolution of the light output vector, and the ADC unit 160 may preferably be designed to support the digitization of analog voltages having a resolution of at least 8 bits.

The power consumption and design complexity of various electronic components generally increases with bit resolution, operating speed, and bandwidth. For example, as a first order approximation (first-order approximation), the power consumption of the ADC unit 160 may scale linearly with the sampling rate, and the scaling factor is 2N, where N is the bit resolution of the conversion result. Furthermore, design considerations of DAC cell 130 and ADC cell 160 typically result in a tradeoff between sampling rate and bit resolution. As such, in some cases, it may be desirable for the artificial neural network computing system to operate internally at a lower bit resolution than the resolution of the input data set, while maintaining the resolution of the artificial neural network computing output.

Referring to fig. 4A, a diagram of an example of an artificial neural network (artificial neural network) computing system 400 with 1-bit internal resolution is shown. The artificial neural network computing system 400 is similar to the artificial neural network computing system 100 except that the DAC unit 130 is now replaced by the driver unit 430 and the ADC unit 160 is now replaced by the comparator unit 460. The driver unit 430 includes a first driver subunit 432 and a second driver subunit 434.

Driver unit 430 is configured to generate a 1-bit modulator control signal and a multi-bit weight control signal. For example, the driver circuitry of driver unit 430 may receive a binary (binary) digital output directly from controller 110 and condition the binary signal to a two-stage (two-level) voltage or current output suitable for driving modulator array 144. Similarly, the second driver subunit 434 of the driver unit 430 may receive the binary digital output directly from the controller 110 and condition the binary signal into a two-stage voltage or current output suitable for driving the modulator in the optical matrix multiplication unit 150.

The comparator unit 460 is configured to convert the output voltage of the detection unit 146 into a digital 1-bit optical output. For example, the comparison circuit of the comparator unit 460 may receive the voltage from the detection unit 146, compare the voltage with a preset threshold voltage, and output a digital 0 or 1 when the received voltage is less than or greater than the preset threshold voltage, respectively.

Referring to fig. 4B, a mathematical representation of the operation of the artificial neural network computing system 400 is shown. The operation of the artificial neural network computing system 400 will now be described with reference to fig. 4B. For a given artificial neural network calculation to be performed by the artificial neural network computing system 400, there is a corresponding digital input vector V and neural network weight matrix U. In this embodiment, the input vector V is a vector of length 4 having elements V ₀ to V ₃, and the matrix U is a 4 x 4 matrix having weights U ₀₀ to U ₃₃. Each element of the vector V has a resolution of 4 bits. Each 4-bit vector element has bits 0 (bit ₀) to 3 (bit ₃) corresponding to 2^0 to 2^3 positions, respectively. Thus, the decimal (radix 10) value of the 4-bit vector element is calculated by the sum of 2 x 0 bits ₀+2^1*bit₁+2^2*bit₂+2^3*bit₃. Thus, as shown, the input vector V may be similarly simulated decomposed into V _bit0 -V _bit3 by the controller 110.

The particular artificial neural network computation may then be performed by performing a series of matrix multiplications of the 1-bit vector, followed by summing the individual matrix multiplication results. For example, each of the decomposed input vectors V _bit0 to V _bit3 may be multiplied by a matrix U by generating a sequence of 4 1-bit modulator control signals corresponding to the 4 1-bit input vectors by the driver unit 430. This in turn produces a sequence of 4 1-bit optical input vectors that propagate through the optical matrix multiplication unit 150, the optical matrix multiplication unit 150 being configured to effect matrix multiplication of the matrix U by the driver unit 430. Next, the controller 110 may derive a sequence of 4 digital 1-bit light outputs corresponding to the sequence of 4 1-bit modulator control signals from the comparator unit 460.

In the case of 4-bit vectors decomposed into 4 1-bit vectors, each vector should be processed by the artificial neural network computing system 400 at four times the speed at which other artificial neural network computing systems (e.g., system 100) can process a single 4-bit vector to maintain the same effective artificial neural network computing throughput. This increased internal processing speed can be seen as time division multiplexing (time-division multiplexing) of 4 1-bit vectors into a single slot (timeslot) for processing the 4-bit vectors. The required increase in processing speed may be achieved at least in part by the increased operating speed of the driver unit 430 and the comparator unit 460 relative to the DAC unit 130 and the ADC unit 160, as a reduction in resolution of the signal conversion process generally results in an increase in the achievable signal conversion rate.

Although the signal slew rate in 1-bit operation is increased by a factor of four, the resulting power consumption can be significantly reduced relative to 4-bit operation. As described above, the power consumption of the signal conversion process typically scales exponentially with bit resolution, while scaling linearly with the conversion rate. As such, each 16-fold reduction in conversion power may be due to a 4-fold reduction in bit resolution, followed by a 4-fold increase in power due to an increase in conversion rate. In summary, a 4-fold reduction in operating power may be achieved over, for example, the artificial neural network computing system 100 by the artificial neural network computing system 400, while maintaining the same effective artificial neural network computing throughput.

Next, the controller 110 may construct a 4-bit digital output vector from the 4 digital 1-bit light outputs by multiplying each digital 1-bit light output by a respective weight 2^0 to 2^3. Once the 4-bit digital output vector is constructed, an artificial neural network calculation may be performed by performing a nonlinear transformation on the constructed 4-bit digital output vector to generate a converted 4-bit digital output vector, and the converted 4-bit digital output vector may be stored in the storage unit 120.

Alternatively (or additionally), in some embodiments, each of the 4 digital 1-bit light outputs may be non-linearly transformed. For example, a step function nonlinear function (step-function nonlinear function) may be used for the nonlinear transformation. A converted 4-bit digital output vector can then be constructed from the nonlinear transformed digital 1-bit optical output.

While a separate artificial neural network computing system 400 has been shown and described, in general, the artificial neural network computing system 100 of FIG. 1A may be designed to implement functions similar to the artificial neural network computing system 400. For example, DAC unit 130 may include a 1-bit DAC subunit configured to generate a 1-bit modulator control signal, and ADC unit 160 may be designed to have a resolution of 1 bit. Such a 1-bit ADC may be similar to or effectively identical to a comparator.

Furthermore, while the operation of an artificial neural network computing system having a 1-bit internal resolution has been described, in general, the internal resolution of an artificial neural network computing system may be reduced to an intermediate level below the N-bit resolution of the input data set. For example, the internal resolution may be reduced to 2^Y bits, where Y is an integer greater than or equal to 0.

Embodiments of the subject matter and the functional operations described in this disclosure can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this disclosure and their structural equivalents, or in combinations of one or more of them. Embodiments of the subject matter described in this disclosure can be implemented using one or more modules of computer program instructions encoded on a computer-readable medium to perform or control the operation of a data processing apparatus. The computer readable medium may be an article of manufacture (e.g., a hard disk drive in a computer system or an optical disk sold through a retail pipeline) or an embedded system. The computer readable medium may separately acquire and then encode one or more modules of computer program instructions, for example, by transmitting the one or more modules of computer program instructions over a wired or wireless network. The computer readable medium may be a machine readable storage device, a machine readable storage substrate, a storage device, or a combination of one or more of them.

A computer program (also known as a program, software application, script (script), or code) can be written in any form of programming language, including compiled or interpreted languages, declarative (declarative), or program (procedural), and it can be deployed in any form, including as a stand-alone program (stand alone program) or as a module, component, subroutine, or other unit suitable for use in a computing environment. The computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document (markup language document)), in a single file dedicated to the program in question, or in multiple coordinated files (multiple coordinated file) (e.g., files that store one or more modules, sub-programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network.

The processes and logic flows described in this disclosure can be performed by one or more programmable processors (programmable processor) executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry (special purpose logic circuitry), e.g., a field-encodable gate array (field programmable GATE ARRAY; FPGA) or an application-specific integrated circuit (ASIC).

While this disclosure contains many implementation details, these should not be construed as limitations on the scope of the disclosure or of the claims, but rather as descriptions of specific features of specific embodiments of the disclosure. Certain features that are described in this disclosure in the context of separate embodiments can also be implemented in combination in a single embodiment. Rather, the various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although features may be described above as acting on certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, although operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the embodiments described herein should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated in a single software product or packaged into multiple software products.

Thus, particular embodiments of the present disclosure have been described. Other embodiments are within the scope of the following claims. In addition, the actions recited in the claims can be performed in a different order and still achieve desirable results. For example, the optical matrix multiplication unit 150 in fig. 1A includes an optical interference unit 154 that includes a plurality of interconnected mach-zehnder interferometers. In some embodiments, the optical interference unit may be implemented using a one-, two-or three-dimensional passive diffractive optical element (PASSIVE DIFFRACTIVE optical element) that consumes little power. The optical interference unit using the passive diffractive optical element may have a smaller size or may process a larger number of inputs/outputs for the same chip size, if the number of inputs/outputs remains unchanged, compared to the optical interference unit including the mach-zehnder interferometer. Passive diffractive optical components can be manufactured at lower cost than mach-zehnder interferometers.

Referring to fig. 5, in some embodiments, the artificial neural network computing system 500 includes a controller 110, a storage unit 120, a DAC unit 506, an optical processor 504, and an ADC unit 160. The memory unit 120 and the ADC unit 160 are similar to the corresponding components of the system 100 in fig. 1A. The light processor 504 is configured to perform matrix calculations using the optical components. In system 500, the weights of optical matrix multiplication units 502 are fixed. DAC unit 506 is similar to first DAC subunit 132 of system 100 of fig. 1A.

In an example operation of the artificial neural network computing system 500, the computer 102 may issue an artificial neural network computing request to the artificial neural network computing system 500. The artificial neural network computation request may include an input data set to be processed by the provided artificial neural network. The controller 110 receives the artificial neural network calculation request and stores the input data set in the storage unit 120.

In some embodiments, a hybrid approach is used in which one portion of the optical matrix multiplication unit 150 includes a mach-zehnder interferometer and another portion of the optical matrix multiplication unit 150 includes a passive diffraction component.

The internal operation of the artificial neural network computing system 500 will now be described. The optical processor 504 includes a laser unit 142, a modulator array 144, a detection unit 146, and an Optical Matrix Multiplication (OMM) unit 502. The laser unit 142, modulator array 144, and detection unit 146 are similar to the corresponding components of the system 100 in fig. 1A. In this example, the optical matrix multiplication unit 502 includes a two-dimensional diffractive optical element, and may be implemented as a passive integrated silicon photonics chip (PASSIVE INTEGRATED silicon photonic chip). The optical matrix multiplication unit 502 may be configured to implement a diffractive neural network, and may perform matrix multiplication with almost zero power consumption.

The optical processor 504 operates by encoding a digital input vector of length N onto an optical input vector of length N and propagating the optical input vector through the optical matrix multiplication unit 502. The optical matrix multiplication unit 502 receives an optical input vector of length N, and performs n×n matrix multiplication on the received optical input vector in the optical domain. The n×n matrix multiplication performed by the optical matrix multiplication unit 502 is determined by the internal configuration of the optical matrix multiplication unit 502. The internal configuration of the optical matrix multiplication unit 502 includes the size, location and geometry of the diffractive optical element, as well as the doping of impurities, if any.

The optical matrix multiplication unit 502 may be implemented in various ways. Fig. 6 shows a diagram of an example of an optical matrix multiplication unit 502 using a two-dimensional diffraction element array. The optical matrix multiplication unit 502 may include an array of input waveguides 602 to receive an optical input vector, a two-dimensional optical interference unit 600 in optical communication with the array of input waveguides 602, and an array of output waveguides 604 in optical communication with the optical interference unit 600. The optical interference unit 600 includes a plurality of diffractive optical components and performs a conversion (e.g., linear transformation) of the optical input vector into a second optical signal array. An array of output waveguides 604 guide a second array of optical signals output by optical interference unit 600. At least one input waveguide of the array of input waveguides 602 is in optical communication with each output waveguide of the array of output waveguides 604 through the optical interference unit 600. For example, for an optical input vector of length N, the optical matrix multiplication unit 502 may include N input waveguides 602 and N output waveguides 604.

In some embodiments, optical interference unit 600 includes a substrate with diffraction elements arranged in two dimensions (e.g., in a 2D array). For example, a plurality of circular holes may be drilled or etched in the substrate. The size of these holes may be on the order of the size of the wavelength of the input light such that the light is diffracted by the holes (or structures defining the holes). For example, the size of the holes may be in the range of 100nm to 2 μm. The holes may be of the same or different sizes. The holes may also have other cross-sectional shapes, such as triangular, square, rectangular, hexagonal or irregular shapes. The substrate may be made of a material transparent or translucent to the input light, for example having a transmission rate of 1% to 99% with respect to the input light. For example, the substrate may be made of silicon, silicon oxide, silicon nitride, quartz, crystals (e.g., lithium niobate (LiNbO 3)), III-V materials (e.g., gallium arsenide or indium phosphide), erbium-modified semiconductors (erbium modified semiconductor), or polymers.

In some embodiments, a holographic method (holographic method) may be used to form a two-dimensional diffractive optical element in a substrate. The substrate may be made of glass, crystal, or photorefractive material (photorefractive material).

The dimensions and positions of the diffraction elements are taken into account in two dimensions (e.g., X-direction and Y-direction) when designing the light matrix multiplication unit 502, regardless of the relative positions of the diffraction elements in a third dimension (e.g., Z-direction). Each diffraction element may be a three-dimensional structure formed in the substrate, such as a hole with a depth, a column (column), or a stripe (stripe).

In fig. 6, the diffractive optical element is represented by a circle. The diffractive optical element may also have other shapes, such as triangular, square, rectangular or irregular shapes. The diffractive optical element can have various dimensions. The diffractive optical elements do not have to be located on grid points (grid points), their positions can be changed. The diagram in fig. 6 is for illustration purposes only. The actual diffractive optical element may be different from that shown in the figures. Different arrangements of diffractive optical elements may be used to implement different matrix calculations, e.g. different matrix multiplication functions.

An optimization process may be used to determine the configuration of the diffractive optical element. For example, the substrate may be divided into an array of pixels, and each pixel may be filled with substrate material (no holes) or filled with air (holes). The configuration of the pixels may be modified iteratively and for each configuration of pixels, a simulation (simulation) may be performed by passing light through the diffractive optical element and evaluating the output. After performing a simulation of all possible configurations of pixels, the configuration that provides the closest result to the desired matrix processing is selected as the diffractive optical element configuration of the optical matrix multiplication unit 502.

As another example, the diffraction element is initially configured as an array of holes. The location, size and shape of the holes may be slightly different from their original configuration. Parameters of each hole may be iteratively adjusted and simulations may be performed to find an optimal configuration of holes.

In some embodiments, a machine learning process is used to design the diffractive optical element. An analysis function is determined how the pixel affects the input light to produce output light, and an optimization process (e.g., gradient descent method (GRADIENT DESCENT method)) is used to determine the optimal configuration of the pixel.

In some embodiments, the optical matrix multiplication unit 502 may be implemented as a user-variable component (user-changeable component), and different optical matrix multiplication units 502 with different optical interference units 600 may be installed for different applications. For example, system 500 may be configured as an optical character recognition system, and optical interference unit 600 may be configured to implement a neural network for performing optical character recognition. For example, the first optical matrix multiplication unit may have a first optical interference unit comprising a passive diffractive optical component configured to implement a first neural network for an optical character recognition engine for a first set of written languages and fonts. The second optical matrix multiplication unit may have a second optical interference unit comprising a passive diffractive optical element configured to implement a second neural network for the optical character recognition engine for a second set of written languages, fonts, and the like. When a user wants to apply optical character recognition to a first set of written languages and fonts using the system 500, the user can insert a first light matrix multiplication unit into the system. When the user wants to apply optical character recognition to a second set of written languages and fonts using the system 500, the user can swap out the first light matrix multiplication unit and insert the second light matrix multiplication unit into the system.

For example, the system 500 may be configured as a speech recognition system, and the optical interference unit 600 may be configured to implement a neural network for performing speech recognition. For example, the first optical matrix multiplication unit may have a first optical interference unit comprising a passive diffractive optical component configured to implement a first neural network for the speech recognition engine for the first spoken language. The second optical matrix multiplication unit may have a second optical interference unit comprising a passive diffractive optical element configured to implement a second neural network for the speech recognition engine for a second spoken language or the like, and so on. When a user wants to use the system 500 to recognize speech in a first spoken language, the user can insert a first light matrix multiplication unit into the system. When the user wants to use the system 500 to recognize speech in the second spoken language, the user can swap out the first light matrix multiplication unit and insert the second light matrix multiplication unit into the system.

For example, the system 500 may be part of a control unit of an autonomous vehicle, and the light intervention unit 600 may be configured to implement a neural network for performing road condition identification. For example, the first light matrix multiplication unit may have a first light interference unit comprising a passive diffractive optical element configured to implement a first neural network for identifying road conditions (including road signs) in the united states. The second light matrix multiplication unit may have a second light interference unit comprising a passive diffractive optical element configured to implement a second neural network for identifying road conditions in canada, including road signs. The third light matrix multiplication unit may have a third light interference unit comprising a passive diffractive optical element configured to implement a third neural network for identifying road conditions in mexico (including road signs), and so on. When using an autonomous vehicle in the united states, a first light matrix multiplication unit is inserted into the system. When the autonomous vehicle crosses a boundary and enters canada, the first light matrix multiplication unit is swapped out and the second light matrix multiplication unit is plugged into the system. On the other hand, when the autonomous vehicle crosses the boundary and enters mexico, the first light matrix multiplication unit is swapped out and the third light matrix multiplication unit is inserted into the system.

For example, system 500 may be used for gene sequencing (genetic sequencing). The DNA sequences may be classified using a convolutional neural network implemented using a system 500 that includes passive diffractive optical components. For example, the system 500 may implement a neural network for distinguishing tumor types, predicting tumor grade (tumor grade), and predicting patient survival from gene expression patterns (gene expression pattern). For example, system 500 may implement a neural network for identifying a subset of genes or features that are most predictive of the analyzed characteristics. For example, system 500 may implement a neural network for predicting or inferring expression levels (expression levels) of all genes from a data map (profile) of a subset of genes. For example, system 500 may implement a neural network for epigenetic analysis (epigenomic analyse), such as predicted transcription factor binding site (transcription factor binding site), enhancer region (enhancer region), and chromatin accessibility from gene sequences (chromatin accessibility). For example, system 500 may implement a neural network for capturing structures within a gene sequence.

For example, the system 500 may be configured as a medical diagnostic system, and the optical matrix multiplication unit 502 may be configured to implement a neural network for analyzing physiological parameters (physiological parameter) to perform screening for disease. For example, system 500 may be configured as a bacterial detection system, and light matrix multiplication unit 502 may be configured to implement a multiplication function for analyzing DNA sequences to detect certain bacterial strains.

In some embodiments, optical matrix multiplication unit 502 includes a housing (e.g., a cassette) that protects a substrate having diffractive optical components. The housing supports an input interface coupled to the input waveguide 602 and an output interface coupled to the output waveguide 604. The input interface is configured to receive the output from the modulator array 144 and the output interface is configured to transmit the output of the optical matrix multiplication unit 502 to the detection unit 146. The light matrix multiplication units 502 may be designed as modules suitable for handling by ordinary consumers, allowing a user to easily switch from one light matrix multiplication unit 502 to another light matrix multiplication unit 502. Machine learning techniques improve over time. A user may upgrade the system 500 by swapping out the old optical matrix multiplication unit 502 and inserting a new upgrade version.

Similar to the way an optical compact disc (optical compact disc) can store digital information that can be retrieved by a CD player, an optical matrix multiplication unit can store a neural network configuration that can be used in an optical processor. Just as optical compact discs are low cost media for distributing digital information (including audio, video, and software programs) to consumers, optical matrix multiplication units may be low cost media for distributing pre-configured neural networks or matrix processing functions (e.g., multiplication, convolution, or any other linear operation) to consumers.

In some embodiments, system 500 is an optical computing platform configured to operate with light matrix multiplication units provided by different companies. This allows different companies to develop different passive optical neural networks for various applications. Passive optical neural networks are sold to end users in standardized packages that can be installed in optical computing platforms to allow the system 500 to perform various intelligent functions.

In some embodiments, the system may have a holder mechanism for supporting a plurality of light matrix multiplication units 502, and a mechanical handling mechanism may be provided for automatically swapping out the light matrix multiplication units 502. The system determines which light matrix multiplication unit 502 is needed for the current application and uses a mechanical processing mechanism to automatically take the appropriate light matrix multiplication unit from the holder mechanism and insert it into the light processor 504.

For a particular size optical chip, more passive diffraction components can be mounted on the substrate than if an active interferometer (e.g., a Mach-Zehnder interferometer) were used. For example, optical interference unit 154 in fig. 1B using a mach-zehnder interferometer may be configured to handle 200 x 200 matrix multiplication, while optical interference unit 600 having the same overall dimensions and using passive diffraction components (each having dimensions of about 100nm x 100 nm) may be configured to handle 5000 x 5000 matrix multiplication.

The passive diffractive optical component consumes little power, so the optical matrix multiplication unit 502 can be used for low power devices, such as battery operated devices. The optical matrix multiplication unit 502 is adapted for edge computing (edge computing). For example, the optical matrix multiplication unit 502 may be used in a smart sensor, where raw data from the sensor is processed using an optical processor using the optical matrix multiplication unit 502. The smart sensor may be configured to transmit the processed data to the central computer server, thereby reducing the amount of raw data transmitted to the central computer server. By placing intelligent processing functions on the intelligent sensors, faults and anomalies can be detected earlier and processed more efficiently. The optical matrix multiplication unit 502 is suitable for applications requiring handling large matrix multiplications. The optical matrix multiplication unit 502 is suitable for applications where the neural network has been trained and weights have been determined and no modification is required.

The substrate in which the diffractive optical element is formed may be planar or curved. In the example of fig. 6, input light enters the light interference unit 600 from the left side, and output light exits the light interference unit 600 from the right side (the terms "left", "right", "upper" and "lower" refer to directions shown in the drawings). In some embodiments, the passive diffractive optical element may be configured such that some of the output light exits the optical interference unit from either the upper or lower portion, or any combination of the left, right, upper, and lower sides of the optical interference unit 600. The substrate for the optical interference unit 600 may have various shapes, such as square, rectangular, triangular, circular, or elliptical. The light interference unit 600 may include a reflective component or mirror to redirect the light propagation direction.

In some embodiments, the artificial neural network computing system 500 may be modified by adding an analog nonlinear unit 310 between the detection unit 146 and the ADC unit 160. The analog nonlinear unit 310 is configured to receive the output voltage from the detection unit 146, apply a nonlinear transfer function, and output the converted output voltage to the ADC unit 160. The controller 110 may obtain a converted digital output voltage corresponding to the converted output voltage from the ADC unit 160. Since the digital output voltage obtained from the ADC unit 160 has been non-linearly transformed ("activated"), the non-linear transformation step of the controller 110 may be omitted, thereby reducing the computational burden of the controller 110. Next, the first converted voltage directly obtained from the ADC unit 160 may be stored in the storage unit 120 as a first converted digital output vector.

The optical interference unit may be implemented using passive diffractive optical components arranged in three dimensions. Referring to fig. 7, in some embodiments, an artificial neural network computing system 700 has an optical processor 702 that includes a three-dimensional OMM unit 708. The system 700 includes a memory unit 120 and an ADC unit 160, which are similar to the corresponding components of the system 500 in fig. 5. The light processor 702 is configured to perform matrix calculations using diffractive optical elements arranged in three dimensions.

The light processor 702 includes a laser unit 704 configured to output a two-dimensional array of light beams 714, and a two-dimensional modulator array 706 configured to modulate the two-dimensional array of light beams 714 to produce a modulated two-dimensional array of light beams 716. The optical processor 702 includes an Optical Matrix Multiplication (OMM) unit 708 having a three-dimensionally arranged diffractive optical element and is configured to process a modulated two-dimensional array of light beams 716 and produce a two-dimensional array of output light beams 718. The light processor 702 includes a detection unit 710 having a two-dimensional array of light sensors to detect a two-dimensional array of output light beams 718. The ADC unit 160 converts the output of the detection unit 710 into a digital signal.

For example, the 3D optical matrix multiplication unit 708 may be implemented as a passive integrated silicon photonic column or cube. The optical matrix multiplication unit 708 may be configured to implement a diffractive neuronal network and may perform matrix multiplication at almost zero power consumption.

There are many ways to encode input data for use by the light processor 702. For example, a length N x N digital input vector may be encoded onto an N x N size optical input matrix, which propagates through optical matrix multiplication unit 708. The optical matrix multiplication unit 708 performs (n×n) × (n×n) matrix multiplication on the received optical input matrix in the optical domain. The (nxn) x (nxn) matrix multiplication performed by the optical matrix multiplication unit 708 is determined by the internal configuration of the optical matrix multiplication unit 708, including the size, position and geometry of the diffractive optical elements arranged in three dimensions, and doping of impurities, if any.

The optical matrix multiplication unit 708 may be implemented in various ways. Fig. 8 shows a diagram of an example of an optical matrix multiplication unit 708 using a three-dimensional arrangement of diffraction elements. The optical matrix multiplication unit 708 may include an input waveguide matrix for receiving the optical input matrix 802, a three-dimensional optical interference unit 804 in optical communication with the input waveguide matrix, and an output waveguide matrix in optical communication with the optical interference unit 804 for providing an optical output matrix 806. The optical interference unit 804 includes a plurality of diffractive optical components, and performs conversion (e.g., linear transformation) of an optical input (e.g., an n×n vector or matrix) to an optical output (e.g., an n×n vector or matrix). The output waveguide matrix guides the optical signal output by the optical interference unit 804. At least one input waveguide of the matrix of input waveguides is in optical communication with each output waveguide of the matrix of output waveguides by an optical interference unit 804. For example, for an optical input vector of length n×n, the optical matrix multiplication unit 708 may include n×n input waveguides and n×n output waveguides.

In some embodiments, the optical interference unit 804 includes a substrate block with diffraction elements arranged in three dimensions (e.g., in a 3D matrix). For example, a plurality of holes may be drilled or etched in each of a plurality of substrate slices, and the plurality of substrate slices may be combined to form a substrate block. The size of these holes may be on the order of the size of the wavelength of the input light such that the light is diffracted by the holes (or structures defining the holes). The holes may be of the same or different sizes. The holes may also have other cross-sectional shapes, such as triangular, square, rectangular, hexagonal or irregular shapes. In some embodiments, holographic methods may be used to form a three-dimensional diffractive optical element in the entire substrate block. The substrate may be made of a material transparent or translucent to the input light, for example having a transmission rate of 1% to 99% with respect to the input light.

The dimensions and positions of the diffraction elements in the x, y and z directions are taken into account when designing the light matrix multiplication unit 708. An optimization process may be used to determine the configuration of the diffractive optical element. For example, the substrate block may be divided into a three-dimensional matrix of pixels, and each pixel may be filled with substrate material (no holes) or filled with air (holes). The configuration of the pixels may be modified iteratively and for each configuration of pixels, a simulation may be performed by passing light through the diffractive optical element and evaluating the output. After performing a simulation of all possible configurations of pixels, the configuration that provides the closest result to the desired matrix processing is selected as the diffractive optical element configuration of the optical matrix multiplication unit 708.

As another example, the diffraction element is initially configured as a three-dimensional matrix of holes. The location, size and shape of the holes may be slightly different from their original configuration. Parameters of each hole may be iteratively adjusted and simulations may be performed to find an optimal configuration of holes.

In some embodiments, a machine learning process is used to design a three-dimensional diffractive optical element. It is determined how the pixel affects the analytical function of the input light and a gradient descent method is used to determine the optimal configuration of the pixel.

In some embodiments, the optical matrix multiplication unit 708 may be implemented as a user variable component, and different optical matrix multiplication units 708 with different optical interference units 804 may be installed for different applications. For example, the system 700 may be configured as a medical diagnostic system, and the optical interference unit 804 may be configured to implement a neural network for analyzing physiological parameters to perform screening for disease. For example, the first optical matrix multiplication unit may have a first optical interference unit comprising a 3D passive diffractive optical component configured to implement a first neural network for screening a first set of diseases. The second optical matrix multiplication unit may have a second optical interference unit comprising a 3D passive diffractive optical component configured to implement a second neural network for screening a second set of diseases, and so on. The first and second light matrix multiplication units may be developed by different companies that specifically develop techniques for screening for different diseases. When a user wants to use the system 700 to screen a first set of diseases, the user may insert a first light matrix multiplication unit into the system. When the user wants to use the system 700 to screen for a second set of diseases, the user can swap out the first light matrix multiplication unit and insert the second light matrix multiplication unit into the system.

For example, the system 700 may be configured as an optical character recognition system, and the optical interference unit 804 may be configured to implement a neural network for performing optical character recognition. For example, the system 700 may be configured as a speech recognition system, and the optical interference unit 804 may be configured to implement a neural network for performing speech recognition. For example, the system 700 may be part of a control unit of an autonomous vehicle, and the light intervention unit 804 may be configured to implement a neural network for performing road condition identification.

For example, system 700 may be used for gene sequencing. The DNA sequences may be classified using a convolutional neural network implemented using a system 700 that includes passive diffractive optical components. For example, system 700 can implement a neural network for differentiating tumor types, predicting tumor grade, and predicting patient survival from gene expression patterns. For example, the system 700 may implement a neural network for identifying a subset of genes or features that are most predictive of the analyzed characteristics. For example, system 700 may implement a neural network for predicting or inferring expression levels of all genes from a data map of a subset of genes. For example, system 700 may implement neural networks for epigenetic analysis, such as predicting transcription factor binding sites, enhancer regions, and chromatin accessibility from gene sequences. For example, system 700 may implement a neural network for capturing structures within a gene sequence. For example, system 700 may be configured as a bacterial detection system, and optical interference unit 804 may be configured to implement a multiplicative function for analyzing DNA sequences to detect certain bacterial strains.

In some embodiments, the light matrix multiplication unit 708 includes a housing (e.g., a cassette) that protects the substrate with the 3D diffractive optical components. The housing supports an input interface coupled to the input waveguide and an output interface coupled to the output waveguide. The input interface is configured to receive the output from the modulator array 706 and the output interface is configured to transmit the output of the optical matrix multiplication unit 708 to the detection unit 710. The light matrix multiplication units 708 may be designed as modules suitable for handling by ordinary consumers, allowing a user to easily switch from one light matrix multiplication unit 708 to another light matrix multiplication unit 708. Machine learning techniques improve over time. The user may upgrade the system 700 by swapping out the old optical matrix multiplication unit 708 and inserting a new upgrade version.

In some embodiments, system 700 is an optical computing platform configured to operate with light matrix multiplication units provided by different companies. This allows different companies to develop different 3D passive optical neural networks for various applications. The 3D passive optical neural network is sold to end users in standardized packages that can be installed in an optical computing platform to allow the system 700 to perform various intelligent functions.

In some embodiments, the system may have a holder mechanism for supporting a plurality of light matrix multiplication units 708, and a mechanical handling mechanism may be provided for automatically swapping out the light matrix multiplication units 708. The system determines which light matrix multiplication unit 708 is needed for the current application and uses a mechanical processing mechanism to automatically take the appropriate light matrix multiplication unit 708 from the holder mechanism and insert it into the light processor 702.

In some embodiments, the artificial neural network computing system 700 may be modified by adding an analog nonlinear unit between the detection unit 710 and the ADC unit 160. The analog nonlinear unit is configured to receive the output voltage from the detection unit 710, apply a nonlinear transfer function, and output the converted output voltage to the ADC unit 160. The controller 110 may obtain a converted digital output voltage corresponding to the converted output voltage from the ADC unit 160. Since the digital output voltage obtained from the ADC unit 160 has been non-linearly transformed ("activated"), the non-linear transformation step of the controller 110 may be omitted, thereby reducing the computational burden of the controller 110. Next, the first converted voltage directly obtained from the ADC unit 160 may be stored in the storage unit 120 as a first converted digital output vector.

The optical interference unit may be implemented using passive diffractive optical components arranged in one dimension. Referring to fig. 9, in some embodiments, the artificial neural network computing system 900 has an optical processor 906 that includes a one-dimensional optical multiplication unit 916. The system 900 includes a storage unit 120 that is similar to the corresponding components of the system 100 in fig. 1A. The light processor 906 is configured to perform multiplication calculations using diffractive optical elements in a one-dimensional arrangement (along the light propagation axis).

The optical processor 906 includes a laser unit 908 configured to output a laser beam 910 and a modulator 912 configured to modulate the laser beam 910 to produce a modulated beam 914. The optical processor 906 includes a one-dimensional optical multiplication unit 916 having a one-dimensional arrangement of diffractive optical elements and is configured to process the modulated optical beam 914 and to generate an output optical beam 918. The light processor 906 comprises a detection unit 920, the detection unit 920 having a light sensor for detecting the output light beam 918. The output of the detection unit 920 is converted into a digital signal by the ADC unit 930.

For example, the optical multiplication unit 916 may be implemented as a passive integrated silicon photonic waveguide with diffractive optical components (e.g., gratings or holes). The optical multiplication unit 916 may be configured to perform multiplication operations with almost zero power consumption.

There are many ways to encode input data for use by the light processor 906. For example, the digital input vector may be encoded as an optical input propagating through the optical multiplication unit 916. The optical multiplication unit 916 performs multiplication on the received optical input in the optical domain. The multiplication performed by the optical multiplication unit 916 is determined by the internal configuration of the optical multiplication unit 916, including, for example, the size, position, and geometry of the diffractive optical element arranged in one dimension along the optical propagation path, and the doping of impurities (if any).

The optical multiplication unit 916 may be implemented in various ways. Fig. 10 shows a diagram of an example of a light multiplication unit 916 using a one-dimensional arrangement of diffraction elements. The optical multiplication unit 916 may include an input waveguide for receiving the optical input 1002, a one-dimensional optical interference unit 1004 in optical communication with the input waveguide, and an output waveguide in optical communication with the optical interference unit 1004 for providing the optical output 1006. The optical interference unit 1004 includes a plurality of diffractive optical elements, and performs conversion (e.g., linear conversion) of light input to light output. The output waveguide guides the optical signal output by the optical interference unit 1004.

In some embodiments, the optical interference unit 1004 includes an elongated substrate having diffraction elements arranged in one dimension along an optical propagation path. For example, a plurality of holes may be drilled or etched in the substrate. The size of these holes may be on the order of the size of the wavelength of the input light such that the light is diffracted by the holes (or structures defining the holes). The holes may be of the same or different sizes. The substrate may be made of a material transparent or translucent to the input light, for example having a transmission rate of 1% to 99% with respect to the input light. In some embodiments, holographic methods may also be used to form diffractive optical elements in a substrate.

When designing the optical interference unit 1004, the size and position of the diffraction element along the propagation path of the light beam are considered. An optimization process may be used to determine the configuration of the diffractive optical element. For example, the substrate may be divided into a series of pixels, and each pixel may be filled with substrate material (no holes) or filled with air (holes). The configuration of the pixels may be modified iteratively and for each configuration of pixels, a simulation may be performed by passing light through the diffractive optical element and evaluating the output. After performing a simulation of all possible configurations of pixels, the configuration that provides the closest result to the desired multiplication process is selected as the diffractive optical element configuration of the optical interference unit 1004.

As another embodiment, the diffraction element is initially configured as a series of holes. The locations and sizes of the holes may be slightly different from their original configuration. Parameters of each hole may be iteratively adjusted and simulations may be performed to find an optimal configuration of holes.

In some embodiments, a machine learning process is used to design a one-dimensional diffractive optical element. It is determined how the pixel affects the analytical function of the input light and a gradient descent method is used to determine the optimal configuration of the pixel.

In some embodiments, the optical multiplication unit 916 may be implemented as a user variable component, and different optical multiplication units 916 with different optical interference units 1004 may be installed for different applications. For example, system 900 may be configured as a bacterial detection system, and optical interference unit 1004 may be configured to implement a multiplicative function for analyzing DNA sequences to detect certain bacterial strains. For example, the first optical multiplication unit may have a first optical interference unit comprising a 1D passive diffractive optical component configured to implement a first multiplication function for detecting the first group of bacteria. The second optical multiplying unit may have a second optical interference unit comprising a 1D passive diffractive optical element configured to implement a second multiplying function for detecting a second set of bacteria, and so on. The first and second optical multiplying units may be developed by different companies that specifically develop techniques for detecting different bacteria. When a user wants to detect a first set of bacteria using the system 900, the user can insert a first light multiplying unit into the system. When a user wants to detect a second set of bacteria using the system 900, the user can swap out the first light multiplying unit and insert a second light multiplying unit into the system. By using a one-dimensional diffractive optical element, the laser unit 908, the modulator 912, the detection unit 920, and the ADC unit 930 can be manufactured at low cost.

In some embodiments, the optical multiplication unit 916 includes a housing (e.g., a cassette) that protects the substrate with the 1D diffractive optical element. The housing supports an input interface coupled to the input waveguide and an output interface coupled to the output waveguide. The input interface is configured to receive the output from the modulator 912 and the output interface is configured to transmit the output of the optical multiplication unit 916 to the detection unit 920. The optical multiplication units 916 may be designed as modules suitable for processing by an average consumer, allowing a user to easily switch from one optical multiplication unit 916 to another optical multiplication unit 916. Machine learning techniques improve over time. The user may upgrade the system 900 by swapping out the old optical multiplication unit 916 and inserting a new upgrade version.

In some embodiments, system 900 is an optical computing platform configured to operate with optical multiplication units provided by different companies. This allows different companies to develop different 1D passive optical multiplication functions for various applications. The 1D passive optical multiplication functions are sold to end users in standardized packages that can be installed in an optical computing platform to allow the system 900 to perform various intelligent functions.

In some embodiments, the system may have a holder mechanism for supporting a plurality of light multiplying units 916, and a mechanical handling mechanism may be provided for automatically swapping out the light multiplying units 916. The system determines which optical multiplication unit 916 is required for the current application and uses a mechanical processing mechanism to automatically take the appropriate optical multiplication unit 916 from the holder mechanism and insert it into the optical processor 906.

In some embodiments, the artificial neural network computing system 900 may be modified by adding an analog nonlinear unit between the detection unit 920 and the ADC unit 930. The analog nonlinear unit is configured to receive the output voltage from the detection unit 920, apply a nonlinear transfer function, and output the converted output voltage to the ADC unit 930. The controller 902 may obtain a converted digital output voltage corresponding to the converted output voltage from the ADC unit 930. Since the digital output voltage obtained from the ADC unit 930 has been non-linearly transformed ("activated"), the non-linear transformation step of the controller 902 may be omitted, thereby reducing the computational burden of the controller 902. The first converted voltage directly obtained from the ADC unit 930 may then be stored in the memory unit 120 as a first converted digital output vector.

Passive chips with passive diffractive optical components have a number of advantages. First, any given size chip may contain a larger neural network because the active components (typically the most bulky components) have been eliminated. Typically useful neural networks can include millions of weights, which are challenging to implement on an active chip, and may require multiple data runs through the chip and reprogramming of the chip. In contrast, a single passive chip may be able to support the entire neural network. Second, the very low power consumption of passive chips is important for "edge" applications, as such applications may require a small footprint (footprint) and low power consumption. Third, passive chips can be manufactured at very low cost because they do not contain active components.

An optical matrix multiplication unit with passive diffractive optical components may also be used in a wavelength division multiplexed artificial neural network computing system. For example, the optical matrix multiplication unit 150 of the system 104 in fig. 1F may be replaced with an optical matrix multiplication unit using passive diffractive optical components. In this example, the second DAC subunit 134 may be removed.

In some embodiments, the light processor (e.g., 504, 702) may perform matrix processing in addition to matrix multiplication. The optical matrix multiplication units 502 and 708 may be replaced by optical matrix processing units that perform other types of matrix processing.

Fig. 25 shows a flowchart of an example of a method 2500 of performing artificial neural network calculations using an artificial neural network computing system 500, 700, or 900, the artificial neural network computing system 500, 700, or 900 including one or more optical matrix multiplication units or optical multiplication units with passive diffraction components, such as the 2D optical matrix multiplication unit 502, the 3D optical matrix multiplication unit 708, or the 1D OM unit 916. The steps of process 2500 may be performed at least in part by controller 110 or 902. In some embodiments, the various steps of method 2500 may be run in parallel, in combination, in a loop, or in any order.

In step 2510, an Artificial Neural Network (ANN) calculation request comprising an input data set is received. The input data set includes a first digital input vector. The first digital input vector is a subset of the input data set. For example, it may be a sub-region of an image. The artificial neural network computation request may be generated by various entities (e.g., computer 102). The computers may include one or more of various types of computing devices, such as personal computers, server computers, vehicle computers, and flight computers. An artificial neural network computation request generally refers to an electrical signal that informs or informs the artificial neural network computing system 500, 700, or 900 that an artificial neural network computation is to be performed. In some embodiments, the artificial neural network computation request may be split into two or more signals. For example, the first signal may interrogate the artificial neural network computing system 500, 700, or 900 to check whether the system 500, 700, or 900 is ready to receive the input data set. In response to an acknowledgement by the system 500, 700 or 900, the computer may transmit a second signal comprising the input data set.

In step 2520, the input data set is stored. The controller 110 may store the input data set in the storage unit 120. Storing the input data set in the memory unit 120 may allow flexibility in the operation of the artificial neural network computing system 500, 700, or 900, for example, may improve the overall performance of the system. For example, the input data set may be split into digital input vectors of a set size and format by retrieving a desired portion of the input data set from the storage unit 120. The different portions of the input data set may be processed in various orders or shuffled to allow for various types of artificial neural network calculations to be performed. For example, where the input and output matrices are of different sizes, shuffling may allow matrix multiplication to be performed by a block matrix multiplication technique. As another example, storing the input data set in the storage unit 120 may allow for queuing of multiple artificial neural network computing requests by the artificial neural network computing system 500, 700, or 900, which may allow the system 500, 700, or 900 to maintain operation at its full speed without periods of inactivity.

In step 2530, a first plurality of modulator control signals are generated based on the first digital input vector. The controller 110 may transmit the first DAC control signal to the DAC unit 506, 712, or 904 to generate a first plurality of modulator control signals. DAC unit 506, 712, or 904 generates a first plurality of modulator control signals based on the first DAC control signal and modulator array 144, 706 or modulator 912 generates an optical input vector representing the first digital input vector.

The first DAC control signal may comprise a plurality of digital values to be converted by DAC unit 506, 712, or 904 to a first plurality of modulator control signals. The plurality of digital values generally corresponds to the first digital input vector and may be associated by various mathematical relationships or look-up tables. For example, the plurality of digital values may be linearly proportional to the values of the elements of the first digital input vector. As another example, the plurality of digital values may be associated with elements of the first digital input vector through a lookup table configured to maintain a linear relationship between the digital input vector and the optical input vector produced by modulator array 144, 706 or modulator 912.

In some embodiments, the 2D optical matrix multiplication unit 502, the 3D optical matrix multiplication unit 708, or the 1D OM unit 916 are configured to perform optical matrix processing or optical multiplication based on the optical input vector and a plurality of neural network weights implemented using passive diffraction components. The plurality of neural network weights representing the matrix M may be decomposed into m=usv by a Singular Value Decomposition (SVD) method, where U is an mxm unitary matrix, S is an mxn diagonal matrix having non-negative real numbers on diagonal lines, and V is the complex conjugate of an nxn unitary matrix V. In this case, the passive diffraction element may be configured to implement matrix V, matrix S, and matrix U, such that the optical matrix multiplication unit 502 or 708 implements matrix M as a whole.

In step 2540, a first plurality of digital light outputs corresponding to the light output vectors of the light matrix multiplication unit or light multiplication are obtained. The light input vector produced by modulator array 144, 706 or modulator 912 is processed by 2D light matrix multiplication unit 502, 3D light matrix multiplication unit 708 or 1D OM unit 916 and converted into a light output vector. The light output vector is detected by the detection unit 146, 710 or 920 and converted into an electrical signal, which may be converted into a digital value by the ADC unit 160 or 930. The controller 110 or 902 may, for example, transmit a conversion request to the ADC unit 160 or 930 to start converting the voltage output by the detection unit 146, 710 or 920 into a digital light output. Once the conversion is completed, the ADC unit 160 or 930 may transmit the conversion result to the controller 110 or 902. Alternatively, the controller 110 or 902 may take the conversion result from the ADC unit 160 or 930. The controller 110 or 902 may form a digital output vector from the digital light output, the digital output vector corresponding to the result of a matrix multiplication or vector multiplication of the input digital vector. For example, the digital light outputs may be organized or connected to have a vector format.

In some embodiments, ADC unit 160 or 930 may be set or controlled to perform ADC conversion based on the DAC control signals issued by controller 110 or 902 to DAC unit 506, 712, or 904. For example, the ADC conversion may be set to start at a preset time after the DAC unit 506, 712, or 904 generates the modulation control signal. Such control of the ADC conversion may simplify the operation of the controller 110 or 902 and reduce the number of necessary control operations.

In step 2550, a nonlinear transformation is performed on the first digital output vector to produce a first transformed digital output vector. The nodes or artificial neurons of the artificial neural network operate by first performing a weighted sum of the signals received from the nodes of the previous layer, and then performing a nonlinear transformation ("activation") of the weighted sum to produce an output. Various types of artificial neural networks may implement various types of differentiable nonlinear transformations. Examples of nonlinear transformation functions include modified linear unit (RELU) functions, sigmoid functions, hyperbolic tangent functions (hyperbolic tangent function), X2 functions, and |X| functions. This nonlinear transformation is performed on the first digital output by controller 110 or 902 to produce a first transformed digital output vector. In some embodiments, the nonlinear transformation may be performed by an application specific digital integrated circuit within the controller 110 or 902. For example, the controller 110 or 902 may include one or more modules or circuit blocks that are particularly adapted to accelerate the computation of one or more types of nonlinear transformations.

In step 2560, the first transformed digital output vector is stored. The controller 110 or 902 may store the first transformed digital output vector in the storage unit 120. In the case where the input data set is divided into a plurality of digital input vectors, the first transformed digital output vector corresponds to, for example, an artificial neural network calculation result of a portion of the input data set of the first digital input vector. As such, storing the first transformed digital output vector allows the artificial neural network computing system 500, 700, or 900 to perform and store additional computations on other digital input vectors of the input dataset to be later aggregated into a single artificial neural network output.

In step 2570, an artificial neural network output generated based on the first transformed digital output vector is output. The controller 110 or 902 generates an artificial neural network output that is the result of processing the input dataset through an artificial neural network defined by the first plurality of neural network weights. In the case where the input data set is split into a plurality of digital input vectors, the artificial neural network output produced is an aggregate output comprising the first converted digital output, but may further comprise additional converted digital outputs corresponding to other portions of the input data set. Once the artificial neural network output is generated, the generated output is transmitted to a computer (e.g., computer 102) that initiated the artificial neural network calculation request.

The 2D optical matrix multiplication unit 502, the 3D optical matrix multiplication unit 708, or the 1D OM unit 916 may represent weight coefficients of one hidden layer of the neural network. If the neural network has multiple hidden layers, additional 2D optical matrix multiplication units 502, 3D optical matrix multiplication units 708, or 1D OM units 916 may be coupled in series. Fig. 26 illustrates an example of an artificial neural network computing system 2600 for implementing a neural network with two hidden layers. The first 2D light matrix multiplication unit 2604 represents the weight coefficient of the first hidden layer, and the second 2D light matrix multiplication unit 2606 represents the weight coefficient of the second hidden layer. Artificial neural network computing system 2600 includes controller 110, storage unit 120, DAC unit 506, and photo-processor 2602. The memory unit 120 and DAC unit 506 are similar to the corresponding components of the system 500 in fig. 5. The optoelectronic processor 2602 is configured to perform matrix calculations using optical and electronic components.

The photo processor 2602 includes a first laser unit 142a, a first modulator array 144a, a first 2D optical matrix multiplication unit 2604, a first detection unit 146a, a first analog nonlinear unit 310a, an analog storage unit 320, a second laser unit 142b, a second modulator array 144b, a second 2D optical matrix multiplication unit 2606, a second detection unit 146b, a second analog nonlinear unit 310b, and an ADC unit 160. The operation of the first laser unit 142, the first modulator array 144a, the first detection unit 146a, the first analog nonlinear unit 310a, and the analog storage unit 320 is similar to the corresponding components shown in fig. 3B. The first 2D light matrix multiplication unit 2604 is similar to the 2D light matrix multiplication 502 of fig. 5. The output of the analog storage unit 320 drives the second modulator array 144b, and the second modulator array 144b modulates the laser light from the second laser unit 142b to generate a light vector. The light vectors from the second modulator array 144b are processed by a second 2D light matrix multiplication unit 2606, the second 2D light matrix multiplication unit 2606 performing matrix multiplication and producing light output vectors, which are detected by a second detection unit 246 b. The second detection unit 246b is configured to generate an output voltage of the optical signal corresponding to the optical output vector from the second 2D optical matrix multiplication unit 2606. The ADC unit 160 is configured to convert the output voltage into a digital output voltage. The controller 110 may derive a digital output from the ADC unit 160 corresponding to the light output vector of the second 2D light matrix multiplication unit 2606. The controller 110 may form a digital output vector from the digital output, the digital output vector corresponding to a result of a second matrix multiplication of the nonlinear transformation of the result of the first matrix multiplication of the input digital vector. The second laser unit 142b is combined with the first laser unit 142a by using a beam splitter to steer some light from the first laser unit 142a to the second modulator array 144b.

The above principle can be applied to implement a neural network with three or more hidden layers, where the weight coefficient of each hidden layer is represented by a corresponding 2D light matrix multiplication unit.

Fig. 27 illustrates an example of an artificial neural network computing system 2700 for implementing a neural network with two hidden layers. The first 3D light matrix multiplication unit 2704 represents the weight coefficient of the first hidden layer, and the second 3D light matrix multiplication unit 2706 represents the weight coefficient of the second hidden layer. The artificial neural network computing system 2700 includes a controller 110, a storage unit 120, a DAC unit 712, and a photo-processor 2702. The memory unit 120 and DAC unit 712 are similar to the corresponding components of the system 700 in fig. 7. The optoelectronic processor 2702 is configured to perform matrix calculations using optical and electronic components.

The photo processor 2702 includes a first laser unit 704a, a first modulator array 706a, a first 3D optical matrix multiplication unit 2704, a first detection unit 710a, a first analog nonlinear unit 310a, an analog storage unit 320, a second laser unit 704b, a second modulator array 706b, a second 3D optical matrix multiplication unit 2706, a second detection unit 710b, a second analog nonlinear unit 310b, and an ADC unit 160. The operation of the first laser unit 704a, the first modulator array 706a, the first detection unit 710a, the first analog nonlinear unit 310a, and the analog storage unit 320 is similar to the corresponding components shown in fig. 3B. The first 3D light matrix multiplication unit 2704 is similar to the 3D light matrix multiplication 708 of fig. 7. The output of the analog storage unit 320 drives a second modulator array 706b, the second modulator array 706b modulating the laser light from the second laser unit 704b to produce a light vector. The light vectors from the second modulator array 706b are processed by a second 3D light matrix multiplication unit 2706, the second 3D light matrix multiplication unit 2706 performs matrix multiplication and generates light output vectors, which are detected by a second detection unit 710 b. The second detection unit 710b is configured to generate an output voltage of the optical signal corresponding to the optical output vector from the second 3D optical matrix multiplication unit 2706. The ADC unit 160 is configured to convert the output voltage into a digital output voltage. The controller 110 may derive a digital output from the ADC unit 160 corresponding to the light output vector of the second 3D light matrix multiplication unit 2406. The controller 110 may form a digital output vector from the digital output, the digital output vector corresponding to a result of a second matrix multiplication of the nonlinear transformation of the result of the first matrix multiplication of the input digital vector. The second laser unit 704b may be combined with the first laser unit 704a by using a beam splitter to steer some of the light from the first laser unit 704a to the second modulator array 706 b.

The above principle can be applied to implement a neural network with three or more hidden layers, where the weight coefficient of each hidden layer is represented by a corresponding 3D light matrix multiplication unit.

The 2D optical matrix multiplication unit 502 and the 3D optical matrix multiplication unit 708 with passive diffractive optical components are suitable for use in recurrent neural networks (recurrent neural networks), wherein the output of the network during the (k) th pass through the neural network is recycled back to the input of the neural network and used as input during the (k+1) th pass so that the weighting coefficients of the neural network remain the same during the multipass.

Fig. 28 illustrates an example of a neural network computing system 2800 that can be employed to implement a recurrent neural network. The system 2800 includes a light processor 2802 that operates in a manner similar to the light processor 140 of fig. 3B, except that the light matrix multiplication unit 150 is replaced by a 2D light matrix multiplication unit 2804, the 2D light matrix multiplication unit 2804 may be similar to the 2D light matrix multiplication unit 502 of fig. 6. The neural network weights of the 2D optical matrix multiplication unit 2804 are fixed, so the system 2800 does not require the second DAC subunit 134 used in the system 302 of fig. 3B.

Fig. 29 illustrates an example of a neural network computing system 2900 that may be used to implement a recurrent neural network. The system 2900 includes an optical processor 2902 that operates in a similar manner to the optical processor 140 of fig. 3B, except that the laser unit 142, modulator array 144, optical matrix multiplication unit 150, and detection unit 146 are replaced by the laser unit 704, modulator array 706, 3D optical matrix multiplication unit 2904, and detection unit 710 of fig. 7, respectively. The neural network weights of the 3D optical matrix multiplication unit 2904 are fixed, so the system 2900 does not require the second DAC subunit 134 used in the system 302 of fig. 3B.

Fig. 30 shows a diagram of an example of an artificial neural network computing system 3000 with 1-bit internal resolution. The artificial neural network computing system 3000 is similar to the artificial neural network computing system 400 of fig. 4A, except that the optical matrix multiplication unit 150 is replaced by a 2D optical matrix multiplication unit 3004 (which is similar to the 2D optical matrix multiplication unit 502 of fig. 5), and the second driver sub-unit 434 is omitted. The artificial neural network computing system 3000 operates in a similar manner to the artificial neural network computing system 400, in that an input vector is decomposed into a plurality of 1-bit vectors, and then some artificial neural network computations may be performed by summing the individual matrix multiplication results after performing a series of matrix multiplications of the 1-bit vectors.

Fig. 31 shows a diagram of an example of an artificial neural network computing system 3100 with 1-bit internal resolution. The artificial neural network computing system 3100 is similar to the artificial neural network computing system 400 of fig. 4A except that the optical matrix multiplication unit 150 is replaced by a 3D optical matrix multiplication unit 3104 (which is similar to the 3D optical matrix multiplication unit 708 of fig. 7), and the second driver sub-unit 434 is omitted. In the example of fig. 31, the laser unit 142, modulator array 144, and detection unit 146 of fig. 4A are replaced by the laser unit 704, modulator array 706, and detection unit 710 of fig. 7, respectively. The artificial neural network computing system 3100 operates in a manner similar to the artificial neural network computing system 400, in that an input vector is decomposed into a plurality of 1-bit vectors, and then some artificial neural network computations may be performed by summing the individual matrix multiplication results after performing a series of matrix multiplications of the 1-bit vectors.

The principle of the optical diffraction neural network is described below. The optical diffraction neural network may be implemented as several layers of diffractive or transmissive optical media. Based on the Huygens-Fresnel principle (Huygens-FRESNEL PRINCIPLE), each point in the diffraction medium can be considered a secondary light source (secondary light source). For each light source, far field diffraction (far field diffraction) can be described by the following equation:

here, the indices l and i represent the i-th neuron in the first layer neural network, λ is the wavelength of light, and r is the distance, where:

The output from each secondary light source can be written as the input times the phase and intensity modulation of the light source:

Here, t is a transmission modulation (transmission modulation) which is a plurality of terms (complex term) including both amplitude and phase modulation, and Is the sum of the inputs from all the previous light sources. In general, the outputs may be combined into far field diffraction time w and amplitude |A| and additional phase terms (PHASE TERM). Thus, each point in each layer may be considered a neuron that takes input from multiple neurons from a previous layer and adds additional phase and intensity modulation before output to the next layer.

The compact design (compact design) of a compact photon matrix multiplier unit that can implement general unitary matrix multiplication is described below. Referring to fig. 11, a photonic matrix multiplier unit 1100 includes a modulator 1102, a plurality of interconnection interferometers 1104, and an attenuator 1106. The interconnection interferometer 1104 includes directional coupler layers (or groups or sets of directional couplers) 1108a, 1108b, 1108c, 1108d, and 1108e (collectively 1108) and phase shifter layers (or groups or sets of phase shifters) 1110a, 1110b, 1110c, and 1110d (collectively 1110). Each directional coupler layer (or group or set of directional couplers) may include one or more directional couplers. Each phase shifter layer may include one or more phase shifters. In this example, interconnect interferometer 1104 includes five layers of directional coupler 1108 and four layers of phase shifters. In other embodiments, the photon matrix multiplier unit 1100 may have different directional coupler and phase shifter layers. In contrast to conventional matrix multiplier units using interconnected mach-zehnder interferometers, the photonic matrix multiplier unit 1100 has a directional coupler 1108, the directional coupler 1108 being positioned in such a way that the number of layers of the directional coupler 1108 is reduced.

Here, the term "layer" in the phrases "directional coupler layer" and "phase shifter layer" refers to a set or collection of directional couplers or phase shifters based on their positions in the photon matrix multiplier unit 1100 relative to the input ports and output ports. In the example of fig. 11, the input optical signal is processed by a first layer directional coupler 1108a, then by a second layer phase shifter 1110a, then by a third layer directional coupler 1108b, then by a fourth layer phase shifter 1110b, and so on.

For example, a conventional matrix multiplier unit using an interconnected Mach-Zehnder interferometer may require 2N layers of directional couplers, while the photonic matrix multiplier unit 1100 requires only N+2 layers of directional couplers. N represents the number of input signals, or the number of digits (number of digits) in the input vector. The grid architecture (mesh architecture) used in the photonic matrix multiplier unit 1100 may have the most compact geometry of a photonic interconnect interferometer that can perform general matrix calculations.

Fig. 12A shows a graph comparing an interconnection interferometer 1104 of a photonic matrix multiplier unit 1100 with a conventionally designed interconnection interferometer for various numbers of input signals. When there are 4 input signals, the interconnect mach-zehnder interferometer 1200 according to the conventional design requires 8 layers of directional couplers, while the interconnect interferometer 1202 according to the new compact design requires only 6 layers of directional couplers. When there are 3 input signals, the interconnecting mach-zehnder interferometer 1204 according to the conventional design requires 6 layers of directional couplers, while the interconnecting interferometer 1206 according to the new compact design requires only 5 layers of directional couplers. When there are 8 input signals, the interconnecting mach-zehnder interferometer 1208 according to the conventional design requires 16 layers of directional couplers, while the interconnecting interferometer 1210 according to the new compact design requires only 10 layers of directional couplers.

In general, when there are n input signals, an interconnecting mach-zehnder interferometer according to the conventional design requires 2n layers of directional couplers, whereas an interconnecting interferometer according to the new compact design requires only n+2 layers of directional couplers.

In conventional designs, there are n layers of mach-zehnder interferometers for n input signals, and each mach-zehnder interferometer includes a directional coupler followed by a pair of phase shifters followed by another directional coupler. Thus, an n-layer mach-zehnder interferometer has a 2 n-layer directional coupler. As a result, in conventional designs, for n input signals, n-layer phase shifters and 2 n-layer directional couplers are required.

In contrast, in the new compact design, one layer of directional coupler is followed by a first layer of phase shifter, then one layer of directional coupler, then a second layer of phase shifter, then one layer of directional coupler, then a third layer of phase shifter, and so on. After the last layer of phase shifters there are two layers of directional couplers. As a result, for n input signals, there are n layers of phase shifters and n+2 layers of directional couplers.

Because the directional couplers occupy a lot of space, reducing the number of directional couplers from 2 n to n+2 can significantly reduce the size of the photon matrix multiplier unit 1100 compared to conventional designs.

Fig. 12B shows a diagram of a compact interconnection interferometer 1212 according to the new design, where the number of input signals is 5.

Compact design decomposition using gradient descent is described below. The compact design of the photon matrix multiplier described above can take any unitary matrix U and use an analytical decomposition algorithm (analytic decomposition algorithm) to determine which phases need to be implemented using phase shifters and thus implement matrix U. For example, the phase may be extracted from a given matrix U by using gradient descent. The gradient descent process is as follows. Starting from a fixed matrix U and initializing a random weight θ for a compactly designed phase shifter. The matrix U 'is constructed using a compact design, i.e., U' = CompactDesign (θ). Next look at the loss function (loss function) l= |u-U' |2 (this is the freude Luo Beini us norm of the matrix) and minimize this function using gradient descent (i.e. update θ by using gradient update).

Referring to FIG. 13, homodyne detection (homodyne detection) is used (e.g., taking the real part at the output), so an additional attenuator layer 1302 is provided before detection in order to simulate an orthogonal matrix. This means that along with θ, the diagonal weight (weight) x of the attenuator needs to be learned. In this way, the phase and diagonal weights required for U can be known and the decomposition can be obtained numerically.

An Optical GENERATIVE ADVERSARIAL Network (OGAN) is described below that includes a generator configured to efficiently generate loyalty data (faithful data). Fig. 14 shows an example of a light generation countermeasure network 1400 in which the generator 1404 includes a neural network configured or trained to generate a composite image 1410 that is similar to a real image, and the discriminator 1402 includes a neural network trained to determine whether the input image is real or composite. An initial training image set 1406 is provided to train the discriminator 1402 such that the discriminator 1402 learns the features of the real image. Similarly, the generator 1404 is trained using a set of training images (not shown) such that the generator 1404 can generate a composite image 1410 having features similar to those of a real image.

In some embodiments, training of the discriminator 1402 is performed electronically, e.g., using a transistor-based data processor (e.g., a central processing unit or a general purpose graphics processor unit (general purpose graphic processor unit)) to calculate weights of the neural layers of the discriminator 1402. Similarly, training of the generator 1404 is also performed electronically to calculate weights of the neural layers of the generator 1404.

The composite image 1410 generated by the generator 1404 may be provided to the discriminator 1402 to further train the discriminator 1402 so that the discriminator 1402 may more accurately detect the true image. The detection results of the discriminator 1402 may also be used to further train the generator 1404 so that the generator 1404 may produce a more realistic composite image 1410, i.e., a more realistic image.

The light generation countermeasure network 1400 has many applications. For example, in some applications, it may be difficult or expensive to obtain a large number of real images for training the discriminator 1402. To train the discriminator 1402 to detect (e.g., cancer cells), a large number of cancer cell images are required during the training phase. Obtaining a large number of cancer cell images from a cancer patient can be difficult and expensive, and thus there may be insufficient sample to train the discriminator 1402 with sufficient accuracy. To improve the discriminator 1402, the generator 1404 is trained to generate realistic images of cancer cells, and the discriminator 1402 is further trained using the synthesized realistic images 1410 of cancer cells, thereby improving the ability of the discriminator 1402 to detect cancer cells.

In some embodiments, the generator 1404 may be an optical chip that includes active components, such as active phase shifters for modifying weights of the neural network. After training generator 1404, the active components are fixed to fix the weights. Random noise 1408 is fed to the generator 1404, and the generator 1404 then generates a composite image 1410 based on the random noise 1408, wherein the composite image 1410 resembles a real image of cancer cells.

In some embodiments, generator 1404 is implemented using an optical matrix multiplication unit as shown in fig. 5, 7, and/or 9. After determining the weights of the neural network, the optical matrix multiplication unit is configured to implement the neural network based on the determined weights. Because the input to the generator 1404 is random noise 1408, it is not necessary to have a modulator array, allowing the generator 1404 to have a small footprint.

Whether the generator 1404 is implemented using a passive optical chip or an optical chip with active components, the trained generator 1404 can generate realistic images (e.g., real images that resemble cancer cells) that can then be provided to the discriminator 1402 to further train and refine the discriminator 1402. The generator 1404 has a high throughput and can generate the composite image 1410 at a rate potentially several orders of magnitude faster than using a conventional electronic data processor (e.g., a general-purpose graphics processing unit). The generator 1404 has low power consumption, possibly several orders of magnitude lower, than using a conventional electronic data processor.

The generator 1404 has a variety of applications. For example, the composite image produced by the generator 1404 may have many applications in the medical field. The generator 1404 may be configured to synthesize images of tissue associated with certain diseases, and the synthesized images may be used to train the discriminator 1402 to identify tissue associated with the diseases. For example, the composite image produced by the generator 1404 may have many applications in the field of autopilot or navigation. For example, the generator 1404 may be configured to generate composite images of various traffic conditions, and the composite images may be used to train the discriminator 1402 to identify the traffic conditions. For example, the composite image produced by the generator 1404 may have many applications in the field of manufacturing quality control (field of manufacturing quality control). For example, the generator 1404 may be configured to generate a composite image of a product having a defect, and the composite image may be used to train the discriminator 1402 to detect the defective product.

In some embodiments, the light generation countermeasure network 1400 includes a coherent light source (coherent light source), a filter for random amplitude and phase inputs, where both amplitude and phase follow a known distribution. The light generating countermeasure network 1400 includes an interferometer grid (mesh of interferometers) for fast processing of information. The light generating countermeasure network 1400 can be designed to have an architecture that does not require shuffle weights, i.e., the interferometer is not reprogrammed. The light generating countermeasure network 1400 can also be designed to include fast phase shifters with operating rates greater than 1 GHz. The light generation countermeasure network 1400 can have a non-linear rapid execution. For example, it may have (i) nonlinearity in the analog electronics domain, (ii) simple optical nonlinearity, or (iii) nonlinearity in the digital electronics domain.

The following describes a novel photonic circuit having interconnected Mach-Zehnder interferometers and configured to implement logic gates (logic gates). Referring to fig. 15, the mach-zehnder interferometer 1500 includes a phase shifter 1502 configured to cause the mach-zehnder interferometer 1500 to effect the following rotations:

Referring to fig. 16, photonic circuit 1600 may implement XOR gates and OR gates. The photonic circuit 1600 includes a mach-zehnder interferometer 1500, a detector 1602, and a comparator 1604 having an analog electronic threshold. When input signals x1 and x2 are provided to photonic circuit 1600, mach-zehnder interferometer 1500 performs the following operations:

The detector 1602 generates an output representing the absolute value of the detection signal, so the output of the detector 1602 is:

The analog electronic threshold of the comparator 1604 is biased (biased) to 1/2 to remove the 1/∈2 factor, so the output of the comparator 1604 is:

the photonic circuit 1600 produces the following results for various combinations of input signals x1, x 2:

In the above, the first pair of digits is the input signal, the second pair of digits is the output of detector 1602, and the third pair of digits is the output of comparator 1604. When (x 1, x 2) = (0, 0), the mach-zehnder interferometer 1500 performs multiplication, generates a result (0, 0), the detector 1602 outputs (0, 0), and the comparator 1604 generates the result (0, 0). When (x 1, x 2) = (0, 1) is input, the mach-zehnder interferometer 1500 performs multiplication, producing a result Detector 1602 outputAnd the comparator 1604 generates a result (1, 1). When (x 1, x 2) = (1, 0) is input, the mach-zehnder interferometer 1500 performs multiplication, producing a resultDetector 1602 outputAnd the comparator 1604 generates a result (1, 1). When (x 1, x 2) = (1, 1), the mach-zehnder interferometer 1500 performs multiplication, producing a resultDetector 1602 outputAnd the comparator 1604 generates a result (0, 1). The above result indicates that detector 1602 produces at first output 1606aAnd at a second output 1606bComparator 1604 removalFactors to produce XOR (x 1, x 2) at the first output 1608a and OR (x 1, x 2) at the second output 1608 b.

Referring to fig. 17A, a photonic circuit 1700 may implement an AND gate AND an OR gate. The photonic circuit 1700 includes a mach-zehnder interferometer 1500 and a detector 1602, wherein the output of the detector 1602 is recycled once. When input signals x1 and x2 are provided to photonic circuit 1700, mach-Zehnder interferometer 1500 and detector 1602 produce an output:

The output of the detector 1602 is recycled back to the input of the photonic circuit 1700 and after the signal second pass mach-zehnder interferometer 1500 and the detector 1602, the detector 1602 produces the final output:

The photonic circuit 1700 produces the following results for various combinations of input signals x1, x 2:

in the above, the first pair of digits is the input signal, the second pair of digits is the output of the detector 1602 after the first pass, and the third pair of digits is the output of the detector 1602 after the second pass. When (x 1, x 2) = (0, 0), the detector 1602 outputs (0, 0) after the first pass through the mach-zehnder interferometer 1500, and the detector 1602 outputs (0, 0) after the second pass through the mach-zehnder interferometer 1500. When input (x 1, x 2) = (0, 1), after the first pass through mach-zehnder interferometer 1500, detector 1602 outputs And detector 1602 outputs (0, 1) after the second pass mach-zehnder interferometer 1500. When input (x 1, x 2) = (1, 0), after the first pass through mach-zehnder interferometer 1500, detector 1602 outputsAnd detector 1602 outputs (0, 1) after the second pass mach-zehnder interferometer 1500. When input (x 1, x 2) = (1, 1), after the first pass through mach-zehnder interferometer 1500, detector 1602 outputsAnd after the second mach-zehnder interferometer 1500, the detector 1602 outputs (1, 1). The above results indicate that after two passes, detector 1602 generates a signal representing AND (x 1, x 2) at first output 1704 AND generates a signal representing OR (x 1, x 2) at second output 1706.

Fig. 17B shows another embodiment of a photonic circuit 1710 that includes a first mach-zehnder interferometer 1712, a first detector 1714, a second mach-zehnder interferometer 1716, and a second detector 1718. The second detector 1718 produces a first output 1720 representing AND (x 1, x 2) AND a second output 1722 representing OR (x 1, x 2).

The implementation of logic gates (e.g., AND, OR, AND XOR gates) using photonic circuits including mach-zehnder interferometers, directional couplers, planar optical waveguides, AND photodetectors is described above. Logic gates may be used to generate comparators for sorting algorithms, e.g., algorithms similar to the double-tone sorter (Bitonic sorter) described in the linking URL < https:// en. Wikipedia. Org/wiki/Bitonic _ sorter >. As another example, a logic gate may be used to construct a hash algorithm (hashing algorithm) similar to SHA-2, SHA-2 is described in the linking URL < https:// en.wikipedia. Org/wiki/SHA-2>, which is a standard for NIST suggestion and has many applications. Because the logic circuits implemented using the photonic circuits described above are mostly passive, they can have less delay and lower power consumption than CMOS logic gates. There is no optical nonlinearity in the design of the optical logic gate. The nonlinear response is from the detection of a signal using a photodetector.

Incoherent or low coherence optical computing system

The following describes an optoelectronic computing system that processes incoherent or low coherence optical signals while performing matrix calculations. The optical processor 140 of the artificial neural network computing system 100 in fig. 1 includes a laser unit 142 that produces N optical outputs having the same wavelength and being optically coherent. The optical matrix multiplication unit 150 performs n×n matrix multiplication in the optical domain, wherein the optical signals remain coherent from the input of the optical matrix multiplication unit 150 to the output of the optical matrix multiplication unit 150. The advantages of the optical matrix multiplication unit 150 performing matrix multiplication in the optical domain have been described above. The following describes an optoelectronic computing system that does not require that the optical signals be coherent throughout the matrix multiplication process, where a portion of the computation is performed in the optical domain and a portion of the computation is performed in the electrical domain. The advantages of the optoelectronic computing system have been described in the summary of the invention above.

Optoelectronic computing systems use different types of operations to produce a result of a computation, each operation being performed on a signal (e.g., an electrical or optical signal) that is best suited to the fundamental physical characteristics of the operation (e.g., in terms of energy consumption and/or speed). For example, replication may be performed using optical power splitting, summation may be performed using current-based summation, and multiplication may be performed using optical amplitude modulation. An example of a calculation that may be performed using these three types of operations is to multiply a vector by a matrix (e.g., as employed by artificial neural network calculations). These operations may be used to perform various other calculations representing a general set of linear operations that may perform various calculations including, but not limited to, vector-vector dot product, vector-vector element-by-element multiplication, vector-scalar element-by-element multiplication, or matrix-matrix element-by-element multiplication.

Referring to fig. 18, an example of an optoelectronic computing system 1800 includes a set of optical ports/sources 1802A, 1802B, etc. that provide optical signals. For example, in some embodiments, optical port/light source 1802A may include an optical input coupler that provides an optical signal coupled to optical path 1803. In other embodiments, optical port/light source 1802A may include a modulated light source, such as a laser (e.g., for a coherently sensitive embodiment) or a light emitting diode (LIGHT EMITTING diode; LED) (e.g., for a coherently non-sensitive embodiment), that generates an optical signal that is coupled to optical path 1803. Some embodiments may include a combination of a port coupling an optical signal into system 1800 and a source generating an optical signal within system 1800. An optical signal may include any light wave (e.g., electromagnetic waves, the spectrum of which includes wavelengths in a range between about 100nm and about 1 mm) that has been or is being modulated with information using any of a variety of forms of modulation. The optical path 1803 may be defined, for example, based on a guided mode of an optical waveguide (e.g., a waveguide embedded in a Photonic Integrated Circuit (PIC) or optical fiber) or based on a predetermined free-space path between the optical port/light source 1802A and another module of the system 1800.

In some embodiments, the optoelectronic computing system 1800 is configured to perform computations on arrays of input values encoded on respective optical signals provided through the optical ports/light sources 1802A, 1802B, etc. For example, for various machine learning applications based on neural networks, the computation may implement vector matrix multiplication (or vector-by-matrix multiplication (matrix multiplication)), where the input vector is multiplied by a matrix to produce the output vector as a result. The light signal may represent elements of a vector, possibly including only a subset of selected elements of the vector. For example, for some neural network models, the size of the matrix used in the computation may be larger than the size of the matrix that may be loaded into a hardware system (e.g., an engine or coprocessor of a larger system) that performs the vector matrix multiplication portion of the computation. Thus, performing a portion of the computation may involve dividing the matrix and vector into smaller segments (segments) that may be provided separately to the hardware system.

The module shown in fig. 18 may be part of a larger system that performs vector matrix multiplication on a relatively large matrix (or sub-matrix) such as a 64 x 64 element matrix. For purposes of illustration, however, the modules will be described in the context of example computations that perform vector matrix multiplication using a2 x2 matrix of elements. The modules referenced in this example will include two replication modules 1804A and 1804B, four multiplication modules 1806A, 1806B, 1806C, and 1806D, and two summation modules, only one summation module 1808 of which is shown in fig. 18. These modules will cause the input vectorMultiplying by matrix To generate an output vectorFor the vector matrix multiplicationOutput vectorEach of the two elements of (a) may be represented by a different equation, as shown below.

y_A＝m_Ax_A+M_BX_B

y_B＝M_Cx_A+M_Dx_B

These equations may be broken down into separate steps, copy operations, multiply operations, and sum operations, that may be performed in system 1800 using a set of basic operations. In these equations, each element of the input vector appears twice, so there are two copy operations. There are also four multiplication operations and there are two summation operations. For systems that use larger matrices to implement vector matrix multiplication, the number of operations performed will be greater, and using matrices that are not square in shape (i.e., columns and rows are different), the relative number of instances per operation will be different.

In this example, the copy operation is performed by the copy modules 1804A and 1804B. The elements of input vectors x _A and x _B are represented by values encoded on the optical signals from optical ports/sources 1802A and 1802B, respectively. Each of these values is used in two equations, so each value is duplicated to provide the resulting two copies to different respective multiplication modules. For example, as described in more detail below, values may be encoded in a particular time slot using light waves that have been modulated to have power from a set of multiple power levels, or light waves having a duty cycle from a set of multiple duty cycles. The value is copied by copying the optical signal on which the value was encoded. The optical signal encoded with the value representing element x _A is replicated by the replication module 1804A, and the optical signal encoded with the value representing element x _B is replicated by the replication module 1804B. Each replication module may be implemented, for example, using an optical power splitter, such as a waveguide splitter that couples a guided mode in an input waveguide to each of two output waveguides on a Y-splitter that splits power progressively, such as adiabatically (adiabatically), or a free-space splitter, such as a free-space beam splitter, that uses a dielectric interface or film with one or more layers to transmit and reflect two output beams, respectively, from an input beam.

In this disclosure, when it is said that the optical signal encoded with the value representing element x _A is replicated by the replication module 1804A, it is meant that the multiple signal copies representing element x _A are generated based on the input signal, and the output signal of the replication module 1804A does not necessarily have the same amplitude as the input signal. For example, if the replica module 1804A splits the input signal power evenly between the two output signals, each of the two output signals will have a power equal to or less than 50% of the input signal power. The two output signals are copies of each other, and the amplitude of each output signal of the replica module 1804A is different from the amplitude of the input signal. Moreover, in some embodiments having a set of multiple replica modules for replicating a given optical signal or subset of optical signals, each individual replica module does not necessarily split power evenly among its generated replicas, but the set of replica modules may be collectively configured to provide a replica having substantially equal power as the input of a downstream module (downstream module) (e.g., a downstream multiplication module).

In this embodiment, the multiplication operation is performed by four multiplication modules 1806A, 1806B, 1806C, and 1806D. For each copy of an optical signal, a multiplication module multiplies the copy of the optical signal by the matrix element values, which may be performed using optical amplitude modulation. For example, the multiplication module 1806A multiplies the input vector element x _A by the matrix element M _A. The values of vector element x _A may be encoded on the optical signal and the values of matrix element M _A may be encoded as the amplitude modulation level of the optical amplitude modulator (amplitude modulation level).

The optical signal encoded with vector elements x _A may be encoded using different forms of amplitude modulation. The amplitude of the optical signal may correspond to a particular instantaneous power level P _A of the physical lightwave over a particular time slot, or may correspond to a particular energy E _A of the physical lightwave over a particular time slot (integrated power over time (the power integrated over time) producing total energy). For example, the power of the laser source may be modulated to have a particular power level from a predetermined set of multiple power levels. In some embodiments, it may be useful to operate the electronic circuit near an optimal operating point, thus instead of varying power over many possible power levels, an optimal "on" power level is used, where the signal is modulated to be "on" and "off (off) for a particular portion of the time slot (at zero power). The time portion of the power at the "on" level corresponds to a particular energy level. Any of these particular values of power or energy may be mapped to particular values of element x _A (using linear or non-linear mapping). After the signal is in the electrical domain, the actual integration (actual integration over time) over time that produces a particular total energy level may occur downstream of the system 1800, as described in more detail below.

In addition, the term "amplitude" may refer to the amplitude of a signal represented by the instantaneous or integrated power in a light wave, or equivalently, the "electromagnetic field amplitude" of a light wave. This is because the electromagnetic field amplitude has a well-defined relationship to the signal amplitude (e.g., by integrating the electromagnetic field strength (proportional to the square of the electromagnetic field amplitude) over the lateral dimension of the guided mode or free space beam to produce instantaneous power). This results in a relation between the modulation values, since by a specific valueA modulator that modulates the amplitude of the electromagnetic field may also be considered as modulating the power-based signal amplitude by a corresponding value M (since the optical power is proportional to the square of the electromagnetic field amplitude).

The optical amplitude modulator used by the multiplication module to encode the matrix element M _A may operate by changing the amplitude of the optical signal (i.e., the power in the optical signal) using any of a variety of physical interactions. For example, the modulator may include a ring resonator, an electro-absorption modulator, a thermo-optic modulator (thermo electro-optical modulator), or a Mach-Zehnder interference (MZI) modulator. In some techniques, a portion of the power is absorbed as part of a physical interaction, and in other techniques, the power is transferred using a physical interaction that modifies other characteristics of the light wave rather than its power, such as its polarization or phase, or modifies the coupling of optical power between different optical structures (e.g., using tunable resonators). For optical amplitude modulators that operate using interference (e.g., destructive and/or constructive interference) between light waves that have traveled on different paths, a coherent light source (e.g., a laser) may be used. For an optical amplitude modulator that operates using absorption, either a coherent or incoherent or low coherence light source, such as an LED, may be used.

In one example of a waveguide 1x 2 optical amplitude modulator, a phase modulator is used to modulate the power in an optical wave by placing the phase modulator in one of the plurality of waveguides of the modulator. For example, a waveguide 1x 2 optical amplitude modulator may split an optical wave guided by an input optical waveguide into a first arm and a second arm. The first arm includes a phase shifter that produces a relative phase shift with respect to a phase delay of the second arm. The modulator then combines the light waves from the first arm and the second arm. In some embodiments, the different phase delay values multiply the power in the light wave guided by the input light waveguide by a value between 0 and 1 by constructive or destructive interference. In some embodiments, the first arm and the second arm are combined into each of the two output waveguides, and the difference between photocurrents produced by the respective photodetectors receiving the light waves from the two output waveguides provides a signed multiplication result (e.g., multiplied by a value between-1 and 1), as described in more detail below. By appropriate selection of the amplitude scaling of the encoded optical signal, the range of matrix element values can be mapped to any range of positive values (0 to M) or signed values (-M to M).

In this embodiment, the summation operation is performed by two summation modules, with summation module 1808 (shown in FIG. 18) being used to perform the summation in the equation used to calculate output vector element y _B. A corresponding summation module (not shown) is used to perform the summation in the equation used to calculate the output vector element y _A. The summing module 1808 produces an electrical signal that represents the sum of the results of the two multiplication modules 1806C and 1806D. In this example, the electrical signal is in the form of a current i _sum that is proportional to the sum of the power in the output optical signals generated by multiplication modules 1806C and 1806D, respectively. In some embodiments, the summing operation that produces this current i _sum is performed in the optical-electrical domain, and in other embodiments in the electrical domain. Or some embodiments may use electro-optic domain summation for some summation modules and use electro-domain summation for other summation modules.

In embodiments where the summation is performed in the electrical domain, the summation module 1808 may be implemented using (1) two or more input conductors, each carrying an input current, the magnitude of the input current representing the result of one of the multiplication modules, and (2) at least one output conductor carrying a current that is the sum of the input currents. This may occur, for example, if the conductors are wires that contact at the junction. For example, and without being bound by theory, this relationship may be understood based on Kirchhoff's current law (Kirchhoff's current law), which states that the current flowing into a junction is equal to the current flowing out of the junction. For these embodiments, the signals 1810A and 1810B provided to the summing module 1808 are input currents that may be generated by a photodetector that is part of a multiplying module that generates a corresponding photocurrent whose amplitude is proportional to the power in the received optical signal. The summing module 1808 then provides an output current i _sum. The instantaneous value of the output current (instantaneous value) or the integrated value of the output current (INTEGRATED VALUE) can then be used to represent a quantitative value (quantitative value) of the sum.

In embodiments where the summation is performed in the photo-domain, the summation module 1808 may be implemented using a photo-detector (e.g., photodiode) that receives the optical signals generated by the different respective multiplication modules. For these embodiments, the signals 1810A and 1810B provided to the summing module 1808 are input optical signals, each of which includes an optical wave, the power of which represents the result of one of the multiplication modules. The output current i _sum in this embodiment is the photocurrent generated by the photodetector. Since the wavelengths of the light waves are different (e.g., sufficiently different that no significant constructive or destructive interference occurs between them), the photocurrent will be proportional to the sum of the powers of the received optical signals. The photocurrent is also substantially equal to the sum of the currents that would result in the detected optical powers detected by the separate equivalent photodetectors. The wavelengths of the light waves are different but close enough that the photodetectors have substantially the same response (e.g., wavelengths within the substantially flat detection bandwidth of the photodetectors). As described above, summing in the electrical domain using current summing can achieve a simpler system architecture by avoiding the need for multiple wavelengths.

Fig. 19A shows an example configuration of a system 1900 for an implementation of a system for performing vector matrix multiplication using a 2 x 2 matrix of elements, where the summing operation is performed in the electrical domain. In this example, the input vector isAnd the matrix isEach element of the input vector is encoded on a different optical signal. Two different replication modules 1902a and 1902b (collectively 1902) perform optical replication operations to split computations on different paths (e.g., an "up" path and a "down" path). There are four multiplication modules 1904a, 1904aa, 1904b, and 1904bb (collectively 1904), each multiplication module 1904 multiplying a different matrix element using optical amplitude modulation. At the output of each multiplication module 1904, there is a photo detection module 1906 (e.g., 1906a, 1906aa, 1906b, and 1906 bb) that converts the optical signal into an electrical signal in the form of a current. Two upper paths of different input vector elements (denoted as M ₁₁v₁ and M ₁₂v₂) are combined using a summing module 1908a, and two lower paths of different input vector elements (denoted as M ₂₁v₁ and M ₂₂v₂) are combined using a summing module 1908b, the summing module 1908 performing the summation in the electrical domain. Thus, each element of the output vector is encoded on a different electrical signal. As shown in fig. 19A, each component of the output vector is incrementally generated as the calculation proceeds to generate the following results of the upper and lower paths, respectively.

M₁₁v₁+M₁₂v₂

M₂₁v₁+M₂₂v₂

The same optical power may represent different values in different parts of the system. For example, the replication module 1902a receives an input signal on an input waveguide 1914 and provides an output signal on output waveguides 1916a and 1916 b. The amplitude of the optical signal on the output waveguide 1916a or 1916b of the representative value v1 has an amplitude of about half the amplitude of the optical signal on the input waveguide 1914 of the representative value v 1.

In some embodiments, if the replication module performs an optical replication operation to split the computation into three paths, the optical signal on the output waveguide of the optical splitter representing a particular value has an amplitude that is approximately one third of the amplitude of the optical signal on the input waveguide of the optical splitter representing the particular value. Similarly, if the replication module performs an optical replication operation to split the computation into four paths, the optical signal on the output waveguide of the optical splitter representing a particular value has an amplitude that is approximately one-fourth the amplitude of the optical signal on the input waveguide of the optical splitter representing that particular value, and so on.

In some embodiments, the photonic integrated circuit includes different types of replication modules, for example, a first replication module that performs an optical replication operation to split a computation into two paths, a second replication module that performs an optical replication operation to split a computation into three paths, a third replication module that performs an optical replication operation to split a computation into four paths, and a fourth replication module that performs an optical replication operation to split a computation into eight paths. The signals derived from the outputs of the first, second, third and fourth replica modules are scaled before combining.

For example, let vout1 be the value of a vector multiplied from a vector matrix using a2×2 element matrix, where a pair of splitters is used in the optical replication operation, and vout2 be the value of a vector multiplied from a vector matrix using a 4×4 element matrix, where a pair of quadpparators is used in the optical replication operation. If the photonic integrated circuit is configured such that vout1 is combined with vout2, then vout2 is scaled to twice its value prior to combining with vout 1.

The configuration of system 1900 may be implemented using any of a variety of electro-optical techniques. In some embodiments, there is a common substrate (e.g., semiconductor (e.g., silicon)) that can support the integrated optical and electronic components. The optical path may be implemented in a waveguide structure having a material with a higher optical index surrounded by a material with a lower optical index (optical index), the material defining a waveguide for propagating light waves carrying optical signals. The electrical path may be implemented by an electrically conductive material for propagating an electrical current carrying an electrical signal. (in fig. 19A to 20A, 21A to 24E, unless otherwise specified, the thickness of the lines representing the paths is used to distinguish between the optical paths (represented by thicker lines) and the electrical paths (represented by thinner lines or dashed lines)) optical devices (e.g., splitters and optical amplitude modulators) and electrical devices (e.g., photodetectors and operational amplifiers (operational amplifier; op-amps)) may be fabricated on a common substrate. Alternatively, different devices with different substrates may be used to implement different portions of the system, and those devices may communicate over a communication channel. For example, optical fibers may be used to provide a communication channel to transmit optical signals between multiple devices used to implement an overall system. Those light signals may represent different subsets of input vectors provided when performing vector matrix multiplication and/or different subsets of intermediate results calculated when performing vector matrix multiplication, as described in more detail below.

In the present disclosure, the drawings may show the optical waveguide passing through the electrical signal line, with the understanding that the optical waveguide does not intersect the electrical signal line. The electrical signal lines and the optical waveguides may be arranged in different layers of the device.

Fig. 19B illustrates an example configuration of a system 1920 for an implementation of a system for performing vector matrix multiplication using a 2 x 2 matrix of elements, where the summation operation is performed in the optoelectronic domain. In this example, two different respective wavelengths λ ₁ and λ ₂ are used to encode different input vector elements on the optical signal. Also, the optical output signals of the multiplication modules 1904 are combined in an optical combiner module 1910 such that the optical waveguides guide two optical signals on two wavelengths to each photo summation module 1912, which can be implemented using photo detectors, as used for photo detection module 1906 in the example of fig. 19A. However, in this example, the sum is represented by photocurrent representing power in the two wavelengths, rather than by current leaving the junction between the different conductors.

In the present disclosure, when the drawing shows two optical waveguides intersecting each other, it will be clear from the description whether the two optical waveguides are actually optically coupled to each other. For example, two waveguides that appear to cross each other from a top view of the device may be implemented in different layers and thus not cross each other. For example, in some implementations, the optical path providing optical signal λ ₂ as an input to replication module 1902 and the optical path providing optical signal M ₁₁V₁ from multiplication module 1904 to optical combiner module 1910 are not optically coupled to each other, although they may appear to intersect each other in the figures. Similarly, the optical path providing optical signal λ ₂ from replication module 1902 to multiplication module 1904 and the optical path providing optical signal M ₂₁V₁ from multiplication module 1904 to optical combiner module 1910 are not optically coupled to each other, although they may appear to cross each other in the figures.

The system configuration shown in fig. 19A and 19B can be extended to realize a system configuration for performing vector matrix multiplication using an mxn element matrix. In this example, the input vector isAnd the matrix isFor example, input vector elements v ₁ through v _n are provided by n waveguides, and each input vector element is processed by one or more duplication modules to provide m copies of the input vector element to m respective paths. There are m×n multiplication modules, each multiplying a different matrix element using optical amplitude modulation to produce an electrical or optical signal representing M _ij·v_j (i=1..m, j=1..n)). The signals representing M _ij·v_j (j=1..n) are combined using an ith summing module (i=1..m) to produce the following results for the M paths, respectively.

M₁₁v₁+M₁₂v₂+…+M_1nv_n

M₂₁v₁+M₂₂v₂+…+M_2nv_n

...

M_m1v₁+M_m2v₂+…+M_mnv_n

Since the optical amplitude modulation is able to reduce the power in the optical signal from its full value to a lower value, to zero (or near zero) power, any value between 0 and 1 can be achieved. However, some calculations may require multiplication by a value greater than 1 and/or multiplication by a signed (positive or negative) value. First, to extend the range to 0 to M _max (where M _max > 1), the original modulation of the optical signal may include scaling the original vector element amplitude (or equivalently, scaling the value mapped to a particular vector element amplitude in the linear mapping by 1/M _max) by M _max explicit (explicit) or implicit (explicit) such that the range 0 to 1 of matrix element amplitudes corresponds quantitatively in the calculation to the range 0 to M _max. Second, to extend the positive range of matrix element values 0 through M _max to the signed range-M _max through M _max, a symmetrical differential configuration may be used, as described in more detail below. Similarly, a symmetrical differential configuration may also be used to extend the positive range of values encoded on the various signals to a signed range of values.

Fig. 20A shows an example of a symmetrical differential configuration 2000 for providing values encoded on an optical signal with a range of symbols. In this example, there are two correlated optical signals encoded as unsigned values (unsigned value), designated asAndWhere each value is assumed to vary between 0 (e.g., corresponding to near zero optical power) and V _max (e.g., corresponding to maximum power level). The relationship between two optical signals is that when one optical signal is represented by a "main" valueIn the encoding process, the other optical signal uses a corresponding anti-symmetric valueEncoding such that the principal value encoded on an optical signalMonotonically increasing (monotonically increase) from 0 to V _max, encoded antisymmetric values on the paired optical signalMonotonically decreasing (monotonically decrease) from V _max to 0. Or conversely, when the primary value is encoded on an optical signalMonotonously decreasing from V _max to 0, the antisymmetric value encoded on the paired optical signalMonotonically increasing from 0 to V _max. The difference between the current signals may be generated by a current subtraction module (current subtraction module) 2002 after the light signals in the upper and lower paths are converted to current signals by the respective photo-detection modules 1906. EncodingAndThe difference between the current signals of (a) results in a current encoded with a signed value V ₁ given as:

wherein following the unsigned principal value Monotonically increasing from 0 to V _max and paired with an antisymmetric valueMonotonically decreasing from V _max to 0, the signed value V ₁ monotonically increases between-V _max and V _max. There are various techniques that can be used to implement the symmetrical differential configuration of fig. 20A, as shown in fig. 20B and 20C.

In fig. 20B, the optical signal is detected in a common-port configuration (common-terminal configuration) in which two photodiode detectors are connected to a common port 2032 (e.g., inverting port (INVERTING TERMINAL)) of an operational amplifier 2030. In this configuration, the current 2010 generated from the first photodiode detector 2012 and the current 2014 generated from the second photodiode detector 2016 are combined at a junction 2018 between the three conductors to produce a difference current 2020 between the current 2010 and the current 2014. The current 2010 and the current 2014 are provided from opposite sides of the respective photodiodes, which are connected at the other end to a voltage source (not shown) providing bias voltages of the same magnitude v _bias but of opposite sign, as shown in fig. 20B. In this configuration, a difference is generated due to the behavior of the current contacted at the common node 2018. The difference current 2020 represents a signed value encoded on the electrical signal that corresponds to the difference between unsigned values encoded on the detected optical signal. The operational amplifier 2030 may be configured as a transimpedance amplifier (TRANSIMPEDANCE AMPLIFIER; TIA) configuration with the other end 2024 grounded and the output 2026 fed back to the common 2032 using a resistive component 2028, the resistive component 2028 providing a voltage proportional to the difference current 2020. Such a transimpedance amplifier configuration will provide the resulting value as an electrical signal in the form of a voltage signal.

In fig. 20C, the optical signal is detected in a differential terminal configuration in which two photodiode detectors are connected to different terminals of an operational amplifier 2050. In this configuration, the current 2040 generated from the first photodiode detector 2042 is connected to the inverting terminal 2052, and the current 2044 generated from the second photodiode detector 2046 is connected to the non-inverting terminal 2054. Currents 2040 and 2044 are supplied from the same terminal of a respective photodiode, which is connected at the other terminal to a voltage source (not shown) that supplies bias voltages of the same magnitude v _bias and the same sign, as shown in fig. 20C. The output terminal 2056 of the operational amplifier 2050 in this configuration supplies a current proportional to the difference between the current 2040 and the current 2044. In this configuration, a difference is generated due to the behavior of the circuit of the operational amplifier 2050. The differential current flowing from output 2056 represents the signed value encoded on the electrical signal, which corresponds to the difference between the unsigned values encoded on the detected optical signal.

Fig. 21A shows an example of a symmetrical differential configuration 2100 for providing values encoded as modulation levels of an optical amplitude modulator implementing a multiplication module 1904 with values of a range of symbols. In this example, there are two related modulators configured to pass through a filter designated asAndIs modulated, wherein each value is assumed to vary between 0 (e.g., corresponding to an optical power modulated down to near zero) and M _max (e.g., corresponding to an optical power maintained near a maximum power level). The relationship between the two modulation levels is such that when one modulation level is configured at a "primary" valueAt the same time, another modulation level is configured at a corresponding "antisymmetric" valueSo that when the principal value of one modulatorThe anti-symmetry value of the other modulator increases monotonically from 0 to M _max Monotonically decreasing from M _max to 0. Or conversely, when the principal value of a modulatorMonotonically decreasing from M _max to 0, the anti-symmetry value of the other modulatorMonotonically increasing from 0 to M _max. After the replication module 1902 replicates the input optical signal encoded with the value V, each modulator provides a modulated output optical signal to a corresponding photo detection module 1906. Multiplication block 1904 in the upper path includes an ANDMultiply and provide with valueA modulator of the encoded optical signal. Multiplication block 1904 in the lower path includes a ANDMultiply and provide with valueA modulator of the encoded optical signal. After the light signals are converted into current signals by the respective photo detection modules 1906, the difference between them may be generated by the current subtraction module 2102. EncodingAndThe difference between the current signals of (a) results in the current encoded with V multiplied by the signed value M ₁₁ given as:

wherein following the unsigned principal value Monotonically increasing from 0 to M _max and paired with the antisymmetric valueMonotonically decreasing from M _max to 0, the signed value M ₁₁ monotonically increases between-M _max and M _max.

Fig. 21B shows an example configuration of a system 2110 for an implementation of the system 1800 for performing vector matrix multiplication using a2 x 2 matrix of elements, where the summation operation is performed in the electrical domain and has signed elements of the input vector and signed elements of the matrix. In this example, for each signed element of the input vector, there are two associated optical signals encoding unsigned values. For the first signed input vector element value V ₁, there are two designations asAndAnd for the second signed input vector element value V ₂, there are two values designated asAndIs an unsigned value of (c). Each unsigned value encoded on an optical signal is received by a replication module 2112, the replication module 2112 performing one or more optical replication operations that produce four copies of the optical signal on four respective optical paths. In some embodiments of the replication module 2112, there are three different Y-waveguide splitters, each configured to split using a different power ratio (which may be implemented using any of a variety of photonic devices, for example). For example, a first splitter may split using a power ratio of 1:4 to transfer 25% (1/4) of the power to the first path, a second splitter may split using a power ratio of 1:3 to transfer 25% (1/4=1/3×3/4) of the power to the second path, and a third splitter may split using a power ratio of 1:2 to transfer 25% (1/4=1/2×2/3×3/4) of the power to the third path, and the remaining 25% of the power to the fourth path. For example, individual splitters that are part of the replication module 2112 can be arranged in different portions of the substrate to appropriately distribute different replicas to different paths within the system. In other embodiments of the replication module 2112, different numbers of paths may be split at different split rates as appropriate.

In some implementations, the replication module 2112 can include an optical replication distribution network having a binary tree topology. The optical replication distribution network comprises a plurality of optical splitters having an input port for receiving an input optical signal and two or more output ports for providing output optical signals, wherein each output optical signal has a predetermined proportion of the power of the input optical signal. For example, the first splitter may split using a power ratio of 1:2 to provide two intermediate optical signals of substantially the same power (e.g., 50% of the power of the input optical wave to each of the two output ports). Next, one of the intermediate optical signals may be split using a second splitter having a power ratio of 1:2 to transfer 25% of the input optical wave power to each of the first and second paths, and the other of the intermediate optical signals may be split using a third splitter having a power ratio of 1:2 to transfer 25% of the input optical wave power to each of the third and fourth paths. In this example, the optical replication distribution network splits the input optical signal into four output optical signals, where the power of each output optical signal is scaled to 25% of the input optical signal power. In this example, the output optical signal is scaled to the same proportion of the input optical signal.

Optical replication distribution networks having this binary tree topology type offer particular advantages. For example, because a binary tree optical replication distribution network can use a symmetrical design (e.g., Y-shaped adiabatic waveguide cone, Y-shaped adiabatic waveguide taper) on a uniform 1:2 power splitter for all wavelengths, the network can be wavelength independent, facilitating its use for multiple wavelengths. Furthermore, a non-uniform power splitter may have coupling portions that require precise control lengths to switch different power ratios (e.g., 1/n, 1/(n-1),. The.. However, such precision may be difficult in existing manufacturing variations. This binary tree optical replication distribution network also facilitates a reduction in the electrical path of a portion of the compact die layout, as described in more detail below with reference to fig. 45A-45G.

The system 2110 also includes other modules arranged as shown in fig. 21B to provide two different output electrical signals representing output vectors that are the result of vector matrix multiplication performed by the system 100. There are 16 different multiplication modules 1904 that modulate different copies of the optical signal representing the input vector, and there are 16 different photo detection modules 1906 to provide an electrical signal representing the computed intermediate result. There are also two different summing modules 2114A and 2114B that calculate the overall summation of each output electrical signal. In the drawing, signal lines electrically coupling the photodetection module 1906 to the summing module 2114B are shown in broken lines. Because each overall summation may include some antisymmetric terms (anti-SYMMETRIC TERM) subtracted from the pair of dominant terms (PAIRED MAIN TERM) from any symmetric differential configuration for vector elements and/or matrix elements, summation modules 2114A and 2114B may include a mechanism for adding some of the terms in the summation after being inverted (invert) (equivalently, subtracted from non-inverted terms (non-INVERTED TERM)). For example, in some embodiments, summing modules 2114A and 2114B include an inverting input port and a non-inverting input port such that an item to be added in the overall summation may be connected to the non-inverting input port and an item to be subtracted in the overall summation may be connected to the inverting input port. An example embodiment of such a summing module is an operational amplifier in which the non-inverting terminal is connected to a conductor conducting a current representing the signal to be added and the inverting terminal is connected to a conductor conducting a current representing the signal to be subtracted. Alternatively, if inversion of the antisymmetric term is performed by other means, an inverting input port may not be required on the summing module. Summing modules 2114A and 2114B generate the following summation results, respectively, to complete the vector matrix multiplication.

In the present disclosure, when the drawing shows two electric signal lines crossing each other, it is clear from the description whether the two electric signal lines are electrically coupled to each other. For example, the signal line carrying the M ₂₁ ⁺V1⁺ signal is not electrically coupled to the signal line carrying the M ₁₁ ⁺V₁ ^- signal or the signal line carrying the M ₁₁ ^-V₁ ^- signal.

The system configuration shown in fig. 21B can be extended to realize a system configuration that performs vector matrix multiplication using an mxn element matrix in which the input vector sum matrix includes signed elements.

There are various techniques that may be used to implement the symmetrical differential configuration of fig. 21B. Some of these techniques utilize a1 x 2 optical amplitude modulator to implement the multiplication module 1904 and/or provide optical signal pairs associated with primary and anti-symmetric pairings. Fig. 22A shows an example of a1×2 optical amplitude modulator 2200. In this example, the 1 x 2 optical amplitude modulator 2200 includes an input optical splitter 2202 that splits an input optical signal to provide 50% power to a first path that includes a phase modulator 2204 (also referred to as a phase shifter) and to provide 50% power to a second path that does not include a phase modulator. The path may be defined in different ways depending on whether the optical amplitude modulator is implemented as a free space interferometer or as a waveguide interferometer. For example, in a free space interferometer, one path is defined by the transmission of a wave through a beam splitter, and the other path is defined by the reflection of the wave from the beam splitter. In a waveguide interferometer, each path is defined by a different optical waveguide that has been coupled to an incident waveguide (incoming waveguide) (e.g., in a Y-splitter). The phase modulator 2204 may be configured to produce a phase shift such that the total phase delay of the first path differs from the total phase delay of the second path by a configurable phase shift value (e.g., a value that may be arranged to be a phase shift somewhere between 0 degrees and 180 degrees).

The 1 x2 optical amplitude modulator 2200 includes a 2 x2 coupler 2206 that uses optical interference or optical coupling in a particular manner to combine the light waves from the first and second input paths to transfer power into the first and second output paths at different ratios, depending on the phase shift. For example, in a free space interferometer, a phase shift of 0 degrees results in constructive interference of substantially all of the input power split between the two paths to exit from one output path of the beam splitter implementing coupler 2206, and a phase shift of 180 degrees results in constructive interference of substantially all of the input power split between the two paths to exit from the other output path of the beam splitter implementing coupler 2206. In a waveguide interferometer, a phase shift of 0 degrees results in coupling substantially all of the input power split between the two paths to one output waveguide 2208a of the coupler 2206, and a phase shift of 180 degrees results in coupling substantially all of the input power split between the two paths to the other output waveguide 2208b of the coupler 2206. The phase shift between 0 and 180 degrees may then multiply the power in the light wave (and the value encoded on the light wave) by a value between 0 and 1 by partial constructive or destructive interference or partial waveguide coupling. The multiplication by any value between 0 and 1 may then be mapped to the multiplication by any value between 0 and M _max as described above.

In addition, the relationship between the power in the two light waves emitted from modulator 2200 follows the relationship between the power of the primary and anti-symmetric pairs described above. When the amplitude of the optical power of one signal increases, the amplitude of the optical power of the other signal decreases, so the difference between the detected photocurrents can be generated as a signed vector element, or multiplied by a signed matrix element, as described herein. For example, the pair of correlated optical signals may be provided from two output ports of the modulator 2200 such that the difference between the magnitudes of the correlated optical signals corresponds to the result of multiplying the input value by the signed matrix element value. Fig. 22B shows a symmetrical differential configuration 2210 of the 1×2 optical amplitude modulator 2200, which has an optical signal arranged at an output to be detected in a common-terminal version (common-version) of the symmetrical differential configuration of fig. 20B. The current signals corresponding to the photocurrents generated by the pair of photodetectors 2212 and 2214 are combined at node 2216 to provide an output current signal having an amplitude corresponding to the difference between the amplitudes of the associated optical signals. In other embodiments, such as in the symmetrical differential configuration of fig. 20C, different circuits may be used to combine photocurrents detected from the two optical signals output.

Other techniques may be used to construct a 1x 2 optical amplitude modulator for implementing the multiplication module 1904 and/or to provide optical signal pairs associated with primary and anti-symmetric pairings. Fig. 22C shows another example of a symmetrical differential configuration 2220 of another type of 1x 2 optical amplitude modulator. In this example, the 1×2 optical amplitude modulator includes a ring resonator 2222 configured to divide the optical power of the optical signal of the input port 2221 into two output ports. The ring resonator 2222 (also referred to as a "micro-ring (microring)") may be manufactured, for example, by forming a circular waveguide on a substrate, wherein the circular waveguide is coupled to a linear waveguide (straight waveguide) corresponding to the input port 2221. When the wavelength of the optical signal approaches the resonant wavelength associated with the ring resonator 2222, the optical wave coupled into the ring circulates around the ring on the clockwise path 2226 and interferes destructively at the coupling location, such that the reduced power optical wave exits through path 2224 to the first output port. The circulating light wave is also coupled out of the loop such that another light wave exits on path 2228 through a curved waveguide that directs the light wave out of the second output port.

Since the time scale of the optical power circulating around the ring resonator 2222 is small compared to the time scale of the amplitude modulation of the optical signal, an antisymmetric power relationship is rapidly established between the two output ports such that the optical wave detected by the photodetector 2212 and the optical wave detected by the photodetector 2214 form a dominant and antisymmetric pair. The resonant wavelength of the ring resonator 2222 may be tuned to monotonically decrease/increase the dominant/anti-symmetric signal to achieve a signed result, as described above. When the loop is not resonating at all, all power leaves the first output port via path 2224, and when it is resonating at all, all power leaves the second output port via path 2228 with certain other parameters (e.g., quality factor and coupling coefficient) properly adjusted. In particular, in order to achieve complete power transfer, the coupling coefficients characterizing (characterizing) the coupling efficiency between the waveguide and the ring resonator should be matched. In some embodiments, it may be useful to have a relatively shallow (shaping curve) tuning curve, which may be achieved by decreasing the quality factor of the ring resonator 2222 (e.g., by increasing the loss) and increasing the coupling coefficient into and out of the ring accordingly. Shallow tuning curves provide less amplitude sensitivity to resonant wavelengths. Techniques such as temperature control may also be used for tuning and/or stability of the resonant wavelength.

Fig. 22D shows another example of a symmetrical differential configuration 2230 of another type of 1x 2 optical amplitude modulator. In this example, the 1x 2 optical amplitude modulator includes two ring resonators 2232 and 2234. The optical power of the optical signal at input port 2231 is split into two ports. When the wavelength of the optical signal approaches the resonant wavelength associated with the two ring resonators 2232 and 2234, the reduced power optical wave exits the first output port via path 2236. A portion of the light waves are also coupled into a ring resonator 2232 that circulates around the ring on a clockwise path 2238, and are also coupled into a ring resonator 2234 that circulates around the ring on a counter-clockwise path 2240. The circulating light wave is then coupled out of the ring such that another light wave exits the second output port via path 2242. In this example, the light waves detected by photodetector 2212 and the light waves detected by photodetector 2214 also form a primary and anti-symmetric pairing.

Fig. 23A and 23B illustrate different examples of the use of an optical amplitude modulator, such as a1 x 2 optical amplitude modulator 2200, for implementing a system 1800 that performs vector matrix multiplication on a2 x 2 matrix of elements. Fig. 23A shows an example configuration of an optoelectronic system 2300A that includes optical amplitude modulators 2302A and 2302B that provide values of signed vector elements representing an input vector. The optical amplitude modulator 2302A provides a pair of optical signals that encode a pair of values for a first signed vector elementAndAnd the optical amplitude modulator 2302B provides a pair of optical signals that encode a pair of values for a second signed vector elementAndVector Matrix Multiplier (VMM) subsystem 2310A receives the input optical signal, performs splitting operations, multiplication operations, and some summation operations as described above, and provides an output current signal to be processed by additional circuitry. In some examples, the output current signal representation is further processed to produce a partial sum of final sums that result in signed vector elements of the output vector. In this example, some of the final summation operations are performed as a subtraction between the different partial sums represented by the current signals at the inverting and non-inverting ends of transimpedance amplifiers 2306A and 2306B. The subtraction is used to provide signed values as described above (e.g., with reference to fig. 21B). This example also illustrates how certain elements become part of multiple modules. Specifically, the light replication performed by waveguide splitter 2303 may be considered as part of a replication module (e.g., one of replication modules 2112 in FIG. 21B) and as part of a multiplication module (e.g., one of multiplication modules 1904 in FIG. 21B). The optical amplitude modulator used within VMM subsystem 2310A is configured for detection in the common terminal configuration (common-terminal configuration) shown in fig. 20B.

Fig. 23B shows an example configuration of an optoelectronic system 2300B that is similar to the configuration of the optoelectronic system 2300A shown in fig. 23A. The vector matrix multiplier subsystem 2310B includes an optical modulator configured for detection in the differential terminal configuration shown in fig. 20C. In this example, the output current signal of vector matrix multiplier subsystem 2310B also represents a partial sum that is further processed to produce a final sum that results in signed vector elements of the output vector. The final summing operation performed as a subtraction between the different partial sums represented by the current signals at the inverting and non-inverting terminals of the transimpedance amplifiers 2306A and 2306B is different from the example of fig. 23A. But as described above (e.g., with reference to fig. 21B), the final subtraction still results in the provision of a signed value.

Fig. 23C shows an example configuration of an optoelectronic system 2300C that uses an alternative arrangement of VMM subsystem 2310C in the case of detection in a common terminal configuration (as in VMM subsystem 2310A shown in fig. 23A), but where the optical signal carrying the result of the multiplication module is routed through a subsystem within the waveguide (e.g., within a semiconductor substrate) to a portion of the substrate that includes a detector arranged to convert the optical signal into an electrical signal. In fig. 23C, optical waveguides 2304a, 2304b, 2304C, 2304d, 2304e, 2304f, 2304g, and 2304h (collectively 2304) are shown as bold dashed lines. In some embodiments, this grouping of detectors allows shortening of the electrical paths, potentially reducing electrical crosstalk or other damage due to long electrical paths that would otherwise be used. In some embodiments, when waveguides along one dimension intersect waveguides along a vertical dimension in the arrangement shown in fig. 23, the optical waveguides may be routed within a layer of the substrate with some relatively low loss (e.g., on the order of about 0.03dB per intersection). For example, the waveguide 2308a intersects the waveguides 2304a, 2304b, 2304c, 2304d, 2304e, 2304f, and 2304g, and the cumulative loss due to these seven intersections may be about 0.21dB. However, as the number of rows and/or columns in a vector matrix multiplier system increases, losses can accumulate very much.

In some implementations, to prevent waveguide crossover (and associated loss) that may be encountered within a single layer, waveguides may also be routed within multiple layers of the substrate to allow greater flexibility in routing paths that cross in two dimensions of the substrate but not in a third dimension (of depth in the substrate), i.e., the waveguides are separated in the depth direction. Such a multilayer optical network has much lower losses associated with waveguide crossings, which contributes to the greater scalability of certain arrangements. However, multilayer optical networks still suffer from some loss between waveguides in the intersecting second layer and waveguides in the first layer, depending on how close the layers are to each other in the depth dimension of the substrate.

In some implementations, the optical signals may be processed by optoelectronic components in different layers of the photonic integrated circuit, thus requiring the optical signals to be transmitted or transferred between waveguides in different layers having different depths within the photonic integrated circuit. For example, the depth of a layer within a photonic integrated circuit may be the distance between the layer and the surface of the photonic integrated circuit. The different layers may also have different heights, wherein the height of a layer may be defined as the distance between the layer and the substrate of the photonic integrated circuit. For example, a first modulator may be processed (e.g., modulated) by a first optoelectronic component disposed in a first layer, and an optical signal may be subsequently processed (e.g., modulated or detected) by a second optoelectronic component disposed in a second layer, such that the optical signal needs to be transmitted or transferred from the first layer to the second layer. There are locations in a multi-layer optical network where light waves are transferred from one of the layers (e.g., lower Layer (LL)) to the other layer (e.g., upper Layer (UL)). For example, such transfer may occur by using a third layer (e.g., an intermediate layer (ML)) between waveguide layers, the short waveguide segment of which is parallel to both the LL waveguide and the UL waveguide, and the length of which is long enough to allow the light waves to be transferred from the LL waveguide to the ML segment, and then from the ML segment to the UL waveguide in sequence. Similar techniques may cause light waves to be transferred from the UL waveguide to the ML segment and then from the ML segment to the LL waveguide. For example, techniques that may be used are described in optical express 2017, volume 25, month 11, sacher et al "Tri-layer silicon nitride-on-silicon photonic platform for ultra-low-loss crossings and interlayer transitions", which is incorporated herein by reference.

In general, there is a need to trade off between bringing layers closer to each other in order to make the specific interlayer coupling between UL and LL layers (through the ML segments) stronger or bringing layers farther apart from each other in order to make the unwanted coupling (i.e., crosstalk) lower when UL and LL waveguides cross each other above/below in the depth dimension perpendicular to the two-dimensional plane on which the waveguides are routed. In some embodiments, a method of promoting strong interlayer coupling and low crosstalk at intersections is to include multiple intermediate layers with overlapping but staggered coupling segments between UL and LL waveguides.

Fig. 23D shows an example in which there are three intermediate layers that provide coupling segments S1, S2, and S3 between the upper layer waveguide 2321 and the lower layer waveguide 2322. The length (SL) of each segment and the distance (SD) between adjacent segments are selected such that the distance (LD) between waveguide layers is large enough to limit crosstalk, but also has effective coupling over a relatively short Coupling Length (CL). While a single intermediate layer may be sufficient for a relatively small vector matrix multiplier system, it is useful to support multiple intermediate layers of the coupled segment set arranged in the "stair-case configuration" of fig. 23D if the system scales to a large number of waveguide crossings. Moreover, the fabrication of the segments in separate planar layers is compatible with standard CMOS fabrication processes. For clarity of illustration, the example shown in fig. 23D is not to scale. In some embodiments, the dimensions used allow the distance LD between the waveguide layers to be about 2-3 microns or greater, and the waveguide thickness to be a few hundred nanometers. When the distance SD between the segments is greater, the length SL of each segment and/or the Overlap Length (OL) between the segments may be longer to fully transmit power. In general, various trade-offs can be made in the design to keep the coupling length CL relatively short and still provide a sufficient distance LD between the waveguide layers. Moreover, other embodiments may use more or less than three intermediate layers.

Fig. 23E shows a cross-sectional view of an example of photonic integrated circuit 2338, where photonic integrated circuit 2338 includes three intermediate layers to enable transfer of optical signals between waveguides disposed in different layers. In some embodiments, photonic integrated circuit 2338 includes substrate 2324, cladding layer 2326, lower layer waveguide 2322, buffer layer or cladding layer 2328, first coupling segment S1, buffer layer or cladding layer 2330, second coupling segment S2, buffer layer or cladding layer 2332, third coupling segment S3, buffer layer or cladding layer 2334, upper layer waveguide 2321, and cladding layer 2336. The figure shows a cross-section along a plane perpendicular to the substrate 2324 and parallel to the length direction of the waveguides 2321 and 2322. In some embodiments, the substrate 2324 may be made of silicon, the lower waveguide 2322 may be made of silicon, the upper waveguide 2321 may be made of silicon nitride (SiN), the coupling segments S1, S2, and S3 may be made of silicon nitride, and the buffer or cladding layers 2326, 2328, 2330, 2332, 2334, and 2336 may be made of silicon oxide (SiO 2). Other materials or other combinations of materials may be used for the substrate, waveguide, coupling segments, buffer layer or cladding layer. For example, each waveguide and coupling segment may be silicon, amorphous silicon, or silicon nitride. Although fig. 23E shows a boundary (e.g., 2352) between different buffer layers or cladding layers, there may be no measurable boundary between different buffer layers or cladding layers, i.e., the buffer layers or cladding layers may form a continuous layer.

The lower portion of fig. 23E shows a top view of waveguides 2322 and 2321 and coupling segments S1, S2, and S3 (shown in phantom). In this example, the ends of the waveguides 2321 and 2322 and the coupling segments S1, S2, and S3 are tapered. The geometry and dimensions of the waveguide and coupling segments are configured to maximize the transfer of power in the light wave from the waveguide (e.g., 2322) to another waveguide (e.g., 2321).

In some examples, photonic integrated circuit 2338 may be fabricated by the steps of:

forming a cladding layer 2326 on the substrate 2324;

Forming a lower waveguide 2322 on cladding 2326;

forming a buffer layer or cladding 2328 on the lower waveguide 2322;

forming a first coupling segment S1 on the buffer layer or cladding layer 2328;

forming a buffer layer or cladding layer 2330 on the first coupling segment S1;

forming a second coupling segment S2 on the buffer layer or cladding layer 2330;

Forming a buffer layer or cladding layer 2332 on the second coupling segment S2;

forming a third coupling segment S3 on the buffer layer or cladding layer 2332;

forming the upper waveguide 2321 on the buffer or cladding layer 2332, and

A cladding layer 2336 is formed on the upper waveguide 2321.

Fig. 23F is a cross-sectional view of photonic integrated circuit 2338 of fig. 23E. The figure shows a cross-section along a plane perpendicular to the substrate 2324 and perpendicular to the length direction of the waveguides 2321 and 2322. For example, each of the waveguides 2321 and 2322 and the coupling segments S1, S2 and S3 include a core material (e.g., silicon or silicon nitride (SiN)) within a cladding material (e.g., silicon oxide (SiO 2)).

There are many ways to configure the dimensions of the waveguides and coupling segments. In some embodiments, the upper waveguide 2321 and the lower waveguide 2322 are made of the same material and have the same thickness T and width W. Each of the coupling segments S1, S2, and S3 may be made of the same material and may have the same thickness T and width W as the waveguide. In some examples, the upper and lower waveguides may be made of different materials and have different thicknesses and widths. Each of the coupling segments S1, S2 and S3 may be made of the same material and have the same thickness and width as one of the waveguides. In some examples, the thickness and width of the waveguide and the coupling segments may be different from each other. The dimensions of the waveguide depend on the wavelength(s) of the optical signal and the material (and refractive index) used for the waveguide. Simulation software may be used to determine the dimensions and relative positions of the waveguides and coupling segments to produce maximum power transfer from the lower waveguide to the upper waveguide and vice versa. Such software includes, for example, those offered by Mathworks, inc. of Bettk, massOr COMSOL provided by COMSOL Inc. of Berlington, massachusetts

Fig. 23G is a diagram of a perspective view of photonic integrated circuit 2338. In this figure, buffer or cladding layers 2328, 2330, 2332, 2334 and 2336 are omitted. The figure is not drawn to scale and the length of each coupling segment S1, S2 and S3 may be several times greater than the width of the coupling segment.

Fig. 23H is a diagram showing an example in which an optical wave propagates in the direction indicated by arrow 2340 in lower waveguide 2322, passes through coupling segments S1, S2, and S3 in the direction indicated by arrow 2342, and then propagates in the direction indicated by arrow 2344 in upper waveguide 2321.

Fig. 23I is a diagram showing an example in which an optical wave propagates in the direction indicated by arrow 2346 in the upper waveguide 2321, passes through the coupling segments S3, S2, and S1 in the direction indicated by arrow 2348, and then propagates in the direction indicated by arrow 2350 in the lower waveguide 2322.

Fig. 23J is a diagram illustrating an example in which photonic integrated circuit 2338 includes waveguides 2321 and 2322 extending in a first direction (e.g., the x-direction) and other waveguides 2352a, 2352b, and 2352c extending in a second direction (e.g., the y-direction). In this example, the y-direction extends in a direction perpendicular to the plane of the figure. The vertical distance between the waveguide 2352a (or 2352b, 2352 c) and the waveguide 2322 (i.e., in the z-direction perpendicular to the plane of the substrate 2324) is LD. By using three coupling segments S1, S2 and S3, the distance LD may be larger than the distance using only one coupling segment. As such, there may be less interference between the signals traveling in the waveguide 2322 and the signals traveling in the waveguides 2352a, 2352b, and 2352 c.

Fig. 23D to 23J show sets of step-coupling segments S1, S2, and S3 for coupling two optical waveguides in different layers. In some implementations, photonic integrated circuit 2338 can have multiple sets of stepped coupling segments for coupling multiple pairs of waveguides in different layers. For example, photonic integrated circuit 2338 may have a first layer at a first depth (i.e., in a direction perpendicular to the surface of substrate 2324) that includes cladding material in layers 2326 and 2328 and a waveguide (e.g., 2322) formed from core material within the cladding material. Photonic integrated circuit 2338 can have a second layer at a second depth that includes cladding layer material in layer 2336 and cladding layer in an upper portion of layer 2334, and a waveguide (e.g., 2321) formed from core material within the cladding layer material. Photonic integrated circuit 2338 may have a third layer at a third depth (between the first depth and the second depth) that includes cladding material in layer 2332 and cladding material in an upper portion of layer 2330, and coupling structures (e.g., S2) formed within the cladding material. Photonic integrated circuit 2338 may have a fourth layer at a fourth depth (between the first depth and the third depth) that includes cladding material in layer 2332 and cladding material in an upper portion of layer 2330, and coupling structures (e.g., S1) formed within the cladding material. Photonic integrated circuit 2338 may have a fifth layer at a fifth depth (between the second depth and the third depth) that includes cladding material in layer 2334 and cladding material in an upper portion of layer 2332, and coupling structures formed within the cladding material (e.g., S3).

In some embodiments, the photonic integrated circuit may have waveguides arranged in three or more layers. Fig. 23K is a diagram showing an example in which a photonic integrated circuit 2356 includes waveguides 2321, 2322, and 2354 extending along the x-direction. Within photonic integrated circuit 2356, waveguide 2321 is at a first depth, waveguide 2354 is at a second depth, and waveguide 2322 is at a third depth. In this example, photonic integrated circuit 2356 includes waveguides 2352a, 2352d, 2352e, and 2352f extending in the y-direction. Within photonic integrated circuit 2356, waveguides 2352a and 2352f are at a first depth, waveguide 2352d is at a second depth, and waveguide 2352e is at a third depth. In this example, seven coupling segments S1, S2, S3, S4, S5, S6, and S7 facilitate transfer of optical signals between waveguides 2321 and 2322. The three coupling segments S7, S8, and S9 facilitate transfer of optical signals between the waveguides 2321 and 2354.

The use of multiple coupling segments provides greater flexibility in the design of photonic integrated circuits. More layers may be stacked vertically together, allowing more photonic components to be packaged together within a photonic integrated circuit. As shown in the example of fig. 23K, the optical signal may travel in a first waveguide at a first depth, transition to a second waveguide at a second depth through a first set of coupling segments, and transition to a third waveguide at a third depth through a second set of coupling segments. The third depth may be between the first depth and the second depth (as shown in the example of fig. 23K). The third depth may also be outside a depth range between the first depth and the second depth.

Various other modifications may be made in the system configuration, including modifications to components included in the vector matrix multiplier subsystem. For example, the optical amplitude modulators 2302A and 2302B may be included as part of a vector matrix multiplier subsystem. Or the vector matrix multiplier subsystem may include an optical input port for receiving pairs of primary and anti-symmetric optical signals produced by modules other than the optical amplitude modulator, or for interfacing with other types of subsystems. In some embodiments, in addition to grouping detectors and using multiple layers for waveguides on a substrate, an alternative way to avoid waveguide crossover losses and still limit electrical path length involves rearranging the layout of the elements on the waveguides and photonic integrated circuit die. For example, some manufacturing processes may introduce additional cost and/or complexity to provide multiple waveguide layers on a substrate. Conversely, the optical cabling may include an optical replication distribution network that facilitates shortening the electrical path of some compact die layouts, as described below with reference to fig. 45A-45G. In some embodiments, an optical replication distribution network includes an optical splitter and a waveguide that transmits an optical signal from the splitter to an optical-to-electrical node that processes the optical signal.

In some embodiments, the systems described above (e.g., the systems shown in fig. 1A, 1F, 3A, 3B, 4A, 5, 7, 9, 18, 19A, 19B, 21B, 23A-23C, fig. 24A-24E, 26-32A, 35C, and 36-38) may be implemented using two or more semiconductor dies, wherein a first semiconductor die comprises a photonic integrated circuit and a second semiconductor die comprises an electronic integrated circuit. Photonic integrated circuits include, for example, light sources, optical waveguides, optical modulators, photodetectors, and electrically conductive wires or paths. An electronic integrated circuit includes, for example, a memory cell, a controller, a digital-to-analog converter, an analog-to-digital converter, and conductive lines or paths. The conductive path may be made of a metal such as copper or a metal alloy. Conductive paths on the photonic integrated circuit may receive electrical signals from and transmit electrical signals to the electronic integrated circuit. Similarly, conductive paths on an electronic integrated circuit may receive electrical signals from and transmit electrical signals to a photonic integrated circuit. Conductive paths on photonic and electronic integrated circuits may be formed by patterning one or more conductive layers in the integrated circuit using a photolithographic process.

In some embodiments, the external surface of the photonic integrated circuit includes electrical contact pads that are electrically coupled to corresponding contact pads on the external surface of the electronic integrated circuit. For example, the photonic and electronic integrated circuits may be coupled together in a controlled collapse chip connection or flip chip arrangement.

Referring to fig. 46, in some embodiments, the artificial neural network computing system 4600 includes a first semiconductor die having a photonic integrated circuit 4602 and a second semiconductor die having an electronic integrated circuit 4604. The photonic integrated circuit 4602 includes a substrate and one or more layers 4606 formed on the substrate, wherein the one or more layers 4606 include components for processing optical signals, such as light sources, optical waveguides, optical modulators, photodetectors, and conductive pathways. A first set of conductive contact pads 4610 is formed on a surface 4608 of photonic integrated circuit 4602. The electronic integrated circuit 4604 includes a substrate 4612 and one or more layers 4614 formed on the substrate 4612, wherein the one or more layers 4614 include components for processing electrical signals, such as memory cells, controllers, digital-to-analog converters, analog-to-digital converters, and conductive paths. A second set of conductive contact pads 4616 is formed on a surface 4618 of the electronic integrated circuit 4604. Solder balls 4620 are provided to electrically and mechanically couple the first set of contact pads 4610 and the second set of contact pads 4616. An insulating adhesive (not shown) may be applied to the remaining space between photonic integrated circuit 4602 and electronic integrated circuit 4604 to provide a strong bond between the first semiconductor die and the second semiconductor die.

Referring to fig. 47, in some embodiments, the artificial neural network computing system 4700 includes a first semiconductor die having a photonic integrated circuit 4602 and a second semiconductor die having an electronic integrated circuit 4702. The first semiconductor die and the second semiconductor die are combined in a "stacked chip" configuration. The photonic integrated circuit 4602 includes a substrate, one or more layers 4606 including components for processing optical signals, and a first set of conductive contact pads 4610. The electronic integrated circuit 4702 includes a substrate 4704 and one or more layers 4706 formed on the substrate 4704. Wherein one or more layers 4706 include components for processing electrical signals such as memory cells, controllers, digital-to-analog converters, analog-to-digital converters, and conductive paths. A second set 4710 of conductive contact pads is formed on the surface 4708 of the electronic integrated circuit 4702. In this example, the surface 4708 is the back of the substrate 4704. A conductive via 4712 passes through the substrate 4704 to electrically couple the contact pad 4710 to a component in one or more layers 4706. Each contact pad 4710 may be electrically coupled to one or more conductive vias 4712. Solder balls 4620 are provided to electrically and mechanically couple the first set of contact pads 4610 and the second set of contact pads 4710. An insulating adhesive (not shown) may be applied to the remaining space between photonic integrated circuit 4602 and electronic integrated circuit 4702 to provide a strong bond between the first semiconductor die and the second semiconductor die. Advantages of using controlled collapse chip connection (as shown in fig. 46) or stacked chip connection (as shown in fig. 47) include reduced length of conductive paths and reduced amount of crossover between signal lines (optical or electrical).

The long wire between a given photodetector and the downstream port has an associated parasitic capacitance that results in increased power consumption along the wire to drive the signal. To limit power loss in the system, the layout of elements on a die containing a Photonic Integrated Circuit (PIC) implementing an optical processor may be optimized to allow for compact electrical wiring (ELECTRICAL ROUTING). For example, portions of a photonic integrated circuit implementing distributed optoelectronics processing (e.g., vector matrix multiplier subsystem 2310A (FIG. 23A) or vector matrix multiplier subsystem 2310B (FIG. 23B)) may be arranged so as to have a relatively narrow "optical ribbon" that includes an optical waveguide that carries optical signals of optical inputs (e.g., from optical modulators that provide elements of the input vector), optoelectronic nodes (e.g., including one or more MZI modulators and detectors), and wires that carry electrical signals of electrical outputs (e.g., trans-impedance amplifiers that feed elements for providing the output vector).

Fig. 45H shows an example of an optical flat cable 4590 that includes an optical waveguide 4592, an MZI modulator 4594, a detector 4596, and a lead 4598.

In some embodiments, the transimpedance amplifiers (e.g., transimpedance amplifiers 2306A and 2306B) are part of an Electronic Integrated Circuit (EIC) flip-chip connected to a Photonic Integrated Circuit (PIC). The optical flat cable contains a plurality of "strands" that contain portions of the optical replication distribution network, and optoelectronic "nodes" corresponding to specific columns of the matrix multiplication. The nodes of the strands form "tiles" that contain elements corresponding to particular rows of the matrix multiplication. The tiles in these photonic integrated circuits also overlap with corresponding tiles in the electronic integrated circuits, as described in more detail below.

Fig. 45A shows an example of one strand 4500 in such an optical flat. The strand 4500 includes a binary tree optical network that optically distributes corresponding input vector elements as intermediate nodes in a binary tree arrangement using 1:2 splitters 4502, and output ports of waveguides as leaf nodes within the binary tree arrangement that transmit optical waves from the output of the splitters 4502 to optoelectronic nodes 4504 for performing one or more optoelectronic operations. Or the strand may comprise two binary trees that assign the respective principal and antisymmetric values of the element, but it is sufficient for some systems to configure one binary tree, for example, in which the matrix is limited to only contain positive weights for a particular software algorithm. In addition, the photonic integrated circuit may include wires (not shown) extending from node 4504 where the wires of the other strands are contacted. The splitter 4502 and the waveguides that transmit the optical signals from the splitter 4502 to the optoelectronic node 4504 form an optical replication distribution network. The output port of the optical replication distribution network is at the end of a waveguide coupled to an optoelectronic node 4504. The output ports form leaf nodes of an optical replication distribution network. Opto-electronic node 4504 is the part of the opto-electronic circuit that receives the light waves from the output port of the optical replication distribution network. The root of each subnetwork of the optical replication distribution network may be fed by a root modulator (not shown) that modulates the optical waves according to the elements of the input vector (e.g., MZI modulators like 2302A or 2302B). In some embodiments, an optoelectronic node 4504 is connected to each leaf of the optical replication distribution network, wherein the optoelectronic node includes a MZI modulator 4505 that multiplies by a matrix element, and a pair of photodetectors 4507 located at the output of MZI modulator 4505 for photoelectric conversion. The length of the wires used to electrically route these electrical signals depends in part on the width of the overall optical flat. For an N x N array of elements (e.g., N x N matrix multiplication), there are N sets of strands in the flat cable, each set having its own optical replication distribution network. Because the length of the longest wire may need to traverse distances up to N strands, each subnetwork (i.e., each binary tree) of the optical replication distribution network needs to occupy a narrow width. For simplicity and clarity of illustration, an example of elements of a 4 x 4 array is depicted, but in some implementations the value of N may be significantly increased (e.g., 32, 64, 128, or greater).

As described above, an optical replication distribution network with tolerance errors and wavelength independence can be manufactured by a binary tree topology, which comprises a strand that distributes a given value to a node connected to an output port of the strand. As a motivation to consider the asymmetric arrangement of the binary tree in strand 4500, consider the possible size of the symmetric binary tree under n×n matrix multiplication. Because a tree of N elements has a width (N) greater than a depth (log 2 (N)), the tree can be arranged such that the narrowest dimension exceeds its depth. The last layer of the binary tree at the leaf needs to fit the symmetrical distribution of nodes across the width of the tree, so in some examples, the waveguides in the tree may have a 90 degree turn to spread to a sufficiently large width. Based on the minimum radius of curvature needed to support the waveguide (to limit bending losses), there will be a limit to the narrowness of this depth dimension, resulting in a minimum width (e.g., about 40 microns) on each layer of the tree. Thus, in this example, the total width is proportional to log2 (N) times 40 microns. Instead, a symmetrical arrangement using a binary tree in strand 4500 is considered. In such an asymmetric arrangement, the optical transmission length between the root of the binary tree arrangement and the different opto-electronic nodes is different. In other asymmetric arrangements, some (but not necessarily all) of the lengths are different. In some asymmetric arrangements with a binary tree topology, the root may not be at the end of one strand, but may be somewhere between the ends corresponding to the leaf nodes. The asymmetry helps to form a narrow strand. The width of the 1:2y separator can be limited to around 1 micron per arm (i.e., about 2 microns total) without requiring a change in orientation, and without requiring a turn of 90 degrees rotation to be made, which would take around 10 microns. The widest portion of the strand is at the top node, which has the width of the rectangular node + log2 (N) adjacent the width of the waveguide. The width of each node is large enough to accommodate the width of the MZI modulator 2 arm (i.e., 20 microns or less). The width of the adjacent waveguide is about 2.5 microns (being the waveguide itself and its spacing from the neighbors). Thus, the total width of the strands is proportional to 20 microns plus log2 (N) times 2.5 microns, potentially much narrower than in the case of a symmetrical binary tree.

For example, using the example dimensions of the MZI modulator and waveguide described above, the total width of the symmetric binary tree for 8×8 matrix multiplication is about log2 (8) ×40 microns=120 microns. In contrast, a single asymmetric binary tree for 8×8 matrix multiplication has a width of 20 microns+log2 (8) ×2.5 microns=27.5 microns. As another example, the total width of a symmetric binary tree for a 16×16 matrix multiplication is about log2 (16) ×40 microns=160 microns. In contrast, a single asymmetric binary tree for 16×16 matrix multiplication has a width of 20 microns+log2 (16) ×2.5 microns=30 microns.

The strands may be arranged in a straight line or have one or more bends to reduce their overall length. As described below, when the number of bends is small, the width of the asymmetric binary tree arrangement will still be significantly smaller than the width of the symmetric binary tree arrangement.

Fig. 45B shows an example of how a flat cable 4510 may be arranged on a photonic integrated circuit die. The flat cable 4510 comprises a first wire 4512A of a tab 4514 disposed on one side of the die and a second wire 4512B of the tab 4514 disposed on the other side of the die. The connecting portion 4515 is provided by extending one or more waveguides in each strand. The distribution of the sheets into two or more substantially straight lines, the different parts of the die area (in this case, the different ends of the die area) are connected within the strand by the waveguides of the optical fiber replication distribution network, thereby achieving a more compact arrangement. Expanding the waveguide in this way does increase the total optical insertion loss (e.g., by about 1dB/cm of the extra waveguide length), but such extra loss can generally persist. The number of lines of the slab that the extended waveguide (e.g., 2 lines, 3 lines, 4 or more) connects can be selected to jointly optimize the adaptation to the die area and the total power loss in the overall system. For a large number of sheets, the substantially straight sheets may be arranged in uniformly spaced columns. Furthermore, the amount of waveguide extension may be limited by computational constraints, e.g., propagation times over the length of the strand are significantly less than the time of a clock cycle, resulting in a limitation of the total length of the strand (e.g., less than 10 cm).

Fig. 45C shows an arrangement of flat wires 4510 (not shown with a slice boundary) superimposed on an arrangement of bumps 4516 for electrically coupling pads (e.g., composed of a conductive material such as a metal or alloy) on a photonic integrated circuit that provide electrical input and output ports with corresponding pads on an electronic integrated circuit that provide electrical output and input ports. For example, bump 4516 may be solder ball 4620 in fig. 46 and 47. For example, signals are provided through output ports of the electronic integrated circuit to control the MZI modulator (e.g., two bumps per Mach-Zehnder interferometer in a given optoelectronic node). In some embodiments, each optoelectronic node has one or more additional bumps (e.g., bumps for temperature control of a given Mach-Zehnder interferometer modulator) and additional bumps for various other electrical signal exchanges between the photonic integrated circuit and the electronic integrated circuit. To convert electrical signals from the electronic integrated circuit to the photonic integrated circuit for control, and to receive electrical signals from the photonic integrated circuit to the electronic integrated circuit, pads in the photonic integrated circuit are aligned with corresponding pads on bump locations in the electronic integrated circuit. One example of a bump connecting a photonic integrated circuit output port to an electronic integrated circuit input port is a bump connecting a pad in the chip that provides a summed current from the wires of a plurality of optoelectronic nodes to a pad in a transimpedance amplifier input in an electronic integrated circuit (not shown). A typical bump diameter may be about 100 microns, although the bumps may be smaller (e.g., 50 microns). Thus, in some embodiments, the pitch of the bumps (e.g., 100 microns) may be greater than the pitch required for the sheets in the strand, in which case the sheets may be spread apart to provide a substantially uniform sheet spacing.

Fig. 45D shows another example of a flat cable 4520 depicting an example of a sheet 4522 comprising a root modulator 4524, the root modulator 4524 being used to modulate a data value onto an optical wave fed to one of the strands of an optical replication distribution network. Among the strands (including the strands fed by the root modulator 4524) is also an array of optoelectronic nodes 4526 (4 nodes in this example). In node 4526 there is a set 4528 of bumps (e.g., modulation weights for matrix multiplication) for conveying phase modulation values from the electronic integrated circuit to the arms of the photonic integrated circuit MZI modulator. For example, the set of bumps 4528 may comprise solder balls 4620 shown in fig. 46 and 47. The tab 4522 also includes a wire ending in a pad, the plurality of pads being connected via bumps 4530 with pads of an input of a transimpedance amplifier 4532 in the electronic integrated circuit. It is the length of these wires in the dimension across the strands that should be optimized to remain relatively short, as this dimension scales by N, which may be relatively large in some implementations.

In fig. 45D, bumps 4528, 4530 and transimpedance amplifier 4532 are shown superimposed on tab 4522, but are not part of tab 4522. Because the root modulator 4524 of tile 4522 is placed at a different location on the die than the node of the optical replication distribution network, the waveguide portion connecting the modulators 4524 contains the optical delay portion (or other form of optical delay) of the waveguide, matching the total effective optical distance and corresponding time delay to the root modulators of the other tiles. Thus, in this example, waveguide portion 4534 is longer than waveguide portion 4536.

Fig. 45E shows another optical flat cable 4540 of a different optoelectronic computing system that uses electronic integrated circuits instead of photonic integrated circuits to perform more computations. In this example, for a 4 x 4 matrix multiplication, there is still a similar configuration of four tiles 4542, 4544, 4546 and 4548 in the photonic integrated circuit. However, the light waves carrying the modulated data value are detected via bumps in the electronic integrated circuit connected to the TIA and coupled to the electronic integrated circuit. Multiplication and addition as part of the vector matrix multiplier operation is then performed electronically by digital circuitry in the electronic integrated circuit using the digital values. For this calculation, in case of synchronous communication occurring in the digital domain, time differences caused by different waveguide lengths can be compensated for, and thus no optical delay is required. Or another optoelectronic computing system may include a mach-zehnder interferometer modulator for performing weight multiplication, and the result of the optoelectronic multiplication may be detected and coupled to an electronic integrated circuit to electronically perform summation using the digital values.

Fig. 45F shows another example of an optical flat cable 4550 and the type of optoelectronic process that may occur within a tile 4552 that performs various types of data processing in a photonic integrated circuit. Typically, photodiodes are used to convert optical signals encoded on different strands distributed over the flat cable into electrical signals. These electrical signals are fed to data processing circuitry 4560 in the photonic integrated circuit. The photonic integrated circuit also includes a data upload circuit 4570 for uploading the results to the operation of the flip-chip connected electronic integrated circuit or any other form of integrated electronic circuit.

Fig. 45G shows a view of an optoelectronic computing system 4580, illustrating an exemplary arrangement of various functions within the system, including weight values (w#, #) for multiplication of matrix elements, photodiodes (PD) for optical or electrical summation, and an ADC module for converting analog electrical signals to digital electrical signals. Different parts of the functionality may be included in a photonic or electronic integrated circuit in the system 4580.

In some arrangements, the matrix multiplication may have different rows and columns. For example, for an mxn matrix multiplier, there are M electrical tiles (ELECTRIC TILE) in the electronic integrated circuit (1 per row) and M tiles in the photonic integrated circuit, where each tile has N weight modulators, corresponding to one of the N strands of the optical flat cable. As described above, to better fit into a die, rather than a long row of M tiles, there may be multiple rows of M/2 tiles for the first row and M/2 tiles for the second row, or four rows of M/4, M/4 tiles, etc. In some cases, four rows may be sufficient, as the return on spatial distribution may be reduced, but in some cases the number of rows may be greater but less than M.

In some implementations, the electronic integrated circuit includes circuitry for components such as weight drivers, data drivers, memory (e.g., storing matrix weights for modulators and accumulation results), DAC, ADC, digital logic (e.g., for accumulation), and portions of a digital data bus for communicating with other tiles. For most cases, limited communication is required between different tiles (e.g. different rows in a matrix) due to limited correlation between the calculated data in the different tiles. Thus, the layout may allow (short) rows added (by current) to a given transimpedance amplifier (and corresponding elements in the output vector) to be relatively independent in the layout. In most cases, there is no relationship between a given output vector and the input vector for the next iteration, but in some iterations of the computation (e.g., neural network computation), there is a correlation between the elements of the output vector and the corresponding elements of the input vector used in the next iteration. In some examples, other elements may be further correlated, such as when all elements are accumulated as part of a normalized calculation that divides each element by the accumulated sum. Thus, in the layout, components that need to communicate with each other more frequently can be arranged together.

Fig. 24A illustrates an example configuration of a system 2400A for an implementation of the system 1800 in which there are multiple devices 2410A, 2410B, 2410C, and 2410D (collectively 2410) carrying (host) respective different multiplication modules (e.g., multiplication modules 1806A, 1806B, 1806C, and 1806D), each configured as a vector matrix multiplier subsystem to perform vector matrix multiplication on different subsets of vector elements through different sub-matrices of a larger matrix. Each device may include a substrate, and different devices may have different substrates. The substrate may have a size proportional to the size of the semiconductor wafer. The substrate may also be as large as the entire wafer. For example, rather than implementing a vector matrix multiplier subsystem using a2 x 2 matrix of elements, each multiplication module may be configured to implement a vector matrix multiplier subsystem using a matrix having dimensions as large as can be efficiently fabricated on a single device having a common substrate for the modules within the device, similar to system 2110 (fig. 21B). For example, each multiplication module may implement a vector matrix multiplier subsystem using a 64×64 element matrix.

The different vector matrix multiplier subsystems are arranged so that the results of each sub-matrix are appropriately combined to produce a result of a larger combined matrix (e.g., the elements of a 128 element vector multiplied by a 128 x 128 element matrix). Each set of optical ports or light sources 2402 provides a set of optical signals representing a different subset of vector elements of a larger input vector. The replication module 2404 is configured to replicate all optical signals within a set of received optical signals (encoded on the guided optical waves in the set 2403 of 64 optical waveguides) and provide the set of optical signals to each of two different sets of optical waveguides, in this example, the set 2405A of 64 optical waveguides and the set 2405B of 64 optical waveguides. For example, by performing this replication operation using an array of waveguide splitters, each splitter in the array replicates one element of a subset of input vector elements (e.g., a subset of 64 elements for each replication module 2404) by dividing an optical wave in the set 2403 of optical waveguides into a first corresponding optical wave in the set 2405A of optical waveguides and a second corresponding optical wave in the set 2405B of optical waveguides. If multiple wavelengths (e.g., W wavelengths) are used in some embodiments, the number of separate waveguides (and thus the number of separate ports or sources in 2402) may be reduced by, for example, 1/W. Each vector matrix multiplier subsystem 2410 performs a vector matrix multiplication providing its partial results as a set of electrical signals (a subset of elements for outputting vectors), with the respective partial result pairs from different vector matrix multiplier subsystems 2410 being added together by summing modules 2414A and 2414B as shown in fig. 24A using any of the techniques described herein (e.g., current summation at junctions between conductors). In this example, the output of the device 2410A is transmitted to the summing module 2414A through an electrical lead 2416A and the output of the device 2410B is transmitted to the summing module 2414A through an electrical lead 2416B.

In some embodiments, for any number of recursion levels, vector matrix multiplication using the desired matrix may be performed recursively by combining the results from the smaller submatrices, ending with the use of a single element optical amplitude modulator at the root level (root level) of the recursion. At different levels of recursion, the vector matrix multiplier subsystem arrangement may be more compact (e.g., different data centers connected by long-distance fiber optic networks at one level, different multi-chip arrangements connected by fiber optics at another level, different chips at another level, and different portions of modules on the same chip connected by another level on-chip waveguides (on-chip waveguide)).

Fig. 24B illustrates another example configuration of system 2400B in which additional devices are used for optical transmission and reception for each vector matrix multiplier subsystem 2410. In some implementations, the different vector matrix multiplier subsystems 2410 are carried by separate devices and/or distributed in separate remote locations. In this example, at the output 2418 of each vector matrix multiplier subsystem 2410 (the output 2418 that provides the electrical signal), the electrical signal is converted to an optical signal using the optical transmitter array 2420 and each optical signal is coupled to a channel within an optical transmission line, e.g., an optical fiber in the fiber optic bundle 2416 between separate devices and/or remote locations. The light emitter array 2420 may include, for example, an array of laser diodes that convert electrical signals at the output of the vector matrix multiplier subsystem 2410 into optical signals. In some implementations, different vector matrix multiplier subsystems 2410 are located in different areas on an integrated device, such as a system on a chip, that carries vector matrix multiplier subsystem 2410 on a common substrate. In this example, at the output 2418 of each vector matrix multiplier subsystem 2410, an electrical signal at the output 2418 is converted to an optical signal using an optical transmitter array 2420 and each optical signal is coupled to a channel in a waveguide in a set of waveguides between different regions on an integrated device.

The optical receiver array 2422 is used to output each subset of vector elements to convert the optical signals into electrical signals before the corresponding pairs of partial results are summed by the summing module 2414.

Fig. 24C illustrates another example configuration of system 2400C in which vector matrix multiplier subsystem 2410 may be reconfigured to enable different vector matrix multiplications for different sub-matrices to be rearranged in a different manner. For example, the shape of a larger matrix formed by combining different sub-matrices may be configurable. The user can dynamically configure how the different sub-matrices are combined based on the computational requirements. This provides more flexibility in the operation of the light processor. In this example, two different subsets of optical signals 2424A and 2424B are provided from each set of optical ports or light sources 2402 to the optical switch 2430. There is also an electrical switch 2440 that is capable of rearranging a subset of electrical signals representing the partial results of one output vector or a separate output vector to be summed by summing module 2414 to provide for a desired calculation. For example, instead of vector matrix multiplication using a matrix of size 2 mx 2n consisting of four sub-matrices of size mxn, vector matrix multiplier subsystem 2410 may be rearranged to use a matrix of size 2 mxn or a matrix of size mx 2 n.

Fig. 24D illustrates another example configuration of system 2400D in which vector matrix multiplier subsystem 2410 may be reconfigured in other ways. The optical switch 2430 may receive up to four separate sets of optical signals and may be configured to provide different sets of optical signals to different vector matrix multiplier subsystems 2410 or to copy any set of optical signals to multiple vector matrix multiplier subsystems 2410. Moreover, the electrical switch 2440 can be configured to provide any combination of the received sets of electrical signals to the summing module 2414. This greater reconfigurability enables a wider variety of vector matrix multiplication calculations, including multiplication using matrices of sizes mx3n, 3 mxn, mx4n, 4 mxn.

FIG. 24E shows another example configuration of system 2400E that includes additional circuitry that may perform various operations (e.g., digital logic operations) to enable system 2400E (e.g., for a complete optoelectronic computing system, or an optoelectronic system for a larger computing platform) to be used to implement computing technologies such as artificial neural networks or other forms of machine learning. Data storage subsystem 2450 can include volatile storage media (e.g., SRAM and/or DRAM) and/or nonvolatile storage media (e.g., solid state disk and/or hard disk). Data storage subsystem 2450 can also include a hierarchical cache module (HIERARCHICAL CACHE modules). The stored data may include, for example, training data, intermediate result data, or production data (production data) for feeding to an online computing system (online computational system). Data storage subsystem 2450 may be configured to provide concurrent access to input data (concurrent access) to modulate on different optical signals provided by optical port or light source 2402. The conversion of the data stored in digital form to analog form available for modulation may be performed by circuitry (e.g., a digital-to-analog converter) included at the output of data storage subsystem 2450, or at the input of optical port or light source 2402, or split between the two. The auxiliary processing subsystem (auxiliary processing subsystem) 2460 may be configured to perform auxiliary operations on the data (e.g., non-linear operations, data shuffling (data shuffling), etc.), which may be performed through multiple iterative loops of vector matrix multiplication using the vector matrix multiplier subsystem 2410. The resulting data 2462 from those ancillary operations may be transmitted in digital form to the data storage subsystem 2450. The data retrieved by data storage subsystem 2450 may be used to modulate an optical signal using an appropriate input vector, and to provide control signals (not shown) used to arrange the modulation levels of the optical amplitude modulators in vector matrix multiplier subsystem 2410. The conversion of data encoded on electrical signals in analog form to digital form may be performed by circuitry (e.g., analog-to-digital converter) within auxiliary processing subsystem 2460.

In some embodiments, a digital controller (not shown in the figures) is provided to control the operation of data storage subsystem 2450, the layered cache module, various circuits (e.g., digital-to-analog and analog-to-digital converters), vector matrix multiplier subsystem 2410, and light source 2402. For example, the digital controller is configured to execute program code to implement a neural network having a plurality of hidden layers. The digital controller iteratively performs matrix processing associated with the various layers of the neural network. The digital controller performs a first iteration of matrix processing by retrieving first matrix data from data storage subsystem 2450 and arranging the modulation levels of the optical amplitude modulators in vector matrix multiplier subsystem 2410 based on the retrieved data, wherein the first matrix data represents coefficients of a first layer of the neural network. The digital controller takes a set of input data from the data storage subsystem and arranges the modulation levels for the light source 2402 to produce a set of optical input signals representing the elements of the first input vector.

The vector matrix multiplier subsystem 2410 performs matrix processing based on the first input vector and the first matrix data, representing processing of signals by a first layer of the neural network. After the auxiliary processing subsystem 2450 generates the first set of result data 2462, the digital controller performs a second iteration of matrix processing by retrieving second matrix data from the data storage subsystem that represents coefficients of a second layer of the neural network, and arranging the modulation levels of the optical amplitude modulators in the vector matrix multiplier subsystem 2410 based on the second matrix data. The first set of result data 2462 is used as a second input vector to arrange the modulation level of the light source 2402. The vector matrix multiplier subsystem 2410 performs matrix processing based on the second input vector and the second matrix data, representing processing of signals by the second layer of the neural network, and so on. In the last iteration, an output of the signal processed by the last layer of the neural network is generated.

In some embodiments, when performing calculations associated with hidden layers of the neural network, the resulting data 2462 is not transmitted to the data storage subsystem 2450, but rather is used by a digital controller to directly control a digital-to-analog converter that generates control signals for arranging modulation levels of the optical amplitude modulator in the vector matrix multiplier subsystem 2410. This reduces the time required to store data to data storage subsystem 2450 and access data from data storage subsystem 2450.

Other processing techniques may be incorporated into other examples of system configurations. For example, various techniques used with other kinds of vector matrix multiplication subsystems (e.g., subsystems that do not have electrical summation or signed multiplication as described herein, but use optical interference) may be incorporated into some system configurations, such as some of the techniques described in U.S. patent publication No. US2017/0351293, which is incorporated herein by reference.

Referring to fig. 32A, an Artificial Neural Network (ANN) computing system 3200 includes a photo-matrix multiplication unit 3220, the photo-matrix multiplication unit 3220 having, for example, a replication module, a multiplication module, and a summation module as shown in fig. 18 to 24D, to be able to process incoherent or low coherent optical signals when performing matrix computation. The artificial neural network computing system 3200 includes a controller 110, a storage unit 120, a DAC unit 130, and an ADC unit 160, similar to those of the system 100 of fig. 1A. The controller 110 receives a request from the computer 102 and transmits a computing output to the computer 102, similar to that shown in FIG. 1A.

The optoelectronic processor 3210 includes a light source 3230, which may be similar to the laser unit 142 of fig. 1A, wherein multiple output signals of the light source 3230 are coherent. Light source 3230 may also use light emitting diodes to produce multiple output signals that are incoherent or have low coherence. The opto-electronic matrix multiplication unit 3220 includes a modulator array 144, the modulator array 144 receiving modulator control signals generated by the first DAC subunit 132 based on input vectors, similar to the operations performed by the optical processor 140 of fig. 1A. The output of modulator array 144 may be compared to the output of optical port/light source 1802 in fig. 18. The manner in which the opto-electronic matrix multiplication unit 3220 processes the optical signals from the modulator array 144 is similar to the manner in which the replication module 1804, multiplication module 1806, and summation module 1808 process the optical signals from the optical port/light source 1802 in fig. 18.

Referring to fig. 32B, the photo matrix multiplication unit 3220 receives an input vectorAnd multiplying the input vector by a matrixTo generate an output vector

The opto-electronic matrix multiplication unit 3220 includes m optical paths 1803_1, 1803_2, 1803—m (collectively 1803) that carry optical signals representing an input vector. The replication module 1804_1 provides a copy of the input optical signal v ₁ to the multiplication modules 1806_11, 1806_21. The replication module 1804_2 provides a copy of the input optical signal v ₂ to the multiplication modules 1806_12, 1806_22. The replication module 1804_n provides a copy of the input optical signal v _n to the multiplication modules 1806_1n, 1806_2n.

As described above, the amplitude of the copies of the optical signal v ₁ provided by the replication module 1804_1 are the same (or substantially the same) relative to each other, but different from the amplitude of the optical signal v ₁ provided by the modulator array 144. For example, if the replication module 1804_1 evenly splits the signal power of the optical signal v ₁ provided by the modulator array 144 among m signals, each of the m signals will have a power equal to or less than 1/m of the power of the optical signal v ₁ provided by the modulator array 144.

The multiplication module 1806_11 multiplies the input signal v ₁ by the matrix element M ₁₁ to produce M ₁₁·v₁. The multiplication module 1806_21 multiplies the input signal v ₁ with the matrix element M ₂₁ to produce M ₂₁·v₁. the multiplication module 1806_m1 multiplies the input signal v ₁ by the matrix element M _m1 to produce M _m1 ·v1. The multiplication module 1806_12 multiplies the input signal v ₂ by the matrix element M ₁₂ to produce M ₁₂·v₂. The multiplication module 1806_22 multiplies the input signal v ₂ with the matrix element M ₂₂ to produce M ₂₂·v₂. The multiplication module 1806_m2 multiplies the input signal v ₂ by the matrix element M _m2 to produce M _m2·v₂. The multiplication module 1806—1n multiplies the input signal v _n by the matrix element M _1n to produce M _1n·v_n. The multiplication module 1806—2n multiplies the input signal v _n with the matrix element M _2n to produce M _2n·v_n. The multiplication module 1806—mn multiplies the input signal v _n with the matrix element M _mn to produce M _mn·v_n, and so on.

The second DAC subunit 134 generates a control signal based on the values of the matrix elements and transmits the control signal to the multiplication module 1806 to enable the multiplication module 1806 to multiply the values of the input vector elements by the values of the matrix elements, for example, using light amplitude modulation. For example, the multiplication module 1806_11 may include an optical amplitude modulator, and multiplying the input vector element v ₁ by the matrix element M ₁₁ may be implemented by encoding the value of the matrix element M ₁₁ as the amplitude modulation level applied to the input optical signal representing the input vector element v ₁.

The summing module 1808_1 receives the outputs of the multiplication modules 1806_11, 1806_12, and 1806_1n, and produces a sum y ₁ equal to M ₁₁v₁+M₁₂v₂+…+M_1nv_n. The summing module 1808_2 receives the outputs of the multiplication modules 1806_21, 1806_22, and 1806_2n, and produces a sum y ₂ equal to M ₂₁v₁+M₂₂v₂+…+M_2nv_n. The summing module 1808—n receives the outputs of the multiplication modules 1806—m1, 1806—m2. And produces a sum y _n equal to M _m1v₁+M_m2v₂+…+M_mnv_n.

In the system 3200, the output of the photo matrix multiplication unit 3220 is provided to the ADC unit 160, without passing through the detection unit 146 as is the case in the system 100 of fig. 1A. This is because the multiplication module 1806 or the summation module 1808 has already converted the optical signal into an electrical signal, so a separate detection unit 146 is not required in the system 3200.

Fig. 33 shows a flowchart of an example of a method 3300 of performing artificial neural network calculations using the artificial neural network computing system 3200 of fig. 32A. The steps of method 3300 may be performed by controller 110 of system 3200. In some embodiments, the various steps of method 3300 may be run in parallel, in combination, in a loop, or in any order.

In step 3310, an Artificial Neural Network (ANN) calculation request is received that includes the input data set and the first plurality of neural network weights. The input data set includes a first digital input vector. The first digital input vector is a subset of the input data set. For example, it may be a sub-region of an image. The artificial neural network calculation request may be generated by various entities (e.g., computer 102 of fig. 32A). The computers may include one or more of various types of computing devices, such as personal computers, server computers, vehicle computers, and flight computers. An artificial neural network computation request generally refers to an electrical signal that informs or informs the artificial neural network computing system 100 that an artificial neural network computation is to be performed. In some embodiments, the artificial neural network computation request may be split into two or more signals. For example, the first signal may query (query) the artificial neural network computing system 3300 to check whether the system 3300 is ready to receive the input data set and the first plurality of neural network weights. In response to an acknowledgement by the system 3300, the computer 102 may transmit a second signal comprising the input data set and the first plurality of neural network weights.

In step 3320, the input data set and the first plurality of neural network weights are stored. The controller 110 may store the input data set and the first plurality of neural network weights in the storage unit 120. Storing the input data set and the first plurality of neural network weights in the storage unit 120 may allow flexibility in the operation of the artificial neural network computing system 3300, e.g., may improve overall performance of the system. For example, the input data set may be divided into digital input vectors of a set size and format by retrieving (retrieve) a desired portion of the input data set from the storage unit 120. The different portions of the input data set may be processed in various orders, or shuffled (shuffled), to allow various types of artificial neural network calculations to be performed. For example, where the input and output matrices are of different sizes, shuffling may allow matrix multiplication to be performed by a block matrix multiplication technique. As another example, storing the input data set and the first plurality of neural network weights in the storage unit 120 may allow for queuing of the plurality of artificial neural network computing requests by the artificial neural network computing system 3300, which may allow the system 3300 to maintain operation at its full speed without periods of inactivity.

In step 3330, a first plurality of modulator control signals is generated based on the first digital input vector and a first plurality of weight control signals is generated based on the first plurality of neural network weights. The controller 110 may transmit the first DAC control signal to the DAC unit 130 to generate a first plurality of modulator control signals. DAC unit 130 generates a first plurality of modulator control signals based on the first DAC control signals and modulator array 144 generates an optical input vector representing a first digital input vector.

The controller 110 may transmit the second DAC control signal to the DAC unit 130 to generate a first plurality of weight control signals. The DAC unit 130 generates a first plurality of weight control signals based on the second DAC control signals, and reconfigures the photo-matrix multiplication unit 3220 according to the first plurality of weight control signals, implementing a matrix corresponding to the weights of the first plurality of neural networks.

The second DAC control signal may include a plurality of digital values to be converted into the first plurality of weight control signals by the DAC unit 130. The plurality of digital values generally correspond to a first plurality of neural network weights and may be associated by various mathematical relationships or look-up tables. For example, the plurality of digital values may be linearly proportional to the first plurality of neural network weights. As another example, the plurality of digital values may be calculated by performing various mathematical operations on the first plurality of neural network weights to generate the weight control signal, which may configure the photo-matrix multiplication unit 3220 to perform matrix multiplication corresponding to the first plurality of neural network weights.

In step 3340, a first plurality of digital outputs corresponding to the electrical output vectors of the opto-electronic matrix multiplication unit 3220 are obtained. The optical input vector produced by modulator array 144 is processed by a photo-matrix multiplication unit 3220 and converted into an electrical output vector. The electrical output vector is converted to a digital value by ADC unit 160. The controller 110 may, for example, transmit a conversion request to the ADC unit 160 to start converting the voltage output from the photo matrix multiplication unit 3220 into a digital output. Once the conversion is completed, the ADC unit 160 may transmit the conversion result to the controller 110. Alternatively, the controller 110 may take the conversion result from the ADC unit 160. The controller 110 may form a digital output vector from the digital output, the digital output vector corresponding to the result of the matrix multiplication of the input digital vector. For example, the digital outputs may be organized or concatenated to have a vector format.

In step 3350, a nonlinear transformation is performed on the first digital output vector to produce a first transformed digital output vector. The nodes or artificial neurons of the artificial neural network operate by first performing a weighted sum of the signals received from the nodes of the previous layer, and then performing a nonlinear transformation ("activation") of the weighted sum to produce an output. Various types of artificial neural networks may implement various types of differentiable nonlinear transformations. Examples of nonlinear transformation functions include modified linear unit (RECTIFIED LINEAR units; RELU) functions, S-shaped functions, hyperbolic tangent functions (yperbolic tangent function), X2 functions, and |X| functions. This nonlinear transformation is performed on the first digital output by the controller 110 to produce a first transformed digital output vector. In some embodiments, the nonlinear transformation may be performed by an application specific digital integrated circuit within the controller 110. For example, the controller 110 may include one or more modules or circuit blocks that are particularly adapted to accelerate the computation of one or more types of nonlinear transformations.

In step 3360, the first transformed digital output vector is stored. The controller 110 may store the first transformed digital output vector in the storage unit 120. In the case where the input data set is divided into a plurality of digital input vectors, the first transformed digital output vector corresponds to, for example, an artificial neural network calculation result of a portion of the input data set of the first digital input vector. As such, storing the first transformed digital output vector allows the artificial neural network computing system 3200 to perform and store additional computations on other digital input vectors of the input data set to be later aggregated into a single artificial neural network output.

In step 3370, an artificial neural network output generated based on the first transformed digital output vector is output. The controller 110 generates an artificial neural network output that is the result of processing the input dataset through an artificial neural network defined by the first plurality of neural network weights. In the case where the input data set is split into a plurality of digital input vectors, the artificial neural network output produced is an aggregate output comprising the first converted digital output, but may further comprise additional converted digital outputs corresponding to other portions of the input data set. Once the artificial neural network output is generated, the generated output is transmitted to a computer (e.g., computer 102) that initiated the artificial neural network calculation request.

Various performance metrics may be defined for the artificial neural network computing system 3200 implementing the method 3300 (performance metric). Defining performance metrics may allow the performance of the artificial neural network computing system 3200 implementing the optoelectronic processor 3210 to be compared to the performance of other systems used to replace the artificial neural network computing implementing the electrical matrix multiplication unit (electronic matrix multiplication unit). In one aspect, the rate at which the artificial neural network computation may be performed may be indicated in part by a first recurring period defined as the time elapsed between the step 3320 of storing the input data set and the first plurality of neural network weights in the memory unit and the step 3360 of storing the first transformed digital output vector in the memory unit. Thus, the first cycle period includes the time it takes to convert the electrical signal to an optical signal (e.g., step 3330), perform matrix multiplication in the optical and electrical domains (e.g., step 3340). Steps 3320 and 3360 both involve storing the data in storage unit 120, a step shared between artificial neural network computing system 3200 and a conventional artificial neural network computing system without photo-electric processor 3210. As such, measuring the first cycle period stored to the memory-to-memory transaction time time of the memory transaction may allow for an actual or fair comparison of the artificial neural network computational throughput between the artificial neural network computing system 3200 and an artificial neural network computing system without the optoelectronic processor 3210 (e.g., a system implementing an electrical matrix multiplication unit).

Because of the rate at which the modulator array 144 can generate the optical input vector (e.g., at 25 GHz) and the processing rate of the opto-electronic matrix multiplication unit 3220 (e.g., >25 GHz), the first cycle period of the artificial neural network computing system 3200 for performing a single artificial neural network computation of a single digital input vector may be close to the inverse of the speed of the modulator array 144 (e.g., 40 ps). The first cycle period may be, for example, less than or equal to 100ps, less than or equal to 200ps, less than or equal to 500ps, less than or equal to 1ns, less than or equal to 2ns, less than or equal to 5ns, or less than or equal to 10ns after considering the delays associated with the signal generation of DAC unit 130 and the ADC conversion of ADC unit 160.

By comparison, the multiplication run time of the M1 vector and M matrix of the electrical matrix multiplication unit is generally proportional to M2-1 processor clock cycles (processor clock cycle). For m=32, this multiplication would take about 1024 cycles, which results in a run time exceeding 300ns at a 3GHz clock speed, which is several orders of magnitude slower than the first cycle period of the artificial neural network computing system 3200.

In some embodiments, method 3300 further comprises the step of generating a second plurality of modulator control signals based on the first transformed digital output vector. In some types of artificial neural network calculations, a single digital input vector may be repeatedly propagated through or processed by the same artificial neural network. As described above, an artificial neural network implementing multipass processing may be referred to as a recurrent neural network (recurrent neural network; RNN). A recurrent neural network is a neural network in which the output of the network is recycled back to the input of the neural network during the (k) th pass and used as input during the (k+1) th pass. Recurrent neural networks may have various applications in pattern recognition tasks, such as speech or handwriting recognition. Once the second plurality of modulator control signals are generated, method 3300 may proceed from step 3340 to step 3360 to complete the first digital input vector second pass artificial neural network. In general, the recycling of the converted digital output into the digital input vector may be repeated for a predetermined number of cycles, depending on the characteristics of the recurrent neural network received in the artificial neural network calculation request.

In some embodiments, the method 3300 further comprises the step of generating a second plurality of weight control signals based on the second plurality of neural network weights. In some cases, the artificial neural network computation request further includes a second plurality of neural network weights. As described above, in general, artificial neural networks have one or more hidden layers in addition to an input layer and an output layer. For an artificial neural network having two hidden layers, the second plurality of neural network weights may correspond to connectivity between a first layer of the artificial neural network and a second layer of the artificial neural network. To process the first digital input vector through the two hidden layers of the artificial neural network, the first digital input vector may be first processed according to method 3300 until step 3360, wherein the result of processing the first digital input vector through the first hidden layer of the artificial neural network in step 3360 is stored in the storage unit 120. The controller 110 then reconfigures the opto-electronic matrix multiplication unit 3220 to perform matrix multiplication corresponding to a second plurality of neural network weights associated with a second hidden layer of the artificial neural network. Once the opto-electronic matrix multiplication unit 3220 is reconfigured, the method 3300 may generate a plurality of modulator control signals based on the first transformed digital output vector, which generate updated optical input vectors corresponding to the outputs of the first hidden layer. The updated light input vector is then processed by the reconfigured photo-matrix multiplication unit 3220, the photo-matrix multiplication unit 3220 corresponding to the second hidden layer of the artificial neural network. In general, the steps described may be repeated until the digital input vector has been processed through all hidden layers of the artificial neural network.

In some embodiments of the opto-electronic matrix multiplication unit 3220, the reconfiguration rate of the opto-electronic matrix multiplication unit 3220 may be significantly slower than the modulation rate of the modulator array 144. In this case, the throughput of the artificial neural network computing system 3200 may be adversely affected by the amount of time it takes to reconfigure the photo matrix multiplication unit 3220 during the period in which the artificial neural network computation cannot be performed. To mitigate the effects of the relatively slow reconfiguration time of the photo-matrix multiplication unit 3220, batch processing (batch processing) techniques may be utilized in which two or more digital input vectors propagate through the photo-matrix multiplication unit 3220 without configuration changes to apportion (amortize) the reconfiguration time over a greater number of digital input vectors.

Fig. 34 shows a diagram 3290 illustrating aspects of the method 3300 of fig. 33. For an artificial neural network with two hidden layers, instead of processing the first digital input vector through the first hidden layer, reconfiguring the photo-matrix multiplication unit 3220 for the second hidden layer, processing the first digital input vector through the reconfigured photo-matrix multiplication unit 3220, and repeating the same operations for the remaining digital input vectors, all digital input vectors of the input data set may be first processed through the photo-matrix multiplication unit 3220 configured for the first hidden layer (configuration # 1), as shown in the upper part of fig. 3290. Once all digital input vectors have been processed by the photo matrix multiplication unit 3220 with configuration #1, the photo matrix multiplication unit 3220 is reconfigured to configuration #2, which corresponds to the second hidden layer of the artificial neural network. This reconfiguration may be significantly slower than the rate at which the photo-matrix multiplication unit 3220 may process the input vector. Once the photo-matrix multiplication unit 3220 is reconfigured for the second hidden layer, output vectors from the previous hidden layer may be batched by the photo-matrix multiplication unit 3220. For large input data sets with tens or hundreds of thousands of digital input vectors, the impact of reconfiguration time may be reduced by approximately the same factors, which may significantly reduce the portion of time that artificial neural network computing system 3200 spends in reconfiguration.

To achieve batch processing, in some embodiments, the method 3300 further includes the steps of generating, by the DAC unit, a second plurality of modulator control signals based on the second digital input vector, obtaining, from the ADC unit, a second plurality of digital outputs corresponding to the output vectors of the photo-matrix multiplication unit, the second plurality of digital outputs forming a second digital output vector, performing a nonlinear transformation on the second digital output vector to generate a second transformed digital output vector, and storing the second transformed digital output vector in the storage unit. For example, generating the second plurality of modulator control signals may follow step 3360. Further, the artificial neural network output of step 3370 in this case is now based on the first transformed digital output vector and the second transformed digital output vector. The retrieving, executing, and storing steps are similar to steps 3340 through 3360.

Batch processing techniques are one of many techniques for improving the throughput of the artificial neural network computing system 3200. Another technique for improving the throughput of the artificial neural network computing system 3200 is to process multiple digital input vectors in parallel by utilizing Wavelength Division Multiplexing (WDM). As described above, WDM is a technique of simultaneously propagating a plurality of optical signals of different wavelengths through a common propagation channel (e.g., a waveguide of the opto-electronic matrix multiplication unit 3220). Unlike electrical signals, optical signals of different wavelengths may propagate through a common channel without affecting other optical signals of different wavelengths on the same channel. In addition, optical signals may be added (multiplexed) or dropped (demultiplexed (demultiplexed)) from the common propagation channel using well-known structures such as optical multiplexers and demultiplexers.

In the context of the artificial neural network computing system 3200, multiple light input vectors of different wavelengths may be independently generated, propagated through the light path and optical processing components (e.g., light amplitude modulators) of the photo-matrix multiplication unit 3220 simultaneously, and independently processed by electronic processing components (e.g., detectors and/or summing modules) to enhance the throughput of the artificial neural network computing system 3200.

Referring to fig. 35A, in some embodiments, a Wavelength Division Multiplexing (WDM) Artificial Neural Network (ANN) computing system 3500 includes an optoelectronic processor 3510, the optoelectronic processor 3510 including an optoelectronic matrix multiplication unit 3520, the optoelectronic matrix multiplication unit 3520 having a replication module, a multiplication module, and a summation module as shown in fig. 18-24D, to enable processing of incoherent or low coherence optical signals when performing matrix calculations, wherein the optical signals are encoded at a plurality of wavelengths. WDM artificial neural network computing system 3500 is similar to artificial neural network computing system 3200 except that WDM technology is used therein, for some embodiments of artificial neural network computing system 3500, light source 3230 is configured to generate a plurality of wavelengths, such as λ ₁、λ₂ and λ ₃, similar to system 104 of fig. 1F.

The multiple wavelengths may preferably be separated by a sufficiently large wavelength spacing to allow for easy multiplexing and demultiplexing onto common propagation channels. For example, wavelength intervals greater than 0.5nm, 1.0nm, 2.0nm, 3.0nm, or 5.0nm may allow for simple multiplexing and demultiplexing. On the other hand, the range between the shortest wavelength and the longest wavelength of the plurality of wavelengths ("WDM bandwidth") may preferably be small enough that the characteristics or performance of the photo-matrix multiplication unit 3520 remains substantially the same across the plurality of wavelengths. The optical components are typically dispersive, meaning that their optical properties vary with wavelength. For example, the power splitting ratio of a Mach-Zehnder interferometer may vary with wavelength. However, by designing the opto-electronic matrix multiplication unit 3520 to have a sufficiently large operating wavelength window (operating wavelength window), and by limiting the wavelengths within the operating wavelength window, the electrical output vector output by the opto-electronic matrix multiplication unit 3520 for each wavelength can be a sufficiently accurate result of the matrix multiplication implemented by the opto-electronic matrix multiplication unit 3520. The operating wavelength window may be, for example, 1nm, 2nm, 3nm, 4nm, 5nm, 10nm or 20nm.

The modulator array 144 of the WDM artificial neural network computing system 3500 includes a set of optical modulators (banks of optical modulators) configured to generate a plurality of optical input vectors, each of the set of optical modulators corresponding to one of the plurality of wavelengths and generating a respective optical input vector having a respective wavelength. For example, for a system having light input vectors of length 32 and 3 wavelengths (e.g., lambda ₁、λ₂ and lambda ₃), the modulator array 144 may have 3 groups of 32 modulators each. In addition, modulator array 144 also includes an optical multiplexer configured to combine the plurality of optical input vectors into a combined optical input vector comprising a plurality of wavelengths. For example, an optical multiplexer may combine the outputs of three modulator groups of three different wavelengths into a single propagation channel (e.g., waveguide) for each element of an optical input vector. As such, returning to the example above, the combined optical input vector will have 32 optical signals, each signal comprising 3 wavelengths.

The electro-optical processing component of WDM artificial neural network computing system 3500 is further configured to demultiplex a plurality of wavelengths and produce a plurality of demultiplexed output electrical signals. Referring to fig. 35B, the opto-electronic matrix multiplication unit 3520 includes an optical path 1803, the optical path 1803 being configured to receive a combined optical input vector comprising a plurality of wavelengths from the modulator array 144. For example, optical path 1803_1 receives a combined light input vector element v ₁ at wavelengths λ ₁、λ₂ and λ ₃. Copies of the light input vector element v ₁ at wavelengths λ ₁、λ₂ and λ ₃ are provided to multiplication modules 3530_11, 3530_21. In some embodiments where the multiplication module 3530 outputs electrical signals, the multiplication module 3530_11 outputs three electrical signals representing M ₁₁·v₁, which correspond to the input vector elements v ₁ at wavelengths λ ₁、λ₂ and λ ₃. The output electrical signals of the multiplication module 3530_11 corresponding to the input vector element v ₁ at wavelengths λ ₁、λ₂ and λ ₃ are shown as (λ1), (λ2) and (λ3), respectively. Similar symbols apply to the outputs of the other multiplication modules. The multiplication module 3530_21 outputs three electrical signals representing M ₂₁·v₁, which correspond to the input vector elements v ₁ at wavelengths λ ₁、λ₂ and λ ₃, respectively. The multiplication module 3530_m1 outputs three electrical signals representing M _m1·v₁, which correspond to the input vector elements v ₁ at wavelengths λ ₁、λ₂ and λ ₃.

Copies of the light input vector element v ₂ at wavelengths λ ₁、λ₂ and λ ₃ are provided to multiplication modules 3530_12, 3530_22. The multiplication module 3530_12 outputs three electrical signals representing M ₁₂·v₂, which correspond to the input vector elements v ₂ at wavelengths λ ₁、λ₂ and λ ₃. The multiplication module 3530_22 outputs three electrical signals representing M ₂₂·v₂, which correspond to the input vector elements v ₂ at wavelengths λ ₁、λ₂ and λ ₃. The multiplication module 3530_m2 outputs three electrical signals representing M _m2·v₂, which correspond to the input vector elements v ₂ at wavelengths λ ₁、λ₂ and λ ₃.

Copies of the light input vector element v _n, including wavelengths λ ₁、λ₂ and λ ₃, are provided to multiplication modules 3530_1n, 3530_2n, &. The multiplication module 3530—1n outputs three electrical signals representing M _1n·v_n, which correspond to the input vector elements v _n at wavelengths λ ₁、λ₂ and λ ₃. The multiplication module 3530—2n outputs three electrical signals representing M _2n·v_n, which correspond to the input vector elements v _n at wavelengths λ ₁、λ₂ and λ ₃. The multiplication module 3530—mn outputs three electrical signals representing M _mn·v_n, which correspond to the input vector elements v _n at wavelengths λ ₁、λ₂ and λ ₃, and so on.

For example, each multiplication module 3530 may include a demultiplexer configured to demultiplex three wavelengths of each of the 32 signals contained in the multi-wavelength light vector and route (route) the 3 single-wavelength light output vectors to three sets of photodetectors (e.g., photodetectors 2012, 2016 (FIG. 20B) or 2042, 2046 (FIG. 20C)) coupled to three sets of operational or transimpedance amplifiers (e.g., operational amplifiers 2030 (FIG. 20B) or 2050 (FIG. 20C)).

Three sets of summation modules 1808 receive the outputs from the multiplication modules 3530 and produce sums y corresponding to input vectors at various wavelengths. For example, three summing modules 1808_1 receive multiplication modules 3530_11, 3530_12 3530—1n, and produces a sum y ₁(λ1)、y₁(λ2)、y₁ (λ2) of input vector elements v ₁ at wavelengths λ ₁、λ₂ and λ ₃, respectively, where the sum y ₁ at each wavelength is equal to M ₁₁v₁+M₁₂v₂+…+M_1nv_n. Three summing modules 1808_2 receive multiplication modules 3530_21, 3530_22 3530—2n, and produces a sum y ₂(λ1)、y₂(λ2)、y₂ (λ3) of input vector elements v ₂ at wavelengths λ ₁、λ₂ and λ ₃, respectively, where the sum y ₂ at each wavelength is equal to M ₂₁v₁+M₂₂v₂+…+M_2nv_n. Three summing modules 1808—n receive multiplication modules 3530_m1, 3530_m2 3530—mn, and produces a sum y _n(λ1)、y_n(λ2)、y_n (λ3) of input vector elements v _n at wavelengths λ ₁、λ₂ and λ ₃, respectively, where the sum yn at each wavelength is equal to M _m1v₁+M_m2v₂+…+M_mnv_n.

Referring again to fig. 35a, the ADC unit 160 of the wdm artificial neural network computing system 3500 includes an ADC group (banks of ADCs) configured to convert a plurality of demultiplexed output voltages (demultiplexed output voltage) of the photo-matrix multiplication unit 3520. Each ADC group corresponds to one of a plurality of wavelengths and produces a corresponding digital demultiplexed output (digitized demultiplexed output). For example, a group of ADCs 160 may be coupled to a group of summing modules 1808.

Controller 110 may implement a method similar to method 3300 (fig. 33), but extended to support multi-wavelength operation. For example, the method may include the steps of obtaining a plurality of digital demultiplexed outputs from the ADC unit 160, the plurality of digital demultiplexed outputs forming a plurality of first digital output vectors, wherein each of the plurality of first digital output vectors corresponds to one of a plurality of wavelengths, performing a nonlinear transformation on each of the plurality of first digital output vectors to produce a plurality of transformed first digital output vectors, and storing the plurality of transformed first digital output vectors in a memory unit.

In some cases, the artificial neural network may be specially designed and the digital input vector may be specifically formed such that the multi-wavelength product (multi-WAVELENGTH PRODUCT) of the multiplication module 3530 may be added without demultiplexing. In this case, the multiplication module 3530 may be a wavelength-insensitive (wavelength-insensitive) multiplication module that does not demultiplex multiple wavelengths of the multi-wavelength product. As such, each photodetector of the multiplication module 3530 effectively adds multiple wavelengths of the optical signal to a single photocurrent, and each voltage output by the multiplication module 3530 corresponds to the sum of the products of the vector elements and the matrix elements for the multiple wavelengths. The summing module 1808 (only one group is required) outputs element-by-element sums (element-by-element sum) of the matrix multiplication results of the plurality of digital input vectors.

Fig. 35C shows an example configuration of a system 3500 for an implementation of a dwdm electro-optical matrix multiplication unit 3520 for performing vector matrix multiplication using a2 x 2 matrix of elements, wherein a summing operation is performed in the electrical domain. In this embodiment, the input vector isAnd the matrix is In this embodiment, the input vector has a plurality of wavelengths λ ₁、λ₂ and λ ₃, and each element of the input vector is encoded on a different optical signal. Two different replication modules 1902 perform optical replication operations to separate computations on different paths (e.g., an "up" path and a "down" path). There are four multiplication modules 1904, each multiplication module 1904 multiplying a different matrix element using light amplitude modulation. The output of each multiplication module 1904 is provided to a demultiplexer and a set of photo detection modules 3310, the photo detection modules 3310 converting the wavelength division multiplexed optical signals into electrical signals in the form of currents associated with wavelengths lambda ₁、λ₂ and lambda ₃. The two upper paths of the different input vector elements are combined using a set of summing modules 3320 associated with wavelengths lambda ₁、λ₂ and lambda ₃, and the two lower paths of the different input vector elements are combined using a set of summing modules 3320 associated with wavelengths lambda ₁、λ₂ and lambda ₃, wherein the summing modules 3320 perform the summation in the electrical domain. Thus, each element of the output vector for each wavelength is encoded on a different electrical signal. As shown in fig. 35A, each component of the output vector is incrementally generated as the calculation proceeds to generate the following results of the upper and lower paths, respectively, for each wavelength.

M₁₁v₁+M₁₂v₂

M₂₁v₁+M₂₂v₂

System 3500 may be implemented using any of a variety of electro-optical technologies. In some embodiments, there is a common substrate (e.g., semiconductor (e.g., silicon)) that can support the integrated optical and electronic components. The optical path may be implemented in a waveguide structure having a material with a higher optical index surrounded by a material with a lower optical index (optical index), the material defining a waveguide for propagating light waves carrying optical signals. The electrical path may be implemented by an electrically conductive material for propagating an electrical current carrying an electrical signal. (in fig. 35C, the thickness of the lines representing the paths is used to distinguish between optical paths (represented by thicker lines) and electrical paths (represented by thinner lines or dashed lines)) optical devices (e.g., splitters and optical amplitude modulators), and electronic devices (e.g., photodetectors and operational amplifiers (op-amps)) may be fabricated on a common substrate. Alternatively, different devices with different substrates may be used to implement different portions of the system, and those devices may communicate over a communication channel. For example, optical fibers may be used to provide a communication channel to transmit optical signals between multiple devices used to implement an overall system. Those light signals may represent different subsets of input vectors provided when performing vector matrix multiplication and/or different subsets of intermediate results calculated when performing vector matrix multiplication, as described in more detail below.

Up to now, the nonlinear transformation of the weighted sum performed as part of the artificial neural network calculation is performed in the digital domain by the controller 110. In some cases, the nonlinear transformation may be computationally intensive (computationally intensive) or power hungry, significantly increasing the complexity of the controller 110, or limiting the performance of the artificial neural network computing system 3200 (fig. 32A) in terms of throughput or power efficiency. As such, in some embodiments of the artificial neural network computing system, the nonlinear transformation may be performed in the analog domain by analog electronics.

Fig. 36 illustrates a diagram of an example of an artificial neural network computing system 3600. The artificial neural network computing system 3600 is similar to the artificial neural network computing system 3200 except that the simulated nonlinear unit 310 is added. The analog nonlinear unit 310 is arranged between the photo matrix multiplication unit 3220 and the ADC unit 160. The analog nonlinear unit 310 is configured to receive an output voltage from the photo matrix multiplication unit 3220, apply a nonlinear transfer function, and output a converted output voltage to the ADC unit 160.

The analog nonlinear unit 310 may be implemented in various ways, as discussed above with respect to the analog nonlinear unit 310 of fig. 3A. The use of the analog nonlinear unit 310 may improve the performance, such as throughput or power efficiency, of the artificial neural network computing system 3600 by reducing steps performed in the digital domain. Shifting the nonlinear transformation step out of the digital domain may allow for additional flexibility and improvement in the operation of the artificial neural network computing system. For example, in a recurrent neural network, the output of the photo-matrix multiplication unit 3220 is activated and recycled back to the input of the photo-matrix multiplication unit 3220. The activation step is performed by the controller 110 in the artificial neural network computing system 3200, which requires digitizing the output voltage of the photo-matrix multiplication unit 3220 each time the photo-matrix multiplication unit 3220 is passed. However, since the activation step is now performed before the digitization of the ADC unit 160, the number of ADC conversions required in performing the recurrent neural network calculations can be reduced.

Fig. 37 shows a diagram of an example of an artificial neural network computing system 3700. The artificial neural network computing system 3700 is similar to the system 3600 of fig. 36, except that it further includes an analog storage unit 320. The analog storage unit 320 is coupled to the DAC unit 130 (e.g., via the first DAC subunit 132), the modulator array 144, and the analog nonlinear unit 310. The analog storage unit 320 includes a multiplexer having a first input coupled to the first DAC subunit 132 and a second input coupled to the analog nonlinear unit 310. This allows the analog storage unit 320 to receive signals from the first DAC subunit 132 or the analog nonlinear unit 310. The analog memory unit 320 is configured to store an analog voltage and output the stored analog voltage. The analog memory cell 320 may be implemented in various ways, as discussed above with respect to the analog memory cell 320 of fig. 3B.

The operation of the artificial neural network computing system 3700 will now be described. The first plurality of modulator control signals output by the DAC unit 130 (e.g., by the first DAC subunit 132) are first input to the modulator array 144 through the analog storage unit 320. In this step, analog storage unit 320 may simply pass or buffer the first plurality of modulator control signals. The modulator array 144 generates an optical input vector based on the first plurality of modulator control signals, which propagates through the opto-electronic matrix multiplication unit 3220. The output voltage of the photo-matrix multiplying unit 3220 is non-linearly transformed by the analog non-linear unit 310. At this time, instead of being digitized by the ADC unit 160, the output voltage of the analog nonlinear unit 310 is stored by the analog storage unit 320, which is then output to the modulator array 144 to be converted into the next optical input vector to be propagated through the photo matrix multiplication unit 3220. The recursive process may be performed under the control of the controller 110 for a preset amount of time or for a preset number of cycles (recurrent processing). Once the recursive process is completed for a given digital input vector, the converted output voltage of the analog nonlinear unit 310 is converted by the ADC unit 160.

The advantages of using analog memory cell 320 in system 3700 are similar to those of using analog memory cell 320 in system 302 of fig. 3B. Similarly, the performance of recurrent neural network calculations using system 3700 may be similar to system 302 of fig. 3B.

As discussed above with respect to the system 400 of FIG. 4A, the use of an artificial neural network computing system has the advantage of operating internally at a bit resolution lower than the resolution of the input data set while maintaining the resolution of the artificial neural network computing output. Referring to fig. 38, a diagram of an example of an artificial neural network (artificial neural network) computing system 3800 having a 1-bit internal resolution is shown. The artificial neural network computing system 3800 is similar to the artificial neural network computing system 3200 (fig. 32A), except that the DAC unit 130 is now replaced by the driver unit 430, and the ADC unit 160 is now replaced by the comparator unit 460. The driver unit 430 includes a first driver subunit 432 and a second driver subunit 434. The driver unit 430 of fig. 38 operates in a similar manner to the driver unit 430 of fig. 4A.

The driver unit 430 and the comparator unit 460 in the system 3800 of fig. 38 operate in a similar manner to the driver unit 430 and the comparator 460 in the system 400 of fig. 4A. The mathematical representation of the operation of the artificial neural network computing system 3800 in fig. 38 is similar to the mathematical representation of the operation of the artificial neural network computing system 400 shown in fig. 4A.

The artificial neural network computing system 3800 performs artificial neural network computations by performing a series of matrix multiplications of a 1-bit vector, followed by summing the individual matrix multiplication results. Using the example shown in fig. 4A, each of the decomposed input vectors V _bit0 to V _bit3 may be multiplied by a matrix U by generating a sequence of 4 1-bit modulator control signals corresponding to the 4 1-bit input vectors by the driver unit 430. This in turn produces a sequence of 4 1-bit optical input vectors, which are processed by the opto-electronic matrix multiplication unit 3220 configured by the driver unit 430 to effect matrix multiplication of the matrix U. Next, the controller 110 may derive a sequence of 4 digital 1-bit light outputs corresponding to the sequence of 4 1-bit modulator control signals from the comparator unit 460.

In the case of 4-bit vectors decomposed into 4 1-bit vectors, each vector should be processed by the artificial neural network computing system 3800 at four times the speed at which other artificial neural network computing systems (e.g., system 3200 (fig. 32A)) can process a single 4-bit vector to maintain the same effective artificial neural network computing throughput. This increased internal processing speed can be seen as time division multiplexing (time-division multiplexing) of 4 1-bit vectors into a single slot (timeslot) for processing the 4-bit vectors. The required increase in processing speed may be achieved at least in part by the increased operating speed of the driver unit 430 and the comparator unit 460 relative to the DAC unit 130 and the ADC unit 160, as a reduction in resolution of the signal conversion process generally results in an increase in the achievable signal conversion rate.

In this example, although the signal slew rate in 1-bit operation is increased by a factor of four, the resulting power consumption may be significantly reduced relative to 4-bit operation. As described above, the power consumption of the signal conversion process typically scales exponentially with bit resolution, while scaling linearly with the conversion rate. As such, each 16-fold reduction in conversion power may be due to a 4-fold reduction in bit resolution, followed by a 4-fold increase in power due to an increase in conversion rate. In summary, a 4-fold reduction in operating power may be achieved over, for example, the artificial neural network computing system 3200 by the artificial neural network computing system 3800, while maintaining the same effective artificial neural network computing throughput.

While a separate artificial neural network computing system 3800 has been shown and described, in general, the artificial neural network computing system 3200 of fig. 32A may be designed to implement functions similar to the artificial neural network computing system 3800. For example, DAC unit 130 may include a 1-bit DAC subunit configured to generate a 1-bit modulator control signal, and ADC unit 160 may be designed to have a resolution of 1 bit. Such a 1-bit ADC may be similar to or effectively identical to a comparator.

Various alternative system configurations or signal processing techniques may be used with the various embodiments of the different systems, subsystems, and modules described herein.

In some embodiments, it may be useful for some or all of the vector matrix multiplier subsystems to be replaced with alternative subsystems, including subsystems of different embodiments using various duplication, multiplication, and/or summation modules. For example, the vector matrix multiplier subsystem may include the optical replication module described herein and the electrical summing module described herein, but the multiplication module may be replaced with a subsystem that performs multiplication operations in the electrical domain instead of the optical domain. In this example, the array of optical amplitude modulators may be replaced by an array of detectors to convert the optical signals to electrical signals, followed by an electronic subsystem (e.g., an ASIC, processor, or SoC). Alternatively, if the optical signal wiring (optical signal routing) is to be used for a summing module configured to detect optical signals, the electronics subsystem may include electro-optic conversion using, for example, an array of electrically modulated light sources (array of electrically-modulated optical sources).

In some embodiments, it may be useful to be able to use a single wavelength for some or all of the optical signals used for some or all of the vector matrix multiplier calculations. Alternatively, in some embodiments, to help reduce the number of optical input ports that may be required, the input ports may receive multiplexed optical signals having different values encoded on different optical waves of different wavelengths. Those light waves may then be separated at appropriate locations in the system, depending on whether any of the replication module, multiplication module, and/or summation module are configured to operate over multiple wavelengths. Even in multi-wavelength embodiments, however, it may be useful to use the same wavelength, for example, for different subsets of optical signals used in the same vector matrix multiplier subsystem.

In some embodiments, an accumulator may be used to implement time-domain encoding of optical and electrical signals received by the various modules, thereby alleviating the need for electronic circuitry to operate efficiently at a large number of different power levels. For example, a signal encoded using binary (on-off) amplitude modulation having a particular duty cycle over N time slots per symbol may be converted into a signal having N amplitude levels per symbol after the signal passes through an accumulator (an analog electronic accumulator that combines the current or voltage of the electrical signal). Thus, if optical devices (e.g., phase modulators in an optical amplitude modulator) are capable of operating at symbol bandwidth (symbol bandwidth) B, they may instead operate at symbol bandwidth B/100, where each symbol value uses n=100 slots. The 50% integrated amplitude has a 50% duty cycle (e.g., the first 50 slots are at a non-zero "on" level followed by 50 slots at a zero or near zero "off" level), while the 10% integrated amplitude has a 10% duty cycle (e.g., the first 10 slots are at a non-zero "on" level followed by 90 slots are at a zero "off" level). In the examples described herein, such an accumulator may be positioned in the path of each electrical signal anywhere within the vector matrix multiplier subsystem that is consistent for each electrical signal, e.g., before or after the summing module for all electrical signals in the vector matrix multiplier subsystem. The vector matrix multiplier subsystem may also be configured such that there is no significant relative time shift between the different electrical signals that maintain alignment of the different symbols.

Referring to fig. 40, homodyne detection may be used to derive the phase and amplitude of the modulated signal in some embodiments. Homodyne detector 4000 comprises a beam splitter 4002 comprising a2×2 multimode interference (MMI) coupler, two photodetectors 4004a and 4004b, and a subtractor 4006. The beam splitter 4002 receives input signals E1 and E2, and the output of the beam splitter 4002 is detected by photodetectors 4004a and 4004 b. For example, the input signal E1 may be a signal to be detected, and the input signal E2 may be generated by a local oscillator having a constant laser power. The local oscillator signal E2 is mixed with the input signal E1 by the beam splitter 4002 before the signals are detected by the photodetectors 4004a and 4004 b. The subtractor 4006 outputs a difference between outputs of the photodetectors 4004a and 4004 b. The output 4008 of subtractor 4006 is proportional to |e ₁||E₂ | sin (θ), where |e ₁ | and |e ₂ | are the magnitudes of the two input light fields (input optical field), θ being their relative phases. Since the output is related to the product of the two light fields, extremely weak optical signals can be detected even at a single photon level.

For example, homodyne detector 4000 may be used in the systems shown in fig. 1A, 1F, 3A-4A, 5, 7, 9, 18-24E, 26-32B, and 35A-38. Homodyne detector 4000 provides gain (gain) on the signal and thus a better signal-to-noise ratio (signal noise ratio). For coherent systems, homodyne detector 4000 provides the added benefit of revealing the phase information of the signal by detecting the polarity of the result.

In the embodiment of fig. 19B, the configuration of system 1920 includes a 2 x 2 matrix of elements, where two input vector elements are encoded on two optical signals using two different respective wavelengths λ ₁ and λ ₂. Two optical signals may be provided to a configuration of the system 1920, for example, using two optical fibers. For example, a system that performs matrix processing on a 4 x 4 matrix may receive four input optical signals carried on four optical fibers. While more fibers can be used to carry more input optical signals for systems handling larger matrices, it is difficult to couple a large number of fibers to an optoelectronic chip because the coupling between the fibers and the optoelectronic chip takes up a considerable amount of space.

One way to reduce the number of optical fibers required to carry the optical signals to the optoelectronic chip is to use wavelength division multiplexing. A single optical fiber may be used to multiplex and transmit multiple optical signals having different wavelengths. For example, referring to fig. 41, in computing system 4100, a first optical signal 4102 having a wavelength λ ₁ is modulated by a first modulator 4104 to produce a first modulated optical signal 4120 representative of a first input vector element V1. The second optical signal 4106 having a wavelength lambda ₂ is modulated by a second modulator 4108 to produce a second modulated optical signal 4122 representing a second input vector element V2. The first and second modulated optical signals are combined by a multiplexer 4110 to produce a wavelength division multiplexed signal that is transmitted over optical fiber 4112 to an optical-to-electrical chip 4114, the optical-to-electrical chip 4114 comprising a plurality of matrix multiplication modules 4116a, 4116b, 4116c, and 4116d (collectively 4116) and 4118a, 4118b, 4118c, and 4118d (collectively 4118).

Inside the optoelectronic chip 4114, the wavelength division multiplexed signal is demultiplexed by a demultiplexer 4118 to separate an optical signal 4120 and an optical signal 4122. In this example, optical signal 4120 is replicated by replication module 4124 to produce a replica of the optical signal that is transmitted to matrix multiplication modules 4116a and 4118 a. Optical signal 4122 is replicated by replication module 4126 to produce a replica of the optical signal that is transmitted to matrix multiplication modules 4116b and 4118 b. The outputs of the matrix multiplication modules 4116a and 4116b are combined using an optical coupler 4120a, and the combined signal is detected by a photodetector 4122 a.

The third optical signal 4124' having the wavelength lambda ₁ is modulated by the third modulator 4128 to produce a third modulated optical signal 4132 representing the third input vector element V3. The fourth optical signal 4126' having a wavelength lambda ₂ is modulated by a fourth modulator 4130 to produce a fourth modulated optical signal 4134 representing a fourth input vector element V4. The third and fourth modulated optical signals are combined by multiplexer 4136 to produce a wavelength division multiplexed signal that is transmitted through optical fiber 4138 to opto-electronic chip 4114.

Inside the optoelectronic chip 4114, the wavelength division multiplexed signal provided by the optical fiber 4138 is demultiplexed by a demultiplexer 4140 to separate the optical signals 4132 and 4134. In this example, optical signal 4132 is replicated by replication module 4142 to produce a replica of the optical signal that is transmitted to matrix multiplication modules 4116c and 4118 c. Optical signal 4134 is replicated by replication module 4144 to produce a replica of the optical signal that is transmitted to matrix multiplication modules 4116d and 4118 d. The outputs of the matrix multiplication units 4116c and 4116d are combined using the photo coupler 4120b, and the combined signal is detected by the photo detector 4122 b. The outputs of the matrix multiplication units 4118a and 4118b are combined using an optical coupler, and the combined signal is detected by a photodetector. The outputs of the matrix multiplication units 4118c and 4118d are combined using an optical coupler, and the combined signal is detected by a photodetector.

In some embodiments, the multiplexer may multiplex optical signals having three or more (e.g., 10 or 100) wavelengths to produce wavelength division multiplexed signals that are transmitted by a single optical fiber, and the demultiplexer inside the optoelectronic chip may demultiplex the wavelength division multiplexed signals to separate signals having different wavelengths. This allows more optical signals to be transmitted in parallel through the optical fiber to the optoelectronic chip, increasing the data processing throughput of the optoelectronic chip.

In some examples, the laser unit 142 of fig. 1A includes a single laser that provides a light wave that can be modulated with different optical signals. In that case, the light waves in the individual waveguides of the system have a common wavelength that is approximately the same as each other within the resolution of the linewidth of the laser light. For example, the light waves may have wavelengths within 1nm of each other. However, the laser unit 142 may also include multiple lasers capable of performing wavelength division multiplexing operations using different optical signals modulated onto different respective optical waves (e.g., each having a linewidth of 1nm or less). The different light waves may have peak wavelengths that are separated from each other by a wavelength distance (e.g., greater than 1 nm) that is greater than the linewidth of the individual lasers. In some examples, wavelength division multiplexing systems may use optical signals modulated onto optical waves having wavelengths of a few nanometers (e.g., 3nm or more). However, if the demultiplexer has a better resolution, the difference between the different wavelengths in the WDM system can also be less than 3nm.

In some embodiments, instead of using solder balls to electrically couple the photonic integrated circuit and the electronic integrated circuit as shown in fig. 46 and 47, contact pins may also be used. For example, a land grid array (LAND GRID ARRAY) or a pin grid array (PIN GRID ARRAY) may be used to electrically couple the photonic integrated circuit and the electronic integrated circuit.

Referring to fig. 48, in some embodiments, the artificial neural network computing system 4900 includes a first semiconductor die having a first photonic integrated circuit 4602, a second semiconductor die having a first electronic integrated circuit 4702, and a third semiconductor die having a second electronic integrated circuit 4802. The second electronic integrated circuit 4802 is coupled to the first electronic integrated circuit 4702 using a controlled collapse chip connection.

Referring to fig. 49, in some embodiments, the artificial neural network computing system 4900 includes a first semiconductor die having a first photonic integrated circuit 4602, a second semiconductor die having a first electronic integrated circuit 4702, a third semiconductor die having a second electronic integrated circuit 4802, and a fourth semiconductor die having a second photonic integrated circuit 4902. The second photonic integrated circuit 4902 is coupled to the second electronic integrated circuit 4802 using stacked chip connections. In addition, optical fiber 4904 may provide an optical path for optical signals to be transmitted between first photonic integrated circuit 4602 and second photonic integrated circuit 4902.

The digital controllers (e.g., for controlling the components shown in FIG. 24E) and functional operations described in this disclosure may be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures in this disclosure and their structural equivalents, or in combinations of one or more of them. Embodiments of the subject matter described in this disclosure can be implemented using one or more modules of computer program instructions encoded on a computer-readable medium to perform or control the operation of a data processing apparatus. The computer readable medium may be an article of manufacture (e.g., a hard disk drive in a computer system or an optical disk sold through a retail pipeline) or an embedded system. The computer readable medium may separately acquire and then encode one or more computer program instruction modules, for example, by transmitting the one or more computer program instruction modules over a wired or wireless network. The computer readable medium may be a machine readable storage device, a machine readable storage substrate, a storage device, or a combination of one or more of them.

The processes and logic flows described in this disclosure can be performed by one or more programmable processors (programmable processor) executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry (special purpose logic circuitry), e.g., a field programmable gate array (field programmable GATE ARRAY; FPGA) or an application-specific integrated circuit (ASIC).

While the disclosure has been described in connection with certain embodiments, it is to be understood that the disclosure is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims, which scope is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures as is permitted under the law.

For example, fig. 42 shows probability distribution functions for data sets where small coefficients occur more frequently. In another example, assume that the dataset has a property that the Probability Distribution Function (PDF) of the coefficients yields a higher probability (and thus more frequent instances) for large coefficients (i.e., coefficients having relatively large absolute values). For such data sets ("high coefficient weighted data sets"), reduced power consumption may be achieved by designing the modulator such that the modulator operates in a lower power state for computation using larger coefficients (which occur more often in the data set) and in a high power state using smaller coefficients (which occur less frequently in the data set).

Some background information for the various systems described in this specification is disclosed in U.S. provisional application 62/680,944 filed on 5 th month 2018, U.S. provisional application 62/744,706 filed on 12 th month 10 2018, and U.S. application 16/431,167 filed on 4 th month 6 2019. The entire disclosure of the above application is incorporated herein by reference.

For example, the optical replication distribution network may include a plurality of optical splitters, a plurality of directional couplers, or both. For example, an optical replication distribution network may include a cascaded directional coupler having N output ports, where each output port outputs 1/N of the input power to the optical replication distribution network.

Although the present disclosure is defined in the appended claims, it should be understood that the present disclosure may also be defined in terms of the following embodiments:

Embodiment 1a system comprising:

A storage unit configured to store a data set and a plurality of neural network weights;

A digital-to-analog converter (DAC) unit configured to generate a plurality of modulator control signals and to generate a plurality of weight control signals;

An optical processor, comprising:

a laser unit configured to generate a plurality of light outputs;

A plurality of light modulators coupled to the laser unit and the DAC unit, the plurality of light modulators configured to generate light input vectors by modulating a plurality of light outputs generated by the laser unit based on a plurality of modulator control signals;

The light source device comprises a light matrix multiplication unit, a photoelectric detection unit and a light detection unit, wherein the light matrix multiplication unit is coupled to the light modulators and the DAC unit, and is configured to convert light input vectors into light output vectors based on a plurality of weight control signals;

An analog-to-digital converter (ADC) unit coupled to the photodetector unit and configured to convert the plurality of output voltages into a plurality of digital light outputs;

A controller comprising an integrated circuit configured to:

Receiving an artificial neural network calculation request from a computer comprising an input dataset and a first plurality of neural network weights, wherein the input dataset comprises a first digital input vector;

storing the input data set and the first plurality of neural network weights in a memory unit, and

The method includes generating, by a DAC unit, a first plurality of modulator control signals based on a first digital input vector, and generating a first plurality of weight control signals based on a first plurality of neural network weights.

Embodiment 2 the system of embodiment 1, wherein the operations further comprise:

obtaining a first plurality of digital light outputs from the ADC unit corresponding to the light output vectors of the light matrix multiplication unit, the first plurality of digital light outputs forming a first digital output vector;

Performing a nonlinear transformation on the first digital output vector to produce a first transformed digital output vector, and

The first transformed digital output vector is stored in a memory unit.

Embodiment 3 the system of embodiment 2 wherein the system has a first cycle period defined as the time elapsed between the step of storing the input data set and the first plurality of neural network weights in the memory unit and the step of storing the first transformed digital output vector in the memory unit, and

Wherein the first cycle period is less than or equal to 1ns.

Embodiment 4 the system of embodiment 2, wherein the operations further comprise:

the output is based on an artificial neural network output generated by the first transformed digital output vector.

Embodiment 5 the system of embodiment 2, wherein the operations further comprise:

a second plurality of modulator control signals is generated by the DAC unit based on the first transformed digital output vector.

Embodiment 6 the system of embodiment 2 wherein the artificial neural network computation request further includes a second plurality of neural network weights, and

Wherein the operations further comprise:

Based on the obtaining of the first plurality of digital light outputs, a second plurality of weight control signals is generated by the DAC unit based on the second plurality of neural network weights.

Embodiment 7 the system of embodiment 6 wherein the first plurality of neural network weights and the second plurality of neural network weights correspond to different layers of an artificial neural network.

Embodiment 8 the system of embodiment 2 wherein the input data set further comprises a second digital input vector, and

Wherein the operations further comprise:

Generating, by the DAC unit, a second plurality of modulator control signals based on the second digital input vector;

obtaining a second plurality of digital light outputs from the ADC unit corresponding to the light output vectors of the light matrix multiplication unit, the second plurality of digital light outputs forming a second digital output vector;

performing a nonlinear transformation on the second digital output vector to produce a second transformed digital output vector;

Storing the second transformed digital output vector in a memory unit, and

The output is based on an artificial neural network output generated by the first transformed digital output vector and the second transformed digital output vector,

Wherein the light output vector of the light matrix multiplication unit is generated by a second light input vector generated based on a second plurality of modulator control signals, the second light input vector being transformed by the light matrix multiplication unit based on the first mentioned plurality of weight control signals.

Embodiment 9 the system of embodiment 1, further comprising:

An analog nonlinear unit disposed between the photodetecting unit and the ADC unit, the analog nonlinear unit configured to receive a plurality of output voltages from the photodetecting unit, apply a nonlinear transfer function, and output a plurality of converted output voltages to the ADC unit,

Wherein the operations further comprise:

Obtaining a first plurality of converted digital output voltages from the ADC unit, the first plurality of converted digital output voltages forming a first converted digital output vector, and

The first transformed digital output vector is stored in a memory unit.

Embodiment 10 the system of embodiment 1, wherein the integrated circuit of the controller is configured to generate the first plurality of modulator control signals at a rate greater than or equal to 8 GHz.

Embodiment 11 the system of embodiment 1, further comprising:

an analog storage unit disposed between the DAC unit and the plurality of optical modulators, the analog storage unit configured to store an analog voltage and output the stored analog voltage, and

An analog nonlinear unit disposed between the photodetecting unit and the ADC unit, the analog nonlinear unit configured to receive a plurality of output voltages from the photodetecting unit, apply a nonlinear transfer function, and output a plurality of converted output voltages.

Embodiment 12 the system of embodiment 11 wherein the analog memory cell comprises a plurality of capacitors.

Embodiment 13 the system of embodiment 11 wherein the analog storage unit is configured to receive and store a plurality of converted output voltages of the analog nonlinear unit and output the stored plurality of converted output voltages to the plurality of optical modulators, an

Wherein the operations further comprise:

storing a plurality of converted output voltages of the analog nonlinear unit in an analog storage unit based on generating the first plurality of modulator control signals and the first plurality of weight control signals;

outputting the stored converted output voltage through the analog storage unit;

Obtaining a second plurality of converted digital output voltages from the ADC unit, the second plurality of converted digital output voltages forming a second converted digital output vector, and

The second transformed digital output vector is stored in the memory unit.

Embodiment 14 the system of embodiment 1, wherein the artificial neural network computing request input data set includes a plurality of digital input vectors,

Wherein the laser unit is configured to generate a plurality of wavelengths,

Wherein the plurality of light modulators comprises:

a plurality of light modulator banks (bank) configured to generate a plurality of light input vectors, each light modulator bank corresponding to one of a plurality of wavelengths and generating a respective light input vector having a respective wavelength, and

An optical multiplexer configured to combine the plurality of optical input vectors into a combined optical input vector comprising a plurality of wavelengths,

Wherein the photodetector unit is further configured to demultiplex a plurality of wavelengths and produce a plurality of demultiplexed output voltages, an

Wherein the operations include:

Obtaining a plurality of digital demultiplexed optical outputs from the ADC unit, the plurality of digital demultiplexed optical outputs forming a plurality of first digital output vectors, wherein each of the plurality of first digital output vectors corresponds to one of a plurality of wavelengths;

Performing a nonlinear transformation on each of the plurality of first digital output vectors to produce a plurality of transformed first digital output vectors, and

A plurality of transformed first digital output vectors are stored in a memory unit,

Wherein each of the plurality of digital input vectors corresponds to one of the plurality of optical input vectors.

Embodiment 15 the system of embodiment 1, wherein the artificial neural network computation request includes a plurality of digital input vectors,

Wherein the laser unit is configured to generate a plurality of wavelengths,

Wherein the plurality of light modulators comprises:

a plurality of light modulator groups configured to generate a plurality of light input vectors, each light modulator group corresponding to one of the plurality of wavelengths and generating a respective light input vector having a respective wavelength, and

An optical multiplexer configured to combine the plurality of optical input vectors into a combined optical input vector comprising a plurality of wavelengths, and

Wherein the operations include:

obtaining a first plurality of digital light outputs corresponding to the light output vector from the ADC unit, the light output vector comprising a plurality of wavelengths, the first plurality of digital light outputs forming a first digital output vector;

The first transformed digital output vector is stored in a memory unit.

Embodiment 16 the system of embodiment 1, wherein the DAC unit comprises:

a 1-bit DAC subunit configured to generate a plurality of 1-bit modulator control signals,

Where the resolution of the ADC unit is 1 bit,

Wherein the first digital input vector has a resolution of N bits, and

Wherein the operations include:

Decomposing the first digital input vector into N1-bit input vectors, each of the N1-bit input vectors corresponding to one of the N bits of the first digital input vector;

Generating a sequence of N1-bit modulator control signals corresponding to the N1-bit input vectors by a 1-bit DAC subunit;

Obtaining a sequence of N digital 1-bit optical outputs from the ADC unit corresponding to the sequence of N1-bit modulator control signals;

Constructing an N-bit digital output vector from a sequence of N digital 1-bit optical outputs;

Performing a nonlinear transformation on the constructed N-bit digital output vector to produce a transformed N-bit digital output vector, and

The transformed N-bit digital output vector is stored in a memory unit.

Embodiment 17 the system of embodiment 1, wherein the storage unit comprises:

A digital input vector memory configured to store a first digital input vector and comprising at least one SRAM, and

The neural network weight memory is configured to store a plurality of neural network weights and includes at least one DRAM.

Embodiment 18 the system of embodiment 1, wherein the DAC unit comprises:

a first DAC subunit configured to generate a plurality of modulator control signals, and

A second DAC subunit configured to generate a plurality of weight control signals,

Wherein the first DAC subunit and the second DAC subunit are different.

Embodiment 19 the system of embodiment 1, wherein the laser unit comprises:

A laser source configured to generate light, and

An optical power splitter is configured to split light generated by the laser source into a plurality of light outputs, wherein each of the plurality of light outputs has substantially the same power.

Embodiment 20 the system of embodiment 1 wherein the plurality of optical modulators comprises one of a MZI modulator, a ring resonator modulator, or an electroabsorption modulator.

Embodiment 21 the system of embodiment 1, wherein the photodetection unit comprises:

a plurality of photodetectors, and

And a plurality of amplifiers configured to convert photocurrents generated by the photodetectors into a plurality of output voltages.

Embodiment 22 the system of embodiment 1 wherein the integrated circuit is an application specific integrated circuit.

Embodiment 23 the system of embodiment 1, wherein the optical matrix multiplication unit comprises:

An input waveguide array for receiving an optical input vector;

An optical interference unit in optical communication with the input waveguide array for performing a linear transformation of the optical input vector into a second array of optical signals, and

An array of output waveguides in optical communication with the optical interference unit for guiding a second array of optical signals, wherein at least one input waveguide in the array of input waveguides is in optical communication with each output waveguide in the array of output waveguides through the optical interference unit.

Embodiment 24 the system of embodiment 23, wherein the optical interference unit comprises:

a plurality of interconnected mach-zehnder interferometers (MZI), each of the plurality of interconnected mach-zehnder interferometers comprising:

a first phase shifter configured to change a spectral ratio of the Mach-Zehnder interferometer, and

A second phase shifter configured to shift the phase of one output of the Mach-Zehnder interferometer,

Wherein the first phase shifter and the second phase shifter are coupled to a plurality of weight control signals.

Embodiment 25 a system comprising:

A driver unit configured to generate a plurality of modulator control signals and to generate a plurality of weight control signals;

An optical processor, comprising:

a laser unit configured to generate a plurality of light outputs;

A plurality of light modulators coupled to the laser unit and the driver unit, the plurality of light modulators configured to generate light input vectors by modulating a plurality of light outputs generated by the laser unit based on a plurality of modulator control signals;

an optical matrix multiplication unit coupled to the plurality of light modulators and the driver unit, the optical matrix multiplication unit configured to convert the optical input vector into an optical output vector based on the weight control signal, and

A photo detection unit coupled to the optical matrix multiplication unit and configured to generate a plurality of output voltages corresponding to the optical output vectors;

a comparator unit coupled to the photo detection unit and configured to convert the plurality of output voltages into a plurality of digital 1-bit optical outputs, and

A controller comprising an integrated circuit configured to:

Receiving an artificial neural network calculation request from a computer comprising an input data set and a first plurality of neural network weights, wherein the input data set comprises a first digital input vector having an N-bit resolution;

storing the input data set and the first plurality of neural network weights in a storage unit;

Generating, by a driver unit, a sequence of N1-bit modulator control signals corresponding to the N1-bit input vectors;

obtaining a sequence of N digital 1-bit optical outputs corresponding to the sequence of N1-bit modulator control signals from the comparator unit;

The transformed N-bit digital output vector is stored in a memory unit.

Embodiment 26 a method for performing artificial neural network computations in a system having an optical matrix multiplication unit configured to convert an optical input vector into an optical output vector based on a plurality of weight control signals, the method comprising:

Generating, by a digital-to-analog conversion (DAC) unit, a first plurality of modulator control signals based on a first digital input vector, and a first plurality of weight control signals based on a first plurality of neural network weights;

Obtaining a first plurality of digital light outputs from an analog-to-digital conversion (ADC) unit corresponding to the light output vectors of the light matrix multiplication unit, the first plurality of digital light outputs forming a first digital output vector;

Performing, by the controller, a nonlinear transformation on the first digital output vector to produce a first transformed digital output vector;

storing the first transformed digital output vector in a memory unit, and

An artificial neural network output generated based on the first transformed digital output vector is output by the controller.

Embodiment 27 a method comprising:

Providing input information in an electronic format;

Converting at least a portion of the electronic input information into an optical input vector;

optically converting the light input vector into a light output vector based on the light matrix multiplication;

Converting the light output vector into an electronic format, and

A nonlinear transformation is electronically applied to the electronically converted light output vector to provide output information in an electronic format.

Embodiment 28 the method of embodiment 27, further comprising:

For new electronic input information corresponding to output information provided in electronic format, electro-optical-to-optical converting, optical-to-optical transforming, optical-to-electronic converting, and non-linear transformation of electrical applications are repeated.

Embodiment 29 the method of embodiment 28 wherein the optical matrix multiplication for the initial optical transformation and the optical matrix multiplication for the repeated optical transformation are the same and correspond to the same layer of the artificial neural network.

Embodiment 30 the method of embodiment 28 wherein the optical matrix multiplication for the initial optical transformation and the optical matrix multiplication for the repeated optical transformation are different and correspond to different layers of the artificial neural network.

Embodiment 31 the method of embodiment 27, further comprising:

For different parts of the electronically input information, the electro-optical conversion, the photoelectric conversion and the non-linear conversion of the electrical application are repeated,

Wherein the optical matrix multiplication for the initial optical transformation and the optical matrix multiplication for the repeated optical transformation are identical and correspond to the first layer of the artificial neural network.

Embodiment 32 the method of embodiment 31, further comprising:

providing intermediate information in an electronic format based on electronic output information for a plurality of portions of the electronic input information generated by a first layer of the artificial neural network, and

For each different portion of the electronic intermediate information, the electro-optical conversion, the electro-optical conversion and the non-linear conversion of the electrical application are repeated,

Wherein the optical matrix multiplication for the initial optical transformation and the optical matrix multiplication for the repeated optical transformation associated with different parts of the electronic intermediate information are identical and correspond to the second layer of the artificial neural network.

Embodiment 33 a system comprising:

An optical processor comprising a passive diffractive optical element, wherein the passive diffractive optical element is configured to transform an optical input vector or matrix into an optical output vector or matrix, which represents the result of a matrix process applied to the optical input vector or matrix and a predetermined vector defined by the arrangement of the diffractive optical element.

Embodiment 34 the system of embodiment 33, wherein the matrix processing comprises matrix multiplication between the light input vector or matrix and a predetermined vector defined by the arrangement of diffractive optical elements.

Embodiment 35 the system of embodiment 33, wherein the light processor comprises a light matrix processing unit comprising:

an input waveguide array for receiving the light input vector,

An optical interference unit comprising a passive diffractive optical component, wherein the optical interference unit is in optical communication with the input waveguide array and is configured to perform a linear transformation of the optical input vector into a second array of optical signals, and

An array of output waveguides in optical communication with the optical interference unit for guiding a second array of optical signals, wherein at least one input waveguide of the array of input waveguides is in optical communication with each output waveguide of the array of output waveguides through the optical interference unit.

Embodiment 36 the system of embodiment 35, wherein the optical interference unit comprises a substrate having at least one of holes or strips, the holes having a size in the range of 100nm to 10 μm and the strips having a width in the range of 100nm to 10 μm.

Embodiment 37 the system of embodiment 35, wherein the optical interference unit comprises a substrate having passive diffractive optical elements arranged in a two-dimensional configuration, and the substrate comprises at least one of a planar substrate or a curved substrate.

Embodiment 38 the system of embodiment 37, wherein the substrate comprises a planar substrate that is parallel to a direction of light propagation from the input waveguide array to the output waveguide array.

Embodiment 39 the system of embodiment 33, wherein the light processor comprises a light matrix processing unit comprising:

An input waveguide matrix for receiving the optical input matrix,

An optical interference unit comprising a passive diffractive optical component, wherein the optical interference unit is in optical communication with the input waveguide matrix and is configured to perform a linear transformation of the optical input matrix into a second optical signal matrix, and

An output waveguide matrix in optical communication with the optical interference unit for guiding a second optical signal matrix, wherein at least one input waveguide of the input waveguide matrix is in optical communication with each of the output waveguides of the output waveguide matrix through the optical interference unit.

Embodiment 40 the system of embodiment 39, wherein the optical interference unit comprises a substrate having at least one of holes or strips (strips), the holes having a size in the range of 100nm to 10 μm and the strips having a width in the range of 100nm to 10 μm.

Embodiment 41 the system of embodiment 39, wherein the optical interference unit comprises a substrate having passive diffractive optical elements arranged in a three-dimensional configuration.

Embodiment 42 the system of embodiment 41, wherein the substrate has a shape of at least one of a cube, a column, a prism, or an irregular volume.

Embodiment 43 the system of embodiment 39, wherein the light processor comprises a light interference unit comprising a hologram (hologram) having a passive diffractive optical element, the light processor configured to receive modulated light representing the light input matrix and to continuously convert the light as it passes through the hologram until the light emerges from the hologram as the light output matrix.

Embodiment 44 the system of embodiment 35 or 39, wherein the optical interference unit comprises a substrate having a passive diffractive optical element, and the substrate comprises at least one of silicon, silicon oxide, silicon nitride, quartz, lithium niobate, a phase change material, or a polymer.

Embodiment 45 the system of embodiment 35 or 39, wherein the optical interference unit comprises a substrate having a passive diffractive optical element, and the substrate comprises at least one of a glass substrate or an acrylic substrate.

Embodiment 46 the system of embodiment 33, wherein the passive diffractive optical element is formed in part from a dopant.

Embodiment 47 the system of embodiment 33 wherein the matrix processing represents processing of input data by a neural network, the input data represented by an optical input vector.

Embodiment 48 the system of embodiment 33, wherein the light processor comprises:

a laser unit configured to generate a plurality of light outputs;

A plurality of light modulators coupled to the laser unit and configured to generate light input vectors by modulating a plurality of light outputs generated by the laser unit based on a plurality of modulator control signals;

An optical matrix processing unit coupled to the plurality of optical modulators, the optical matrix processing unit comprising a passive diffractive optical element configured to convert an optical input vector into an optical output vector based on a plurality of weights defined by the passive diffractive optical element, and

A photo detection unit coupled to the optical matrix processing unit and configured to generate a plurality of output electrical signals corresponding to the optical output vectors.

Embodiment 49 the system of embodiment 48, wherein the passive diffractive optical element is arranged in a three-dimensional configuration, the plurality of light modulators comprises a two-dimensional array of light modulators, and the photodetecting unit comprises a two-dimensional array of photodetectors.

Embodiment 50 the system of embodiment 48, wherein the optical matrix processing unit includes a housing module (housing module) to support and protect the input waveguide array, the optical interference unit, and the output waveguide array, and

The optical processor includes a receiving module configured to receive the optical matrix processing unit, the receiving module including a first interface (interface) that enables the optical matrix processing unit to receive the optical input vectors from the plurality of optical modulators, and a second interface that enables the optical matrix processing unit to transmit the optical output vectors to the photodetecting unit.

Embodiment 51 the system of embodiment 48, wherein the output electrical signal comprises at least one of a plurality of voltage signals or a plurality of current signals.

Example 52 the system of example 48, further comprising:

A storage unit;

A digital-to-analog conversion (DAC) unit configured to generate a plurality of modulator control signals;

an analog-to-digital conversion (ADC) unit coupled to the photodetector unit and configured to convert the plurality of output electrical signals into a plurality of digital outputs, and

A controller comprising an integrated circuit configured to:

Receiving an artificial neural network calculation request from a computer comprising an input dataset, wherein the input dataset comprises a first digital input vector;

storing the input data set in a memory cell, and

A first plurality of modulator control signals is generated by a DAC unit based on a first digital input vector.

Embodiment 53 a method comprising:

the 3D printing comprises an optical matrix processing unit of a passive diffractive optical element, wherein the passive diffractive optical element is configured to transform the optical input vector or matrix into an optical output vector or matrix, which represents the result of a matrix processing applied to the optical input vector or matrix and a predetermined vector defined by the arrangement of the diffractive optical element.

Embodiment 54 a method comprising:

A hologram comprising a passive diffractive optical element is generated using one or more laser beams, wherein the passive diffractive optical element is configured to transform a light input vector or matrix into an output vector or matrix representing the result of a matrix process applied to the light input vector or matrix and a predetermined vector defined by the arrangement of the diffractive optical element.

Embodiment 55 a system comprising:

An optical processor comprising passive diffractive optical elements arranged in a one-dimensional manner, wherein the passive diffractive optical elements are configured to convert an optical input into an optical output representing the result of a matrix process applied to the optical input and a predetermined vector defined by the arrangement of the diffractive optical elements.

Embodiment 56 the system of embodiment 55, wherein the matrix processing comprises matrix multiplication between the light input and a predetermined vector defined by the arrangement of diffractive optical elements.

Embodiment 57 the system of embodiment 55, wherein the light processor comprises a light matrix processing unit comprising:

an input waveguide for receiving an optical input,

An optical interference unit comprising a passive diffractive optical component, wherein the optical interference unit is in optical communication with the input waveguide and is configured to perform a linear transformation of the optical input, and

An output waveguide in optical communication with the optical interference unit for guiding the light output.

Embodiment 58 the system of embodiment 57, wherein the optical interference unit comprises a substrate having at least one of holes or gratings (gratings) having a size in a range of 100nm to 10 μm.

Embodiment 59 a system comprising:

A storage unit;

An optical processor, comprising:

a laser unit configured to generate a plurality of light outputs;

A photo detection unit coupled to the optical matrix processing unit and configured to generate a plurality of output electrical signals corresponding to the optical output vectors;

an analog-to-digital conversion (ADC) unit coupled to the photodetecting unit and configured to convert the output plurality of electrical signals into a plurality of digital light outputs;

A controller comprising an integrated circuit configured to:

storing the input data set in a memory cell, and

Embodiment 60 the system of embodiment 59, wherein the matrix processing unit includes a passive diffractive optical element configured to convert the light input vector into a light output vector, the light output vector representing a product of a matrix multiplication between the light input vector and a predetermined vector defined by the passive diffractive optical element.

Embodiment 61 the system of embodiment 59, wherein the operations further comprise:

Obtaining a first plurality of digital light outputs from the ADC unit corresponding to the light output vectors of the light matrix processing unit, the first plurality of digital light outputs forming a first digital output vector;

The first transformed digital output vector is stored in a memory unit.

Embodiment 62 the system of embodiment 61, wherein the system has a first cycle period defined as the time elapsed between the step of storing the input data set in the memory unit and the step of storing the first transformed digital output vector in the memory unit, and

Wherein the first cycle period is less than or equal to 1ns.

Embodiment 63 the system of embodiment 61, wherein the operations further comprise:

Embodiment 64 the system of embodiment 61, wherein the operations further comprise:

Embodiment 65 the system of embodiment 61 wherein the input data set further comprises a second digital input vector, and

Wherein the operations further comprise:

obtaining a second plurality of digital light outputs from the ADC unit corresponding to the light output vectors of the light matrix processing unit, the second plurality of digital light outputs forming a second digital output vector;

Storing the second transformed digital output vector in a memory unit, and

Wherein the light output vector of the light matrix processing unit is generated by a second light input vector generated based on a second plurality of modulator control signals, the second light input vector being converted by the light matrix multiplying unit based on a plurality of weights defined by the passive diffractive optical element.

Example 66 the system of example 59, further comprising:

An analog nonlinear unit disposed between the photodetecting unit and the ADC unit, the analog nonlinear unit configured to receive a plurality of output electrical signals from the photodetecting unit, apply a nonlinear transfer function, and output a plurality of converted output electrical signals to the ADC unit,

Wherein the operations further comprise:

obtaining a first plurality of converted digital output electrical signals corresponding to the plurality of converted output electrical signals from the ADC unit, the first plurality of converted digital output electrical signals forming a first converted digital output vector, and

The first transformed digital output vector is stored in a memory unit.

Embodiment 67 the system of embodiment 59, wherein the integrated circuit of the controller is configured to generate the first plurality of modulator control signals at a rate greater than or equal to 8 GHz.

Embodiment 68 the system of embodiment 59, further comprising:

An analog nonlinear unit disposed between the photodetecting unit and the ADC unit, the analog nonlinear unit configured to receive a plurality of output electrical signals from the photodetecting unit, apply a nonlinear transfer function, and output a plurality of converted output electrical signals.

Embodiment 69 the system of embodiment 68, wherein the analog memory cell comprises a plurality of capacitors.

Embodiment 70 the system of embodiment 68, wherein the analog storage unit is configured to receive and store the plurality of converted output electrical signals of the analog nonlinear unit and output the stored plurality of converted output electrical signals to the plurality of optical modulators, and

Wherein the operations further comprise:

storing a plurality of converted output electrical signals of the analog nonlinear unit in an analog storage unit based on generating a first plurality of modulator control signals;

outputting the stored converted output electrical signal through the analog storage unit;

Obtaining a second plurality of converted digital output electrical signals from the ADC unit, the second plurality of converted digital output electrical signals forming a second converted digital output vector, and

The second transformed digital output vector is stored in the memory unit.

Embodiment 71 the system of embodiment 59, wherein the artificial neural network computing request input data set includes a plurality of digital input vectors,

Wherein the laser unit is configured to generate a plurality of wavelengths,

Wherein the plurality of light modulators comprises:

Wherein the photodetector unit is further configured to demultiplex a plurality of wavelengths and produce a plurality of demultiplexed output electrical signals, an

Wherein the operations include:

Embodiment 72 the system of embodiment 59, wherein the artificial neural network computation request includes a plurality of digital input vectors,

Wherein the laser unit is configured to generate a plurality of wavelengths,

Wherein the plurality of light modulators comprises:

Wherein the operations include:

The first transformed digital output vector is stored in a memory unit.

Embodiment 73 the system of embodiment 59, wherein the DAC unit comprises:

a 1-bit DAC unit configured to generate a plurality of 1-bit modulator control signals,

Where the resolution of the ADC unit is 1 bit,

Wherein the first digital input vector has a resolution of N bits, and

Wherein the operations include:

generating, by a 1-bit DAC unit, a sequence of N1-bit modulator control signals corresponding to the N1-bit input vectors;

The transformed N-bit digital output vector is stored in a memory unit.

Embodiment 74 the system of embodiment 59, wherein the memory unit comprises a digital input vector memory configured to store the first digital input vector and comprising at least one SRAM.

Embodiment 75 the system of embodiment 59, wherein the laser unit comprises:

A laser source configured to generate light, and

Embodiment 76 the system of embodiment 59, wherein the plurality of optical modulators comprises one of a MZI modulator, a ring resonator modulator, or an electroabsorption modulator.

Embodiment 77 the system of embodiment 59, wherein the photodetection unit comprises:

a plurality of photodetectors, and

And a plurality of amplifiers configured to convert photocurrents generated by the photodetectors into a plurality of output electrical signals.

Embodiment 78 the system of embodiment 59, wherein the integrated circuit comprises an application specific integrated circuit.

Embodiment 79 the system of embodiment 59, wherein the light matrix processing unit comprises:

An input waveguide array for receiving an optical input vector;

An optical interference unit in optical communication with the input waveguide array for performing a linear transformation of the optical input vector into a second array of optical signals, wherein the optical interference unit comprises a passive diffractive optical component, and

Embodiment 80 a system comprising:

A storage unit;

A driver unit configured to generate a plurality of modulator control signals;

An optical processor, comprising:

a laser unit configured to generate a plurality of light outputs;

An optical matrix processing unit coupled to the plurality of optical modulators and the driver unit, the optical matrix processing unit comprising a passive diffractive optical element configured to convert an optical input vector into an optical output vector based on a plurality of weight control signals defined by the passive diffractive optical element, and

a comparator unit coupled to the photo detection unit and configured to convert the plurality of output electrical signals into a plurality of digital 1-bit optical outputs, and

A controller comprising an integrated circuit configured to:

Receiving an artificial neural network computation request from a computer comprising an input dataset, wherein the input dataset comprises a first digital input vector having an N-bit resolution;

storing the input data set in a storage unit;

The transformed N-bit digital output vector is stored in a memory unit.

Embodiment 81 the system of embodiment 80 wherein the light matrix processing unit comprises a light matrix multiplication unit configured to convert the light input vector into a light output vector, the light output vector representing a product of a matrix multiplication between the input vector represented by the light input vector and a predetermined vector defined by the diffractive optical element.

Embodiment 82 a method for performing artificial neural network calculations in a system having an optical matrix processing unit, the method comprising:

Receiving an artificial neural network calculation request from a computer comprising an input dataset comprising a first digital input vector;

storing the input data set in a storage unit;

Generating, by a digital-to-analog conversion (DAC) unit, a first plurality of modulator control signals based on a first digital input vector;

Converting the light input vector into a light output vector by using a light matrix processing unit comprising an arrangement of diffractive optical elements, wherein the light output vector represents a result of a matrix process applied to the light input vector and a predetermined vector defined by the arrangement of diffractive optical elements;

obtaining a first plurality of digital light outputs corresponding to the light output vectors of the light matrix processing unit from an analog-to-digital conversion (ADC) unit, the first plurality of digital light outputs forming a first digital output vector;

storing the first transformed digital output vector in a memory unit, and

Embodiment 83 the method of embodiment 82, wherein converting the light input vector to the light output vector comprises converting the light input vector to a light output vector representing a product of a matrix multiplication between the digital input vector and a predetermined vector defined by the arrangement of diffractive optical elements.

Embodiment 84, a method, comprising:

Providing input information in an electronic format;

optically converting, by an optical processor comprising a passive diffractive optical component, an optical input vector into an optical output vector based on optical matrix processing;

Converting the light output vector into an electronic format, and

Embodiment 85 the method of embodiment 84, wherein optically converting the light input vector into the light output vector comprises optically converting the light input vector into the light output vector based on an optical matrix multiplication between an input vector represented by the light input vector and a predetermined vector defined by the passive diffractive optical element.

Embodiment 86 the method of embodiment 84, further comprising:

for new electronic input information corresponding to the output information provided in electronic format, the electro-optical conversion, photoelectric conversion, and nonlinear conversion of the electrical application are repeated.

Embodiment 87 the method of embodiment 86, wherein the light matrix process for the initial light transformation and the light matrix process for the repeated light transformation are the same and correspond to the same layer of the artificial neural network.

Embodiment 88 the method of embodiment 84, further comprising:

Wherein the light matrix process for the initial light transformation and the light matrix process for the repeated light transformation are identical and correspond to one layer of the artificial neural network.

Embodiment 89 a system comprising:

An optical matrix processing unit configured to process an input vector of N length, wherein the optical matrix processing unit includes an n+2 layer directional coupler (directional coupler) and an N layer phase shifter, and N is a positive integer.

Embodiment 90 the system of embodiment 89, wherein the optical matrix processing unit includes no more than an n+2 layer directional coupler.

Embodiment 91 the system of embodiment 89, wherein the light matrix processing unit comprises a light matrix method unit.

Embodiment 92 the system of embodiment 89, wherein the light matrix processing unit comprises:

A substrate, and

Interconnect interferometers are arranged on the substrate, wherein each interferometer comprises an optical waveguide arranged on the substrate, and the directional coupler and the phase shifter are part of the interconnect interferometers.

Embodiment 93 the system of embodiment 89, wherein the optical matrix processing unit includes a layer of attenuators (attenuators) that follow the last layer of directional couplers.

Embodiment 94 the system of embodiment 93, wherein the one layer of attenuators comprises N attenuators.

Embodiment 95 the system of embodiment 93, comprising one or more homodyne detectors (homodyne detector) for detecting an output from the attenuator.

Embodiment 96 the system of embodiment 89, wherein N = 3, and the optical matrix processing unit includes:

an input (terminal) configured to receive an input vector;

A first layer directional coupler coupled to the input;

A first layer phase shifter coupled to the first layer directional coupler;

a second layer directional coupler coupled to the first layer phase shifter;

a second layer phase shifter coupled to the second layer directional coupler;

a third layer directional coupler coupled to the second layer phase shifter;

A third layer phase shifter coupled to the third layer directional coupler;

a fourth layer directional coupler coupled to the third layer phase shifter, and

And a fifth layer directional coupler coupled to the fourth layer directional coupler.

Embodiment 97 the system of embodiment 89, wherein N = 4, and the optical matrix processing unit includes:

an input configured to receive an input vector;

a first layer, a second layer, a third layer, and a fourth layer of directional couplers, each layer of directional coupler followed by a layer of phase shifter, wherein the first layer of directional coupler is coupled to the input;

A penultimate layer (second-to-LAST LAYER) directional coupler coupled to the fourth layer phase shifter, and

And a final layer directional coupler coupled to the penultimate layer directional coupler.

Embodiment 98 the system of embodiment 89, wherein N = 8, and the optical matrix processing unit includes:

an input configured to receive an input vector;

eight layers of directional couplers, each layer of directional coupler being followed by a layer of phase shifter, wherein a first layer of directional coupler is coupled to an input;

a penultimate layer directional coupler coupled to the eighth layer phase shifter, and

Embodiment 99 the system of embodiment 89, wherein the light matrix processing unit comprises:

an input configured to receive an input vector;

N layers of directional couplers, each layer of directional coupler being followed by a layer of phase shifter, wherein a first layer of directional coupler is coupled to the input;

a penultimate layer directional coupler coupled to the N-th layer phase shifter, and

Embodiment 100 the system of embodiment 99 wherein N is an even number.

Embodiment 101 the system of embodiment 100 wherein each ith layer directional coupler includes N/2 directional couplers, wherein i is an odd number, and

Each j-th layer directional coupler includes N/2-1 directional couplers, where j is an even number.

Embodiment 102 the system of embodiment 100 wherein for each i-th layer directional coupler having an odd number of i, the kth directional coupler is coupled to the (2 k-1) th and 2 k-th outputs of the previous layer, k being an integer from 1 to N/2.

Example 103 the system of example 100 wherein for each j-th layer directional coupler where j is an even number, the mth directional coupler is coupled to the (2 m) th and (2m+1) th outputs of the previous layer, m is an integer from 1 to N/2-1.

Embodiment 104 the system of embodiment 100 wherein each ith layer of phase shifters comprises N phase shifters, where i is an odd number, and

Each j-th layer shifter includes N-2 shifters, where j is an even number.

Embodiment 105 the system of embodiment 99, wherein N is an odd number.

Embodiment 106 the system of embodiment 105, wherein each layer of directional couplers comprises (N-1)/2 directional couplers.

Embodiment 107 the system of embodiment 105, wherein each layer of phase shifters comprises N-1 phase shifters.

Embodiment 108 a system comprising:

A generator configured to generate a first data set, wherein the generator comprises an optical matrix processing unit, and

An discriminator (discriminator) configured to receive a second dataset comprising data from the first dataset and data from the third dataset, the data in the first dataset having similar characteristics (characteristics) as the data in the third dataset, and to classify the data in the second dataset as either data from the first dataset or data from the third dataset.

Embodiment 109 the system of embodiment 108 wherein the light matrix processing unit comprises at least one of (i) the light matrix multiplication unit of any one of embodiments 1-25, (ii) the passive diffractive optical component of any one of embodiments 32-52, 55-81, or (iii) the light matrix processing unit of any one of embodiments 89-107.

Embodiment 110 the system of embodiment 108 wherein the third data set includes real data, the generator is configured to generate synthetic data (synthesized data) similar to the real data, and the discriminator is configured to classify the data as real data or synthetic data.

Embodiment 111 the system of embodiment 108, wherein the generator is configured to generate a data set for training at least one of a vehicle for automated driving (vehicle), a medical diagnostic system, a fraud detection system, a weather forecast system, a financial prediction system, a facial recognition system, a speech recognition system, or a product defect detection system.

Embodiment 112 the system of embodiment 108 wherein the generator is configured to generate an image that is similar to an image of at least one of the real object or the real scene, and the discriminator is configured to classify the received image as either (i) an image of the real object or the real scene, or (ii) a composite image generated by the generator.

Embodiment 113 the system of embodiment 112 wherein the real object comprises at least one of a person, an animal, a cell, a tissue, or a product, and the real scene comprises a scene encountered by a vehicle.

Embodiment 114 the system of embodiment 113 wherein the discriminator is configured to classify the received image as being (i) an image of a real person, a real animal, a real cell, a real tissue, a real product, or a real scene encountered by the vehicle, or (ii) a composite image produced by the generator.

Embodiment 115 the system of embodiment 113, wherein the vehicle comprises at least one of a motorcycle, an automobile, a truck, a train, a helicopter, an airplane, a submarine, a ship, or an unmanned aerial vehicle.

Embodiment 116 the system of embodiment 113, wherein the generator is configured to generate an image of tissue or cells associated with at least one of a human disease, an animal disease, or a plant disease.

Embodiment 117 the system of embodiment 116, wherein the generator is configured to generate an image of tissue or cells associated with a human disease, and the disease comprises at least one of cancer, parkinson's disease, sickle cell anemia, heart disease, cardiovascular disease, diabetes, chest disease, or skin disease.

Embodiment 118 the system of embodiment 116, wherein the generator is configured to generate an image of tissue or cells associated with a cancer, and the cancer comprises at least one of skin cancer, breast cancer, lung cancer, liver cancer, prostate cancer, or brain cancer.

Embodiment 119 the system of embodiment 108, further comprising a random noise generator configured to generate random noise input to the generator, and the generator is configured to generate the first data set based on the random noise.

Embodiment 120 a system comprising:

a random noise generator configured to generate random noise, and

And a generator configured to generate data based on the random noise, wherein the generator includes an optical matrix processing unit.

Embodiment 121 the system of embodiment 120, wherein the light matrix processing unit comprises at least one of (i) the light matrix multiplication unit of any one of embodiments 1-25, (ii) the passive diffractive optical element of any one of embodiments 33-52, 55-81, or (iii) the light matrix processing unit of any one of embodiments 89-107.

Embodiment 122 a system comprising:

An optical circuit configured to perform a logic function (logic function) on two input signals, the optical circuit comprising:

a first directional coupler having two inputs and two outputs, the two inputs configured to receive two input signals;

a first pair of (pair) phase shifters configured to modify phases of signals at two outputs of the first directional coupler;

A second directional coupler having two inputs configured to receive signals from the first pair of phase shifters and two outputs, and

A second pair of phase shifters configured to modify the phase of signals at the two outputs of the second directional coupler.

Embodiment 123 the system of embodiment 122, wherein the phase shifter is configured to cause the optical circuit to perform rotation (rotation):

Embodiment 124 the system of embodiment 122, wherein when the input signals x1 and x2 are provided to the two inputs of the first directional coupler, the phase shifter is configured to cause the optical circuit to perform the operations of:

Embodiment 125 the system of embodiment 124, wherein the optical circuit comprises a first photodetector configured to generate an absolute value of the signal from the second pair of phase shifters to cause the optical circuit to perform the operation of:

embodiment 126 the system of embodiment 125, wherein the photo circuit comprises a comparator configured to compare the output signal of the first photo detector to a threshold value to generate a binary value (binary value) to cause the photo circuit to generate an output:

Embodiment 127 the system of embodiment 125, wherein the optical circuit includes a feedback mechanism (feedback mechanism) configured to cause the output signal of the photodetector to be fed back to the input of the first directional coupler and pass through the first directional coupler, the first pair of phase shifters, the second directional coupler, and the second pair of phase shifters, and to be detected by the photodetector to cause the optical circuit to perform operations of:

which produces outputs AND (x 1, x 2) AND OR (x 1, x 2).

Embodiment 128 the system of embodiment 125, wherein the optical circuit comprises:

a third directional coupler having two inputs configured to receive signals from the second pair of phase shifters and two outputs;

a third pair of phase shifters configured to modify the phase of signals at the two outputs of the third directional coupler;

a fourth directional coupler having two inputs and two outputs, the two inputs configured to receive signals from the third pair of phase shifters;

a fourth pair of phase shifters configured to modify phases of signals at two outputs of the fourth directional coupler, and

A second photodetector configured to generate an absolute value of a signal from the fourth pair of phase shifters to cause the optical circuit to perform operations of:

which produces outputs AND (x 1, x 2) AND OR (x 1, x 2).

Embodiment 129 the system of embodiment 122, comprising a double-tone sorter (Bitonic sorter) configured to perform a sorting function (sorting function) of the double-tone sorter using an optical circuit.

Embodiment 130 the system of embodiment 122 comprising means configured to perform a hash function (hash function) using an optical circuit.

Embodiment 131 the system of embodiment 130 wherein the hash function comprises a secure hash algorithm (secure hash algorithm) 2 (SHA-2).

Embodiment 132 an apparatus comprising:

a plurality of optical waveguides, wherein a set of a plurality of input values is encoded on a respective optical signal carried by the optical waveguides;

a plurality of replication modules, and for each of at least two subsets of the one or more optical signals, a respective set of the one or more replication modules is configured to divide the subset of the one or more optical signals into two or more copies of the optical signal (copies);

A plurality of multiplication modules, and for each of at least two copies of a first subset of the one or more optical signals, the respective multiplication modules are configured to multiply the one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation, wherein at least one of the multiplication modules comprises an optical amplitude modulator comprising one input port and two output ports, and a pair of related optical signals are provided from the two output ports such that a difference between the amplitudes of the related optical signals corresponds to a result of multiplying the input value by the signed matrix element value, and

One or more summation modules, and for the results of the two or more multiplication modules, a respective one of the summation modules is configured to produce an electrical signal representative of a sum of the results of the two or more multiplication modules.

Embodiment 133 the apparatus of embodiment 132 wherein the input values in the set of the plurality of input values encoded on the respective optical signals represent elements of an input vector multiplied by a matrix comprising one or more matrix element values.

Embodiment 134 the apparatus of embodiment 132 or 133 wherein a set of the plurality of output values are encoded on respective electrical signals produced by one or more summing modules, and the output values in the set of the plurality of output values represent elements of an output vector, the output vector being produced by multiplying the input vector by a matrix.

Embodiment 135 the apparatus of any of embodiments 132-134 wherein each optical signal carried by the optical waveguide comprises an optical wave having a common wavelength that is substantially the same for all optical signals.

Embodiment 136 the apparatus of any of embodiments 132-135, wherein the replication module comprises at least one replication module having an optical splitter that transmits power of a predetermined proportion of the optical waves at the input port to the first output port and transmits power of the remaining proportion of the optical waves at the input port to the second output port.

Embodiment 137 the apparatus of embodiment 136 wherein the optical splitter comprises a waveguide optical splitter that transmits power of a predetermined proportion of the optical waves directed by the input optical waveguide to the first output optical waveguide and transmits power of the remaining proportion of the optical waves directed by the input optical waveguide to the second output optical waveguide.

Embodiment 138 the apparatus of embodiment 137, wherein the guided mode of the input optical waveguide is adiabatically coupled to the guided mode of each of the first output optical waveguide and the second output optical waveguide.

Embodiment 139 the apparatus of any of embodiments 136-138, wherein the beam splitter comprises a beam splitter comprising at least one surface that transmits a predetermined proportion of the power of the optical wave at the input port and reflects the remaining proportion of the power of the optical wave at the input port.

Embodiment 140 the apparatus of embodiment 139, wherein at least one of the plurality of optical waveguides comprises an optical fiber coupled to an optical coupler that couples a guided mode of the optical fiber to a free-space propagation mode (free-space propagation mode).

Embodiment 141 the apparatus of any of embodiments 132-140, wherein the multiplication module comprises at least one coherence sensitive multiplication module (coherence-SENSITIVE MULTIPLICATION MODULE) configured to multiply one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation based on interference between the optical waves, the optical waves having a coherence length at least as long as a propagation distance through the coherence sensitive multiplication module.

Embodiment 142 the apparatus of embodiment 141 wherein the coherence sensitive multiplication module comprises a mach-zehnder interferometer (MZI) that separates the optical waves guided by the input optical waveguide into a first optical waveguide arm (optical waveguide arm) of the mach-zehnder interferometer and a second optical waveguide arm of the mach-zehnder interferometer, the first optical waveguide arm comprising a phase shifter that imparts a relative phase shift with respect to a phase delay of the second optical waveguide arm, and the mach-zehnder interferometer combines the optical waves from the first optical waveguide arm and the second optical waveguide arm into at least one output optical waveguide.

Embodiment 143 the apparatus of embodiment 142 wherein the mach-zehnder interferometer combines light waves from the first light guide arm and the second light guide arm into each of the first output light guide and the second output light guide, the first photodetector receives light waves from the first output light guide to produce a first photocurrent, the second photodetector receives light waves from the second output light guide to produce a second photocurrent, and the result of the coherence sensitive multiplying module comprises a difference between the first photocurrent and the second photocurrent.

Embodiment 144 the apparatus of any of embodiments 141-143, wherein the coherence sensitive multiplication module comprises one or more ring resonators (ring resonators) comprising at least one ring resonator coupled to the first optical waveguide and at least one ring resonator coupled to the second optical waveguide.

Embodiment 145 the apparatus of embodiment 144, wherein the first photodetector receives light waves from the first light guide to produce a first photocurrent, the second photodetector receives light waves from the second light guide to produce a second photocurrent, and the result of the coherence sensitive multiplying module comprises a difference between the first photocurrent and the second photocurrent.

Embodiment 146 the apparatus of any of embodiments 132-145, wherein the multiplication module comprises at least one coherent non-sensitive multiplication module (coherence-INSENSITIVE MULTIPLICATION MODULE) configured to multiply one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation based on energy absorption within the optical wave.

Embodiment 147 the apparatus of embodiment 146, wherein the coherent non-sensitive multiplying module comprises an electro-absorption modulator (electro-absorption modulator).

Embodiment 148 the apparatus of any of embodiments 132-147 wherein the one or more summing modules comprises at least one summing module having (1) two or more input conductors each carrying an electrical signal in the form of an input current, the magnitude of the input current representing a respective result of the respective one of the multiplication modules, and (2) at least one output conductor carrying an electrical signal representing a sum of the respective results in the form of an output current, the output current being proportional to the sum of the input currents.

Embodiment 149 the apparatus of embodiment 148, wherein the two or more input conductors and the output conductor comprise wires that contact one or more nodes between the wires and the output current is substantially equal to the sum of the input currents.

Embodiment 150 the apparatus of embodiment 148 or 149, wherein the at least one first input current of the input currents is provided in the form of at least one photocurrent generated by at least one photodetector that receives the optical signal generated by the first multiplication module of the multiplication modules.

Embodiment 151 the apparatus of embodiment 150, wherein the first input current is provided in the form of a difference between two photocurrents, the two photocurrents being generated by different respective photodetectors, the photodetectors receiving different respective optical signals generated by the first multiplying module.

Embodiment 152 the apparatus of any one of embodiments 132-151, wherein one of the copies of the first subset of one or more optical signals consists of a single optical signal, wherein one of the input values is encoded on the single optical signal.

Embodiment 153 the apparatus of embodiment 152 wherein the multiplication module corresponding to the copy of the first subset multiplies the encoded input values by the single matrix element values.

Embodiment 154 the apparatus of any one of embodiments 132-153, wherein one of the copies of the first subset of one or more optical signals comprises more than one optical signal and less than all optical signals on which the plurality of input values are encoded.

Embodiment 155 the apparatus of embodiment 154 wherein the multiplication module corresponding to the copy of the first subset multiplies the encoded input values by different respective matrix element values.

Embodiment 156 the apparatus of embodiment 155 wherein different multiplication modules corresponding to different respective copies of the first subset of one or more optical signals are included in different apparatuses that optically communicate to transmit one of the copies of the first subset of one or more optical signals between the different apparatuses.

Embodiment 157 the apparatus of any of embodiments 132-156, wherein the two or more of the plurality of optical waveguides, the two or more of the plurality of replication modules, the two or more of the plurality of multiplication modules, and at least one of the one or more summation modules are disposed on a substrate of a common apparatus.

Embodiment 158 the apparatus of embodiment 157, wherein the apparatus performs vector matrix multiplication, wherein the input vector is provided as a set of optical signals and the output vector is provided as a set of electrical signals.

Embodiment 159 the apparatus of any of embodiments 132-158, further comprising an accumulator that combines the input electrical signals corresponding to the outputs of the multiplication or summation modules, wherein the input electrical signals are encoded using time domain encoding (time domain encoding) using switched amplitude modulation (on-off amplitude modulation) within each of the plurality of time slots, and the accumulator generates the output electrical signals encoded at more than two amplitude levels, the amplitude levels corresponding to different duty cycles of the time domain encoding over the plurality of time slots.

Embodiment 160 the apparatus of any of embodiments 132-159 wherein each of the two or more of the multiplication modules corresponds to a different subset of the one or more optical signals.

Embodiment 161 the apparatus of any of embodiments 132-160, the apparatus further comprising, for each copy of the second subset of one or more optical signals, a multiplication module configured to multiply the one or more optical signals of the second subset by one or more matrix element values using optical amplitude modulation, different from the optical signals of the first subset of one or more optical signals.

Embodiment 162 a method comprising:

Encoding a set of a plurality of input values on respective optical signals;

For each of at least two subsets of the one or more optical signals, using a respective set of one or more replication modules to divide the subset of the one or more optical signals into two or more copies of the optical signal;

For each of at least two copies of a first subset of one or more optical signals, using a respective multiplication module to multiply the one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation, wherein at least one multiplication module comprises an optical amplitude modulator comprising one input port and two output ports, and providing a pair of correlated optical signals from the two output ports such that a difference between the amplitudes of the correlated optical signals corresponds to a result of multiplying the input value by the signed matrix element value, and

For the results of two or more multiplication modules, a summation module configured to generate an electrical signal is used, the electrical signal representing the sum of the results of the two or more multiplication modules.

Embodiment 163 is a method comprising:

encoding a set of input values representing elements of an input vector on a respective optical signal;

Encoding a set of coefficients representing matrix elements as amplitude modulation levels of a set of optical amplitude modulators coupled to the optical signals, wherein at least one optical amplitude modulator comprising one input port and two output ports provides a pair of correlated optical signals from the two output ports such that a difference between the amplitudes of the correlated optical signals corresponds to a result of multiplying the input values by the signed matrix element values, and

A set of output values representing elements of an output vector on respective electrical signals is encoded, wherein at least one of the electrical signals is in the form of a current whose amplitude corresponds to the sum of the respective element of the input vector multiplied by the respective element of a row (row) of the matrix.

Embodiment 164 the method of embodiment 163 wherein the at least one optical signal is provided by a first optical waveguide and the first optical waveguide is coupled to an optical splitter that transmits the power of the predetermined proportion of the optical waves guided by the first optical waveguide to a second output optical waveguide and the remaining proportion of the power of the optical waves guided by the first optical waveguide to a third optical waveguide.

Embodiment 165 an apparatus comprising:

a plurality of optical waveguides encoding a set of input values representing elements of an input vector on a respective optical signal carried by the optical waveguides;

A set of optical amplitude modulators coupled to the optical signals, encoding a set of coefficients representing matrix elements as amplitude modulation levels, wherein at least one optical amplitude modulator comprising one input port and two output ports provides a pair of correlated optical signals from the two output ports such that a difference between the amplitudes of the correlated optical signals corresponds to a result of multiplying the input values by the values of the signed matrix elements, and

A plurality of summing modules encoding a set of output values representing elements of an output vector on respective electrical signals, wherein at least one electrical signal is in the form of a current whose magnitude corresponds to a sum of respective elements of the input vector multiplied by a respective element of a row (row) of the matrix.

Embodiment 166 a method for multiplying an input vector by a given matrix comprises:

Encoding a set of input values representing elements of an input vector on a respective optical signal of the set of optical signals;

coupling a first set of one or more devices to a first set of one or more waveguides providing a first subset of the set of optical signals and producing a result of multiplying a first sub-matrix of a given matrix by values encoded on the first subset of the set of optical signals;

coupling a second set of one or more devices to a second set of one or more waveguides providing a second subset of the set of optical signals and producing a second sub-matrix of the given matrix multiplied by values encoded on the second subset of the set of optical signals;

coupling a third set of one or more devices to a third set of one or more waveguides providing a copy of the first subset of the set of optical signals produced by the first optical splitter and producing a result of multiplying the third sub-matrix of the given matrix by values encoded on the first subset of the set of optical signals;

coupling a fourth set of one or more devices to a fourth set of one or more waveguides providing a copy of the second subset of the set of optical signals produced by the second optical splitter and producing a result of multiplying a fourth sub-matrix of the given matrix by values encoded on the second subset of the set of optical signals;

Wherein the first, second, third and fourth sub-matrices connected together form a given matrix, and

Wherein at least one output value representing an element of an output vector is encoded on an electrical signal, the output vector corresponding to the input vector multiplied by a given matrix, the electrical signal being generated by a device in communication with the first set of one or more devices and the second set of one or more devices.

Embodiment 167 the method of embodiment 166, wherein each pair of the first set of one or more devices, the second set of one or more devices, the third set of one or more devices, and the fourth set of one or more devices is mutually exclusive (mutually exclusive).

Embodiment 168 an apparatus comprising:

a first set of one or more devices configured to receive the first set of optical signals and to produce a result of multiplying the first matrix by values encoded on the first set of optical signals;

a second set of one or more devices configured to receive the second set of optical signals and to produce a second matrix multiplied by values encoded on the second set of optical signals;

a third set of one or more devices configured to receive the third set of optical signals and to produce a third matrix multiplied by values encoded on the third set of optical signals;

a fourth set of one or more devices configured to receive the fourth set of optical signals and to generate a fourth matrix multiplied by values encoded on the fourth set of optical signals, and

A connection path may be configured between two or more of the first set of one or more devices, the second set of one or more devices, the third set of one or more devices, or the fourth set of one or more devices,

Wherein the first configuration of the configurable connection path is configured to (1) provide a copy of the first set of optical signals as the second set of optical signals, at least one of the third set of optical signals or the fourth set of optical signals, and (2) provide one or more signals from the first set of one or more devices and one or more signals from the second set of one or more devices to a summation module configured to generate an electrical signal representative of a sum of values encoded on the signals received by the summation module.

Embodiment 169 an apparatus comprising:

a first set of one or more devices configured to receive the first set of optical signals and produce a result based on an optical amplitude modulation of one or more optical signals of the first set of optical signals;

a second set of one or more devices configured to receive the second set of optical signals and produce a result based on optical amplitude modulation of one or more optical signals of the second set of optical signals;

a third set of one or more devices configured to receive the third set of optical signals and produce a result based on an optical amplitude modulation of one or more optical signals of the third set of optical signals;

A fourth set of one or more devices configured to receive the fourth set of optical signals and to produce a result based on an optical amplitude modulation of one or more optical signals of the fourth set of optical signals, and

Wherein the first configuration of the configurable connection path is configured to (1) provide a copy of the first set of optical signals as the third set of optical signals, or (2) provide one or more signals from the first set of one or more devices and one or more signals from the second set of one or more devices to a summing module configured to generate an electrical signal representative of a sum of values encoded on the signals received by the summing module.

Embodiment 170 the apparatus of embodiment 169 wherein each pair of the first set of one or more apparatus, the second set of one or more apparatus, the third set of one or more apparatus, and the fourth set of one or more apparatus is mutually exclusive.

Embodiment 171 the apparatus of embodiment 169 or 170, wherein the first configuration of the configurable connection path is configured to (1) provide a copy of the first set of optical signals as the third set of optical signals, and (2) provide one or more signals from the first set of one or more devices and one or more signals from the second set of one or more devices to a summation module configured to generate an electrical signal representative of a sum of values encoded on at least two different signals received by the summation module.

Embodiment 172 the apparatus of any of embodiments 169-171, wherein the first configuration of the configurable connection path is configured to provide a copy of the first set of optical signals as the third set of optical signals, and the second configuration of the configurable connection path is configured to provide one or more signals from the first set of one or more apparatuses and one or more signals from the second set of one or more apparatuses to a summation module configured to generate an electrical signal representative of a sum of values encoded on the signals received by the summation module.

Embodiment 173 an apparatus comprising:

A plurality of replication modules comprising, for each of at least two subsets of the one or more optical signals, a respective set of one or more replication modules configured to divide the subset of the one or more optical signals into two or more copies of the optical signal;

A plurality of multiplication modules including, for each of at least two copies of a first subset of one or more optical signals, a respective multiplication module configured to multiply the one or more optical signals of the first subset by one or more values using optical amplitude modulation, and

One or more summation modules, for the results of the two or more multiplication modules, the one or more summation modules comprising a summation module configured to produce an electrical signal representing a sum of the results of the two or more multiplication modules, wherein the results comprise at least one result encoded on the electrical signal, and the result is derived from one copy of the optical signal that propagates through no more than a single optical amplitude modulator before being converted to the electrical signal.

Embodiment 174 a system comprising:

a first unit configured to generate a plurality of modulator control signals;

A processor, comprising:

a light source configured to provide a plurality of light outputs;

A plurality of light modulators coupled to the light source and the first unit, the plurality of light modulators configured to generate a light input vector by modulating a plurality of light outputs provided by the light source based on a plurality of modulator control signals, the light input vector comprising a plurality of light signals, and

A matrix multiplication unit coupled to the plurality of light modulators and the first unit, the matrix multiplication unit configured to convert the optical input vector into an analog output vector based on the plurality of weight control signals;

a second unit coupled to the matrix multiplication unit and configured to convert the analog output vector into a digital output vector, and

A controller comprising an integrated circuit configured to:

Receiving an artificial neural network calculation request, wherein the artificial neural network calculation request comprises an input data set, and the input data set comprises a first digital input vector;

receiving a first plurality of neural network weights, and

The first unit generates a first plurality of modulator control signals based on the first digital input vector and generates a first plurality of weight control signals based on the first plurality of neural network weights.

Embodiment 175 the system of embodiment 174, wherein the first unit comprises a digital-to-analog converter (DAC).

Embodiment 176 the system of embodiment 174 or 175, wherein the second unit comprises an analog-to-digital converter (ADC).

Embodiment 177 the system of any of embodiments 174-176, comprising a storage unit configured to store a data set and a plurality of neural network weights.

Embodiment 178 the system of embodiment 177, wherein the integrated circuit of the controller is further configured to perform operations comprising storing the input data set and the first plurality of neural network weights in the storage unit.

Embodiment 179 the system of any one of embodiments 174-178, wherein the first unit is configured to generate a plurality of weight control signals.

Embodiment 180 the system of any of embodiments 174-178, wherein the controller comprises an Application Specific Integrated Circuit (ASIC), and

Receiving the artificial neural network computation request includes receiving the artificial neural network computation request from a general purpose data processor.

Embodiment 181 the system of any of embodiments 174-178 wherein the first unit, the processing unit, the second unit, and the controller are disposed on at least one of a multi-chip module or an integrated circuit, and

Receiving the artificial neural network computation request includes receiving the artificial neural network computation request from a second data processor, wherein the second data processor is external to the multi-chip module or integrated circuit, the second data processor is coupled to the multi-chip module or integrated circuit through a communication channel (communication channel), and the processing unit is capable of processing data at a data rate that is at least an order of magnitude greater than a data rate of the communication channel.

Embodiment 182 the system of embodiment 174, wherein the first unit, the processing unit, the second unit, and the controller are used for an electro-optical processing loop that is repeated in a plurality of iterations, and the electro-optical processing loop comprises:

(1) At least a first light modulation operation based on at least one of the plurality of modulator control signals, and at least a second light modulation operation based on at least one of the weight control signals, and

(2) At least one of (a) an electrical summing operation or (b) an electrical storage operation.

Embodiment 183 the system of embodiment 182, wherein the optoelectronic processing cycle comprises an electrical storage operation, and the electrical storage operation is performed using a memory unit coupled to the controller,

Wherein the operations performed by the controller further comprise storing the input data set and the first plurality of neural network weights in a memory unit.

Embodiment 184 the system of embodiment 182, wherein the optoelectronic processing loop comprises an electrical summing operation, and the electrical summing operation is performed using an electrical summing module within the matrix multiplication unit,

Wherein the electrical summing module is configured to generate currents corresponding to elements of an analog output vector representing a sum of respective elements of the optical input vector multiplied by respective neural network weights.

Embodiment 185 the system of embodiment 182, wherein the optoelectronic processing loop comprises at least one signal path on which no more than one first optical modulation operation is performed in a single loop iteration based on at least one of the plurality of modulator control signals, and no more than one second optical modulation operation is performed in a single loop iteration based on at least one of the weight control signals.

Embodiment 186 the system of embodiment 185 wherein the first light modulation operation is performed by one of a plurality of light modulators coupled to the light source of the light output and the matrix multiplication unit and the second light modulation operation is performed by a light modulator included in the matrix multiplication unit.

Embodiment 187 the system of embodiment 182, wherein the optoelectronic processing loop includes at least one signal path on which no more than one electrical storage operation is performed in a single loop iteration.

Embodiment 188 the system of embodiment 174, wherein the light source comprises a laser unit configured to generate the plurality of light outputs.

Embodiment 189 the system of embodiment 174, wherein the matrix multiplication unit comprises:

An input waveguide array for receiving an optical input vector, and the optical input vector comprises a first array of optical signals;

Embodiment 190 the system of embodiment 189, wherein the optical interference unit comprises:

Embodiment 191 the system of embodiment 174, wherein the matrix multiplication unit comprises:

a plurality of replication modules, wherein each replication module corresponds to a subset of one or more optical signals of the optical input vector and is configured to divide the subset of one or more optical signals into two or more copies of the optical signals;

a plurality of multiplication modules, wherein each multiplication module corresponds to a subset of the one or more optical signals and is configured to multiply the one or more optical signals of the subset by one or more matrix element values using optical amplitude modulation, and

One or more summation modules, each of which is configured to produce an electrical signal representative of a sum of results of two or more of the multiplication modules.

Embodiment 192 the system of embodiment 191 wherein the at least one multiplication module comprises an optical amplitude modulator comprising one input port and two output ports, and a pair of correlated optical signals is provided from the two output ports such that a difference between the amplitudes of the correlated optical signals corresponds to a multiplication of the input values by the values of the symbol matrix elements.

Embodiment 193 the system as in embodiment 191 or 192 wherein the matrix multiplication unit is configured to multiply the light input vector by a matrix comprising one or more matrix element values.

Embodiment 194 the system of embodiment 193 wherein a set of the plurality of output values are encoded on respective electrical signals produced by the one or more summing modules and the output values of the set of the plurality of output values represent elements of an output vector, the output vector being produced by multiplying the optical input vector by a matrix.

Embodiment 195 the system of any of embodiments 174-194, wherein the system comprises a memory unit configured to store the input data set and the neural network weights, the second unit comprises an analog-to-digital conversion (ADC) unit, and the operations further comprise:

obtaining a first plurality of digital outputs corresponding to the analog output vectors of the matrix multiplication unit from the ADC unit, the first plurality of digital outputs forming a first digital output vector;

The first transformed digital output vector is stored in a memory unit.

Embodiment 196 the system of embodiment 195, wherein the system has a first recurring time period defined as an elapsed time between the step of storing the input data set and the first plurality of neural network weights in the memory unit and the step of storing the first transformed digital output vector in the memory unit, and

Wherein the first cycle period is less than or equal to 1ns.

Embodiment 197 the system of embodiment 195 or 196, wherein the operations further comprise:

Embodiment 198 the system of any one of embodiments 195-197, wherein the first unit comprises a digital-to-analog conversion (DAC) unit, and the operations further comprise:

Embodiment 199 the system of any one of embodiments 195-198, wherein the first unit comprises a digital-to-analog conversion (DAC) unit, the artificial neural network computation request further comprises a second plurality of neural network weights, and wherein the operations further comprise:

based on the obtaining of the first plurality of digital outputs, a second plurality of weight control signals is generated by the DAC unit based on the second plurality of neural network weights.

Embodiment 200 the system of embodiment 199, wherein the first plurality of neural network weights and the second plurality of neural network weights correspond to different layers of an artificial neural network.

Embodiment 201 the system of any one of embodiments 195 through 200 wherein the first unit comprises a digital-to-analog conversion (DAC) unit and the input data set further comprises a second digital input vector, and

Wherein the operations further comprise:

Obtaining a second plurality of digital outputs corresponding to the analog output vectors of the matrix multiplication unit from the ADC unit, the second plurality of digital outputs forming a second digital output vector;

Storing the second transformed digital output vector in a memory unit, and

Wherein the analog output vector of the matrix multiplication unit is generated by a second optical input vector generated based on a second plurality of modulator control signals, the second optical input vector being converted by the matrix multiplication unit based on the first mentioned plurality of weight control signals.

Embodiment 202 the system of any of embodiments 174-201, wherein the system comprises a storage unit configured to store the input data set and the neural network weights, and the second unit comprises an analog-to-digital conversion (ADC) unit, and the system further comprises:

an analog nonlinear unit disposed between the matrix multiplication unit and the ADC unit, the analog nonlinear unit configured to receive a plurality of output voltages from the matrix multiplication unit, apply a nonlinear transfer function, and output a plurality of converted output voltages to the ADC unit,

Wherein the operations performed by the integrated circuit of the controller further comprise:

Obtaining a first plurality of converted digital output voltages corresponding to the plurality of converted output voltages from the ADC unit, the first plurality of converted digital output voltages forming a first converted digital output vector, and

The first transformed digital output vector is stored in a memory unit.

Embodiment 203 the system of any of embodiments 174-202, wherein the integrated circuit of the controller is configured to generate the first plurality of modulator control signals at a rate greater than or equal to 8 GHz.

Embodiment 204 the system of any of embodiments 174-190, wherein the first unit comprises a digital-to-analog conversion (DAC) unit, the second unit comprises an analog-to-digital conversion (ADC) unit, and the matrix multiplication unit comprises:

An optical matrix multiplication unit coupled to the plurality of optical modulators and the DAC unit, the optical matrix multiplication unit configured to convert the optical input vector into an optical output vector based on the plurality of weight control signals, and

A photo detection unit coupled to the optical matrix multiplication unit and configured to generate a plurality of output voltages corresponding to the optical output vectors.

Embodiment 205 the system of embodiment 204, further comprising:

Embodiment 206 the system of embodiment 205 wherein the analog memory cell comprises a plurality of capacitors.

Embodiment 207 the system of embodiment 205 or 206 wherein the analog storage unit is configured to receive and store a plurality of converted output voltages of the analog nonlinear unit and output the stored plurality of converted output voltages to the plurality of light modulators, an

The operations further comprise:

outputting the stored converted output voltage through the analog storage unit;

The second transformed digital output vector is stored in the memory unit.

Embodiment 208 the system of embodiment 204, wherein the system comprises a storage unit configured to store the input data set and the neural network weights, and the artificial neural network computes that the requested input data set comprises a plurality of digital input vectors,

Wherein the light source is configured to generate a plurality of wavelengths,

Wherein the plurality of light modulators comprises:

Wherein the operations include:

Embodiment 209 the system of embodiment 174, wherein the system comprises a storage unit configured to store the input data set and the neural network weights, the second unit comprises an analog-to-digital conversion (ADC) unit, and the artificial neural network computation request comprises a plurality of digital input vectors,

Wherein the light source is configured to generate a plurality of wavelengths,

Wherein the plurality of light modulators comprises:

The operation includes:

The first transformed digital output vector is in a memory unit.

Embodiment 210 the system of any one of embodiments 174-209, wherein the first unit comprises a digital-to-analog conversion (DAC) unit, the second unit comprises an analog-to-digital conversion (ADC) unit, and the DAC unit comprises:

Where the resolution of the ADC unit is 1 bit,

Wherein the first digital input vector has a resolution of N bits, and

Wherein the operations include:

The transformed N-bit digital output vector is stored in a memory unit.

Embodiment 211 the system of any of embodiments 174-210, wherein the system comprises a storage unit configured to store the input dataset and the neural network weights, and the storage unit comprises:

Embodiment 212 the system of any one of embodiments 174-211, wherein the first unit comprises a digital-to-analog conversion (DAC) unit comprising:

Wherein the first DAC subunit and the second DAC subunit are different.

Embodiment 213 the system of any one of embodiments 174-212, wherein the light source comprises:

A laser source configured to generate light, and

Embodiment 214 the system of any one of embodiments 174-213, wherein the plurality of optical modulators comprises one of MZI modulators, ring resonator modulators, or electroabsorption modulators.

Embodiment 215 the system of embodiment 204, wherein the photodetection unit comprises:

a plurality of photodetectors, and

Embodiment 216 the system of any one of embodiments 174-215 wherein the integrated circuit is an application specific integrated circuit.

Embodiment 217 the system of any of embodiments 174 and 191 through 194 comprising a plurality of optical waveguides coupled between the optical modulator and the matrix multiplication unit, wherein the optical input vector comprises a set of a plurality of input values encoded on respective optical signals carried by the optical waveguides, and each of the optical signals carried by one of the optical waveguides comprises an optical wave having a common wavelength that is substantially the same for all of the optical signals.

Embodiment 218 the system of any one of embodiments 191-194 and 217, wherein the replication module comprises at least one replication module having an optical splitter that transmits power of a predetermined proportion of the optical waves at the input port to the first output port and transmits power of the remaining proportion of the optical waves at the input port to the second output port.

Embodiment 219 the apparatus of embodiment 218, wherein the optical splitter comprises a waveguide optical splitter that transmits power of a predetermined proportion of the light waves guided by the input light guide to the first output light guide and transmits power of the remaining proportion of the light waves guided by the input light guide to the second output light guide.

Embodiment 220 the apparatus of embodiment 219, wherein the guided mode of the input optical waveguide is adiabatically coupled to the guided mode of each of the first and second output optical waveguides.

Embodiment 221 the system of any of embodiments 218-220, wherein the optical splitter comprises a beam splitter comprising at least one surface that transmits a predetermined proportion of the power of the optical wave at the input port and reflects a remaining proportion of the power of the optical wave at the input port.

Embodiment 222 the system of any one of embodiments 217-221, wherein at least one of the plurality of optical waveguides comprises an optical fiber coupled to an optical coupler that couples a guided mode of the optical fiber to a free space propagation mode.

Embodiment 223 the system of any of embodiments 174, 191-194, and 217-222, wherein the multiplication module comprises at least one coherence sensitive multiplication module configured to multiply one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation based on interference between the optical waves, the optical waves having a coherence length at least as long as a propagation distance through the coherence sensitive multiplication module.

Embodiment 224 the apparatus of embodiment 223 wherein the coherence sensitive multiplication module comprises a mach-zehnder interferometer (MZI) that separates the optical waves guided by the input optical waveguide into a first optical waveguide arm of the mach-zehnder interferometer and a second optical waveguide arm of the mach-zehnder interferometer, the first optical waveguide arm comprising a phase shifter that produces a relative phase shift with respect to a phase delay of the second optical waveguide arm, and the mach-zehnder interferometer combines the optical waves from the first optical waveguide arm and the second optical waveguide arm into at least one output optical waveguide.

Embodiment 225 the apparatus of embodiment 224, wherein the mach-zehnder interferometer combines light waves from the first light guide arm and the second light guide arm into each of the first output light guide and the second output light guide, the first photodetector receives light waves from the first output light guide to produce a first photocurrent, the second photodetector receives light waves from the second output light guide to produce a second photocurrent, and the result of the coherence sensitive multiplying module comprises a difference between the first photocurrent and the second photocurrent.

Embodiment 226 the system of any of embodiments 223-225 wherein the coherence sensitive multiplication module includes one or more ring resonators including at least one ring resonator coupled to the first optical waveguide and at least one ring resonator coupled to the second optical waveguide.

Embodiment 227 the apparatus of embodiment 226 wherein the first photodetector receives light waves from a first light guide to produce a first photocurrent, the second photodetector receives light waves from a second light guide to produce a second photocurrent, and the result of the coherence sensitive multiplying module comprises a difference between the first photocurrent and the second photocurrent.

Embodiment 228 the system of any one of embodiments 174, 191-194, and 217-227 wherein the multiplication module comprises at least one coherent, non-sensitive multiplication module configured to multiply the one or more optical signals of the first subset by the one or more matrix element values using optical amplitude modulation based on energy absorption within the optical wave.

Embodiment 229 the apparatus of embodiment 228 wherein the coherent non-sensitive multiplying module comprises an electro-absorption modulator.

Embodiment 230 the system of any one of embodiments 174, 191-194, and 217-229, wherein the one or more summing modules includes at least one summing module having (1) two or more input conductors, each carrying an electrical signal in the form of an input current, the magnitude of the input current representing a respective result of a respective one of the multiplying modules, and (2) at least one output conductor carrying an electrical signal representing a sum of the respective results in the form of an output current, the output current being proportional to the sum of the input currents.

Embodiment 231 the apparatus of embodiment 230 wherein the two or more input conductors and the output conductor comprise wires that contact one or more nodes between the wires and the output current is substantially equal to the sum of the input currents.

Embodiment 232 the apparatus of embodiment 230 or 231, wherein at least a first input current of the input currents is provided in the form of at least one photocurrent generated by at least one photodetector that receives the optical signal generated by the first multiplication module of the multiplication modules.

Embodiment 233 the apparatus of embodiment 232 wherein the first input current is provided in the form of a difference between two photocurrents generated by different respective photodetectors that receive different respective optical signals generated by the first multiplication module.

Embodiment 234 the system of any of embodiments 174-233, wherein one of the copies of the first subset of one or more optical signals consists of a single optical signal, wherein one of the input values is encoded on the single optical signal.

Embodiment 235 the apparatus of embodiment 234 wherein the multiplication module corresponding to the copy of the first subset multiplies the encoded input values by the single matrix element values.

Embodiment 236 the system of any one of embodiments 174, 191-194 and 217-235 wherein one of the copies of the first subset of one or more optical signals comprises more than one optical signal and less than all of the optical signals on which the plurality of input values are encoded.

Embodiment 237 the apparatus of embodiment 236 wherein the multiplication module corresponding to the copy of the first subset multiplies the encoded input values by different respective matrix element values.

Embodiment 238 the apparatus of embodiment 237 wherein different multiplication modules corresponding to different respective copies of the first subset of one or more optical signals are included in different apparatuses that optically communicate to transmit one of the copies of the first subset of one or more optical signals between the different apparatuses.

Embodiment 239 the system of any of embodiments 174, 191-194, and 217-238, wherein at least one of the two or more of the plurality of optical waveguides, the two or more of the plurality of replication modules, the two or more of the plurality of multiplication modules, and the one or more summation modules are disposed on a substrate of a common device.

Embodiment 240 the apparatus of embodiment 239 wherein the apparatus performs vector matrix multiplication in which an input vector is provided as a set of optical signals and an output vector is provided as a set of electrical signals.

Embodiment 241 the system of any of embodiments 174, 191-194, and 217-240 further comprising an accumulator that combines the input electrical signals corresponding to the output of the multiplication or summation module, wherein the input electrical signals are encoded using time domain encoding using switched amplitude modulation within each of the plurality of time slots, and the accumulator generates the output electrical signals encoded at more than two amplitude levels corresponding to different duty cycles of the time domain encoding over the plurality of time slots.

Embodiment 242 the system as in any one of embodiments 174, 191-194 and 217-241 wherein each of the two or more multiplication modules corresponds to a different subset of the one or more optical signals.

Embodiment 243 the system of any of examples 174, 191-194, and 217-242, for each copy of the second subset of the one or more optical signals, unlike the optical signals in the first subset of the one or more optical signals, the apparatus further comprising a multiplication module configured to multiply the one or more optical signals of the second subset by one or more matrix element values using optical amplitude modulation.

Embodiment 244 a system comprising:

A driver unit configured to generate a plurality of modulator control signals;

an optoelectronic processor comprising:

a light source configured to provide a plurality of light outputs;

A plurality of light modulators coupled to the light source and the driver unit, the plurality of light modulators configured to generate light input vectors by modulating a plurality of light outputs generated by the light source based on a plurality of modulator control signals;

A matrix multiplication unit coupled to the plurality of light modulators and the driver unit, the matrix multiplication unit configured to convert the light input vector into an analog output vector based on the plurality of weight control signals, and

A comparator unit coupled to the matrix multiplication unit and configured to convert the analog output vector into a plurality of digital 1-bit outputs, and

A controller comprising an integrated circuit configured to:

receiving an artificial neural network calculation request comprising an input data set and a first plurality of neural network weights, wherein the input data set comprises a first digital input vector having an N-bit resolution;

obtaining from the comparator unit a sequence of N digital 1-bit outputs corresponding to the sequence of N1-bit modulator control signals;

constructing an N-bit digital output vector from the sequence of N digital 1-bit outputs;

The transformed N-bit digital output vector is stored in a memory unit.

Embodiment 245 the system of embodiment 244 wherein receiving an artificial neural network calculation request comprises receiving an artificial neural network calculation request from a general purpose computer (general purpose computer).

Embodiment 246 the system of embodiment 244, wherein the driver unit is configured to generate the plurality of weight control signals.

Embodiment 247 the system of embodiment 244, wherein the matrix multiplication unit comprises:

An optical matrix multiplication unit coupled to the plurality of light modulators and the driver unit, the optical matrix multiplication unit configured to convert the optical input vector into an optical output vector based on the plurality of weight control signals, and

Embodiment 248 the system of embodiment 244, wherein the matrix multiplication unit comprises:

An input waveguide array for receiving an optical input vector;

Embodiment 249 the system of embodiment 248, wherein the optical interference unit comprises:

Embodiment 250 the system of embodiment 244, wherein the matrix multiplication unit comprises:

A plurality of replication modules comprising, for each of at least two subsets of one or more optical signals of the optical input vector, a respective set of one or more replication modules configured to divide the subset of one or more optical signals into two or more copies of the optical signals;

a plurality of multiplication modules, for each of at least two copies of a first subset of one or more optical signals, a respective multiplication module of the plurality of multiplication modules configured to multiply the one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation, and

One or more summation modules, for the results of the two or more multiplication modules, including a summation module configured to generate an electrical signal representative of a sum of the results of the two or more multiplication modules.

Embodiment 251 the system of embodiment 250 wherein the at least one multiplication module comprises an optical amplitude modulator comprising one input port and two output ports, and a pair of correlated optical signals is provided from the two output ports such that a difference between the amplitudes of the correlated optical signals corresponds to a multiplication of the input values by the values of the symbol matrix elements.

Embodiment 252 the system of embodiments 250 or 251 wherein the matrix multiplication unit is configured to multiply the light input vector by a matrix comprising one or more matrix element values.

Embodiment 253 the system of embodiment 252 wherein a set of the plurality of output values are encoded on respective electrical signals produced by the one or more summing modules and the output values of the set of the plurality of output values represent elements of an output vector produced by multiplying the optical input vector by a matrix.

Embodiment 254 a method for performing artificial neural network computations in a system having a matrix multiplication unit configured to convert an optical input vector into an analog output vector based on a plurality of weight control signals, the method comprising:

receiving an artificial neural network calculation request comprising an input dataset and a first plurality of neural network weights, wherein the input dataset comprises a first digital input vector;

Generating a first plurality of modulator control signals based on the first digital input vector and generating a first plurality of weight control signals based on the first plurality of neural network weights;

Obtaining a first plurality of digital outputs corresponding to the output vectors of the matrix multiplication unit, the first plurality of digital outputs forming a first digital output vector;

storing the first transformed digital output vector in a memory unit, and

Embodiment 255 the method of embodiment 254 wherein receiving an artificial neural network calculation request comprises receiving an artificial neural network calculation request from a computer over a communication channel.

Embodiment 256 the method of embodiment 254 or 255, wherein generating the first plurality of modulator control signals comprises generating the first plurality of modulator control signals by a digital-to-analog conversion (DAC) unit.

Embodiment 257 the method of any one of embodiments 254-256, wherein obtaining the first plurality of digital outputs comprises obtaining the first plurality of digital outputs from an analog-to-digital conversion (ADC) unit.

Embodiment 258 the method of embodiment 257, comprising:

applying a first plurality of modulator control signals to a plurality of light modulators coupled to the light source and the DAC unit, and

The plurality of light modulators are used to generate the light input vector by modulating a plurality of light outputs generated by the laser unit based on a plurality of modulator control signals.

Embodiment 259 the method of embodiment 258 wherein the matrix multiplication unit is coupled to a plurality of light modulators and DAC units, and the method comprises:

the light input vector is converted into an analog output vector based on a plurality of weight control signals using a matrix multiplication unit.

Embodiment 260 the method of embodiment 259, wherein the ADC unit is coupled to a matrix multiplication unit, and the method comprises:

the analog output vector is converted to a first plurality of digital outputs using an ADC unit.

Embodiment 261 the method of embodiment 259 or 260, wherein the matrix multiplication unit comprises an optical matrix multiplication unit coupled to a plurality of optical modulator and DAC units,

Converting the optical input vector into an analog output vector includes converting the optical input vector into an optical output vector based on a plurality of weight control signals using an optical matrix multiplication unit, and

The method includes generating a plurality of output voltages corresponding to the light output vectors using a photo detection unit coupled to the light matrix multiplication unit.

Embodiment 262 the method of embodiment 254, comprising:

Receiving an optical input vector at an input waveguide array;

Performing a linear transformation of the optical input vector into a second array of optical signals using an optical interference unit in optical communication with the input waveguide array, and

The second array of optical signals is guided using an array of output waveguides in optical communication with the optical interference unit, wherein at least one input waveguide in the array of input waveguides is in optical communication with each output waveguide in the array of output waveguides through the optical interference unit.

Embodiment 263 the method of embodiment 262 wherein the optical interference unit comprises a plurality of interconnected Mach-Zehnder interferometers (MZIs), each Mach-Zehnder interferometer of the plurality of interconnected Mach-Zehnder interferometers comprising a first phase shifter and a second phase shifter, and the first phase shifter and the second phase shifter are coupled to a plurality of weight control signals, wherein

The method comprises the following steps:

changing the splitting ratio of the Mach-Zehnder interferometer using a first phase shifter, and

The phase of one output of the mach-zehnder interferometer is shifted using a second phase shifter.

Embodiment 264 the method of embodiment 258, comprising:

for each of at least two subsets of one or more optical signals of the optical input vector, splitting the subset of one or more optical signals into copies of two or more optical signals using a respective set of one or more copy modules;

For each of at least two copies of a first subset of one or more optical signals, using a respective multiplication module to multiply the one or more optical signals of the first subset by one or more matrix element values using optical amplitude modulation, and

For the results of the two or more multiplication modules, a summation module configured to generate an electrical signal is used, the electrical signal representing the sum of the results of the two or more multiplication modules.

Embodiment 265 the method of embodiment 264 wherein the at least one multiplication module comprises an optical amplitude modulator comprising one input port and two output ports, and a pair of correlated optical signals is provided from the two output ports such that a difference between the amplitudes of the correlated optical signals corresponds to a multiplication of the input values by the values of the symbol matrix elements.

Embodiment 266 the method of embodiment 264 or 265, comprising multiplying the light input vector by a matrix comprising one or more matrix element values using a matrix multiplication unit.

Embodiment 267 the method of embodiment 266 comprising encoding a set of multiple output values on respective electrical signals produced by one or more summing modules, an

The output values in the set of multiple output values are used to represent elements of an output vector, the output vector being generated by multiplying the light input vector by a matrix.

Embodiment 268 a method comprising:

Providing input information in an electronic format;

photoelectrically converting an optical input vector into an analog output vector based on matrix multiplication, and

A nonlinear transformation is electronically applied to the analog output vector to provide output information in an electronic format.

Embodiment 269 the method of embodiment 268, further comprising:

the electro-optical conversion, the photoelectric conversion, and the nonlinear transformation of the electrical application are repeated for new electronic input information corresponding to the output information provided in electronic format.

Embodiment 270 the method of embodiment 269 wherein the matrix multiplication for initial photoelectric conversion and the matrix multiplication for repeated photoelectric conversion are the same and correspond to the same layer of the artificial neural network.

Embodiment 271 the method of embodiment 269, wherein the matrix multiplication for initial photoelectric conversion and the matrix multiplication for repeated photoelectric conversion are different and correspond to different layers of an artificial neural network.

Embodiment 272 the method of embodiment 268, further comprising:

Wherein the matrix multiplication for the initial photoelectric conversion and the matrix multiplication for the repeated photoelectric conversion are the same and correspond to the first layer of the artificial neural network.

Embodiment 273 the method of embodiment 272, further comprising:

For each of the different portions of the electronic intermediate information, the electro-optical conversion and the non-linear transformation of the electrical application are repeated,

Wherein the matrix multiplication for the initial photoelectric conversion and the matrix multiplication for the repeated photoelectric conversion associated with different portions of the electronic intermediate information are the same and correspond to the second layer of the artificial neural network.

Embodiment 274 a system for performing artificial neural network calculations, the system comprising:

A first unit configured to generate a plurality of vector control signals and to generate a plurality of weight control signals;

a second unit configured to provide an optical input vector based on a plurality of vector control signals;

A matrix multiplication unit coupled to the second unit and the first unit, the matrix multiplication unit configured to convert the optical input vector into an output vector based on a plurality of weight control signals, and

A controller comprising an integrated circuit configured to:

receiving an artificial neural network calculation request comprising an input data set and a first plurality of neural network weights, wherein the input data set comprises a first digital input vector, and

Generating, by a first unit, a first plurality of vector control signals based on a first digital input vector, and a first plurality of weight control signals based on a first plurality of neural network weights;

Wherein the first unit, the second unit, the matrix multiplication unit, and the controller are used for an electro-optical processing loop that is repeated in a plurality of iterations, and the electro-optical processing loop includes at least one of (1) at least two light modulation operations, and (2) an electrical summation operation or (b) an electrical storage operation.

Embodiment 275 a method for performing an artificial neural network calculation method, the method comprising:

Providing input information in an electronic format;

converting at least a portion of the electronic input information into an optical input vector, and

Converting the optical input vector into an output vector based on matrix multiplication using a set of neural network weights;

wherein the providing and converting are performed in a photoelectric processing loop, the photoelectric processing loop being repeated in a plurality of iterations using different respective sets of neural network weights and different respective input information, and the photoelectric processing loop comprising at least one of (1) at least two light modulation operations, and (2) an electrical summation operation or (b) an electrical storage operation.

Embodiment 276 a computing system comprising:

a first unit configured to generate a plurality of modulator control signals;

A processing unit comprising:

A light source or port configured to provide a plurality of light outputs;

A first set of optical modulators coupled to the light source or port and the first unit, the optical modulators in the first set of optical modulators configured to generate an optical input vector by modulating a plurality of optical outputs provided by the light source or port based on a plurality of digital input values corresponding to a first set of modulator control signals in the modulator control signals, the optical input vector comprising a plurality of optical signals, and

A matrix multiplication unit comprising a second set of light modulators, wherein the matrix multiplication unit is coupled to

A first unit, and the matrix multiplication unit is configured to transform the optical input vector into an analog output vector based on a plurality of digital weight values corresponding to a second set of modulator control signals of the plurality of modulator control signals applied to the second set of light modulators,

Wherein at least one optical modulator of at least one of the first or second sets of optical modulators is configured to modulate the optical signal based on a first modulator control signal of the plurality of modulator control signals, and the first unit is configured to shape the first modulator control signal to include a bandwidth enhancement related to an amplitude variation associated with a corresponding variation of a consecutive digital value corresponding to the first modulator control signal.

Embodiment 277 the system of embodiment 276, further comprising:

A controller comprising an integrated circuit configured to:

receiving a first plurality of neural network weights, and

Embodiment 278 the system of embodiment 276 or 277, wherein the first unit comprises a digital-to-analog converter (DAC).

Embodiment 279 the system of embodiment 277, further comprising a storage unit configured to store the data set and the plurality of neural network weights.

Embodiment 280 the system of embodiment 279, wherein the integrated circuit of the controller is further configured to perform operations comprising storing the input data set and the first plurality of neural network weights in the memory unit.

Embodiment 281 the system of any of embodiments 277 through 280 wherein the controller includes an Application Specific Integrated Circuit (ASIC), and

Embodiment 282 the system of any of embodiments 277-281, wherein the first unit, the processing unit, the second unit, and the controller are disposed on at least one of a multi-chip module or an integrated circuit, and

Receiving the artificial neural network computation request includes receiving the artificial neural network computation request from a second data processor, wherein the second data processor is external to the multi-chip module or the integrated circuit, the second data processor is coupled to the multi-chip module or the integrated circuit through the communication channel, and the processing unit can process the data at a data rate that is at least an order of magnitude greater than a data rate of the communication channel.

Embodiment 283 the system of any of embodiments 277-282, wherein the first unit, the processing unit, the second unit, and the controller are used for an optoelectronic processing loop that is repeated in a plurality of iterations, the optoelectronic processing loop comprising:

Embodiment 284 the system of embodiment 283, wherein the optoelectronic processing cycle includes an electrical storage operation, and the electrical storage operation is performed using a memory cell coupled to the controller,

Embodiment 285 the system of embodiment 283 or 284, wherein the optoelectronic processing loop comprises an electrical summing operation, and the electrical summing operation is performed using an electrical summing module within the matrix multiplication unit,

Wherein the electrical summing module is configured to generate currents corresponding to elements of the analog output vector, the currents representing a sum of respective elements of the optical input vector multiplied by respective neural network weights.

Embodiment 286 the system as in any one of embodiments 276-285, wherein the first modulator control signal comprises an analog signal associated with a plurality of predetermined amplitude levels, and each of the amplitude levels is associated with a different corresponding digital value.

Embodiment 287 the system of embodiment 286, wherein the first modulator control signal comprises an analog signal associated with two predetermined amplitude levels, and each of the amplitude levels is associated with a different corresponding binary value.

Embodiment 288 the system of embodiment 287, wherein the consecutive digital values comprise a plurality of consecutive binary values in a series of binary values.

Embodiment 289 the system of embodiment 288, wherein the controller is configured to shape the first modulator control signal to include bandwidth enhancements for an initial portion of the second time interval by increasing a magnitude of an amplitude variation between a first predetermined amplitude level associated with the first time interval and a second predetermined amplitude level associated with the second time interval.

Embodiment 290 the system of embodiment 288 or 289, wherein a series of binary values are used to determine an amplitude level of a first modulator control signal for modulating the optical signal according to a non-return to zero (NRZ) modulation mode.

Embodiment 291 the system of any of embodiments 288-290 wherein the first unit is configured to shape the first modulator control signal to include bandwidth enhancement by pumping a current between a diode structure of the first modulator in the second set of light modulators and a capacitance connected in series between the diode structure and a circuit providing the first modulator control signal, and an amount of charge transferred by the pumping current is determined based at least in part on a constant voltage over a period of time providing a continuous digital value.

Embodiment 292 an apparatus comprising:

a plurality of optical waveguides coupled to a first set of optical amplitude modulators, wherein a set of a plurality of input values are encoded on respective optical signals carried by the optical waveguides using the first set of optical amplitude modulators;

A plurality of replication modules, and for each of at least two subsets of the one or more optical signals, a corresponding set of the one or more replication modules is configured to divide the subset of the one or more optical signals into two or more copies of the optical signal;

A plurality of multiplication modules, each of the multiplication modules including an optical amplitude modulator of a second set of optical amplitude modulators and, for each of at least two copies of a first subset of one or more optical signals, the corresponding multiplication module configured to multiply the one or more optical signals of the first subset by one or more matrix element values using the optical amplitude modulators of the second set of optical amplitude modulators, and

One or more summation modules, and for the results of the two or more multiplication modules, a corresponding one of the summation modules is configured to produce an electrical signal representing a sum of the results of the two or more multiplication modules;

Wherein at least one optical amplitude modulator of at least one of the first set of optical amplitude modulators or the second set of optical amplitude modulators is configured to modulate the optical signal with a modulation value using a monotonically increasing power relative to an absolute value of the modulation value.

Embodiment 293 the apparatus of embodiment 292 wherein at least one of the first or second sets of optical amplitude modulators comprises a coherence sensitive optical amplitude modulator configured to modulate the optical signal by a modulation value based on interference between a plurality of optical waves, the optical waves having a coherence length at least as long as a propagation distance through the coherence sensitive optical amplitude modulator.

Embodiment 294 the apparatus of embodiment 293, wherein the coherence sensitive optical amplitude modulator comprises a mach-zehnder interferometer (MZI) that splits an optical wave guided by the input optical waveguide into a first optical waveguide arm of the mach-zehnder interferometer and a second optical waveguide arm of the mach-zehnder interferometer, the first optical waveguide arm comprising an active phase shifter that imparts a relative phase shift with respect to a phase delay of the second optical waveguide arm, and the mach-zehnder interferometer combines the plurality of optical waves from the first optical waveguide arm and the second optical waveguide arm into at least one output optical waveguide.

Embodiment 295 the apparatus of embodiment 294 wherein the power used to modulate the optical signal by the modulation value comprises power applied to an active phase shifter.

Embodiment 296 the apparatus of embodiment 292 wherein the input values in the set of the plurality of input values encoded on the respective optical signals represent a plurality of elements of an input vector multiplied by a matrix comprising one or more matrix element values.

Embodiment 297 the apparatus of embodiment 296 wherein a set of the plurality of output values is encoded on respective electrical signals produced by the one or more summing modules and the output values of the set of the plurality of output values represent a plurality of elements of an output vector, the output vector being produced by multiplying the input vector by a matrix.

Embodiment 298 the apparatus of any of embodiments 292-297, wherein the optical signals carried by the optical waveguides each comprise an optical wave having a common wavelength that is approximately the same for all optical signals.

Embodiment 299 the apparatus of any of embodiments 292 through 297, wherein the replication module comprises at least one replication module having a splitter that transmits power of a predetermined proportion of the optical waves at an input port of the replication module to a first output port of the replication module and transmits power of the remaining proportion of the optical waves at the input port of the replication module to a second output port of the replication module.

Embodiment 300 the apparatus of embodiment 299, wherein the optical splitter comprises a waveguide splitter that transmits power of a predetermined proportion of the optical waves directed by the input optical waveguide of the replication module to the first output optical waveguide of the replication module and transmits power of the remaining proportion of the optical waves directed by the input optical waveguide of the replication module to the second output optical waveguide of the replication module.

Embodiment 301 the apparatus of embodiment 300 wherein the guided mode of the input optical waveguide is adiabatically coupled to the guided mode of each of the first and second output optical waveguides.

Embodiment 302 the apparatus of embodiment 299 or 230, wherein the beam splitter comprises a beam splitter comprising at least one surface that transmits a predetermined proportion of the power of the light waves at the input port and reflects the remaining proportion of the power of the light waves at the input port.

Embodiment 303 the apparatus of embodiment 302, wherein at least one of the plurality of optical waveguides comprises an optical fiber coupled to an optical coupler that couples a guided mode of the optical fiber to a free space propagation mode.

Embodiment 304 the apparatus of any of embodiments 292-303, wherein the multiplication module includes at least one coherence sensitive optical amplitude modulator configured to multiply one or more optical signals of the first subset by one or more matrix element values based on interference between optical waves, the optical waves having a coherence length that is at least as long as a propagation distance through the coherence sensitive optical amplitude modulator.

Embodiment 305 the apparatus of embodiment 304 wherein the coherence sensitive optical amplitude modulator comprises a mach-zehnder interferometer that splits an optical wave guided by the input optical waveguide into a first optical waveguide arm of the mach-zehnder interferometer and a second optical waveguide arm of the mach-zehnder interferometer, the first optical waveguide arm comprising a phase shifter that produces a relative phase shift with respect to a phase delay of the second optical waveguide arm, and the mach-zehnder interferometer combines the plurality of optical waves from the first optical waveguide arm and the second optical waveguide arm into at least one output optical waveguide.

Embodiment 306 the apparatus of embodiment 305 wherein the mach-zehnder interferometer combines a plurality of light waves from the first light guide arm and the second light guide arm into each of the first output light guide and the second output light guide, the first photodetector receives light waves from the first output light guide to produce a first photocurrent, the second photodetector receives light waves from the second output light guide to produce a second photocurrent, and the result of the coherence sensitive optical amplitude modulator comprises a difference between the first photocurrent and the second photocurrent.

Embodiment 307 the apparatus of any one of embodiments 304 to 306 wherein the coherence sensitive optical amplitude modulator comprises one or more ring resonators including at least one ring resonator coupled to the first optical waveguide and at least one ring resonator coupled to the second optical waveguide.

Embodiment 308 the apparatus of embodiment 307 wherein the first photodetector receives light waves from a first light guide to produce a first photocurrent, the second photodetector receives light waves from a second light guide to produce a second photocurrent, and the result of the coherent sensitive light amplitude modulator comprises a difference between the first photocurrent and the second photocurrent.

Embodiment 309 the apparatus of any of embodiments 292-308, wherein the multiplication module includes at least one coherent, non-sensitive optical amplitude modulator configured to multiply one or more optical signals of the first subset by one or more matrix element values based on energy absorption within the optical wave.

Embodiment 310 the apparatus of embodiment 309 wherein the coherent non-sensitive optical amplitude modulator comprises an electro-absorption modulator.

Embodiment 311 an apparatus as in any one of embodiments 292-310 wherein the one or more summing modules comprises at least one summing module having (1) two or more input conductors each carrying an electrical signal in the form of an input current, the magnitude of the input current representing a respective result of a respective one of the multiplication modules, and (2) at least one output conductor carrying an electrical signal representing a sum of respective results in the form of an output current, the output current being proportional to the sum of the input currents.

Embodiment 312 the apparatus of embodiment 311, wherein the two or more input conductors and the output conductor comprise wires that contact one or more nodes between the wires and the output current is approximately equal to the sum of the input currents.

Embodiment 313 the apparatus of embodiment 311 or 312, wherein at least a first input current of the input currents is provided in the form of at least one photocurrent generated by at least one photodetector that receives the optical signal generated by the first multiplication module of the multiplication modules.

Embodiment 314 the apparatus of embodiment 313 wherein the first input current is provided in the form of a difference between two photocurrents generated by different respective photodetectors that receive different respective optical signals generated by the first multiplication module.

Embodiment 315 the apparatus of any one of embodiments 292 through 314 wherein one of the copies of the first subset of one or more optical signals is comprised of a single optical signal on which one of the input values is encoded.

Embodiment 316 the apparatus of embodiment 315 wherein the multiplication module corresponding to the copy of the first subset multiplies the encoded input values by the single matrix element values.

Embodiment 317 the apparatus of any of embodiments 292-316, wherein one of the copies of the first subset of one or more optical signals comprises more than one optical signal and less than a number of all optical signals, wherein the plurality of input values are encoded on the optical signals.

Embodiment 318 the apparatus of embodiment 317 wherein the multiplication module corresponding to the copy of the first subset multiplies the encoded input values by different respective matrix element values.

Embodiment 319 the apparatus of embodiment 318 wherein the different multiplication modules corresponding to different respective copies of the first subset of one or more optical signals are contained in different apparatuses that optically communicate to transmit one of the copies of the first subset of one or more optical signals between the different apparatuses.

Embodiment 320 the apparatus of embodiment 319 wherein at least one of the two or more of the plurality of optical waveguides, the two or more of the plurality of replication modules, the two or more of the plurality of multiplication modules, and the one or more summation modules are disposed on a substrate of a common apparatus.

Embodiment 321 the apparatus of embodiment 320 wherein the apparatus performs vector matrix multiplication in which an input vector is provided as a set of optical signals and an output vector is provided as a set of electrical signals.

Embodiment 322 the apparatus of any one of embodiments 292 through 321, further comprising an accumulator that integrates the input electrical signal corresponding to the output of the multiplication or summation module, wherein the input electrical signal is encoded using time domain encoding using switched amplitude modulation within each of the plurality of time slots, and the accumulator generates the output electrical signal encoded at more than two amplitude levels, the amplitude levels corresponding to different duty cycles of the time domain encoding over the plurality of time slots.

Embodiment 323 the apparatus as in any one of embodiments 292-322, wherein each of the two or more multiplication modules corresponds to a different subset of the one or more optical signals.

Embodiment 324 the apparatus of any of embodiments 292-323, further comprising a multiplication module for each copy of a second subset of the one or more optical signals different from the first subset of the one or more optical signals configured to multiply the one or more optical signals of the second subset by the one or more matrix element values using optical amplitude modulation.

Embodiment 325 a method comprising:

Encoding a set of a plurality of input values on respective optical signals using a first set of optical amplitude modulators;

For each of at least two subsets of the one or more optical signals, using a corresponding set of one or more replication modules to divide the subset of the one or more optical signals into two or more copies of the optical signal;

For each of at least two copies of a first subset of one or more optical signals, multiplying the one or more optical signals of the first subset by one or more matrix element values using an optical amplitude modulator of a second set of optical amplitude modulators, and

For the results of two or more multiplication modules, a summation module configured to generate an electrical signal representing the sum of the results of the two or more multiplication modules is used,

Claims

1. An optoelectronic computing system, comprising:

a first semiconductor die comprising a Photonic Integrated Circuit (PIC), the photonic integrated circuit comprising:

A plurality of optical waveguides configured to carry optical signals, wherein a set of a plurality of input values is encoded on a respective optical signal carried by the optical waveguides;

An optical replication distribution network comprising a plurality of optical splitters, wherein each optical splitter is configured to receive the optical signal from an input port and transmit half of the power of the optical signal to each of two output ports, and

An array of opto-electronic circuit sections, wherein each opto-electronic circuit section is configured to receive an optical wave from one of the output ports of the optical replication distribution network, and each opto-electronic circuit section comprises:

at least one photodetector configured to detect at least one light wave from a photovoltaic operation, and

At least one conductive path integrated in the photonic integrated circuit, the conductive path electrically coupled to the photodetector and electrically coupled to an electrical output port, and

A second semiconductor die comprising an Electronic Integrated Circuit (EIC), the electronic integrated circuit comprising:

a plurality of electrical input ports that receive respective electrical values;

wherein the first semiconductor die and the second semiconductor die are electrically coupled in a controlled collapse chip connection, wherein the electrical output port of the photonic integrated circuit is connected to one of the electrical input ports of the electronic integrated circuit.

2. An optoelectronic computing system according to claim 1, wherein each optoelectronic circuit portion comprises:

An optical-electrical operation module configured to perform an operation between (1) a second value based on one of the input values scaled by the optical replication distribution network and (2) an electrical value provided by the electrical input port.

3. The optoelectronic computing system of claim 2, wherein the electronic integrated circuit further comprises a plurality of digital-to-analog converters (DACs) configured to provide electrical values to respective electrical output ports, and the electrical input port of the photonic integrated circuit is connected to the electrical output port of the electronic integrated circuit.

4. An optoelectronic computing system according to claim 2, wherein the optical splitters are arranged as nodes in a binary tree arrangement, the binary tree arrangement being connected by optical waveguides as links in the binary tree arrangement.

5. An optoelectronic computing system according to claim 4, wherein the optical replication distribution network comprises a plurality of binary tree arrangements, each binary tree arrangement configured to distribute a different one of the plurality of input values encoded on the respective optical signal.

6. An optoelectronic computing system according to claim 4, wherein the light propagation lengths between the root and the different optoelectronic circuit portions of the binary tree arrangement are all different from each other.

7. An optoelectronic computing system according to any one of claims 1 to 6, wherein the optical waveguides in the optical replication distribution network are arranged in the first semiconductor die so as to avoid crossing any optical waveguides in the optical replication distribution network.

8. The optoelectronic computing system of any one of claims 1 to 6, wherein the optoelectronic circuit portion is arranged in a plurality of substantially straight lines on the first semiconductor die.

9. An optoelectronic computing system according to claim 8, wherein the plurality of wires are optically coupled to each other by one or more optical waveguides in the optical replication distribution network.

10. The optoelectronic computing system of any one of claims 1 to 6, wherein a portion of the conductive path integrated in the photonic integrated circuit connects the photodetector to a junction between conductive paths from different optoelectronic circuit portions.

11. The optoelectronic computing system of any one of claims 2 to 6, wherein the optoelectronic operating module comprises a mach-zehnder interferometer configured to perform a multiplication operation between (1) the second value based on one of the input values scaled by the optical replication distribution network and (2) the electrical value provided by the electrical input port.

12. The optoelectronic computing system of any one of claims 1 to 6, wherein the electronic integrated circuit further comprises a transimpedance amplifier having an input electrically coupled to an electrical output port of the photonic integrated circuit.

13. An optoelectronic computing system, comprising:

An optical network comprising a plurality of optical splitters or directional couplers, each configured to receive the optical signal at an input port and transmit a portion of the optical power of the optical signal to each of two or more output ports, and

An array of opto-electronic circuit sections, wherein each opto-electronic circuit section is configured to receive an optical wave from one of the output ports of the optical network, and each opto-electronic circuit section comprises:

at least one photodetector configured to detect at least one light wave from operation, and

At least one conductive line integrated in the photonic integrated circuit, the conductive line electrically coupled to the photodetector and electrically coupled to an electrical output port, and

14. The optoelectronic computing system of claim 13, wherein each optical splitter transmits half of the power of the optical signal at the input port to each of the two output ports.

15. The optoelectronic computing system of claim 13, wherein the optical network comprises cascaded directional couplers.

16. An optoelectronic computing system according to claim 13, wherein the optical network comprises:

a first layer located at a first depth in the first semiconductor die comprising a cladding layer material and a plurality of optical waveguides formed of a core material within the cladding layer material;

a second layer located at a second depth in the first semiconductor die, comprising the cladding layer material and a plurality of optical waveguides formed within the cladding layer material from the core material;

A third layer located at a third depth between the first depth and the second depth in the first semiconductor die, including the cladding layer material and a plurality of coupling structures.

17. The optoelectronic computing system of claim 16, wherein the optical network further comprises a fourth layer located at a fourth depth between the first depth and the third depth in the first semiconductor die, comprising the cladding layer material and a plurality of coupling structures.

18. An optoelectronic computing system according to claim 17, wherein the coupling structures in the third layer and the fourth layer are arranged close to each other to provide coupling between the waveguides in the first layer and the waveguides in the second layer.

19. The optoelectronic computing system of claim 17, wherein the optical network further comprises a fifth layer located at a fifth depth between the second depth and the third depth in the first semiconductor die, comprising the cladding layer material and a plurality of coupling structures.

20. An optoelectronic computing system according to claim 13, wherein each optoelectronic circuit portion comprises:

an electro-optical operation module that performs the operation between (1) a second value based on one of the input values scaled by the optical network and (2) an electrical value provided by the electrical input port.

21. The optoelectronic computing system of claim 13, wherein the electronic integrated circuit further comprises a plurality of digital-to-analog converters (DACs) configured to provide electrical values to respective electrical output ports, and the electrical input port of the photonic integrated circuit is connected to the electrical output port of the electronic integrated circuit.

22. The optoelectronic computing system of claim 13, wherein the optical splitter is arranged as a node in a binary tree arrangement, the binary tree arrangement being connected by an optical waveguide that is a link in the binary tree arrangement.

23. An optoelectronic computing system according to claim 22, wherein the optical network comprises a plurality of binary tree arrangements, each binary tree arrangement assigning a different one of the plurality of input values encoded on the respective optical signal.

24. An optoelectronic computing system according to claim 22, wherein the light propagation lengths between the root and the different optoelectronic circuit portions of the binary tree arrangement are all different from each other.

25. An optoelectronic computing system according to any one of claims 13 to 24, wherein the optical waveguides in the optical network are arranged in the first semiconductor die so as to avoid crossing any optical waveguides in the optical network.

26. The optoelectronic computing system of any one of claims 13 to 24, wherein the optoelectronic circuit portion is arranged in a plurality of substantially straight lines on the first semiconductor die.

27. An optoelectronic computing system according to claim 26, wherein the plurality of wires are optically coupled to each other by one or more optical waveguides in the optical network.

28. The optoelectronic computing system of any one of claims 13 to 24, wherein a portion of the conductive lines integrated in the photonic integrated circuit connect the photodetector to a junction between conductive paths from different optoelectronic circuit portions.

29. The optoelectronic computing system of claim 20, wherein the optoelectronic operating module comprises a mach-zehnder interferometer configured to perform a multiplication operation between (1) an optical value based on one of the input values scaled by the optical network and (2) an electrical value provided by the electrical input port.

30. The optoelectronic computing system of any one of claims 13 to 24, wherein the electronic integrated circuit further comprises a transimpedance amplifier having an input electrically coupled to an electrical output port of the photonic integrated circuit.

31. An optoelectronic computing system, comprising:

An optical replication distribution network comprising a plurality of optical splitters, wherein each optical splitter is configured to receive the optical signal from an input port and to transmit a portion of the power of the optical signal to each of two output ports of the optical splitter, and each of the output ports of the optical replication distribution network provides a portion of the optical signal carrying an optical signal encoding one of the input values scaled by the same, and

32. A photovoltaic computing system according to claim 31, wherein each photovoltaic circuit portion comprises:

An optical-electrical operation module configured to perform the operation between (1) a second value based on one of the input values scaled by the optical replication distribution network and (2) an electrical value provided by the electrical input port.

33. An optoelectronic computing system according to claim 31, wherein the electronic integrated circuit further comprises a plurality of digital-to-analog converters (DACs) providing electrical values to respective electrical output ports, and the electrical input ports of the photonic integrated circuit are connected to the electrical output ports of the electronic integrated circuit.

34. The optoelectronic computing system of claim 31, wherein the plurality of optical splitters are configured to transmit half of the power of the optical signal at the input port to each of two output ports, and the optical splitters are arranged as nodes in a binary tree arrangement connected by optical waveguides that are links in the binary tree arrangement.

35. An optoelectronic computing system according to claim 31, wherein the optical replication distribution network comprises a plurality of binary tree arrangements, each binary tree arrangement distributing a different one of the plurality of input values encoded on the respective optical signal, and the proportions in which the respective input values are scaled in the different binary tree arrangements are the same.

36. An optoelectronic computing system according to claim 34, wherein the light propagation lengths between the root and the different optoelectronic circuit portions of the binary tree arrangement are all different from each other.

37. An optoelectronic computing system according to any one of claims 31 to 36, wherein the optical waveguides in the optical replication distribution network are arranged in the first semiconductor die so as to avoid crossing any optical waveguides in the optical replication distribution network.

38. The optoelectronic computing system of any one of claims 31 to 36, wherein the optoelectronic circuit portion is arranged in a plurality of substantially straight lines on the first semiconductor die.

39. An optoelectronic computing system according to claim 38, wherein the plurality of wires are optically coupled to each other by one or more optical waveguides in the optical replication distribution network.

40. The optoelectronic computing system of any one of claims 31 to 36, wherein a portion of the conductive path integrated in the photonic integrated circuit connects the photodetector to a junction between conductive paths from different optoelectronic circuit portions.

41. The optoelectronic computing system of any one of claims 32 to 36, wherein the optoelectronic operating module comprises a mach-zehnder interferometer configured to perform a multiplication operation between (1) a second value based on one of the input values scaled by the optical replication distribution network and (2) an electrical value provided by the electrical input port.

42. The optoelectronic computing system of any one of claims 31 to 36, wherein the electronic integrated circuit further comprises a transimpedance amplifier having an input electrically coupled to an electrical output port of the photonic integrated circuit.

43. A method, comprising:

At a first semiconductor die comprising a Photonic Integrated Circuit (PIC), for each of at least two subsets of one or more optical signals, splitting the subset of one or more optical signals into two or more copies of the one or more optical signals;

transmitting said copies of said at least two subsets of one or more optical signals to an array of opto-electronic circuit sections,

In each optoelectronic circuit portion of the array of optoelectronic circuit portions, detecting at least one light wave from operation to generate at least one electrical signal and transmitting the at least one electrical signal through at least one conductive path integrated in the photonic integrated circuit to a second semiconductor die comprising an Electronic Integrated Circuit (EIC) electrically coupled to the first semiconductor die in a controlled collapse chip connection, and

The at least one electrical signal is processed using the electronic integrated circuit.

44. The method of claim 43, comprising:

Transmitting, at the first semiconductor die, light waves in a first waveguide disposed at a first depth of the first semiconductor die;

Transferring the light wave from the first waveguide to a second waveguide through a plurality of coupling structures, wherein the second waveguide is disposed at a second depth of the first semiconductor die;

wherein the plurality of coupling structures includes at least a first coupling structure and a second coupling structure, the first coupling structure being disposed at a third depth of the first semiconductor die and the second coupling structure being disposed at a fourth depth of the first semiconductor die.

45. The method of claim 44, wherein the plurality of coupling structures includes at least the first coupling structure, the second coupling structure, and a third coupling structure disposed at a fifth depth of the first semiconductor die.

46. The method of claim 44 or 45, wherein transferring the optical wave from the first waveguide to the second waveguide comprises transmitting no more than 10% of the power of the optical wave in each of the first and second coupling structures simultaneously.

47. The method of claim 44 or 45, wherein transferring the light wave from the first waveguide to the second waveguide comprises simultaneously transmitting at least 10% of the power of the light wave in each of the first and second coupling structures at some point during transfer.

48. The method of claim 44 or 45, wherein transferring the light wave from the first waveguide to the second waveguide comprises simultaneously transmitting at least 20% of the power of the light wave in each of the first and second coupling structures at some point during transfer.

49. The method of claim 44 or 45, wherein transferring the light wave from the first waveguide to the second waveguide comprises simultaneously transmitting at least 30% of the power of the light wave in each of the first and second coupling structures at some point during transfer.

50. The method of claim 44 or 45, wherein transferring the light wave from the first waveguide to the second waveguide comprises simultaneously transmitting at least 40% of the power of the light wave in each of the first and second coupling structures at some point during transfer.

51. The method of claim 45, wherein transferring the optical wave from the first waveguide to the second waveguide comprises simultaneously transmitting at least 10% of the power of the optical wave in each of the first, second, and third coupling structures at some point during the transferring.

52. The method of claim 45, wherein transferring the optical wave from the first waveguide to the second waveguide comprises simultaneously transmitting at least 20% of the power of the optical wave in each of the first, second, and third coupling structures at some point during the transferring.

53. The method of claim 45, wherein transferring the optical wave from the first waveguide to the second waveguide comprises simultaneously transmitting at least 30% of the power of the optical wave in each of the first, second, and third coupling structures at some point during the transferring.