# Introduction

tore and retrieve data units based on the Von-Neumann architecture are far more timeconsuming and power-hungry than optical device [1][2][3][4]. Different from modern computers, cis integrated of data computation, storage and fetch, which more effective, less power, large storage capacity and higher integration level [5][6][7]. Besides, Artificial neural network is [8] similar to the way in which human and animal store and process data is successful in a wide range of tasks such as image analysis [9], speech recognition [10] and language translation [11]. Artificial neural network can get comparable or even superior performance than the human with the increasing data volume, problem complexity and structure depth. Most of tasks cannot be migrated well in smart portable devices for its complexity and power. The less-power, more-efficiency and faster speed is becoming more and more critical for deep learning implemented on embedded device.

The neuromorphic computing seeks a brain-like processing, which overcome the limitation from conventional computers. IBM [12] built a 5.4-billion transistor chip with 4096 neurosynaptic cores called True North -a fully functional digital chip. To provide the extreme complexity of the human cerebral cortex, M Prezioso [13] et al combined complementary metal oxide-semiconductors (CMOS) and two-terminal resistive device with electric circuits. Spin-transfer torque magnetic memory [14] (STTMRAM) with non-volatility, high-speed and high endurance, is suitable as a stochastic memristive device, considering the functional implication of synaptic neuronal plasticity. Alexander N. Tait [15] inspired by spiking neural networks integrated laser devices to explore highly interactive information at speeds with optical-electronic systems. This approach promises to incorporate photonic spike processing in the training system. Besides, Carlos Rios dramatically improve storage capacity to implement all-photic nonvolatile [16] multi-level where memory electric-photic interconnect technologies bring not only opportunities but also challenges to the unconventional circuits and systems. To overcome the wastage of optical-electric conversions coupling, all photic device can be performed with fast computational speed and lower power. On-chip nonvolatile photic [17] device would dramatically improve performance in existing brain-like neural networks [18] to eliminate electronic latency and reduce electronic consuming. The on-chip optical architecture is designed for network protocol computational element and waveguide medium to communicate among high-performance spiking neurons.

The architecture of fully-optical network with Mach-Zehuder interferometer [19] promises the reduction of data-movement energy cost. All-optical diffractive deep neural network architecture (D2NN) [20] utilize passive component and optical diffraction. D2NN can be easily scaled up and fabricated by 3D lithography [21] in a power-efficient manner.

In general, optical networks have more trainable parameters with complex-value modulation which provide phase and amplitude of each neuron rather than only amplitude in electric networks. Unfortunately, optical device to forming neural network has some problems. Firstly, all-optical neural network is designed for a single task, but multi-tasks [22] are significant and important. Secondly, learning rates for different tasks are important to the accuracy. It is non-trivial to balance the tasks and learning rates.

In this paper, to address above two issues, we make most of the optical characteristics to express different tasks with different wavelengths. The one wavelength to use as baseband, the other is used as a carrier frequency. Therefore, the base band wave can be set to a large learning rate and vice versa. Extensive experiments based on MNIST and MNIST-Fasion [23] are conducted to investigate the efficacy and properties of the proposed multi-wavelength diffractive network (MWDN). In both two tasks, The MWDN significantly performs the baselines even better in the same network.


# II.


# Multi-Wavelength Diffractive Network

Spatial domain implies per-wave in-plane propagation reasoning about diffractive in the particular phase and frequency which can analyze and integrate different direction waves. It operates in the frequency domain. The wave distribution of the observation and aperture plane can be viewed as the linear combination with a great many monochromatic plane wave of different direction propagation. The amplitude and phase of each plane wave lies on the angular spectrum. The angular spectrum can be acquired by FFT analysis [24] process. The plane wave propagation is a complex task that take into consider many affecting factors, such as direct, phase and amplitude.

As shown in Fig. 1, we adjust optical grating parameters (height and the complex index of refraction), the height is altered by 3D printed, and the complex index of refraction is altered by laser light with different power. Different power can alter different refraction of phase change materials. We input images in MNIST and MNIST-FASHION simultaneously, the input optical wavelength of MNIST and MNIST-FASION task is ? 1 and ? 2 , respectively. The diffractive network with the different task has the same optical parameters. The bottom of the figure is the optical carrier. Fig. 2 shows the framework of diffractive network and different color denotes different index of refractive. Firstly, we convert image information to the phase and amplitude of optical information as the input of systems. Then, the optical grating is manufactured by 3D-printing device with different height. In the following sections, we discuss that MWDN tackles the tasks predominately using the angular spectrum. MWDN by the 3D-printing would influence the amplitude and phase of the wave to 0~1and 0~2?, for two tasks in the same network. For each layer of MWDN, we set the neuron size range 200?m to 700?m, which is an effect tunable.

Following the Fresnel diffraction equation, we can consider the optical signal from the spatial domain to the frequency domain. The angular spectrum method of plane wave explains how wave propagate. It is the primary method of analyzing diffraction in the frequency domain. Based on the angular spectrum, the free space transfer function is to control free propagation. The wave plane can transfer angular spectrum by FFT process, where diffractive data processing is more evident as follows:

The output wave plane distribution propagates through 3D material and the field distribution is changed by the refractive index.


# Global Journal of Computer Science and Technology

Volume XX Issue I Version I 
x y x y x y A f f A f f H f f = ? ( , ) ( ( , )) x y U x y IFFT A f f = 2 ( , ) exp( ) exp( ( ) ) x y x y H f f jkz j z f f ?? = + 1 0 0 2 2 0 0 ( , ) ( ( , )) 1 exp( ) exp( (( ) ( ) )) 2 x y h x x y y F H f f k jkz j x x y y j z z ? ? ? ? = = ? + ? 1 1 0 ( , ) ( , )(1 ) exp( ) 2 4 ,
,
O x y U x y d j n d k n n n ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? = ? = ? =
Where, fx, fy are space frequency correspond to x, y location (fx =1/(x-x0)), x-x0 is gaps in the optical map, N and M is the number of grooves on optical grating in height and width direction, U0(x, y) is the original field distribution, U(x, y) is the field distribution after free space transfer, H(fx, fy) is transfer function, A0(fx, fy) is the original angular spectrum, A(fx, fy) is the angular spectrum after free space transfer. The results of the inverse Fourier transform to transfer function, are the impulse response function. The equation can view as the Fourier transform.

Where ? is extinction efficiency, n is the refractive index real part of 3D material, n0 is the refractive index real part of the vacuum, k is the refractive index imagery part, ? is the wavelength, Î?"d is the height of material map, ? is the phase difference. If we choose transparent material (k?0) ignoring the optical losses, the transmission coefficient of a neuron is composed of only phase term; if we select non-transparent material, the transmission coefficient of a neuron is composed of amplitude and phase in MWDN architecture.

According to the size of input data, an effective and flexible linear interpolation algorithm is to fit the diffractive input layer. The interconnection rate between adjacent layers relates to the distance and diffraction angle, which approach the critical value (1.0).

Furthermore, the number of the network layers and the axial distance is also a tunable. The output layer can part into ten regions corresponding to ten classes, where the summation of light intensity can be detected in the wave plane region. Mean square error (MSE) uses to train MWDN parameters compared to the target. We aim to minimize a loss function, which increases target region wave intensity and decrease other regions. The training batch size set to be 10 for the classifier.


# III. The Backward of Multi-Wavelength Diffractive Network

To train MWDN, we use the back-propagation algorithm with a dam optimization method.

We focus on the intensity of wave and define loss function with MSE between output and target.

where K is the number of training data, ok is the output of the MWDN, and tk is the label of the corresponding input. The optimization problem can be written follows:

where l is the layer, i is the lth layer location. The gradient of loss to all parameters can be calculated, which is used to update MWDN architecture parameters during the training process. Each batch of the training data is fed into MWDN, where each layer gradient can be calculated to update.


# IV. The Backward of Multi-Wavelength Diffractive Network

The optical diffractive network and deep neural network are markedly different. The function of the optical diffractive network is determined by wavelength and the parameters of the optical grating (height and complex refractive index). Multi-wavelength diffractive network has a broad range of requirements that differ from the conventional network.

Different wavelength has different effectiveness. We set different wavelength for different tasks. Meanwhile, the network needs to ensure that different wavelengths do not affect each other. By setting one to baseband and the other to the carrier, the diffractive network is used to adjust optical plane wave independently. The algorithm can be considered as an efficient carrier algorithm. The ratio of baseband and carrier wavelengths is 1:30. The short wavelength is little influence to long wavelength and vice versa. If the phase difference of long wavelength is ? 1 and the phase difference of short wavelength is ? 2 , the corresponding relationship as follows:

So, the equation can be as follows:

The second terms of ? 2 is relative to the first term can be ignore, the equation can be shown as follows:

The multi-wavelength diffractive network can be effective, and more powerful than deep neural network. Phase difference ? i (i=1, 2) can be obtained easily by adjusting the height on the diffractive network. Due to ? 2 << ? 1 , then we adjust ?d for minor learning rate for ? 2 , as well as for large learning rate for ? 1 , without one impacting the other.

V.


# Experiment

In this work, we apply the proposed MWDN to implement on two different dataset MNIST and MNIST-Fashion.


# a) Model setup

By comparing to the state-of-the-art methods with accuracy and speed of, MNIST and MNIST-FASION in this method achieve better performance. The size of the network is set to 200×200, 500×500 and 1500×1500, each having a trainable height of the map. The optical network possesses two types, one for phase modulation, and the other for complex-modulation. The MNIST and MNIST-FASION tasks with different optical wavelengths, the input is altered by optical grating mask.

Using the backward propagation, the model is trained with two task datasets alternately, validated its effectiveness. We train the network with different learning rate for different tasks, which overcome the drawbacks of local optimum to solve. As well as, all the parameters of the network are adjusted by the gradient descent algorithm to minimize the error. 


# Global
K k k k E d o t K = ? = ? ? min ( ) l i l i d E d ? ? ? 1 ? 1 = ? 2 ? 2 , ? 1 ? 2 ? = 1: 30 ? 2 ? 1 ? = 1: 30, ? 2 ? ? 1 ? 1 = 2?? + ? 1 ? , ? = 0,1,2 ? ? ? 2 = ?n 15 ? + ? 1 ? 30 , ? 1 ? = 0~2? ? 2 ? ?n 15 ? ?? ? ?? 2?Î?"? ? b) Dataset
We evaluate the approach on two datasets and input information for neurons in the form of phase fed into the network. The two datasets have different data distribution, which is difficult to classify in the same network. The conventional networks require the input information to be independent and identical distribution. The task is to handle two different distribution data in a same network.


# c) Experimental analysis

For better performance, we set a different learning rate and different signal frequency to two datasets. The maximum half-cone diffraction angle is formulated as follow:

The light wavelength is 0.4THz, 14.4THz for MNIST and MNIST fashion. The neuron size is set to be 200?m. The height of the map and axial distance between two successive layers are trainable. As comparing the performance of MWDN and DN methods with single task, the results was shown in Table 1. It is clearly that the performance of MWDN would improve the accuracy of 1.15% and 3.2%, independently. To evaluate the multi-wavelength for multi-task, so we compare the multi-task to a single task in Table 2. The multi-task diffractive network enables consistent performance with a single task. The result can perform well in the same parameters. The experiment set of setting 1 is the same wavelength for comparison. Setting 2 is performed by a different wavelength. The DN-FASION and DN-MNIST are evaluated by independent diffractive network.    We find that we can implement two classes to the same network with MWDN algorithm. Compared to other approach that use only single dataset as input, our approach even yields a boost.


# VI.


# Conclusion

In this paper, we propose a novel and multitasks optical network named as the multi-wavelength diffractive network (MWDN). Based on plane wave propagation, our method can achieve comparable accuracy against the single-task network. We successfully apply MWDN to multi-tasks with different datasets distribution and provide a multi-wavelength method with different model size. In the future, we aim to develop a more effective network to achieve complex tasks and reach better performance.
12![Fig. 1: The architecture of Multi-wavelength diffractive network](image-2.png "Fig. 1 :Fig. 2 :")
![JournalsMulti-Task Learning by Multi-Wave Optical Diffractive Network ? 0 (ð??" ? , ð??" ? ) = ???(? 0 (?, ?)) = ? ? ? 0 (?, ?)? ??2?( ð??" ? ?](image-3.png "")
![Journal of Computer Science and TechnologyVolume XX Issue I Version I](image-4.png "")


1MethodMNIST? (THZ)MNIST-Fasion? (THZ)MWDN(PCM)92.85%0.484.33%14.4DN-MNIST91.75%0.481.13%0.4
2MethodMNISTMNIST-Fasion?1 (THZ)?2(THZ)Setting 123.45%12.12%0.40.4Setting 290.45%76.67%14.40.144DN-Fasion/78.70%14.4/DN-MNIST91.75%//14.4d) Convergence analysis
			© 2020 Global JournalsMulti-Task Learning by Multi-Wave Optical Diffractive Network
		
		
## Acknowledgements

We acknowledge the financial supports by National Key R&D Program of China (Grant No. 2017YFE0112000). The authors wish to thank Professor Du. Furthermore, the authors would like to express sincere thanks for server supported by Light-ca Technology Corporation.

			
* 
	
		Regenerative pulsations from an intrinsic bistable optical device
		
			JLJewell
		
	
		Applied Physics Letters
		
			40
			4
			291
		
	
* 
	
		An optical device to measure blood components by a photoplethysmographic method
		
			JKraitl
		
		
			HEwald
		
		
			HGehring
		
	
		Journal of Optics A Pure & Applied Optics
		
			7
			6
			
		
* 
	
		3D-printed optical-electronic integrated devices
		
			YLiu
		
		
			XLin
		
		
			CWei
		
		
			CZhang
		
		
			YSZhao
		
	
		Science China Chemistry
		
			2019
		
	
* 
	
		Functional Midnfrared Polaritonics in van der Waals Crystals
		
			SKim
		
		
			SGMenabde
		
		
			VWBrar
		
		
			MSJang
		
	
		Advanced Optical Materials
		
			1901194
			2019
		
	
* 
	
		A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction
		
			MSpencer
		
		
			JEickholt
		
		
			JCheng
		
	
		Computational Biology & Bioinformatics IEEE/ACM Transactions on
		
			99
			1
			2014
		
	
* 
	
		Deep learning of the sectional appearances of 3D CT images for anatomical structure segmentation based on an FCN voting method
		
			XZhou
		
		
			RTakayama
		
		
			SWang
		
		
			THara
		
		
			HFujita
		
	
		Medical Physics
		
			44
			10
			5221
			2017
		
	
* 
	
		Improved deep learning-based macromolecules structure classification from electron cryo-tomograms
		
			CChe
		
		
			RLin
		
		
			XZeng
		
		
			EKarim
		
		
			GJohn
		
		
			MXu
		
		
			Machine Vision & Applications
		
	
* 
	
		Aggregated Residual Transformations for Deep Neural Networks
		
			XS
		
		
			RGirshick
		
		
			PDoll¨¢r
		
		
			ZTu
		
		
			KHe
		
	
		IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
				
			2017
			
		
* 
	
		Learning a Deep Convolutional Network for Image Super-Resolution
		
			DChao
		
		
			CLChen
		
		
			KHe
		
		
			XTang
		
		
* 
	
		Deep learning: from speech recognition to language and multimodal processing
		
			DLi
		
	
		Apsipa Transactions on Signal & Information Processing
		
			5
		
	
* 
	
		Machine translation using deep learning: An overview
		
			SPSingh
		
		
			AKumar
		
		
			HDarbari
		
		
			LSingh
		
		
			SJain
		
		
* 
	
		On the design, control, and use of a reconfigurable heterogeneous multicore system-on-a-chip
		
			TOKwok
		
		
			YKKwok
		
		
* 
	
		COMPLEMENTARY VALUE OF TRANSTHORACIC ECHOCARDIOGRAPHY & CINEFLUOROSCOPIC EVALUATION OF MECHANICAL HEART PROSTHETIC VALVES
		
			TFCianciulli
		
		
			JALax
		
		
			FECerruti
		
		
			GEGigena
		
		
			HJRedruello
		
		
			MAOrsi
		
		
			JAGagliardi
		
		
			ANDorelle
		
		
			MARiccitelli
		
		
			HAPrezioso
		
	
		Echocardiography
		
			21
			2
			
			2010
		
	
* 
	
		Spin-Transfer Torque Magnetic Memory As a Stochastic Memristive Synapse
		
			AFVincent
		
		
			JLarroque
		
		
			WZhao
		
		
			NBRomdhane
		
		
			DQuerlioz
		
		
* 
	
		
			ANTait
		
		
			MANahmias
		
		
			TYue
		
		
			BJShastri
		
		
			PR
		
		Photonic Neuromorphic Signal Processing and Computing
				Berlin Heidelberg
		
			Springer
			2013
		
	
* 
	
		Integrated all-photonic non-volatile multilevel memory
		
			CRíos
		
		
			MStegmaier
		
		
			PHosseini
		
		
			DWang
		
		
			TScherer
		
		
			CDWright
		
		
			HBhaskaran
		
		
			WH PPernice
		
	
		Nature Photonics
		
	
* 
	
		Nonphotic Entrainment in a Diurnal Mammal, the European Ground Squirrel (Spermophilus citellus)
		
			RAHut
		
		
			NMrosovsky
		
		
			SDaan
		
	
		Journal of Biological Rhythms
		
			14
			5
			
		
* 
	
		A brain-like neural network for periodicity analysis
		
			KVoutsas
		
		
			GLangner
		
		
			JAdamy
		
		
			MOchse
		
	
		IEEE Transactions on Systems Man & Cybernetics Part B Cybernetics
		
			35
			1
			
		
* 
	
		Ultrafast all-optical demultiplexer based on monolithic Mach-Zehnder interferometer with integrated semiconductor optical amplifiers
		
			MDülk
		
		
			SFischer
		
		
			MBitter
		
		
			MCaraccia
		
		
			WVogt
		
		
			EGini
		
		
			HMelchior
		
		
			WHunziker
		
		
			ABuxens
		
		
			HNPoulsen
		
	
		Optical & Quantum Electronics
		
			33
			7-10
			
		
* 
	
		All-optical machine learning using diffractive deep neural networks
		
			XLin
		
		
			RYair
		
		
			YN T
		
		
			VMuhammed
		
		
			LYi
		
		
			JMona
		
		
			OAydogan
		
	
		Science
		
			8084
		
	
* 
	
		SU-8: a photoresist for high-aspect-ratio and 3D submicron lithography
		
			ADCampo
		
		
			CGreiner
		
	
		Journal of Micromechanics & Microengineering
		
			17
			6
			
		
* 
	
		Pedestrian Detection aided by Deep Learning Semantic Tasks
		
			YTian
		
		
			PLuo
		
		
			XWang
		
		
			XTang
		
		
* 
	
		The MNIST Database of Handwritten Digit Images for Machine Learning Research
		
			LDeng
		
	
		IEEE Signal Processing Magazine
		
			29
			6
			
		
	Best of the Web


* 
	
		Computational Frameworks for the Fast Fourier Transform
		
			CF VLoan
		
		
			2011