# Introduction tore and retrieve data units based on the Von-Neumann architecture are far more timeconsuming and power-hungry than optical device [1][2][3][4]. Different from modern computers, cis integrated of data computation, storage and fetch, which more effective, less power, large storage capacity and higher integration level [5][6][7]. Besides, Artificial neural network is [8] similar to the way in which human and animal store and process data is successful in a wide range of tasks such as image analysis [9], speech recognition [10] and language translation [11]. Artificial neural network can get comparable or even superior performance than the human with the increasing data volume, problem complexity and structure depth. Most of tasks cannot be migrated well in smart portable devices for its complexity and power. The less-power, more-efficiency and faster speed is becoming more and more critical for deep learning implemented on embedded device. The neuromorphic computing seeks a brain-like processing, which overcome the limitation from conventional computers. IBM [12] built a 5.4-billion transistor chip with 4096 neurosynaptic cores called True North -a fully functional digital chip. To provide the extreme complexity of the human cerebral cortex, M Prezioso [13] et al combined complementary metal oxide-semiconductors (CMOS) and two-terminal resistive device with electric circuits. Spin-transfer torque magnetic memory [14] (STTMRAM) with non-volatility, high-speed and high endurance, is suitable as a stochastic memristive device, considering the functional implication of synaptic neuronal plasticity. Alexander N. Tait [15] inspired by spiking neural networks integrated laser devices to explore highly interactive information at speeds with optical-electronic systems. This approach promises to incorporate photonic spike processing in the training system. Besides, Carlos Rios dramatically improve storage capacity to implement all-photic nonvolatile [16] multi-level where memory electric-photic interconnect technologies bring not only opportunities but also challenges to the unconventional circuits and systems. To overcome the wastage of optical-electric conversions coupling, all photic device can be performed with fast computational speed and lower power. On-chip nonvolatile photic [17] device would dramatically improve performance in existing brain-like neural networks [18] to eliminate electronic latency and reduce electronic consuming. The on-chip optical architecture is designed for network protocol computational element and waveguide medium to communicate among high-performance spiking neurons. The architecture of fully-optical network with Mach-Zehuder interferometer [19] promises the reduction of data-movement energy cost. All-optical diffractive deep neural network architecture (D2NN) [20] utilize passive component and optical diffraction. D2NN can be easily scaled up and fabricated by 3D lithography [21] in a power-efficient manner. In general, optical networks have more trainable parameters with complex-value modulation which provide phase and amplitude of each neuron rather than only amplitude in electric networks. Unfortunately, optical device to forming neural network has some problems. Firstly, all-optical neural network is designed for a single task, but multi-tasks [22] are significant and important. Secondly, learning rates for different tasks are important to the accuracy. It is non-trivial to balance the tasks and learning rates. In this paper, to address above two issues, we make most of the optical characteristics to express different tasks with different wavelengths. The one wavelength to use as baseband, the other is used as a carrier frequency. Therefore, the base band wave can be set to a large learning rate and vice versa. Extensive experiments based on MNIST and MNIST-Fasion [23] are conducted to investigate the efficacy and properties of the proposed multi-wavelength diffractive network (MWDN). In both two tasks, The MWDN significantly performs the baselines even better in the same network. # II. # Multi-Wavelength Diffractive Network Spatial domain implies per-wave in-plane propagation reasoning about diffractive in the particular phase and frequency which can analyze and integrate different direction waves. It operates in the frequency domain. The wave distribution of the observation and aperture plane can be viewed as the linear combination with a great many monochromatic plane wave of different direction propagation. The amplitude and phase of each plane wave lies on the angular spectrum. The angular spectrum can be acquired by FFT analysis [24] process. The plane wave propagation is a complex task that take into consider many affecting factors, such as direct, phase and amplitude. As shown in Fig. 1, we adjust optical grating parameters (height and the complex index of refraction), the height is altered by 3D printed, and the complex index of refraction is altered by laser light with different power. Different power can alter different refraction of phase change materials. We input images in MNIST and MNIST-FASHION simultaneously, the input optical wavelength of MNIST and MNIST-FASION task is ? 1 and ? 2 , respectively. The diffractive network with the different task has the same optical parameters. The bottom of the figure is the optical carrier. Fig. 2 shows the framework of diffractive network and different color denotes different index of refractive. Firstly, we convert image information to the phase and amplitude of optical information as the input of systems. Then, the optical grating is manufactured by 3D-printing device with different height. In the following sections, we discuss that MWDN tackles the tasks predominately using the angular spectrum. MWDN by the 3D-printing would influence the amplitude and phase of the wave to 0~1and 0~2?, for two tasks in the same network. For each layer of MWDN, we set the neuron size range 200?m to 700?m, which is an effect tunable. Following the Fresnel diffraction equation, we can consider the optical signal from the spatial domain to the frequency domain. The angular spectrum method of plane wave explains how wave propagate. It is the primary method of analyzing diffraction in the frequency domain. Based on the angular spectrum, the free space transfer function is to control free propagation. The wave plane can transfer angular spectrum by FFT process, where diffractive data processing is more evident as follows: The output wave plane distribution propagates through 3D material and the field distribution is changed by the refractive index. # Global Journal of Computer Science and Technology Volume XX Issue I Version I x y x y x y A f f A f f H f f = ? ( , ) ( ( , )) x y U x y IFFT A f f = 2 ( , ) exp( ) exp( ( ) ) x y x y H f f jkz j z f f ?? = + 1 0 0 2 2 0 0 ( , ) ( ( , )) 1 exp( ) exp( (( ) ( ) )) 2 x y h x x y y F H f f k jkz j x x y y j z z ? ? ? ? = = ? + ? 1 1 0 ( , ) ( , )(1 ) exp( ) 2 4 , , O x y U x y d j n d k n n n ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? = ? = ? = Where, fx, fy are space frequency correspond to x, y location (fx =1/(x-x0)), x-x0 is gaps in the optical map, N and M is the number of grooves on optical grating in height and width direction, U0(x, y) is the original field distribution, U(x, y) is the field distribution after free space transfer, H(fx, fy) is transfer function, A0(fx, fy) is the original angular spectrum, A(fx, fy) is the angular spectrum after free space transfer. The results of the inverse Fourier transform to transfer function, are the impulse response function. The equation can view as the Fourier transform. Where ? is extinction efficiency, n is the refractive index real part of 3D material, n0 is the refractive index real part of the vacuum, k is the refractive index imagery part, ? is the wavelength, Î?"d is the height of material map, ? is the phase difference. If we choose transparent material (k?0) ignoring the optical losses, the transmission coefficient of a neuron is composed of only phase term; if we select non-transparent material, the transmission coefficient of a neuron is composed of amplitude and phase in MWDN architecture. According to the size of input data, an effective and flexible linear interpolation algorithm is to fit the diffractive input layer. The interconnection rate between adjacent layers relates to the distance and diffraction angle, which approach the critical value (1.0). Furthermore, the number of the network layers and the axial distance is also a tunable. The output layer can part into ten regions corresponding to ten classes, where the summation of light intensity can be detected in the wave plane region. Mean square error (MSE) uses to train MWDN parameters compared to the target. We aim to minimize a loss function, which increases target region wave intensity and decrease other regions. The training batch size set to be 10 for the classifier. # III. The Backward of Multi-Wavelength Diffractive Network To train MWDN, we use the back-propagation algorithm with a dam optimization method. We focus on the intensity of wave and define loss function with MSE between output and target. where K is the number of training data, ok is the output of the MWDN, and tk is the label of the corresponding input. The optimization problem can be written follows: where l is the layer, i is the lth layer location. The gradient of loss to all parameters can be calculated, which is used to update MWDN architecture parameters during the training process. Each batch of the training data is fed into MWDN, where each layer gradient can be calculated to update. # IV. The Backward of Multi-Wavelength Diffractive Network The optical diffractive network and deep neural network are markedly different. The function of the optical diffractive network is determined by wavelength and the parameters of the optical grating (height and complex refractive index). Multi-wavelength diffractive network has a broad range of requirements that differ from the conventional network. Different wavelength has different effectiveness. We set different wavelength for different tasks. Meanwhile, the network needs to ensure that different wavelengths do not affect each other. By setting one to baseband and the other to the carrier, the diffractive network is used to adjust optical plane wave independently. The algorithm can be considered as an efficient carrier algorithm. The ratio of baseband and carrier wavelengths is 1:30. The short wavelength is little influence to long wavelength and vice versa. If the phase difference of long wavelength is ? 1 and the phase difference of short wavelength is ? 2 , the corresponding relationship as follows: So, the equation can be as follows: The second terms of ? 2 is relative to the first term can be ignore, the equation can be shown as follows: The multi-wavelength diffractive network can be effective, and more powerful than deep neural network. Phase difference ? i (i=1, 2) can be obtained easily by adjusting the height on the diffractive network. Due to ? 2 << ? 1 , then we adjust ?d for minor learning rate for ? 2 , as well as for large learning rate for ? 1 , without one impacting the other. V. # Experiment In this work, we apply the proposed MWDN to implement on two different dataset MNIST and MNIST-Fashion. # a) Model setup By comparing to the state-of-the-art methods with accuracy and speed of, MNIST and MNIST-FASION in this method achieve better performance. The size of the network is set to 200×200, 500×500 and 1500×1500, each having a trainable height of the map. The optical network possesses two types, one for phase modulation, and the other for complex-modulation. The MNIST and MNIST-FASION tasks with different optical wavelengths, the input is altered by optical grating mask. Using the backward propagation, the model is trained with two task datasets alternately, validated its effectiveness. We train the network with different learning rate for different tasks, which overcome the drawbacks of local optimum to solve. As well as, all the parameters of the network are adjusted by the gradient descent algorithm to minimize the error. # Global K k k k E d o t K = ? = ? ? min ( ) l i l i d E d ? ? ? 1 ? 1 = ? 2 ? 2 , ? 1 ? 2 ? = 1: 30 ? 2 ? 1 ? = 1: 30, ? 2 ? ? 1 ? 1 = 2?? + ? 1 ? , ? = 0,1,2 ? ? ? 2 = ?n 15 ? + ? 1 ? 30 , ? 1 ? = 0~2? ? 2 ? ?n 15 ? ?? ? ?? 2?Î?"? ? b) Dataset We evaluate the approach on two datasets and input information for neurons in the form of phase fed into the network. The two datasets have different data distribution, which is difficult to classify in the same network. The conventional networks require the input information to be independent and identical distribution. The task is to handle two different distribution data in a same network. # c) Experimental analysis For better performance, we set a different learning rate and different signal frequency to two datasets. The maximum half-cone diffraction angle is formulated as follow: The light wavelength is 0.4THz, 14.4THz for MNIST and MNIST fashion. The neuron size is set to be 200?m. The height of the map and axial distance between two successive layers are trainable. As comparing the performance of MWDN and DN methods with single task, the results was shown in Table 1. It is clearly that the performance of MWDN would improve the accuracy of 1.15% and 3.2%, independently. To evaluate the multi-wavelength for multi-task, so we compare the multi-task to a single task in Table 2. The multi-task diffractive network enables consistent performance with a single task. The result can perform well in the same parameters. The experiment set of setting 1 is the same wavelength for comparison. Setting 2 is performed by a different wavelength. The DN-FASION and DN-MNIST are evaluated by independent diffractive network. We find that we can implement two classes to the same network with MWDN algorithm. Compared to other approach that use only single dataset as input, our approach even yields a boost. # VI. # Conclusion In this paper, we propose a novel and multitasks optical network named as the multi-wavelength diffractive network (MWDN). Based on plane wave propagation, our method can achieve comparable accuracy against the single-task network. We successfully apply MWDN to multi-tasks with different datasets distribution and provide a multi-wavelength method with different model size. In the future, we aim to develop a more effective network to achieve complex tasks and reach better performance. 12![Fig. 1: The architecture of Multi-wavelength diffractive network](image-2.png "Fig. 1 :Fig. 2 :") ![JournalsMulti-Task Learning by Multi-Wave Optical Diffractive Network ? 0 (ð??" ? , ð??" ? ) = ???(? 0 (?, ?)) = ? ? ? 0 (?, ?)? ??2?( ð??" ? ?](image-3.png "") ![Journal of Computer Science and TechnologyVolume XX Issue I Version I](image-4.png "") 1MethodMNIST? (THZ)MNIST-Fasion? (THZ)MWDN(PCM)92.85%0.484.33%14.4DN-MNIST91.75%0.481.13%0.4 2MethodMNISTMNIST-Fasion?1 (THZ)?2(THZ)Setting 123.45%12.12%0.40.4Setting 290.45%76.67%14.40.144DN-Fasion/78.70%14.4/DN-MNIST91.75%//14.4d) Convergence analysis © 2020 Global JournalsMulti-Task Learning by Multi-Wave Optical Diffractive Network ## Acknowledgements We acknowledge the financial supports by National Key R&D Program of China (Grant No. 2017YFE0112000). The authors wish to thank Professor Du. Furthermore, the authors would like to express sincere thanks for server supported by Light-ca Technology Corporation. * Regenerative pulsations from an intrinsic bistable optical device JLJewell Applied Physics Letters 40 4 291 * An optical device to measure blood components by a photoplethysmographic method JKraitl HEwald HGehring Journal of Optics A Pure & Applied Optics 7 6 * 3D-printed optical-electronic integrated devices YLiu XLin CWei CZhang YSZhao Science China Chemistry 2019 * Functional Midnfrared Polaritonics in van der Waals Crystals SKim SGMenabde VWBrar MSJang Advanced Optical Materials 1901194 2019 * A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction MSpencer JEickholt JCheng Computational Biology & Bioinformatics IEEE/ACM Transactions on 99 1 2014 * Deep learning of the sectional appearances of 3D CT images for anatomical structure segmentation based on an FCN voting method XZhou RTakayama SWang THara HFujita Medical Physics 44 10 5221 2017 * Improved deep learning-based macromolecules structure classification from electron cryo-tomograms CChe RLin XZeng EKarim GJohn MXu Machine Vision & Applications * Aggregated Residual Transformations for Deep Neural Networks XS RGirshick PDoll¨¢r ZTu KHe IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017 * Learning a Deep Convolutional Network for Image Super-Resolution DChao CLChen KHe XTang * Deep learning: from speech recognition to language and multimodal processing DLi Apsipa Transactions on Signal & Information Processing 5 * Machine translation using deep learning: An overview SPSingh AKumar HDarbari LSingh SJain * On the design, control, and use of a reconfigurable heterogeneous multicore system-on-a-chip TOKwok YKKwok * COMPLEMENTARY VALUE OF TRANSTHORACIC ECHOCARDIOGRAPHY & CINEFLUOROSCOPIC EVALUATION OF MECHANICAL HEART PROSTHETIC VALVES TFCianciulli JALax FECerruti GEGigena HJRedruello MAOrsi JAGagliardi ANDorelle MARiccitelli HAPrezioso Echocardiography 21 2 2010 * Spin-Transfer Torque Magnetic Memory As a Stochastic Memristive Synapse AFVincent JLarroque WZhao NBRomdhane DQuerlioz * ANTait MANahmias TYue BJShastri PR Photonic Neuromorphic Signal Processing and Computing Berlin Heidelberg Springer 2013 * Integrated all-photonic non-volatile multilevel memory CRíos MStegmaier PHosseini DWang TScherer CDWright HBhaskaran WH PPernice Nature Photonics * Nonphotic Entrainment in a Diurnal Mammal, the European Ground Squirrel (Spermophilus citellus) RAHut NMrosovsky SDaan Journal of Biological Rhythms 14 5 * A brain-like neural network for periodicity analysis KVoutsas GLangner JAdamy MOchse IEEE Transactions on Systems Man & Cybernetics Part B Cybernetics 35 1 * Ultrafast all-optical demultiplexer based on monolithic Mach-Zehnder interferometer with integrated semiconductor optical amplifiers MDülk SFischer MBitter MCaraccia WVogt EGini HMelchior WHunziker ABuxens HNPoulsen Optical & Quantum Electronics 33 7-10 * All-optical machine learning using diffractive deep neural networks XLin RYair YN T VMuhammed LYi JMona OAydogan Science 8084 * SU-8: a photoresist for high-aspect-ratio and 3D submicron lithography ADCampo CGreiner Journal of Micromechanics & Microengineering 17 6 * Pedestrian Detection aided by Deep Learning Semantic Tasks YTian PLuo XWang XTang * The MNIST Database of Handwritten Digit Images for Machine Learning Research LDeng IEEE Signal Processing Magazine 29 6 Best of the Web * Computational Frameworks for the Fast Fourier Transform CF VLoan 2011