# Introduction

Alternative splicing(AS) is a regulated process during gene expression that enables the same gene to give rise to splicing isoforms containing different combinations of exons, which leads to different protein products. Furthermore, AS is often tissue dependent. This paper mainly focus on performing Deep Neural Network (DNN) in predicting outcome of splicing, and compare the performance to formerly trained model Bayesian Neural Network(ref) (BNN) and Multinomial Logistic Regression(ref) (MLR).

A huge difference that the author imposed in DNN is that each tissue type are treated as an input; while in previous BNN, each tissue type was considered as a different output of the neural network. Moreover, in previous work, the splicing code infers the direction of change of the percentage of transcripts with an exon spliced in (PSI). Now, this paper perform absolute PSI prediction for each tissue individually without averaging across tissues, and also predict the difference PSI ($\Delta$PSI) between pairs of tissues. Apart from regular deep neural network, this model will train these two prediction tasks simultaneously.

# Model

The dataset consists of 11019 mouse alternative exons profiled from RNA-Seq(ref) Data. Five tissue types are available, including brain, heart, kidney, liver and testis.

The DNN is fully connected, with multiple layers of non-linearity consisting of hidden units. The mathematical expression of model is below:

 ${a_v}^l = f(\sum_{m}^{M^{l-1}}{\theta_{v,m}^{l}a_m^{l-1}})$, where a is the weighted sum of outputs from the previous layer. $\theta_{v,m}^{l}$ is the weights between layers.

 $f_{RELU}(z)=max(0,z)$, The RELU unit was used for all hidden units except for the first hidden layer, which uses TANH units.

 $h_k=\frac{exp(\sum_m{\theta_{k,m}^{last}a_m^{last}})}{\sum_{k'}{exp(\sum_{m}{\theta_{k',m}^{last}a_m^{last}})}}$, this is the softmax function of the last layer.


The cost function we want to minimize here during training is $E=-\sum_a\sum_{k=1}^{C}{y_{n,k}log(h{n,k})}$, where $n$ denotes the training example, and $k$ indexes $C$ classes.

The identity of two tissues are then appended to the vector of outputs of the first hidden layer, together forming the input into the second hidden layer. The identity is a 1-of-5 binary variables in this case. (Demonstrated in Fig.1) The first targets for training contains three classes, which labeled as low, medium, high (LMH code). The second task describes the $\Delta PSI$ between two tissues for a particular exon. The three classes corresponds to this task is decreased inclusion, no change and increased inclusion (DNI code).Both the LMH and DNI codes are trained jointly, reusing the same hidden representations learned by the model. The DNN used backpropagation to train the data, and used different learning rates for gradient descent. 'pics'