Grace Tompkins, Tatiana Krikella, Swaleh Hussain
Currently, dealing with incomplete inputs in machine learning requires filling absent attributes based on complete, observed data. Two commonly used methods are mean imputation and k-NN imputation. Other methods for dealing with missing data involve training separate neural networks, extreme learning machines, and [math]k[/math]-nearest neighbours. Probabilistic models of incomplete data can also be built depending on the mechanism missingness (i.e. whether the data is Missing At Random (MAR), Missing Completely At Random (MCAR), or Missing Not At Random (MNAR)), which can be fed into a particular learning model. Previous work using neural networks for missing data includes a paper by Bengio and Gringras  where the authors used recurrent neural networks with feedback into the input units to fill absent attributes solely to minimize the learning criterion. Goodfellow et. al.  also used neural networks by introducing a multi-prediction deep Boltzmann machine which could perform classification on data with missingness in the inputs.
Layer for Processing Missing Data
 Yoshua Bengio and Francois Gingras. Recurrent neural networks for missing or asynchronous data. In Advances in neural information processing systems, pages 395–401, 1996.
 Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016.