summary: Difference between revisions
mNo edit summary |
No edit summary |
||
Line 15: | Line 15: | ||
In recent years, XGBoost has been the most popular tool in Kaggle, as well as other machine learning and data mining challenges. The main reason to its success is that this tree boosting system is highly scalable , which means it could be used to solve a variety of different problems. | In recent years, XGBoost has been the most popular tool in Kaggle, as well as other machine learning and data mining challenges. The main reason to its success is that this tree boosting system is highly scalable , which means it could be used to solve a variety of different problems. | ||
This paper is written by XGBoost's father, Tianqi Chen. It gives us an overview on how he used algorithmic optimizations as well as some important systems to develop XGBoost. He explained in the following manner: | This paper is written by XGBoost's father, Tianqi Chen. It gives us an overview on how he used algorithmic optimizations as well as some important systems to develop XGBoost. He explained it in the following manner: | ||
1. Based on gradient tree boosting algorithm as well as using shrinkage and feature subsampling to prevent overfitting, Chen introduced an end-to-end tree boosting system. | 1. Based on gradient tree boosting algorithm as well as using shrinkage and feature subsampling to prevent overfitting, Chen introduced an end-to-end tree boosting system. |
Revision as of 08:22, 15 March 2018
XGBoost: A Scalable Tree Boosting System
Presented by
Jiang, Cong
Song, Ziwei
Ye, Zhaoshan
Zhang, Wenling
Introduction
In recent years, XGBoost has been the most popular tool in Kaggle, as well as other machine learning and data mining challenges. The main reason to its success is that this tree boosting system is highly scalable , which means it could be used to solve a variety of different problems.
This paper is written by XGBoost's father, Tianqi Chen. It gives us an overview on how he used algorithmic optimizations as well as some important systems to develop XGBoost. He explained it in the following manner:
1. Based on gradient tree boosting algorithm as well as using shrinkage and feature subsampling to prevent overfitting, Chen introduced an end-to-end tree boosting system.
2.
Tree Boosting
Gradient Tree Boosting
Split Finding Algorithms
System Design
Related Works
End to End Evaluations
Results
Conclusion
Criticisms
Source
**Sample format
Recurrent neural networks are a variation of deep neural networks that are capable of storing information about previous hidden states in special memory layers.<ref name=lstm> Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780. </ref> Unlike feed forward neural networks that take in a single fixed length vector input and output a fixed length vector output, recurrent neural networks can take in a sequence of fixed length vectors as input, because of their ability to store information and maintain a connection between inputs through this memory layer. By comparison, previous inputs would have no impact on current output for feed forward neural networks, whereas they can impact current input in a recurrent neural network. (This paper used the LSTM formulation from Graves<ref name=grave> Graves, Alex. "Generating sequences with recurrent neural networks." arXiv preprint arXiv:1308.0850 (2013). </ref>)
Where [math]\displaystyle{ \,S }[/math] is the base/source sentence, [math]\displaystyle{ \,T }[/math] is the paired translated sentence and [math]\displaystyle{ \,T_r }[/math] is the total training set. This objective function is to maximize the log probability of a correct translation [math]\displaystyle{ \,T }[/math] given the base/source sentence [math]\displaystyle{ \,S }[/math] over the entire training set. Once the training is complete, translations are produced by finding the most likely translation according to LSTM:
- [math]\displaystyle{ \hat{T} = \underset{T}{\operatorname{arg\ max}}\ p(T|S) }[/math]
It has been showed that Long Short-Term Memory recurrent neural networks have the ability to generate both discrete and real-valued sequences with complex, long-range structure using next-step prediction <ref name=grave>
Reference
</ref>.