http://wiki.math.uwaterloo.ca/statwiki/api.php?action=feedcontributions&user=Z62zhu&feedformat=atom
statwiki - User contributions [US]
2022-10-01T15:44:47Z
User contributions
MediaWiki 1.28.3
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40635
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-21T03:14:41Z
<p>Z62zhu: /* Traditional Approaches */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field groundtruth from real life videos is not possible, but instead, synthetic data is often used.<br />
<br />
The paper "Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness" by Yu et. al. presents an unsupervised approach to address the groundtruth acquisition challenges of optical flow, by making use of the standard Flownet architecture with a spatial transformer component to devise a "self-supervising" loss function.<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera. Most optical flows are estimated on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spatial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta y) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math>, ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
This can be rewritten as:<br />
:<math>I_xV_x+I_yV_y=-I_t</math><br />
or <br />
:<math>\nabla I^T\cdot\vec{V} = -I_t</math><br />
<br />
Where <math> \nabla I^T </math> is known as the spatial gradient<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. In order to solve the optical flow problem, another set of constraints are required, which is where assumption 2 can be applied.<br />
<br />
== Traditional Approaches ==<br />
<br />
Traditional approaches to the optical flow problem consisted of many differential (gradient-based) methods. Horn and Schunck, 1981, being one of the first to create an approach for for optical flow estimation, is one of the classical examples for optical flow estimation. Without diving deep into the math, Horn and Schunk created constraints based on spatio-temporal derivatives of image brightness. Their estimation tries to solve the aperture problem by adding a smoothness condition where that the optical flow field varies smoothly through the entire image (a global motion smoothness). They assume that object motion in a sequence will be rigid and approximately constant, that the objects in a pixel’s neighborhood will have similar velocities, and therefore the object changes smoothly over space and time. The challenges with this approach is that on frames with rougher movements, the accuracy of the estimates dramatically decrease.<br />
<br />
Another classical method, Lucas-Kanade, approaches the problem by taking a local motion smoothness assumption. Lucas-Kanade addressed the sensitivity to rough movements in the Horn and Schunk approach by making a local motion smoothness assumption instead of a global motion smoothness. While the Lucas-Kanade estimation reduced sensitivity to rougher movements, it still has inaccuracy in rough frames as a differential method to optical flow estimation.<br />
<br />
In 2015, FlowNet was proposed as the first approach to use a deep neural network for end-to-end optical flow estimation.<br />
<br />
== Related Works ==<br />
<br />
=== Spatial Transformer Networks ===<br />
<br />
As Convolutional Neural Networks have been established as the preferred solution in image recognition and computer vision problems, increasing attention has been dedicated to evolving the network architecture to further improve predictive power. One such adaptation is the Spatial Transformer Network, developed by Google DeepMind in 2015. <br />
<br />
Spacial invariance is a desired property of any system that deals with visual task, however the basic CNN is not very robust in the presence of input deformations such as scale/translation/rotation variations, viewpoint variations, shape deformations, etc. The introduction of local pooling layers into CNNs have helped address this issue to some degree, by pooling groups of input cells into simpler cells, helping to remove the adverse impact of noise on the input. However, pooling layers are destructive - a standard 2x2 pooling layer discards 75% of the input data, resulting in the loss of exact positional data, which can be very helpful visual recognition tasks. Also, since pooling layers are predefined and non-adaptive, their inclusion may only be helpful in the presence of small deformations; with large transformations, pooling may help provide little to no spatial invariance to the network.<br />
<br />
The Spatial Transformer Network (STN) addresses the spatial invariance issues described above by producing an explicit spatial transformation to carve out the target object. Advantageous properties of the STN are as follows:<br />
<br />
1. Modular - they can easily be implemented anywhere into an existing CNN<br />
<br />
2. Differentiable - they can be trained using backpropagation without modifying the original model<br />
<br />
3. Dynamic - they perform a unique spatial transformation on the feature map for each input sample<br />
<br />
STNs are composed of three primary components:<br />
<br />
1. Localization network: a CNN that outputs the parameters of a spatial transformations<br />
<br />
2. Grid Generator: Generates a sampling grid, where transformations from the localization network are applied to this grid<br />
<br />
3. Sampler: Samples the input feature map according to the transformed grid and a differentiable interpolation function<br />
<br />
== The Paper's Approach: UnsupFlownet ==<br />
<br />
=== Architecture ===<br />
UnsupFlownet's architecture uses Flownet Simple as the core component for optical flow estimation. Two consecutive frames are inputted into the Flownet component. The estimated flow is then inputted to the spatial transformer unit to warp the second frame into the first frame. The spatial transformer performs the following pointwise transformation:<br />
<br />
<math><br />
\begin{bmatrix}<br />
x_2 \\<br />
y_2<br />
\end{bmatrix} = \begin{bmatrix}<br />
x_1 + u \\<br />
x_2 + v<br />
\end{bmatrix}<br />
</math>,<br />
<br />
where <math>(x_1, y_1)</math> are the coordinates in the first frame, and <math>(x_2, y_2)</math> are the sampling coordinates in the second frame, and <math>(u, v)</math> is the estimated flow in the horizontal and vertical components, respectively.<br />
[[File:arch_unsup.png|thumb|upright=2|center|alt=text|alt=text|Figure 1: UnsupFlownet Architecture]]<br />
<br />
Due to the spatial transformer's differentiable nature, the gradients of the losses can be successfully back-propagated to the convnet weights, thereby training the network.<br />
<br />
=== Unsupervised Loss ===<br />
The paper devises the following loss function:<br />
<br />
<center><br />
<math><br />
\mathcal{L}(\mathbf{u}, \mathbf{v}; I_t, I_{t+1}) = l_{photometric}(\mathbf{u}, \mathbf{v}; I_t, I_{t+1}) + \lambda l_{smoothness}(\mathbf{u}, \mathbf{v})<br />
</math><br />
</center><br />
<br />
where <math>(\mathbf{u}, \mathbf{v})</math> is the estimated flow, and <math>I_t(\cdot, \cdot), I_{t+1}(\cdot, \cdot)</math> are the photo-intensity (RGB) functions at frame <math>t</math> and frame <math>t + 1</math>, respectively. The photo-intensity functions accept an <math>(x, y)</math> coordinate and returns the RGB values at that pixel.<br />
<br />
==== Photo Constancy ====<br />
<br />
The photometric term is defined as:<br />
<br />
<center><br />
<math><br />
l_{photometric}(\mathbf{u}, \mathbf{v}; I_t, I_{t+1}) = \sum\limits_{i,j}\rho_D\Big(I_t(i, j) - I_{t+1}(i + u_{i,j}, j + v_{i,j})\Big)<br />
</math><br />
</center><br />
<br />
where <math>\rho_D</math> is some robust penalty function taking on the form <math>(x^2 + \epsilon^2)^{\alpha}, 0 < \alpha < 1</math>.<br />
<br />
This term addresses the first assumption in Section 3. We have that if the estimated flow <math>(\mathbf{u}, \mathbf{v})</math> is correct, then the reconstructed (warped) second frame should very closely resemble the first frame. Hence, the photometric loss would be low. The opposite is true for poor flow estimation.<br />
<br />
<br />
Below is an example of warping. Panels (a) and (b) are the first and second frame respectively, (i) is the estimated flow, and (j) is the second frame warped by (i).<br />
[[File:warped_and_flow.png|thumb|upright=2|center|alt=text|alt=text|300px|Figure 2: Example of warping]]<br />
<br />
==== Local Smoothness ====<br />
<br />
The smoothness term is defined as:<br />
<br />
<center><br />
<math><br />
l_{smoothness}(\mathbf{u}, \mathbf{v}) = \sum\limits_j^H\sum\limits_i^W \Big(\rho_S(u_{i,j}, u_{i+1, j}) + \rho_S(u_{i,j} - u_{i, j+1}) + \rho_S(v_{i,j} - v_{i+1, j}) + \rho_S(v_{i,j} - v_{i,j+1})\Big)<br />
</math><br />
</center><br />
<br />
This term addresses the second assumption in Section 3. It computes a robust penalty, <math>\rho_S(\cdot)</math>, of the difference between the estimated flow of each pixel and that of its nearest rightward and upward neighbour. This penalizes local non-smoothness. The robustness of the penalty function is important, since we expect that at object boundaries, optical flow should change drastically. Therefore, there should be expected large differences in certain spots in the frame. The penalty would still penalize this difference, but avoids generating a large gradient with respect to the flow, by levelling off at large <math>x</math>-axis values.<br />
<br />
[[File:robust.png|thumb|upright=2|center|alt=text|alt=text|Figure 3: Example of Robust Loss Penalty]]</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40529
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-21T00:26:55Z
<p>Z62zhu: /* Traditional Approaches */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field groundtruth from real life videos is not possible, but instead, synthetic data is often used.<br />
<br />
The paper "Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness" by Yu et. al. presents an unsupervised approach to address the groundtruth acquisition challenges of optical flow, by making use of the standard Flownet architecture with a spatial transformer component to devise a "self-supervising" loss function.<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera. Most optical flows are estimated on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spatial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta y) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math>, ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
This can be rewritten as:<br />
:<math>I_xV_x+I_yV_y=-I_t</math><br />
or <br />
:<math>\nabla I^T\cdot\vec{V} = -I_t</math><br />
<br />
Where <math> \nabla I^T </math> is known as the spatial gradient<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. In order to solve the optical flow problem, another set of constraints are required, which is where assumption 2 can be applied.<br />
<br />
== Traditional Approaches ==<br />
<br />
Traditional approaches to the optical flow problem consisted of many differential (gradient-based) methods. Horn and Schunck, 1981, being one of the first to create an approach for for optical flow estimation, is one of the classical examples for optical flow estimation. Without diving deep into the math, Horn and Schunk created constraints based on spatio-temporal derivatives of image brightness. Their estimation tries to solve the aperture problem by adding a smoothness condition where that the optical flow field varies smoothly through the entire image (a global motion smoothness). They assume that object motion in a sequence will be rigid and approximately constant, that the objects in a pixel’s neighborhood will have similar velocities, and therefore the object changes smoothly over space and time. The challenges with this approach is that on frames with rougher movements, the accuracy of the estimates dramatically decrease.<br />
<br />
Another classical method, Lucas-Kanade, approaches the problem by taking a local motion smoothness assumption. Lucas-Kanade addressed the sensitivity to rough movements in the Horn and Schunk approach by making a local motion smoothness assumption instead of a global motion smoothness. While the Lucas-Kanade estimation reduced sensitivity to rougher movements, it still has inaccuracy in rough frames as a differential method to optical flow estimation.<br />
<br />
In 2015, FlowNet [Dosovitskiy et al., 2015] was proposed as the first approach to use a deep neural network for end-to-end optical flow estimation.<br />
<br />
== Related Works ==<br />
<br />
=== Spatial Transformer Networks ===<br />
<br />
As Convolutional Neural Networks have been established as the preferred solution in image recognition and computer vision problems, increasing attention has been dedicated to evolving the network architecture to further improve predictive power. One such adaptation is the Spatial Transformer Network, developed by Google DeepMind in 2015. <br />
<br />
Spacial invariance is a desired property of any system that deals with visual task, however the basic CNN is not very robust in the presence of input deformations such as scale/translation/rotation variations, viewpoint variations, shape deformations, etc. The introduction of local pooling layers into CNNs have helped address this issue to some degree, by pooling groups of input cells into simpler cells, helping to remove the adverse impact of noise on the input. However, pooling layers are destructive - a standard 2x2 pooling layer discards 75% of the input data, resulting in the loss of exact positional data, which can be very helpful visual recognition tasks. Also, since pooling layers are predefined and non-adaptive, their inclusion may only be helpful in the presence of small deformations; with large transformations, pooling may help provide little to no spatial invariance to the network.<br />
<br />
The Spatial Transformer Network (STN) addresses the spatial invariance issues described above by producing an explicit spatial transformation to carve out the target object. Advantageous properties of the STN are as follows:<br />
<br />
1. Modular - they can easily be implemented anywhere into an existing CNN<br />
<br />
2. Differentiable - they can be trained using backpropagation without modifying the original model<br />
<br />
3. Dynamic - they perform a unique spatial transformation on the feature map for each input sample<br />
<br />
STNs are composed of three primary components:<br />
<br />
1. Localization network: a CNN that outputs the parameters of a spatial transformations<br />
<br />
2. Grid Generator: Generates a sampling grid, where transformations from the localization network are applied to this grid<br />
<br />
3. Sampler: Samples the input feature map according to the transformed grid and a differentiable interpolation function<br />
<br />
== The Paper's Approach: UnsupFlownet ==<br />
<br />
=== Architecture ===<br />
[[File:arch_unsup.png|thumb|upright=2|center|alt=text|alt=text|Figure 1: UnsupFlownet Architecture]]</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40406
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T20:38:55Z
<p>Z62zhu: /* Traditional Approaches */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera. Most optical flows are estimated on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
This can be rewritten as:<br />
:<math>I_xV_x+I_yV_y=-I_t</math><br />
or <br />
:<math>\nabla I^T\cdot\vec{V} = -I_t</math><br />
<br />
Where <math> \nabla I^T </math> is known as the spatial gradient<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. In order to solve the optical flow problem, another set of constraints are required, which is where assumption 2 can be applied.<br />
<br />
== Traditional Approaches ==<br />
<br />
Traditional approaches to the optical flow problem consisted of many differential (gradient-based) methods. Horn and Schunck, 1981, being one of the first to create an approach for for optical flow estimation, is one of the classical examples for optical flow estimation. Without diving deep into the math, Horn and Schunk created constraints based on spatio-temporal derivatives of image brightness. Their estimation tries to solve the aperture problem by adding a smoothness condition where that the optical flow field varies smoothly through the entire image (a global motion smoothness). They assume that object motion in a sequence will be rigid and approximately constant, that the objects in a pixel’s neighborhood will have similar velocities, and therefore the object changes smoothly over space and time. The challenges with this approach is that on frames with rougher movements, the accuracy of the estimates dramatically decrease.<br />
<br />
Another classical method, Lucas-Kanade, approaches the problem by taking a local motion smoothness assumption. Lucas-Kanade addressed the sensitivity to rough movements in the Horn and Schunk approach by making a local motion smoothness assumption instead of a global motion smoothness. While the Lucas-Kanade estimation reduced sensitivity to rougher movements, it still has inaccuracy in rough frames as a differential method to optical flow estimation.<br />
<br />
<br />
It wasn't until 2015 that FlowNet [Dosovitskiy et al., 2015] was proposed as the first approach to use a deep neural network for end-to-end optical flow estimation.<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field ground-truth is not possible but instead, segmentation ground-truthing is generally used. Segmentation ground-truth is the classification of all pixels in an image. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes even requiring manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the segmentation ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
The paper "Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness" by Jason J. Yu, Adam W. Harley and Konstantinos G. Derpanis presents an unsupervised approach to address the supervised challenges of optical flow.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40404
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T20:09:34Z
<p>Z62zhu: /* Optical Flow */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera. Most optical flows are estimated on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
This can be rewritten as:<br />
:<math>I_xV_x+I_yV_y=-I_t</math><br />
or <br />
:<math>\nabla I^T\cdot\vec{V} = -I_t</math><br />
<br />
Where <math> \nabla I^T </math> is known as the spatial gradient<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. In order to solve the optical flow problem, another set of constraints are required, which is where assumption 2 can be applied.<br />
<br />
== Traditional Approaches ==<br />
<br />
Traditional approaches to the optical flow problem consisted of many differential (gradient-based) methods. Horn and Schunck, 1981, being one of the first to create an approach for for optical flow estimation, is one of the most famous examples. Without going into the math, Horn and Schunk created constraints based on spatio-temporal derivatives of image brightness. Their estimation tries to solve the aperture problem by adding a smoothness condition where that the optical flow field varies smoothly through the entire image. <br />
<br />
propose a method based on first order derivatives and<br />
add a smoothness condition on the flow vectors to the general conditions. They<br />
assume that object motion in a sequence will be rigid and approximately constant,<br />
that a pixel’s neighborhood in said objects will have similar velocity, therefore,<br />
changing smoothly over space and time. Nevertheless, this condition is not<br />
very realistic in many cases and it yields bad results [14] since the images’ flow<br />
has a lack of continuity, especially in the boundaries between different objects.<br />
Therefore the results obtained in these areas will not be correct. Poor results<br />
are also obtained in the sequences where there are multiple objects, each having<br />
different motion<br />
<br />
http://www.ufjf.br/getcomp/files/2013/03/implementation-and-eval%C3%A7uation-of-differential-optical-flow-methods.pdf<br />
<br />
<br />
It wasn't until 2015 that FlowNet [Dosovitskiy et al., 2015] was proposed as the first approach to use a deep neural network for end-to-end optical flow estimation.<br />
<br />
-under construction-<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field ground-truth is not possible but instead, segmentation ground-truthing is generally used. Segmentation ground-truth is the classification of all pixels in an image. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes even requiring manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the segmentation ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
The paper "Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness" by Jason J. Yu, Adam W. Harley and Konstantinos G. Derpanis presents an unsupervised approach to address the supervised challenges of optical flow.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40403
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T20:08:44Z
<p>Z62zhu: /* Optical Flow */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera. Most optical flows are estimated on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
This can be rewritten as:<br />
:<math>I_xV_x+I_yV_y=-I_t</math><br />
or <br />
:<math>\nabla I^T\cdot\vec{V} = -I_t</math><br />
<br />
Where <math> \nabla I^T </math> is known as the spatial gradient<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. In order to solve the optical flow problem, another set of constraints are required.<br />
<br />
== Traditional Approaches ==<br />
<br />
Traditional approaches to the optical flow problem consisted of many differential (gradient-based) methods. Horn and Schunck, 1981, being one of the first to create an approach for for optical flow estimation, is one of the most famous examples. Without going into the math, Horn and Schunk created constraints based on spatio-temporal derivatives of image brightness. Their estimation tries to solve the aperture problem by adding a smoothness condition where that the optical flow field varies smoothly through the entire image. <br />
<br />
propose a method based on first order derivatives and<br />
add a smoothness condition on the flow vectors to the general conditions. They<br />
assume that object motion in a sequence will be rigid and approximately constant,<br />
that a pixel’s neighborhood in said objects will have similar velocity, therefore,<br />
changing smoothly over space and time. Nevertheless, this condition is not<br />
very realistic in many cases and it yields bad results [14] since the images’ flow<br />
has a lack of continuity, especially in the boundaries between different objects.<br />
Therefore the results obtained in these areas will not be correct. Poor results<br />
are also obtained in the sequences where there are multiple objects, each having<br />
different motion<br />
<br />
http://www.ufjf.br/getcomp/files/2013/03/implementation-and-eval%C3%A7uation-of-differential-optical-flow-methods.pdf<br />
<br />
<br />
It wasn't until 2015 that FlowNet [Dosovitskiy et al., 2015] was proposed as the first approach to use a deep neural network for end-to-end optical flow estimation.<br />
<br />
-under construction-<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field ground-truth is not possible but instead, segmentation ground-truthing is generally used. Segmentation ground-truth is the classification of all pixels in an image. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes even requiring manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the segmentation ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
The paper "Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness" by Jason J. Yu, Adam W. Harley and Konstantinos G. Derpanis presents an unsupervised approach to address the supervised challenges of optical flow.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40402
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T20:07:12Z
<p>Z62zhu: /* Traditional Methods */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
This can be rewritten as:<br />
:<math>I_xV_x+I_yV_y=-I_t</math><br />
or <br />
:<math>\nabla I^T\cdot\vec{V} = -I_t</math><br />
<br />
Where <math> \nabla I^T </math> is known as the spatial gradient<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. In order to solve the optical flow problem, another set of constraints are required.<br />
<br />
== Traditional Approaches ==<br />
<br />
Traditional approaches to the optical flow problem consisted of many differential (gradient-based) methods. Horn and Schunck, 1981, being one of the first to create an approach for for optical flow estimation, is one of the most famous examples. Without going into the math, Horn and Schunk created constraints based on spatio-temporal derivatives of image brightness. Their estimation tries to solve the aperture problem by adding a smoothness condition where that the optical flow field varies smoothly through the entire image. <br />
<br />
propose a method based on first order derivatives and<br />
add a smoothness condition on the flow vectors to the general conditions. They<br />
assume that object motion in a sequence will be rigid and approximately constant,<br />
that a pixel’s neighborhood in said objects will have similar velocity, therefore,<br />
changing smoothly over space and time. Nevertheless, this condition is not<br />
very realistic in many cases and it yields bad results [14] since the images’ flow<br />
has a lack of continuity, especially in the boundaries between different objects.<br />
Therefore the results obtained in these areas will not be correct. Poor results<br />
are also obtained in the sequences where there are multiple objects, each having<br />
different motion<br />
<br />
http://www.ufjf.br/getcomp/files/2013/03/implementation-and-eval%C3%A7uation-of-differential-optical-flow-methods.pdf<br />
<br />
<br />
It wasn't until 2015 that FlowNet [Dosovitskiy et al., 2015] was proposed as the first approach to use a deep neural network for end-to-end optical flow estimation.<br />
<br />
-under construction-<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field ground-truth is not possible but instead, segmentation ground-truthing is generally used. Segmentation ground-truth is the classification of all pixels in an image. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes even requiring manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the segmentation ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
The paper "Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness" by Jason J. Yu, Adam W. Harley and Konstantinos G. Derpanis presents an unsupervised approach to address the supervised challenges of optical flow.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40398
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T19:39:24Z
<p>Z62zhu: /* Optical Flow */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
This can be rewritten as:<br />
:<math>I_xV_x+I_yV_y=-I_t</math><br />
or <br />
:<math>\nabla I^T\cdot\vec{V} = -I_t</math><br />
<br />
Where <math> \nabla I^T </math> is known as the spatial gradient<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. In order to solve the optical flow problem, another set of constraints are required.<br />
<br />
== Traditional Methods ==<br />
<br />
http://www.ufjf.br/getcomp/files/2013/03/implementation-and-evalçuation-of-differential-optical-flow-methods.<br />
<br />
-under construction-<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field ground-truth is not possible but instead, segmentation ground-truthing is generally used. Segmentation ground-truth is the classification of all pixels in an image. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes even requiring manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the segmentation ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
The paper "Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness" by Jason J. Yu, Adam W. Harley and Konstantinos G. Derpanis presents an unsupervised approach to address the supervised challenges of optical flow.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40393
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T19:00:37Z
<p>Z62zhu: /* Optical Flow */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
This can be rewritten as:<br />
:<math>I_xV_x+I_yV_y=-I_t</math><br />
or <br />
:<math>\nabla I^T\cdot\vec{V} = -I_t</math><br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. In order to solve the optical flow problem, another set of constraints are required.<br />
<br />
== Traditional Methods ==<br />
<br />
http://www.ufjf.br/getcomp/files/2013/03/implementation-and-evalçuation-of-differential-optical-flow-methods.<br />
<br />
-under construction-<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field ground-truth is not possible but instead, segmentation ground-truthing is generally used. Segmentation ground-truth is the classification of all pixels in an image. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes even requiring manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the segmentation ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
The paper "Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness" by Jason J. Yu, Adam W. Harley and Konstantinos G. Derpanis presents an unsupervised approach to address the supervised challenges of optical flow.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40392
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T18:59:44Z
<p>Z62zhu: /* Optical Flow */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
This can be rewritten as:<br />
:<math>I_xV_x+I_yV_y=-I_t</math><br />
or <br />
:<math>\nabla I^T\cdot\vec{V} = -I_t</math><br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. To find the optical flow another set of equations is needed, given by some additional constraint.<br />
<br />
== Traditional Methods ==<br />
<br />
http://www.ufjf.br/getcomp/files/2013/03/implementation-and-evalçuation-of-differential-optical-flow-methods.<br />
<br />
-under construction-<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field ground-truth is not possible but instead, segmentation ground-truthing is generally used. Segmentation ground-truth is the classification of all pixels in an image. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes even requiring manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the segmentation ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
The paper "Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness" by Jason J. Yu, Adam W. Harley and Konstantinos G. Derpanis presents an unsupervised approach to address the supervised challenges of optical flow.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40357
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T16:34:19Z
<p>Z62zhu: /* Problem & Motivation */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow/ Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. To find the optical flow another set of equations is needed, given by some additional constraint.<br />
<br />
== Traditional Methods ==<br />
<br />
http://www.ufjf.br/getcomp/files/2013/03/implementation-and-evalçuation-of-differential-optical-flow-methods.<br />
<br />
-under construction-<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field ground-truth is not possible but instead, segmentation ground-truthing is generally used. Segmentation ground-truth is the classification of all pixels in an image. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes even requiring manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the segmentation ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
The paper "Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness" by Jason J. Yu, Adam W. Harley and Konstantinos G. Derpanis presents an unsupervised approach to address the supervised challenges of optical flow.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40320
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T06:58:54Z
<p>Z62zhu: </p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow/ Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. To find the optical flow another set of equations is needed, given by some additional constraint.<br />
<br />
== Traditional Methods ==<br />
<br />
http://www.ufjf.br/getcomp/files/2013/03/implementation-and-evalçuation-of-differential-optical-flow-methods.<br />
<br />
-under construction-<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field ground-truth is not possible but instead, segmentation ground-truthing is generally used. Segmentation ground-truth is the classification of all pixels in an image. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes even requiring manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the segmentation ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40317
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T06:53:45Z
<p>Z62zhu: /* Optical Flow */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow/ Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a video frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. To find the optical flow another set of equations is needed, given by some additional constraint.<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field ground-truth is not possible but instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes even requiring manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the segmentation ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
== Traditional Methods (Horn and Schunk) ==<br />
<br />
-Under Construction-</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40314
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T06:50:10Z
<p>Z62zhu: /* Problem & Motivation */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow/ Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. To find the optical flow another set of equations is needed, given by some additional constraint.<br />
<br />
== Problem & Motivation ==<br />
<br />
The approaches to solving optimal flow problems, albeit widely successful, has mostly been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. Directly obtaining the motion field ground-truth is not possible but instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes even requiring manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the segmentation ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
== Traditional Methods (Horn and Schunk) ==<br />
<br />
-Under Construction-</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40313
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T06:42:33Z
<p>Z62zhu: /* Optical Flow */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow/ Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t</math> ignoring the higher order terms.<br />
From the two equations, it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity (displacement over time) or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math>, and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions.<br />
<br />
Since this results in one equation with two unknowns <math>V_x,V_y</math>, it results into what is known as the aperture problem of the optical flow algorithms. To find the optical flow another set of equations is needed, given by some additional constraint.<br />
<br />
== Problem & Motivation ==<br />
<br />
The current mainstream approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. However, directly obtaining the motion field ground-truth is not possible and instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
== Traditional Methods (Horn and Schunk) ==<br />
<br />
-Under Construction-</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40307
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T06:25:27Z
<p>Z62zhu: /* Optical Flow */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
[https://en.wikipedia.org/wiki/Optical_flow/ Optical flow] is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t+</math><br />
From these equations it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math> and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions. <math>I_x</math>,<math> I_y</math> and <math> I_t</math> can be written for the derivatives in the following.<br />
<br />
This is an equation in two unknowns and cannot be solved as such. This is known as the ''[[Motion perception#The aperture problem|aperture problem]]'' of the optical flow algorithms. To find the optical flow another set of equations is needed, given by some additional constraint. All optical flow methods introduce additional conditions for estimating the actual flow.<br />
<br />
== Problem & Motivation ==<br />
<br />
The current mainstream approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. However, directly obtaining the motion field ground-truth is not possible and instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
== Traditional Methods (Horn and Schunk) ==<br />
<br />
-Under Construction-</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40304
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T06:17:13Z
<p>Z62zhu: /* Optical Flow */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
Optical flow is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence and that the image motion of objects changes gradually over time, creating motion smoothness.<br />
<br />
Given these assumptions, imagine a frame (which is 2D image) with a pixel at position <math> (x,y) </math> at some time t, and in later frame, the pixel is now in position <math>(x + \Delta x, y + \Delta) </math> at some time <math> t + \Delta t </math>. <br />
<br />
Then by the first assumption, the intensity of the pixel at time t is the same as the intensity of the pixel at time <math> t + \Delta t </math>:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t)</math><br />
<br />
Using Taylor series, we get:<br />
:<math>I(x+\Delta x,y+\Delta y,t+\Delta t) = I(x,y,t) + \frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t+</math><br />
From these equations it follows that:<br />
:<math>\frac{\partial I}{\partial x}\Delta x+\frac{\partial I}{\partial y}\Delta y+\frac{\partial I}{\partial t}\Delta t = 0</math><br />
which results in <br />
:<math>\frac{\partial I}{\partial x}V_x+\frac{\partial I}{\partial y}V_y+\frac{\partial I}{\partial t} = 0</math><br />
where <math>V_x,V_y</math> are the <math>x</math> and <math>y</math> components of the velocity or optical flow of <math>I(x,y,t)</math> and <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math> and <math>\tfrac{\partial I}{\partial t}</math> are the derivatives of the image at <math>(x,y,t)</math> in the corresponding directions. <math>I_x</math>,<math> I_y</math> and <math> I_t</math> can be written for the derivatives in the following.<br />
<br />
This is an equation in two unknowns and cannot be solved as such. This is known as the ''[[Motion perception#The aperture problem|aperture problem]]'' of the optical flow algorithms. To find the optical flow another set of equations is needed, given by some additional constraint. All optical flow methods introduce additional conditions for estimating the actual flow.<br />
<br />
== Problem & Motivation ==<br />
<br />
The current mainstream approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. However, directly obtaining the motion field ground-truth is not possible and instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
== Traditional Methods (Horn and Schunk) ==<br />
<br />
-Under Construction-</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40273
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T04:48:22Z
<p>Z62zhu: </p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Optical Flow ==<br />
<br />
Optical flow is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence or smoothing between objects. <br />
<br />
== Problem & Motivation ==<br />
<br />
The current mainstream approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. However, directly obtaining the motion field ground-truth is not possible and instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
== Traditional Methods (Horn and Schunk) ==<br />
<br />
-Under Construction-</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40265
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T04:37:55Z
<p>Z62zhu: /* Traditional Methods Horn and Schunk) */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Problem & Motivation ==<br />
<br />
Optical flow is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence or smoothing between objects. <br />
<br />
The current mainstream approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. However, directly obtaining the motion field ground-truth is not possible and instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
== Traditional Methods (Horn and Schunk) ==<br />
<br />
-Under Construction-</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40264
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T04:37:31Z
<p>Z62zhu: /* Traditional Methods */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Problem & Motivation ==<br />
<br />
Optical flow is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence or smoothing between objects. <br />
<br />
The current mainstream approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. However, directly obtaining the motion field ground-truth is not possible and instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
== Traditional Methods Horn and Schunk) ==</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40261
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T04:34:29Z
<p>Z62zhu: </p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Problem & Motivation ==<br />
<br />
Optical flow is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence or smoothing between objects. <br />
<br />
The current mainstream approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. However, directly obtaining the motion field ground-truth is not possible and instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.<br />
<br />
== Traditional Methods ==</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40260
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T04:34:14Z
<p>Z62zhu: /* Problem & Motivation */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Problem & Motivation ==<br />
<br />
Optical flow is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence or smoothing between objects. <br />
<br />
The current mainstream approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. However, directly obtaining the motion field ground-truth is not possible and instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the ground-truthing becomes. <br />
<br />
In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate segmentation ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40257
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T04:32:06Z
<p>Z62zhu: /* Problem & Motivation */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Problem & Motivation ==<br />
<br />
Optical flow is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence or smoothing between objects. <br />
<br />
The current mainstream approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. However, directly obtaining the motion field ground-truth is not possible and instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the ground-truthing becomes. In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40255
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T04:31:48Z
<p>Z62zhu: /* Problem & Motivation */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Problem & Motivation ==<br />
<br />
Optical flow is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence or smoothing between objects. <br />
<br />
The current mainstream approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. However, directly obtaining the motion field ground-truth is not possible and instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes manually using a ground-truth labeling software. Then as the training and test datasets become larger in size, the more laborious the ground-truthing becomes. In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars. However,from real scenes — the quantity that optical flow attempts to approximate — is not possible.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40250
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T04:28:58Z
<p>Z62zhu: /* Introduction */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Problem & Motivation ==<br />
<br />
Optical flow is the apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position of pixels between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together (motion smoothness).<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence or smoothing between objects. <br />
<br />
The current mainstream approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods using convolutional neural networks (convnets). The inherent challenge with these supervised learning approaches lies in the groundtruth flow, the process of gathering provable data for the measure of the target variable for the training and testing datasets. However, optical flow ground-truth is not possible and instead, segmentation ground-truthing is generally used. Since the segmentation ground-truthing isn't always automated, it requires laborious labeling of items in the video, sometimes manually using a ground-truth labeling software. In the case of the KITTI dataset, a collection of images captured from driving cars around a mid-sized city in Germany, accurate ground truth for the training and testing data is obtained using high-tech laser scanners, as well as a GPS localization device installed onto the top of the cars. However, directly obtaining the motion field groundtruth from real scenes — the quantity that optical flow attempts to approximate — is not possible.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=40225
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-20T03:22:43Z
<p>Z62zhu: /* Introduction */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Introduction ==<br />
<br />
Optical flow is the pattern of apparent motion of image brightness patterns in objects, surfaces and edges in videos. In more laymen terms, it tracks the change in position between two frames caused by the movement of the object or the camera, and it does this on the basis of two assumptions:<br />
<br />
1. Pixel intensities do not change rapidly between frames (brightness constancy).<br />
<br />
2. Groups of pixels move together.<br />
<br />
Both of these assumptions are derived from real-world implications. Firstly, the time between two consecutive frames of a video are so minuscule, such that it becomes extremely improbable for the intensity of a pixel to completely change, even if its location has changed. Secondly, pixels do not teleport. The assumption that groups of pixels move together implies that there is spacial coherence or smoothing between objects. <br />
<br />
The current approach to solving optimal flow problems, albeit widely successful, has been a result of supervised learning methods.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=39925
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-19T04:11:05Z
<p>Z62zhu: /* Presented by */</p>
<hr />
<div>== Presented by == <br />
*Hudson Ash <br />
*Stephen Kingston<br />
*Richard Zhang<br />
*Alexandre Xiao<br />
*Ziqiu Zhu<br />
<br />
== Introduction ==</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F18&diff=39918
stat441F18
2018-11-19T04:06:04Z
<p>Z62zhu: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F18-STAT841-Proposal| Project Proposal ]] ==<br />
<br />
[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]<br />
<br />
<br />
= Record your contributions here [https://docs.google.com/spreadsheets/d/10CHiJpAylR6kB9QLqN7lZHN79D9YEEW6CDTH27eAhbQ/edit?usp=sharing]=<br />
<br />
Use the following notations:<br />
<br />
P: You have written a summary/critique on the paper.<br />
<br />
T: You had a technical contribution on a paper (excluding the paper that you present).<br />
<br />
E: You had an editorial contribution on a paper (excluding the paper that you present).<br />
<br />
<br />
<br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="100pt"|Name <br />
|width="30pt"|Paper number <br />
|width="700pt"|Title<br />
|width="30pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|-<br />
|Feb 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [http://wikicoursenote.com/wiki/Stat946f15/Sequence_to_sequence_learning_with_neural_networks#Long_Short-Term_Memory_Recurrent_Neural_Network Summary]<br />
|-<br />
|Nov 13 || Jason Schneider, Jordyn Walton, Zahraa Abbas, Andrew Na || 1|| Memory-Based Parameter Adaptation || [https://arxiv.org/pdf/1802.10542.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Memory-Based_Parameter_Adaptation#Incremental_Learning Summary]<br />
|-<br />
|Nov 13 ||Sai Praneeth M, Xudong Peng, Alice Li, Shahrzad Hosseini Vajargah|| 2|| Going Deeper with Convolutions ||[https://arxiv.org/pdf/1409.4842.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary]<br />
|-<br />
|NOv 15 || Yan Yu Chen, Qisi Deng, Hengxin Li, Bochao Zhang|| 3|| Topic Compositional Neural Language Model|| [https://arxiv.org/pdf/1712.09783.pdf paper] || <br />
[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F18/TCNLM Summary]<br />
|-<br />
|Nov 15 || Zhaoran Hou, Pei Wei Wang, Chi Zhang, Yiming Li, Daoyi Chen, Ying Chi|| 4|| Extreme Learning Machine for regression and Multi-class Classification|| [https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6035797 Paper] || <br />
[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat841F18/ Summary]<br />
|-<br />
|NOv 20 || Kristi Brewster, Isaac McLellan, Ahmad Nayar Hassan, Marina Medhat Rassmi Melek, Brendan Ross, Jon Barenboim, Junqiao Lin, James Bootsma || 5|| A Neural Representation of Sketch Drawings || [https://arxiv.org/pdf/1704.03477.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_-_A_Neural_Representation_of_Sketch_Drawings Summary] <br />
|-<br />
|Nov 20 || Maya(Mahdiyeh) Bayati, Saber Malekmohammadi, Vincent Loung || 6|| Convolutional Neural Networks for Sentence Classiﬁcation || [https://arxiv.org/pdf/1408.5882.pdf paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Convolutional_Neural_Networks_for_Sentence_Classi%EF%AC%81cation Summary] <br />
|-<br />
|NOv 22 || Qingxi Huo, Yanmin Yang, Jiaqi Wang, Yuanjing Cai, Colin Stranc, Philomène Bobichon, Aditya Maheshwari, Zepeng An || 7|| Robust Probabilistic Modeling with Bayesian Data Reweighting || [http://proceedings.mlr.press/v70/wang17g/wang17g.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Robust_Probabilistic_Modeling_with_Bayesian_Data_Reweighting Summary]<br />
|-<br />
|Nov 22 || Hanzhen Yang, Jing Pu Sun, Ganyuan Xuan, Yu Su, Jiacheng Weng, Keqi Li, Yi Qian, Bomeng Liu || 8|| Deep Residual Learning for Image Recognition || [http://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Residual_Learning_for_Image_Recognition Summary]<br />
|-<br />
|NOv 27 || Mitchell Snaith || 9|| You Only Look Once: Unified, Real-Time Object Detection, V1 -> V3 || [https://arxiv.org/pdf/1506.02640.pdf Paper] || <br />
|-<br />
|Nov 27 || Qi Chu, Gloria Huang, Di Sang, Amanda Lam, Yan Jiao, Shuyue Wang, Yutong Wu || 10|| A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques || [https://arxiv.org/pdf/1707.02919.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=A_Brief_Survey_of_Text_Mining:_Classification,_Clustering_and_Extraction_Techniques Summary]<br />
|-<br />
|NOv 29 || Jameson Ngo, Amy Xu, Aden Grant, Yu Hao Wang, Andrew McMurry, Baizhi Song, Yongqi Dong || 11|| Towards Deep Learning Models Resistant to Adversarial Attacks || [https://arxiv.org/pdf/1706.06083.pdf Paper] || <br />
|-<br />
|Nov 29 || Qianying Zhao, Hui Huang, Lingyun Yi, Jiayue Zhang, Siao Chen, Rongrong Su, Gezhou Zhang, Meiyu Zhou || 12|| XGBoost: A Scalable Tree Boosting System || [http://delivery.acm.org/10.1145/2940000/2939785/p785-chen.pdf?ip=129.97.124.2&id=2939785&acc=CHORUS&key=FD0067F557510FFB%2E9219CF56F73DCF78%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1542321481_ffea42f38a2b3325af4990280553c10f Paper] ||<br />
|-<br />
|Nov 28 || Hudson Ash, Stephen Kingston, Richard Zhang, Alexandre Xiao, Ziqiu Zhu || 13 || Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness || [https://arxiv.org/pdf/1608.05842.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness Summary]<br />
|-<br />
|Nov 21 || Frank Jiang, Yuan Zhang, Jerry Hu || 14 || Distributed Representations of Words and Phrases and their Compositionality || [https://arxiv.org/pdf/1310.4546.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Representations_of_Words_and_Phrases_and_their_Compositionality Summary]<br />
|-<br />
|Nov 21 || Yu Xuan Lee, Tsen Yee Heng || 15 || Gradient Episodic Memory for Continual Learning || [http://papers.nips.cc/paper/7225-gradient-episodic-memory-for-continual-learning.pdf Paper] || <br />
|-<br />
|Nov 28 || Ben Zhang, Rees Simmons, Sunil Mall || 16 || Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift || [https://arxiv.org/pdf/1502.03167.pdf Paper] ||</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Learning_of_Optical_Flow_via_Brightness_Constancy_and_Motion_Smoothness&diff=39917
Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness
2018-11-19T04:02:55Z
<p>Z62zhu: Created page with "== Presented by == Hudson Ash, Stephen Kingston, Richard Zhang, Alexandre Xiao, Ziqiu Zhu == Introduction =="</p>
<hr />
<div>== Presented by == <br />
Hudson Ash, Stephen Kingston, Richard Zhang, Alexandre Xiao, Ziqiu Zhu<br />
<br />
== Introduction ==</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=F18-STAT841-Proposal&diff=36687
F18-STAT841-Proposal
2018-10-08T17:02:26Z
<p>Z62zhu: </p>
<hr />
<div><br />
'''Use this format (Don’t remove Project 0)'''<br />
<br />
'''Project # 0'''<br />
Group members:<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
'''Title:''' Making a String Telephone<br />
<br />
'''Description:''' We use paper cups to make a string phone and talk with friends while learning about sound waves with this science project. (Explain your project in one or two paragraphs).<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 1'''<br />
Group members:<br />
<br />
Weng, Jiacheng<br />
<br />
Li, Keqi<br />
<br />
Qian, Yi<br />
<br />
Liu, Bomeng<br />
<br />
'''Title:''' RSNA Pneumonia Detection Challenge<br />
<br />
'''Description:''' <br />
<br />
Our team’s project is the RSNA Pneumonia Detection Challenge from Kaggle competition. The primary goal of this project is to develop a machine learning tool to detect patients with pneumonia based on their chest radiographs (CXR). <br />
<br />
Pneumonia is an infection that inflames the air sacs in human lungs which has symptoms such as chest pain, cough, and fever [1]. Pneumonia can be very dangerous especially to infants and elders. In 2015, 920,000 children under the age of 5 died from this disease [2]. Due to its fatality to children, diagnosing pneumonia has a high order. A common method of diagnosing pneumonia is to obtain patients’ chest radiograph (CXR) which is a gray-scale scan image of patients’ chests using x-ray. The infected region due to pneumonia usually shows as an area or areas of increased opacity [3] on CXR. However, many other factors can also contribute to increase in opacity on CXR which makes the diagnose very challenging. The diagnose also requires highly-skilled clinicians and a lot of time of CXR screening. The Radiological Society of North America (RSNA®) sees the opportunity of using machine learning to potentially accelerate the initial CXR screening process. <br />
<br />
For the scope of this project, our team plans to contribute to solving this problem by applying our machine learning knowledge in image processing and classification. Team members are going to apply techniques that include, but are not limited to: logistic regression, random forest, SVM, kNN, CNN, etc., in order to successfully detect CXRs with pneumonia.<br />
<br />
<br />
[1] (Accessed 2018, Oct. 4). Pneumonia [Online]. MAYO CLINIC. Available from: https://www.mayoclinic.org/diseases-conditions/pneumonia/symptoms-causes/syc-20354204<br />
[2] (Accessed 2018, Oct. 4). RSNA Pneumonia Detection Challenge [Online]. Kaggle. Available from: https://www.kaggle.com/c/rsna-pneumonia-detection-challenge<br />
[3] Franquet T. Imaging of community-acquired pneumonia. J Thorac Imaging 2018 (epub ahead of print). PMID 30036297<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 2'''<br />
Group members:<br />
<br />
Hou, Zhaoran<br />
<br />
Zhang, Chi<br />
<br />
'''Title:''' <br />
<br />
'''Description:'''<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 3'''<br />
Group members:<br />
<br />
Hanzhen Yang<br />
<br />
Jing Pu Sun<br />
<br />
Ganyuan Xuan<br />
<br />
Yu Su<br />
<br />
'''Title:''' Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:'''<br />
<br />
Our team chose the [https://www.kaggle.com/c/quickdraw-doodle-recognition Quick, Draw! Doodle Recognition Challenge] from the Kaggle Competition. The goal of the competition is to build an image recognition tool that can classify hand-drawn doodles into one of the 340 categories.<br />
<br />
The main challenge of the project remains in the training set being very noisy. Hand-drawn artwork may deviate substantially from the actual object, and is almost definitively different from person to person. Mislabeled images also present a problem since they will create outlier points when we train our models. <br />
<br />
We plan on learning more about some of the currently mature image recognition algorithms to inspire and develop our own model.<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 4'''<br />
Group members:<br />
<br />
Snaith, Mitchell<br />
<br />
'''Title:''' Reproducibility report: *Fixing Variational Bayes: Deterministic Variational Inference for Bayesian Neural Networks*<br />
<br />
'''Description:''' <br />
<br />
The paper *Fixing Variational Bayes: Deterministic Variational Inference for Bayesian Neural Networks* [1] has been submitted to ICLR 2019. It aims to "fix" variational Bayes and turn it into a robust inference tool through two innovations. <br />
<br />
Goals are to: <br />
<br />
- reproduce the deterministic variational inference scheme as described in the paper without referencing the original author's code, providing a 3rd party implementation<br />
<br />
- reproduce experiment results with own implementation, using the same NN framework for reference implementations of compared methods described in the paper<br />
<br />
- reproduce experiment results with the author's own implementation<br />
<br />
- explore other possible applications of variational Bayes besides heteroscedastic regression<br />
<br />
[1] OpenReview location: https://openreview.net/forum?id=B1l08oAct7<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 5'''<br />
Group members:<br />
<br />
Rebecca, Chen<br />
<br />
Susan,<br />
<br />
Mike, Li<br />
<br />
Ted, Wang<br />
<br />
'''Title:''' Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:''' <br />
<br />
Classification has become a more and more eye-catching, especially with the rise of machine learning in these years. Our team is particularly interested in machine learning algorithms that optimize some specific type image classification. <br />
<br />
In this project, we will dig into base classifiers we learnt from the class and try to cook them together to find an optimal solution for a certain type images dataset. Currently, we are looking into a dataset from Kaggle: Quick, Draw! Doodle Recognition Challenge. The dataset in this competition contains 50M drawings among 340 categories and is the subset of the world’s largest doodling dataset and the doodling dataset is updating by real drawing game players. Anyone can contribution by joining it! (quickdraw.withgoogle.com).<br />
<br />
For us, as machine learning students, we are more eager to help getting a better classification method. By “better”, we mean find a balance between simplify and accuracy. We will start with neural network via different activation functions in each layer and we will also combine base classifiers with bagging, random forest, boosting for ensemble learning. Also, we will try to regulate our parameters to avoid overfitting in training dataset. Last, we will summary features of this type image dataset, formulate our solutions and standardize our steps to solve this kind problems <br />
<br />
Hopefully, we can not only finish our project successfully, but also make a little contribution to machine learning research field.<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 6'''<br />
Group members:<br />
<br />
Ngo, Jameson<br />
<br />
Xu, Amy<br />
<br />
'''Title:''' Kaggle Challenge: [https://www.kaggle.com/c/human-protein-atlas-image-classification Human Protein Atlas Image Classification]<br />
<br />
'''Description:''' <br />
<br />
We will participate in the Human Protein Atlas Image Classification competition featured on Kaggle. We will classify proteins based on patterns seen in microscopic images of human cells.<br />
<br />
Historically, the work done to classify proteins had only developed methods to classify proteins using single patterns of very few cell types at a time. The goal of this challenge is to develop methods to classify proteins based on multiple/mixed patterns and with a larger range of cell types.<br />
<br />
--------------------------------------------------------------------<br />
'''Project # 7'''<br />
Group members:<br />
<br />
Qianying Zhao<br />
<br />
Hui Huang<br />
<br />
Meiyu Zhou<br />
<br />
Gezhou Zhang<br />
<br />
'''Title:''' Google Analytics Customer Revenue Prediction<br />
<br />
'''Description:''' <br />
Our group will participate in the featured Kaggle competition of Google Analytics Customer Revenue Prediction. In this competition, we will analyze customer dataset from a Google Merchandise Store selling swags to predict revenue per customer using Rstudio. Our presentation report will include not only how we've concluded by classifying and analyzing provided data with appropriate models, but also how we performed in the contest.<br />
<br />
--------------------------------------------------------------------<br />
'''Project # 8'''<br />
Group members:<br />
<br />
Jiayue Zhang<br />
<br />
Lingyun Yi<br />
<br />
Rongrong Su<br />
<br />
Siao Chen<br />
<br />
<br />
'''Title:''' Kaggle--Two Sigma: Using News to Predict Stock Movements<br />
<br />
<br />
'''Description:''' <br />
Stock price is affected by the news to some extent. What is the news influence on stock price and what is the predicted power of the news? <br />
What we are going to do is to use the content of news to predict the tendency of stock price. We will mine the data, finding the useful information behind the big data. As the result we will predict the stock price performance when market faces news.<br />
<br />
<br />
--------------------------------------------------------------------<br />
'''Project # 9'''<br />
Group members:<br />
<br />
Hassan, Ahmad Nayar<br />
<br />
McLellan, Isaac<br />
<br />
Brewster, Kristi<br />
<br />
Melek, Marina Medhat Rassmi <br />
<br />
<br />
'''Title:''' Quick, Draw! Doodle Recognition<br />
<br />
'''Description:''' <br />
<br />
'''Background'''<br />
<br />
Google’s Quick, Draw! is an online game where a user is prompted to draw an image depicting a certain category in under 20 seconds. As the drawing is being completed, the game uses a model which attempts to correctly identify the image being drawn. With the aim to improve the underlying pattern recognition model this game uses, Google is hosting a Kaggle competition asking the public to build a model to correctly identify a given drawing. The model should classify the drawing into one of the 340 label categories within the Quick, Draw! Game in 3 guesses or less.<br />
<br />
'''Proposed Approach'''<br />
<br />
Each image/doodle (input) is considered as a matrix of pixel values. In order to classify images, we need to essentially reshape an images’ respective matrix of pixel values - convolution. This would reduce the dimensionality of the input significantly which in turn reduces the number of parameters of any proposed recognition model. Using filters, pooling layers and further convolution, a final layer called the fully connected layer is used to correlate images with categories, assigning probabilities (weights) and hence classifying images. <br />
<br />
This approach to image classification is called convolution neural network (CNN) and we propose using this to classify the doodles within the Quick, Draw! Dataset.<br />
<br />
To control overfitting and underfitting of our proposed model and minimizing the error, we will use different architectures consisting of different types and dimensions of pooling layers and input filters.<br />
<br />
'''Challenges'''<br />
<br />
This project presents a number of interesting challenges:<br />
* The data given for training is noisy in that it contains drawings that are incomplete or simply poorly drawn. Dealing with this noise will be a significant part of our work. <br />
* There are 340 label categories within the Quick, Draw! dataset, this means that the model created must be able to classify drawings based on a large pool of information while making effective use of powerful computational resources.<br />
<br />
'''Tools & Resources'''<br />
<br />
* We will use Python & MATLAB.<br />
* We will use the Quick, Draw! Dataset available on the Kaggle competition website. <https://www.kaggle.com/c/quickdraw-doodle-recognition/data><br />
<br />
--------------------------------------------------------------------<br />
'''Project # 10'''<br />
Group members:<br />
<br />
Lam, Amanda<br />
<br />
Huang, Xiaoran<br />
<br />
Chu, Qi<br />
<br />
Sang, Di<br />
<br />
'''Title:''' Kaggle Competition: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:'''<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 11'''<br />
Group members:<br />
<br />
Bobichon, Philomene<br />
<br />
Maheshwari, Aditya<br />
<br />
An, Zepeng<br />
<br />
Stranc, Colin<br />
<br />
'''Title:''' Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:''' <br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 12'''<br />
Group members:<br />
<br />
Huo, Qingxi<br />
<br />
Yang, Yanmin<br />
<br />
Cai, Yuanjing<br />
<br />
Wang, Jiaqi<br />
<br />
'''Title:''' <br />
<br />
'''Description:''' <br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 13'''<br />
Group members:<br />
<br />
Ross, Brendan<br />
<br />
Barenboim, Jon<br />
<br />
Lin, Junqiao<br />
<br />
Bootsma, James<br />
<br />
'''Title:''' Expanding Neural Netwrok<br />
<br />
'''Description:''' The goal of our project is to create an expanding neural network algorithm which starts off by training a small neural network then expands it to a larger one. We hypothesize that with the proper expansion method we could decrease training time and prevent overfitting. The method we wish to explore is to link together input dimensions based on covariance. Then when the neural network reaches convergence we create a larger neural network without the links between dimensions and using starting values from the smaller neural network. <br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 14'''<br />
Group members:<br />
<br />
Schneider, Jason <br />
<br />
Walton, Jordyn <br />
<br />
Abbas, Zahraa<br />
<br />
Na, Andrew<br />
<br />
'''Title:''' Application of ML Classification to Cancer Identification<br />
<br />
'''Description:''' The application of machine learning to cancer classification based on gene expression is a topic of great interest to physicians and biostatisticians alike. We would like to work on this for our final project to encourage the application of proven ML techniques to improve accuracy of cancer classification and diagnosis. In this project, we will use the dataset from Golub et al. [1] which contains data on gene expression on tumour biopsies to train a model and classify healthy individuals and individuals who have cancer.<br />
<br />
One challenge we may face pertains to the way that the data was collected. Some parts of the dataset have thousands of features (which each represent a quantitative measure of the expression of a certain gene) but as few as twenty samples. We propose some ways to mitigate the impact of this; including the use of PCA, leave-one-out cross validation, or regularization. <br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 15'''<br />
Group members:<br />
<br />
Praneeth, Sai<br />
<br />
Peng, Xudong <br />
<br />
Li, Alice<br />
<br />
Vajargah, Shahrzad<br />
<br />
'''Title:''' Google Analytics Customer Revenue Prediction [1] - A Kaggle Competition<br />
<br />
'''Description:''' Guess which cabin class in airlines is the most profitable? One might guess economy - but in reality, it's the premium classes that show higher returns. According to research conducted by Wendover productions [2], despite having less than 50 seats and taking up more space than the economy class, premium classes end up driving more revenue than other classes.<br />
<br />
In fact, just like airlines, many companies adopt the business model where the vast majority of revenue is derived from a minority group of customers. As a result, data-intensive promotional strategies are getting more and more attention nowadays from marketing teams to further improve company returns.<br />
<br />
In this Kaggle competition, we are challenged to analyze a Google Merchanidize Store's customer dataset to predict revenue per customer. We will implement a series of data analytics methods including pre-processing, data augmentation, and parameter tuning. Different classification algorithms will be compared and optimized in order to achieve the best results.<br />
<br />
'''Reference:'''<br />
<br />
[1] Kaggle. (2018, Sep 18). Google Analytics Customer Revenue Prediction. Retrieved from https://www.kaggle.com/c/ga-customer-revenue-prediction<br />
<br />
[2] Kottke, J (2017, Mar 17). The economics of airline classes. Retrieved from https://kottke.org/17/03/the-economics-of-airline-classes<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 16'''<br />
Group members:<br />
<br />
Wang, Yu Hao<br />
<br />
Grant, Aden <br />
<br />
McMurray, Andrew<br />
<br />
Song, Baizhi<br />
<br />
'''Title:''' Two Sigma: Using News to Predict Stock Movements - A Kaggle Competition<br />
<br />
By analyzing news data to predict stock prices, Kagglers have a unique opportunity to advance the state of research in understanding the predictive power of the news. This power, if harnessed, could help predict financial outcomes and generate significant economic impact all over the world.<br />
<br />
Data for this competition comes from the following sources:<br />
<br />
Market data provided by Intrinio.<br />
News data provided by Thomson Reuters. Copyright ©, Thomson Reuters, 2017. All Rights Reserved. Use, duplication, or sale of this service, or data contained herein, except as described in the Competition Rules, is strictly prohibited.<br />
<br />
we will test a variety of classification algorithms to determine an appropriate model.<br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 17'''<br />
Group Members:<br />
<br />
Jiang, Ya Fan<br />
<br />
Zhang, Yuan<br />
<br />
Hu, Jerry Jie<br />
<br />
'''Title:''' Kaggle Competition: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:''' Construction of a classifier that can learn from noisy training data and generalize to a clean test set . Training data coming from the Google game "Quick, Draw"<br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 18'''<br />
Group Members:<br />
<br />
Zhang, Ben<br />
<br />
'''Title:''' Two Sigma: Using News to Predict Stock Movements<br />
<br />
'''Description:''' Use news analytics to predict stock price performance. This is subject to change.<br />
<br />
----------------------------------------------------------------------<br />
'''Project # 19'''<br />
Group Members:<br />
<br />
Yan Yu Chen<br />
<br />
Qisi Deng<br />
<br />
Hengxin Li<br />
<br />
Bochao Zhang<br />
<br />
Our team currently has two interested topics at hand, and we have summarized the objective of each topic below. Please note that we will narrow down our choices after further discussions with the instructor.<br />
<br />
'''Description 1:''' With 14 percent of American claiming that social media is their most dominant news source, fake news shared on Facebook and Twitter are invading people’s information learning experience. Concomitantly, the quality and nature of online news have been gradually diluted by fake news that are sometimes imperceptible. With an aim of creating an unalloyed Internet surfing experience, we sought to develop a tool that performs fake news detection and classification. <br />
<br />
'''Description 2:''' Statistics Canada has recently reported an increasing trend of Toronto’s violent crime score. Though the Royal Canadian Mounted Police has put in the effort and endeavor to track crimes, the ambiguous snapshots captured by outdated cameras often hamper the investigation. Motivated by the aforementioned circumstance, our second interest focuses on the accurate numeral and letter identification within variable-resolution images.<br />
<br />
----------------------------------------------------------------------<br />
'''Project # 20'''<br />
Group Members:<br />
<br />
Dong, Yongqi (Michael)<br />
<br />
Kingston, Stephen<br />
<br />
'''Title:''' Kaggle--Two Sigma: Using News to Predict Stock Movements <br />
<br />
'''Description:''' The movement in price of a trade-able security, or stock, on any given day is an aggregation of each individual market participant’s appraisal of the intrinsic value of the underlying company or assets. These values are primarily driven by investors’ expectations of the company’s ability to generate future free cash flow. A steady stream of information on the state of macro and micro-economic variables which affect a company’s operations inform these market actors, primarily through news articles and alerts. We would like to take a universe of news headlines and parse the information into features, which allow us to classify the direction and ‘intensity’ of a stock’s price move, in any given day. Strategies may include various classification methods to determine the most effective solution.<br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 21'''<br />
Group members:<br />
<br />
Xiao, Alex<br />
<br />
Zhang, Richard<br />
<br />
Ash, Hudson<br />
<br />
Zhu, Ziqiu<br />
<br />
'''Title:''' Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge [Subject to Change]<br />
<br />
'''Description:''' <br />
<br />
"Quick, Draw! was released as an experimental game to educate the public in a playful way about how AI works. The game prompts users to draw an image depicting a certain category, such as ”banana,” “table,” etc. The game generated more than 1B drawings, of which a subset was publicly released as the basis for this competition’s training set. That subset contains 50M drawings encompassing 340 label categories."<br />
<br />
Our goal as students are to a build classification tool that will classify hand-drawn doodles into one of the 340 label categories.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=F18-STAT841-Proposal&diff=36686
F18-STAT841-Proposal
2018-10-08T16:58:22Z
<p>Z62zhu: </p>
<hr />
<div><br />
'''Use this format (Don’t remove Project 0)'''<br />
<br />
'''Project # 0'''<br />
Group members:<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
'''Title:''' Making a String Telephone<br />
<br />
'''Description:''' We use paper cups to make a string phone and talk with friends while learning about sound waves with this science project. (Explain your project in one or two paragraphs).<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 1'''<br />
Group members:<br />
<br />
Weng, Jiacheng<br />
<br />
Li, Keqi<br />
<br />
Qian, Yi<br />
<br />
Liu, Bomeng<br />
<br />
'''Title:''' RSNA Pneumonia Detection Challenge<br />
<br />
'''Description:''' <br />
<br />
Our team’s project is the RSNA Pneumonia Detection Challenge from Kaggle competition. The primary goal of this project is to develop a machine learning tool to detect patients with pneumonia based on their chest radiographs (CXR). <br />
<br />
Pneumonia is an infection that inflames the air sacs in human lungs which has symptoms such as chest pain, cough, and fever [1]. Pneumonia can be very dangerous especially to infants and elders. In 2015, 920,000 children under the age of 5 died from this disease [2]. Due to its fatality to children, diagnosing pneumonia has a high order. A common method of diagnosing pneumonia is to obtain patients’ chest radiograph (CXR) which is a gray-scale scan image of patients’ chests using x-ray. The infected region due to pneumonia usually shows as an area or areas of increased opacity [3] on CXR. However, many other factors can also contribute to increase in opacity on CXR which makes the diagnose very challenging. The diagnose also requires highly-skilled clinicians and a lot of time of CXR screening. The Radiological Society of North America (RSNA®) sees the opportunity of using machine learning to potentially accelerate the initial CXR screening process. <br />
<br />
For the scope of this project, our team plans to contribute to solving this problem by applying our machine learning knowledge in image processing and classification. Team members are going to apply techniques that include, but are not limited to: logistic regression, random forest, SVM, kNN, CNN, etc., in order to successfully detect CXRs with pneumonia.<br />
<br />
<br />
[1] (Accessed 2018, Oct. 4). Pneumonia [Online]. MAYO CLINIC. Available from: https://www.mayoclinic.org/diseases-conditions/pneumonia/symptoms-causes/syc-20354204<br />
[2] (Accessed 2018, Oct. 4). RSNA Pneumonia Detection Challenge [Online]. Kaggle. Available from: https://www.kaggle.com/c/rsna-pneumonia-detection-challenge<br />
[3] Franquet T. Imaging of community-acquired pneumonia. J Thorac Imaging 2018 (epub ahead of print). PMID 30036297<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 2'''<br />
Group members:<br />
<br />
Hou, Zhaoran<br />
<br />
Zhang, Chi<br />
<br />
'''Title:''' <br />
<br />
'''Description:'''<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 3'''<br />
Group members:<br />
<br />
Hanzhen Yang<br />
<br />
Jing Pu Sun<br />
<br />
Ganyuan Xuan<br />
<br />
Yu Su<br />
<br />
'''Title:''' Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:'''<br />
<br />
Our team chose the [https://www.kaggle.com/c/quickdraw-doodle-recognition Quick, Draw! Doodle Recognition Challenge] from the Kaggle Competition. The goal of the competition is to build an image recognition tool that can classify hand-drawn doodles into one of the 340 categories.<br />
<br />
The main challenge of the project remains in the training set being very noisy. Hand-drawn artwork may deviate substantially from the actual object, and is almost definitively different from person to person. Mislabeled images also present a problem since they will create outlier points when we train our models. <br />
<br />
We plan on learning more about some of the currently mature image recognition algorithms to inspire and develop our own model.<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 4'''<br />
Group members:<br />
<br />
Snaith, Mitchell<br />
<br />
'''Title:''' Reproducibility report: *Fixing Variational Bayes: Deterministic Variational Inference for Bayesian Neural Networks*<br />
<br />
'''Description:''' <br />
<br />
The paper *Fixing Variational Bayes: Deterministic Variational Inference for Bayesian Neural Networks* [1] has been submitted to ICLR 2019. It aims to "fix" variational Bayes and turn it into a robust inference tool through two innovations. <br />
<br />
Goals are to: <br />
<br />
- reproduce the deterministic variational inference scheme as described in the paper without referencing the original author's code, providing a 3rd party implementation<br />
<br />
- reproduce experiment results with own implementation, using the same NN framework for reference implementations of compared methods described in the paper<br />
<br />
- reproduce experiment results with the author's own implementation<br />
<br />
- explore other possible applications of variational Bayes besides heteroscedastic regression<br />
<br />
[1] OpenReview location: https://openreview.net/forum?id=B1l08oAct7<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 5'''<br />
Group members:<br />
<br />
Rebecca, Chen<br />
<br />
Susan,<br />
<br />
Mike, Li<br />
<br />
Ted, Wang<br />
<br />
'''Title:''' Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:''' <br />
<br />
Classification has become a more and more eye-catching, especially with the rise of machine learning in these years. Our team is particularly interested in machine learning algorithms that optimize some specific type image classification. <br />
<br />
In this project, we will dig into base classifiers we learnt from the class and try to cook them together to find an optimal solution for a certain type images dataset. Currently, we are looking into a dataset from Kaggle: Quick, Draw! Doodle Recognition Challenge. The dataset in this competition contains 50M drawings among 340 categories and is the subset of the world’s largest doodling dataset and the doodling dataset is updating by real drawing game players. Anyone can contribution by joining it! (quickdraw.withgoogle.com).<br />
<br />
For us, as machine learning students, we are more eager to help getting a better classification method. By “better”, we mean find a balance between simplify and accuracy. We will start with neural network via different activation functions in each layer and we will also combine base classifiers with bagging, random forest, boosting for ensemble learning. Also, we will try to regulate our parameters to avoid overfitting in training dataset. Last, we will summary features of this type image dataset, formulate our solutions and standardize our steps to solve this kind problems <br />
<br />
Hopefully, we can not only finish our project successfully, but also make a little contribution to machine learning research field.<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 6'''<br />
Group members:<br />
<br />
Ngo, Jameson<br />
<br />
Xu, Amy<br />
<br />
'''Title:''' Kaggle Challenge: [https://www.kaggle.com/c/human-protein-atlas-image-classification Human Protein Atlas Image Classification]<br />
<br />
'''Description:''' <br />
<br />
We will participate in the Human Protein Atlas Image Classification competition featured on Kaggle. We will classify proteins based on patterns seen in microscopic images of human cells.<br />
<br />
Historically, the work done to classify proteins had only developed methods to classify proteins using single patterns of very few cell types at a time. The goal of this challenge is to develop methods to classify proteins based on multiple/mixed patterns and with a larger range of cell types.<br />
<br />
--------------------------------------------------------------------<br />
'''Project # 7'''<br />
Group members:<br />
<br />
Qianying Zhao<br />
<br />
Hui Huang<br />
<br />
Meiyu Zhou<br />
<br />
Gezhou Zhang<br />
<br />
'''Title:''' Google Analytics Customer Revenue Prediction<br />
<br />
'''Description:''' <br />
Our group will participate in the featured Kaggle competition of Google Analytics Customer Revenue Prediction. In this competition, we will analyze customer dataset from a Google Merchandise Store selling swags to predict revenue per customer using Rstudio. Our presentation report will include not only how we've concluded by classifying and analyzing provided data with appropriate models, but also how we performed in the contest.<br />
<br />
--------------------------------------------------------------------<br />
'''Project # 8'''<br />
Group members:<br />
<br />
Jiayue Zhang<br />
<br />
Lingyun Yi<br />
<br />
Rongrong Su<br />
<br />
Siao Chen<br />
<br />
<br />
'''Title:''' Kaggle--Two Sigma: Using News to Predict Stock Movements<br />
<br />
<br />
'''Description:''' <br />
Stock price is affected by the news to some extent. What is the news influence on stock price and what is the predicted power of the news? <br />
What we are going to do is to use the content of news to predict the tendency of stock price. We will mine the data, finding the useful information behind the big data. As the result we will predict the stock price performance when market faces news.<br />
<br />
<br />
--------------------------------------------------------------------<br />
'''Project # 9'''<br />
Group members:<br />
<br />
Hassan, Ahmad Nayar<br />
<br />
McLellan, Isaac<br />
<br />
Brewster, Kristi<br />
<br />
Melek, Marina Medhat Rassmi <br />
<br />
<br />
'''Title:''' Quick, Draw! Doodle Recognition<br />
<br />
'''Description:''' <br />
<br />
'''Background'''<br />
<br />
Google’s Quick, Draw! is an online game where a user is prompted to draw an image depicting a certain category in under 20 seconds. As the drawing is being completed, the game uses a model which attempts to correctly identify the image being drawn. With the aim to improve the underlying pattern recognition model this game uses, Google is hosting a Kaggle competition asking the public to build a model to correctly identify a given drawing. The model should classify the drawing into one of the 340 label categories within the Quick, Draw! Game in 3 guesses or less.<br />
<br />
'''Proposed Approach'''<br />
<br />
Each image/doodle (input) is considered as a matrix of pixel values. In order to classify images, we need to essentially reshape an images’ respective matrix of pixel values - convolution. This would reduce the dimensionality of the input significantly which in turn reduces the number of parameters of any proposed recognition model. Using filters, pooling layers and further convolution, a final layer called the fully connected layer is used to correlate images with categories, assigning probabilities (weights) and hence classifying images. <br />
<br />
This approach to image classification is called convolution neural network (CNN) and we propose using this to classify the doodles within the Quick, Draw! Dataset.<br />
<br />
To control overfitting and underfitting of our proposed model and minimizing the error, we will use different architectures consisting of different types and dimensions of pooling layers and input filters.<br />
<br />
'''Challenges'''<br />
<br />
This project presents a number of interesting challenges:<br />
* The data given for training is noisy in that it contains drawings that are incomplete or simply poorly drawn. Dealing with this noise will be a significant part of our work. <br />
* There are 340 label categories within the Quick, Draw! dataset, this means that the model created must be able to classify drawings based on a large pool of information while making effective use of powerful computational resources.<br />
<br />
'''Tools & Resources'''<br />
<br />
* We will use Python & MATLAB.<br />
* We will use the Quick, Draw! Dataset available on the Kaggle competition website. <https://www.kaggle.com/c/quickdraw-doodle-recognition/data><br />
<br />
--------------------------------------------------------------------<br />
'''Project # 10'''<br />
Group members:<br />
<br />
Lam, Amanda<br />
<br />
Huang, Xiaoran<br />
<br />
Chu, Qi<br />
<br />
Sang, Di<br />
<br />
'''Title:''' Kaggle Competition: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:'''<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 11'''<br />
Group members:<br />
<br />
Bobichon, Philomene<br />
<br />
Maheshwari, Aditya<br />
<br />
An, Zepeng<br />
<br />
Stranc, Colin<br />
<br />
'''Title:''' Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:''' <br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 12'''<br />
Group members:<br />
<br />
Huo, Qingxi<br />
<br />
Yang, Yanmin<br />
<br />
Cai, Yuanjing<br />
<br />
Wang, Jiaqi<br />
<br />
'''Title:''' <br />
<br />
'''Description:''' <br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 13'''<br />
Group members:<br />
<br />
Ross, Brendan<br />
<br />
Barenboim, Jon<br />
<br />
Lin, Junqiao<br />
<br />
Bootsma, James<br />
<br />
'''Title:''' Expanding Neural Netwrok<br />
<br />
'''Description:''' The goal of our project is to create an expanding neural network algorithm which starts off by training a small neural network then expands it to a larger one. We hypothesize that with the proper expansion method we could decrease training time and prevent overfitting. The method we wish to explore is to link together input dimensions based on covariance. Then when the neural network reaches convergence we create a larger neural network without the links between dimensions and using starting values from the smaller neural network. <br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 14'''<br />
Group members:<br />
<br />
Schneider, Jason <br />
<br />
Walton, Jordyn <br />
<br />
Abbas, Zahraa<br />
<br />
Na, Andrew<br />
<br />
'''Title:''' Application of ML Classification to Cancer Identification<br />
<br />
'''Description:''' The application of machine learning to cancer classification based on gene expression is a topic of great interest to physicians and biostatisticians alike. We would like to work on this for our final project to encourage the application of proven ML techniques to improve accuracy of cancer classification and diagnosis. In this project, we will use the dataset from Golub et al. [1] which contains data on gene expression on tumour biopsies to train a model and classify healthy individuals and individuals who have cancer.<br />
<br />
One challenge we may face pertains to the way that the data was collected. Some parts of the dataset have thousands of features (which each represent a quantitative measure of the expression of a certain gene) but as few as twenty samples. We propose some ways to mitigate the impact of this; including the use of PCA, leave-one-out cross validation, or regularization. <br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 15'''<br />
Group members:<br />
<br />
Praneeth, Sai<br />
<br />
Peng, Xudong <br />
<br />
Li, Alice<br />
<br />
Vajargah, Shahrzad<br />
<br />
'''Title:''' Google Analytics Customer Revenue Prediction [1] - A Kaggle Competition<br />
<br />
'''Description:''' Guess which cabin class in airlines is the most profitable? One might guess economy - but in reality, it's the premium classes that show higher returns. According to research conducted by Wendover productions [2], despite having less than 50 seats and taking up more space than the economy class, premium classes end up driving more revenue than other classes.<br />
<br />
In fact, just like airlines, many companies adopt the business model where the vast majority of revenue is derived from a minority group of customers. As a result, data-intensive promotional strategies are getting more and more attention nowadays from marketing teams to further improve company returns.<br />
<br />
In this Kaggle competition, we are challenged to analyze a Google Merchanidize Store's customer dataset to predict revenue per customer. We will implement a series of data analytics methods including pre-processing, data augmentation, and parameter tuning. Different classification algorithms will be compared and optimized in order to achieve the best results.<br />
<br />
'''Reference:'''<br />
<br />
[1] Kaggle. (2018, Sep 18). Google Analytics Customer Revenue Prediction. Retrieved from https://www.kaggle.com/c/ga-customer-revenue-prediction<br />
<br />
[2] Kottke, J (2017, Mar 17). The economics of airline classes. Retrieved from https://kottke.org/17/03/the-economics-of-airline-classes<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 16'''<br />
Group members:<br />
<br />
Wang, Yu Hao<br />
<br />
Grant, Aden <br />
<br />
McMurray, Andrew<br />
<br />
Song, Baizhi<br />
<br />
'''Title:''' Two Sigma: Using News to Predict Stock Movements - A Kaggle Competition<br />
<br />
By analyzing news data to predict stock prices, Kagglers have a unique opportunity to advance the state of research in understanding the predictive power of the news. This power, if harnessed, could help predict financial outcomes and generate significant economic impact all over the world.<br />
<br />
Data for this competition comes from the following sources:<br />
<br />
Market data provided by Intrinio.<br />
News data provided by Thomson Reuters. Copyright ©, Thomson Reuters, 2017. All Rights Reserved. Use, duplication, or sale of this service, or data contained herein, except as described in the Competition Rules, is strictly prohibited.<br />
<br />
we will test a variety of classification algorithms to determine an appropriate model.<br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 17'''<br />
Group Members:<br />
<br />
Jiang, Ya Fan<br />
<br />
Zhang, Yuan<br />
<br />
Hu, Jerry Jie<br />
<br />
'''Title:''' Kaggle Competition: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:''' Construction of a classifier that can learn from noisy training data and generalize to a clean test set . Training data coming from the Google game "Quick, Draw"<br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 18'''<br />
Group Members:<br />
<br />
Zhang, Ben<br />
<br />
'''Title:''' Two Sigma: Using News to Predict Stock Movements<br />
<br />
'''Description:''' Use news analytics to predict stock price performance. This is subject to change.<br />
<br />
----------------------------------------------------------------------<br />
'''Project # 19'''<br />
Group Members:<br />
<br />
Yan Yu Chen<br />
<br />
Qisi Deng<br />
<br />
Hengxin Li<br />
<br />
Bochao Zhang<br />
<br />
Our team currently has two interested topics at hand, and we have summarized the objective of each topic below. Please note that we will narrow down our choices after further discussions with the instructor.<br />
<br />
'''Description 1:''' With 14 percent of American claiming that social media is their most dominant news source, fake news shared on Facebook and Twitter are invading people’s information learning experience. Concomitantly, the quality and nature of online news have been gradually diluted by fake news that are sometimes imperceptible. With an aim of creating an unalloyed Internet surfing experience, we sought to develop a tool that performs fake news detection and classification. <br />
<br />
'''Description 2:''' Statistics Canada has recently reported an increasing trend of Toronto’s violent crime score. Though the Royal Canadian Mounted Police has put in the effort and endeavor to track crimes, the ambiguous snapshots captured by outdated cameras often hamper the investigation. Motivated by the aforementioned circumstance, our second interest focuses on the accurate numeral and letter identification within variable-resolution images.<br />
<br />
----------------------------------------------------------------------<br />
'''Project # 20'''<br />
Group Members:<br />
<br />
Dong, Yongqi (Michael)<br />
<br />
Kingston, Stephen<br />
<br />
'''Title:''' Kaggle--Two Sigma: Using News to Predict Stock Movements <br />
<br />
'''Description:''' The movement in price of a trade-able security, or stock, on any given day is an aggregation of each individual market participant’s appraisal of the intrinsic value of the underlying company or assets. These values are primarily driven by investors’ expectations of the company’s ability to generate future free cash flow. A steady stream of information on the state of macro and micro-economic variables which affect a company’s operations inform these market actors, primarily through news articles and alerts. We would like to take a universe of news headlines and parse the information into features, which allow us to classify the direction and ‘intensity’ of a stock’s price move, in any given day. Strategies may include various classification methods to determine the most effective solution.<br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 21'''<br />
Group members:<br />
<br />
Xiao, Alex<br />
<br />
Zhang, Richard<br />
<br />
Ash, Hudson<br />
<br />
Zhu, Ziqiu<br />
<br />
'''Title:''' Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge [Subject to Change]<br />
<br />
'''Description:''' <br />
<br />
"Quick, Draw!" was released as an experimental game to educate the public in a playful way about how AI works. The game prompts users to draw an image depicting a certain category, such as ”banana,” “table,” etc. The game generated more than 1B drawings, of which a subset was publicly released as the basis for this competition’s training set. That subset contains 50M drawings encompassing 340 label categories.</div>
Z62zhu
http://wiki.math.uwaterloo.ca/statwiki/index.php?title=F18-STAT841-Proposal&diff=36685
F18-STAT841-Proposal
2018-10-08T16:23:44Z
<p>Z62zhu: </p>
<hr />
<div><br />
'''Use this format (Don’t remove Project 0)'''<br />
<br />
'''Project # 0'''<br />
Group members:<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
'''Title:''' Making a String Telephone<br />
<br />
'''Description:''' We use paper cups to make a string phone and talk with friends while learning about sound waves with this science project. (Explain your project in one or two paragraphs).<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 1'''<br />
Group members:<br />
<br />
Weng, Jiacheng<br />
<br />
Li, Keqi<br />
<br />
Qian, Yi<br />
<br />
Liu, Bomeng<br />
<br />
'''Title:''' RSNA Pneumonia Detection Challenge<br />
<br />
'''Description:''' <br />
<br />
Our team’s project is the RSNA Pneumonia Detection Challenge from Kaggle competition. The primary goal of this project is to develop a machine learning tool to detect patients with pneumonia based on their chest radiographs (CXR). <br />
<br />
Pneumonia is an infection that inflames the air sacs in human lungs which has symptoms such as chest pain, cough, and fever [1]. Pneumonia can be very dangerous especially to infants and elders. In 2015, 920,000 children under the age of 5 died from this disease [2]. Due to its fatality to children, diagnosing pneumonia has a high order. A common method of diagnosing pneumonia is to obtain patients’ chest radiograph (CXR) which is a gray-scale scan image of patients’ chests using x-ray. The infected region due to pneumonia usually shows as an area or areas of increased opacity [3] on CXR. However, many other factors can also contribute to increase in opacity on CXR which makes the diagnose very challenging. The diagnose also requires highly-skilled clinicians and a lot of time of CXR screening. The Radiological Society of North America (RSNA®) sees the opportunity of using machine learning to potentially accelerate the initial CXR screening process. <br />
<br />
For the scope of this project, our team plans to contribute to solving this problem by applying our machine learning knowledge in image processing and classification. Team members are going to apply techniques that include, but are not limited to: logistic regression, random forest, SVM, kNN, CNN, etc., in order to successfully detect CXRs with pneumonia.<br />
<br />
<br />
[1] (Accessed 2018, Oct. 4). Pneumonia [Online]. MAYO CLINIC. Available from: https://www.mayoclinic.org/diseases-conditions/pneumonia/symptoms-causes/syc-20354204<br />
[2] (Accessed 2018, Oct. 4). RSNA Pneumonia Detection Challenge [Online]. Kaggle. Available from: https://www.kaggle.com/c/rsna-pneumonia-detection-challenge<br />
[3] Franquet T. Imaging of community-acquired pneumonia. J Thorac Imaging 2018 (epub ahead of print). PMID 30036297<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 2'''<br />
Group members:<br />
<br />
Hou, Zhaoran<br />
<br />
Zhang, Chi<br />
<br />
'''Title:''' <br />
<br />
'''Description:'''<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 3'''<br />
Group members:<br />
<br />
Hanzhen Yang<br />
<br />
Jing Pu Sun<br />
<br />
Ganyuan Xuan<br />
<br />
Yu Su<br />
<br />
'''Title:''' Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:'''<br />
<br />
Our team chose the [https://www.kaggle.com/c/quickdraw-doodle-recognition Quick, Draw! Doodle Recognition Challenge] from the Kaggle Competition. The goal of the competition is to build an image recognition tool that can classify hand-drawn doodles into one of the 340 categories.<br />
<br />
The main challenge of the project remains in the training set being very noisy. Hand-drawn artwork may deviate substantially from the actual object, and is almost definitively different from person to person. Mislabeled images also present a problem since they will create outlier points when we train our models. <br />
<br />
We plan on learning more about some of the currently mature image recognition algorithms to inspire and develop our own model.<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 4'''<br />
Group members:<br />
<br />
Snaith, Mitchell<br />
<br />
'''Title:''' Reproducibility report: *Fixing Variational Bayes: Deterministic Variational Inference for Bayesian Neural Networks*<br />
<br />
'''Description:''' <br />
<br />
The paper *Fixing Variational Bayes: Deterministic Variational Inference for Bayesian Neural Networks* [1] has been submitted to ICLR 2019. It aims to "fix" variational Bayes and turn it into a robust inference tool through two innovations. <br />
<br />
Goals are to: <br />
<br />
- reproduce the deterministic variational inference scheme as described in the paper without referencing the original author's code, providing a 3rd party implementation<br />
<br />
- reproduce experiment results with own implementation, using the same NN framework for reference implementations of compared methods described in the paper<br />
<br />
- reproduce experiment results with the author's own implementation<br />
<br />
- explore other possible applications of variational Bayes besides heteroscedastic regression<br />
<br />
[1] OpenReview location: https://openreview.net/forum?id=B1l08oAct7<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 5'''<br />
Group members:<br />
<br />
Rebecca, Chen<br />
<br />
Susan,<br />
<br />
Mike, Li<br />
<br />
Ted, Wang<br />
<br />
'''Title:''' Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:''' <br />
<br />
Classification has become a more and more eye-catching, especially with the rise of machine learning in these years. Our team is particularly interested in machine learning algorithms that optimize some specific type image classification. <br />
<br />
In this project, we will dig into base classifiers we learnt from the class and try to cook them together to find an optimal solution for a certain type images dataset. Currently, we are looking into a dataset from Kaggle: Quick, Draw! Doodle Recognition Challenge. The dataset in this competition contains 50M drawings among 340 categories and is the subset of the world’s largest doodling dataset and the doodling dataset is updating by real drawing game players. Anyone can contribution by joining it! (quickdraw.withgoogle.com).<br />
<br />
For us, as machine learning students, we are more eager to help getting a better classification method. By “better”, we mean find a balance between simplify and accuracy. We will start with neural network via different activation functions in each layer and we will also combine base classifiers with bagging, random forest, boosting for ensemble learning. Also, we will try to regulate our parameters to avoid overfitting in training dataset. Last, we will summary features of this type image dataset, formulate our solutions and standardize our steps to solve this kind problems <br />
<br />
Hopefully, we can not only finish our project successfully, but also make a little contribution to machine learning research field.<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 6'''<br />
Group members:<br />
<br />
Ngo, Jameson<br />
<br />
Xu, Amy<br />
<br />
'''Title:''' Kaggle Challenge: [https://www.kaggle.com/c/human-protein-atlas-image-classification Human Protein Atlas Image Classification]<br />
<br />
'''Description:''' <br />
<br />
We will participate in the Human Protein Atlas Image Classification competition featured on Kaggle. We will classify proteins based on patterns seen in microscopic images of human cells.<br />
<br />
Historically, the work done to classify proteins had only developed methods to classify proteins using single patterns of very few cell types at a time. The goal of this challenge is to develop methods to classify proteins based on multiple/mixed patterns and with a larger range of cell types.<br />
<br />
--------------------------------------------------------------------<br />
'''Project # 7'''<br />
Group members:<br />
<br />
Qianying Zhao<br />
<br />
Hui Huang<br />
<br />
Meiyu Zhou<br />
<br />
Gezhou Zhang<br />
<br />
'''Title:''' Google Analytics Customer Revenue Prediction<br />
<br />
'''Description:''' <br />
Our group will participate in the featured Kaggle competition of Google Analytics Customer Revenue Prediction. In this competition, we will analyze customer dataset from a Google Merchandise Store selling swags to predict revenue per customer using Rstudio. Our presentation report will include not only how we've concluded by classifying and analyzing provided data with appropriate models, but also how we performed in the contest.<br />
<br />
--------------------------------------------------------------------<br />
'''Project # 8'''<br />
Group members:<br />
<br />
Jiayue Zhang<br />
<br />
Lingyun Yi<br />
<br />
Rongrong Su<br />
<br />
Siao Chen<br />
<br />
<br />
'''Title:''' Kaggle--Two Sigma: Using News to Predict Stock Movements<br />
<br />
<br />
'''Description:''' <br />
Stock price is affected by the news to some extent. What is the news influence on stock price and what is the predicted power of the news? <br />
What we are going to do is to use the content of news to predict the tendency of stock price. We will mine the data, finding the useful information behind the big data. As the result we will predict the stock price performance when market faces news.<br />
<br />
<br />
--------------------------------------------------------------------<br />
'''Project # 9'''<br />
Group members:<br />
<br />
Hassan, Ahmad Nayar<br />
<br />
McLellan, Isaac<br />
<br />
Brewster, Kristi<br />
<br />
Melek, Marina Medhat Rassmi <br />
<br />
<br />
'''Title:''' Quick, Draw! Doodle Recognition<br />
<br />
'''Description:''' <br />
<br />
'''Background'''<br />
<br />
Google’s Quick, Draw! is an online game where a user is prompted to draw an image depicting a certain category in under 20 seconds. As the drawing is being completed, the game uses a model which attempts to correctly identify the image being drawn. With the aim to improve the underlying pattern recognition model this game uses, Google is hosting a Kaggle competition asking the public to build a model to correctly identify a given drawing. The model should classify the drawing into one of the 340 label categories within the Quick, Draw! Game in 3 guesses or less.<br />
<br />
'''Proposed Approach'''<br />
<br />
Each image/doodle (input) is considered as a matrix of pixel values. In order to classify images, we need to essentially reshape an images’ respective matrix of pixel values - convolution. This would reduce the dimensionality of the input significantly which in turn reduces the number of parameters of any proposed recognition model. Using filters, pooling layers and further convolution, a final layer called the fully connected layer is used to correlate images with categories, assigning probabilities (weights) and hence classifying images. <br />
<br />
This approach to image classification is called convolution neural network (CNN) and we propose using this to classify the doodles within the Quick, Draw! Dataset.<br />
<br />
To control overfitting and underfitting of our proposed model and minimizing the error, we will use different architectures consisting of different types and dimensions of pooling layers and input filters.<br />
<br />
'''Challenges'''<br />
<br />
This project presents a number of interesting challenges:<br />
* The data given for training is noisy in that it contains drawings that are incomplete or simply poorly drawn. Dealing with this noise will be a significant part of our work. <br />
* There are 340 label categories within the Quick, Draw! dataset, this means that the model created must be able to classify drawings based on a large pool of information while making effective use of powerful computational resources.<br />
<br />
'''Tools & Resources'''<br />
<br />
* We will use Python & MATLAB.<br />
* We will use the Quick, Draw! Dataset available on the Kaggle competition website. <https://www.kaggle.com/c/quickdraw-doodle-recognition/data><br />
<br />
--------------------------------------------------------------------<br />
'''Project # 10'''<br />
Group members:<br />
<br />
Lam, Amanda<br />
<br />
Huang, Xiaoran<br />
<br />
Chu, Qi<br />
<br />
Sang, Di<br />
<br />
'''Title:''' Kaggle Competition: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:'''<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 11'''<br />
Group members:<br />
<br />
Bobichon, Philomene<br />
<br />
Maheshwari, Aditya<br />
<br />
An, Zepeng<br />
<br />
Stranc, Colin<br />
<br />
'''Title:''' Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:''' <br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 12'''<br />
Group members:<br />
<br />
Huo, Qingxi<br />
<br />
Yang, Yanmin<br />
<br />
Cai, Yuanjing<br />
<br />
Wang, Jiaqi<br />
<br />
'''Title:''' <br />
<br />
'''Description:''' <br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 13'''<br />
Group members:<br />
<br />
Ross, Brendan<br />
<br />
Barenboim, Jon<br />
<br />
Lin, Junqiao<br />
<br />
Bootsma, James<br />
<br />
'''Title:''' Expanding Neural Netwrok<br />
<br />
'''Description:''' The goal of our project is to create an expanding neural network algorithm which starts off by training a small neural network then expands it to a larger one. We hypothesize that with the proper expansion method we could decrease training time and prevent overfitting. The method we wish to explore is to link together input dimensions based on covariance. Then when the neural network reaches convergence we create a larger neural network without the links between dimensions and using starting values from the smaller neural network. <br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 14'''<br />
Group members:<br />
<br />
Schneider, Jason <br />
<br />
Walton, Jordyn <br />
<br />
Abbas, Zahraa<br />
<br />
Na, Andrew<br />
<br />
'''Title:''' Application of ML Classification to Cancer Identification<br />
<br />
'''Description:''' The application of machine learning to cancer classification based on gene expression is a topic of great interest to physicians and biostatisticians alike. We would like to work on this for our final project to encourage the application of proven ML techniques to improve accuracy of cancer classification and diagnosis. In this project, we will use the dataset from Golub et al. [1] which contains data on gene expression on tumour biopsies to train a model and classify healthy individuals and individuals who have cancer.<br />
<br />
One challenge we may face pertains to the way that the data was collected. Some parts of the dataset have thousands of features (which each represent a quantitative measure of the expression of a certain gene) but as few as twenty samples. We propose some ways to mitigate the impact of this; including the use of PCA, leave-one-out cross validation, or regularization. <br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 15'''<br />
Group members:<br />
<br />
Praneeth, Sai<br />
<br />
Peng, Xudong <br />
<br />
Li, Alice<br />
<br />
Vajargah, Shahrzad<br />
<br />
'''Title:''' Google Analytics Customer Revenue Prediction [1] - A Kaggle Competition<br />
<br />
'''Description:''' Guess which cabin class in airlines is the most profitable? One might guess economy - but in reality, it's the premium classes that show higher returns. According to research conducted by Wendover productions [2], despite having less than 50 seats and taking up more space than the economy class, premium classes end up driving more revenue than other classes.<br />
<br />
In fact, just like airlines, many companies adopt the business model where the vast majority of revenue is derived from a minority group of customers. As a result, data-intensive promotional strategies are getting more and more attention nowadays from marketing teams to further improve company returns.<br />
<br />
In this Kaggle competition, we are challenged to analyze a Google Merchanidize Store's customer dataset to predict revenue per customer. We will implement a series of data analytics methods including pre-processing, data augmentation, and parameter tuning. Different classification algorithms will be compared and optimized in order to achieve the best results.<br />
<br />
'''Reference:'''<br />
<br />
[1] Kaggle. (2018, Sep 18). Google Analytics Customer Revenue Prediction. Retrieved from https://www.kaggle.com/c/ga-customer-revenue-prediction<br />
<br />
[2] Kottke, J (2017, Mar 17). The economics of airline classes. Retrieved from https://kottke.org/17/03/the-economics-of-airline-classes<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 16'''<br />
Group members:<br />
<br />
Wang, Yu Hao<br />
<br />
Grant, Aden <br />
<br />
McMurray, Andrew<br />
<br />
Song, Baizhi<br />
<br />
'''Title:''' Two Sigma: Using News to Predict Stock Movements - A Kaggle Competition<br />
<br />
By analyzing news data to predict stock prices, Kagglers have a unique opportunity to advance the state of research in understanding the predictive power of the news. This power, if harnessed, could help predict financial outcomes and generate significant economic impact all over the world.<br />
<br />
Data for this competition comes from the following sources:<br />
<br />
Market data provided by Intrinio.<br />
News data provided by Thomson Reuters. Copyright ©, Thomson Reuters, 2017. All Rights Reserved. Use, duplication, or sale of this service, or data contained herein, except as described in the Competition Rules, is strictly prohibited.<br />
<br />
we will test a variety of classification algorithms to determine an appropriate model.<br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 17'''<br />
Group Members:<br />
<br />
Jiang, Ya Fan<br />
<br />
Zhang, Yuan<br />
<br />
Hu, Jerry Jie<br />
<br />
'''Title:''' Kaggle Competition: Quick, Draw! Doodle Recognition Challenge<br />
<br />
'''Description:''' Construction of a classifier that can learn from noisy training data and generalize to a clean test set . Training data coming from the Google game "Quick, Draw"<br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 18'''<br />
Group Members:<br />
<br />
Zhang, Ben<br />
<br />
'''Title:''' Two Sigma: Using News to Predict Stock Movements<br />
<br />
'''Description:''' Use news analytics to predict stock price performance. This is subject to change.<br />
<br />
----------------------------------------------------------------------<br />
'''Project # 19'''<br />
Group Members:<br />
<br />
Yan Yu Chen<br />
<br />
Qisi Deng<br />
<br />
Hengxin Li<br />
<br />
Bochao Zhang<br />
<br />
Our team currently has two interested topics at hand, and we have summarized the objective of each topic below. Please note that we will narrow down our choices after further discussions with the instructor.<br />
<br />
'''Description 1:''' With 14 percent of American claiming that social media is their most dominant news source, fake news shared on Facebook and Twitter are invading people’s information learning experience. Concomitantly, the quality and nature of online news have been gradually diluted by fake news that are sometimes imperceptible. With an aim of creating an unalloyed Internet surfing experience, we sought to develop a tool that performs fake news detection and classification. <br />
<br />
'''Description 2:''' Statistics Canada has recently reported an increasing trend of Toronto’s violent crime score. Though the Royal Canadian Mounted Police has put in the effort and endeavor to track crimes, the ambiguous snapshots captured by outdated cameras often hamper the investigation. Motivated by the aforementioned circumstance, our second interest focuses on the accurate numeral and letter identification within variable-resolution images.<br />
<br />
----------------------------------------------------------------------<br />
'''Project # 20'''<br />
Group Members:<br />
<br />
Dong, Yongqi (Michael)<br />
<br />
Kingston, Stephen<br />
<br />
'''Title:''' Kaggle--Two Sigma: Using News to Predict Stock Movements <br />
<br />
'''Description:''' The movement in price of a trade-able security, or stock, on any given day is an aggregation of each individual market participant’s appraisal of the intrinsic value of the underlying company or assets. These values are primarily driven by investors’ expectations of the company’s ability to generate future free cash flow. A steady stream of information on the state of macro and micro-economic variables which affect a company’s operations inform these market actors, primarily through news articles and alerts. We would like to take a universe of news headlines and parse the information into features, which allow us to classify the direction and ‘intensity’ of a stock’s price move, in any given day. Strategies may include various classification methods to determine the most effective solution.<br />
<br />
----------------------------------------------------------------------<br />
<br />
'''Project # 21'''<br />
Group members:<br />
<br />
Xiao, Alex<br />
<br />
Zhang, Richard<br />
<br />
Ash, Hudson<br />
<br />
Zhu, Ziqiu<br />
<br />
'''Title:''' TBD<br />
<br />
'''Description:'''</div>
Z62zhu