http://wiki.math.uwaterloo.ca/statwiki/index.php?title=is_Multinomial_PCA_Multi-faceted_Clustering_or_Dimensionality_Reduction&feed=atom&action=historyis Multinomial PCA Multi-faceted Clustering or Dimensionality Reduction - Revision history2024-03-29T08:39:16ZRevision history for this page on the wikiMediaWiki 1.41.0http://wiki.math.uwaterloo.ca/statwiki/index.php?title=is_Multinomial_PCA_Multi-faceted_Clustering_or_Dimensionality_Reduction&diff=27510&oldid=prevConversion script: Conversion script moved page Is Multinomial PCA Multi-faceted Clustering or Dimensionality Reduction to is Multinomial PCA Multi-faceted Clustering or Dimensionality Reduction: Converting page titles to lowercase2017-08-30T13:45:36Z<p>Conversion script moved page <a href="/statwiki/index.php?title=Is_Multinomial_PCA_Multi-faceted_Clustering_or_Dimensionality_Reduction" class="mw-redirect" title="Is Multinomial PCA Multi-faceted Clustering or Dimensionality Reduction">Is Multinomial PCA Multi-faceted Clustering or Dimensionality Reduction</a> to <a href="/statwiki/index.php?title=is_Multinomial_PCA_Multi-faceted_Clustering_or_Dimensionality_Reduction" title="is Multinomial PCA Multi-faceted Clustering or Dimensionality Reduction">is Multinomial PCA Multi-faceted Clustering or Dimensionality Reduction</a>: Converting page titles to lowercase</p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<tr class="diff-title" lang="us">
<td colspan="1" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="1" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 09:45, 30 August 2017</td>
</tr><tr><td colspan="2" class="diff-notice" lang="us"><div class="mw-diff-empty">(No difference)</div>
</td></tr></table>Conversion scripthttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=is_Multinomial_PCA_Multi-faceted_Clustering_or_Dimensionality_Reduction&diff=8803&oldid=prevLishayu: /* Introduction */2010-11-15T04:57:17Z<p><span dir="auto"><span class="autocomment">Introduction</span></span></p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="us">
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 00:57, 15 November 2010</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l2">Line 2:</td>
<td colspan="2" class="diff-lineno">Line 2:</td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>A now standard method for analyzing discrete data such as documents is [http://en.wikipedia.org/wiki/Cluster_analysis clustering] or [http://en.wikipedia.org/wiki/Unsupervised_learning unsupervised learning]. A rich variety of methods exist borrowing theory and algorithm from a board spectrum of computer science:[http://en.wikipedia.org/wiki/Spectral_method spectral method], [http://en.wikipedia.org/wiki/Kd-tree kd-trees], data merging algorithm and so on. All these methods, however, have one significant drawback for typical application in areas such as document or image analysis: each item/document is to be classified exclusively to one class. In practice documents invariable mix a few topics, readily seen by inspection of the human-classified Reuters newswire, so the automated construction of topic hierarchies need to be reflect this. One alternative is to make clusters multifaceted whereby a document can be assigned using a convex combination to a number of clusters rather than uniquely to one cluster. This is an unsupervised version of the so-called multi-class classification task.</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>A now standard method for analyzing discrete data such as documents is [http://en.wikipedia.org/wiki/Cluster_analysis clustering] or [http://en.wikipedia.org/wiki/Unsupervised_learning unsupervised learning]. A rich variety of methods exist borrowing theory and algorithm from a board spectrum of computer science:[http://en.wikipedia.org/wiki/Spectral_method spectral method], [http://en.wikipedia.org/wiki/Kd-tree kd-trees], data merging algorithm and so on. All these methods, however, have one significant drawback for typical application in areas such as document or image analysis: each item/document is to be classified exclusively to one class. In practice documents invariable mix a few topics, readily seen by inspection of the human-classified Reuters newswire, so the automated construction of topic hierarchies need to be reflect this. One alternative is to make clusters multifaceted whereby a document can be assigned using a convex combination to a number of clusters rather than uniquely to one cluster. This is an unsupervised version of the so-called multi-class classification task.</div></td></tr>
<tr><td class="diff-marker" data-marker="−"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>A body of techniques with completely goals is known as [http://en.wikipedia.org/wiki/Dimensionality_reduction dimensionality reduction]: they seek to reduce the dimensions of an item/document. The state of the art here is [http://en.wikipedia.org/wiki/Principle_components_analysis Principle Components Analysis(PCA)]. In text applications it is a PCA variant variant called latent semantic indexing LSI. A rich body of practical experience indicates LSI is not ideal for the task and theoretical justification use unrealistic assumptions.</div></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div> </div></td></tr>
<tr><td colspan="2" class="diff-side-deleted"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>A body of techniques with completely goals is known as [http://en.wikipedia.org/wiki/Dimensionality_reduction dimensionality reduction]: they seek to reduce the dimensions of an item/document. The state of the art here is [http://en.wikipedia.org/wiki/Principle_components_analysis Principle Components Analysis(PCA)]. In text applications it is a PCA variant variant called latent semantic indexing LSI. A rich body of practical experience indicates LSI is not ideal for the task and theoretical justification use unrealistic assumptions. <ins style="font-weight: bold; text-decoration: none;">As a substitute to PCA on discrete data, authors have recently proposed discrete analogues to PCA. We refer to the method as multinomial PCA(mPCA) because it is a precise multinomial analogue formulation of PCA as a Gaussian mixture of Gaussians.</ins></div></td></tr>
<tr><td colspan="2" class="diff-side-deleted"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div> </div></td></tr>
<tr><td colspan="2" class="diff-side-deleted"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;">This paper describes our experiments intended to understand mPCA and whether it should be called multi-faceted clustering algorithm or a dimensionality reduction algorithm. </ins></div></td></tr>
<tr><td colspan="2" class="diff-side-deleted"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div> </div></td></tr>
<tr><td colspan="2" class="diff-side-deleted"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;">==Multinomial PCA===</ins></div></td></tr>
<tr><td colspan="2" class="diff-side-deleted"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;">===The Model===</ins></div></td></tr>
<tr><td colspan="2" class="diff-side-deleted"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;">====A Gaussian model====</ins></div></td></tr>
</table>Lishayuhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=is_Multinomial_PCA_Multi-faceted_Clustering_or_Dimensionality_Reduction&diff=8801&oldid=prevLishayu: /* Introduction */2010-11-15T04:47:34Z<p><span dir="auto"><span class="autocomment">Introduction</span></span></p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="us">
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 00:47, 15 November 2010</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l2">Line 2:</td>
<td colspan="2" class="diff-lineno">Line 2:</td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>A now standard method for analyzing discrete data such as documents is [http://en.wikipedia.org/wiki/Cluster_analysis clustering] or [http://en.wikipedia.org/wiki/Unsupervised_learning unsupervised learning]. A rich variety of methods exist borrowing theory and algorithm from a board spectrum of computer science:[http://en.wikipedia.org/wiki/Spectral_method spectral method], [http://en.wikipedia.org/wiki/Kd-tree kd-trees], data merging algorithm and so on. All these methods, however, have one significant drawback for typical application in areas such as document or image analysis: each item/document is to be classified exclusively to one class. In practice documents invariable mix a few topics, readily seen by inspection of the human-classified Reuters newswire, so the automated construction of topic hierarchies need to be reflect this. One alternative is to make clusters multifaceted whereby a document can be assigned using a convex combination to a number of clusters rather than uniquely to one cluster. This is an unsupervised version of the so-called multi-class classification task.</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>A now standard method for analyzing discrete data such as documents is [http://en.wikipedia.org/wiki/Cluster_analysis clustering] or [http://en.wikipedia.org/wiki/Unsupervised_learning unsupervised learning]. A rich variety of methods exist borrowing theory and algorithm from a board spectrum of computer science:[http://en.wikipedia.org/wiki/Spectral_method spectral method], [http://en.wikipedia.org/wiki/Kd-tree kd-trees], data merging algorithm and so on. All these methods, however, have one significant drawback for typical application in areas such as document or image analysis: each item/document is to be classified exclusively to one class. In practice documents invariable mix a few topics, readily seen by inspection of the human-classified Reuters newswire, so the automated construction of topic hierarchies need to be reflect this. One alternative is to make clusters multifaceted whereby a document can be assigned using a convex combination to a number of clusters rather than uniquely to one cluster. This is an unsupervised version of the so-called multi-class classification task.</div></td></tr>
<tr><td colspan="2" class="diff-side-deleted"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;">A body of techniques with completely goals is known as [http://en.wikipedia.org/wiki/Dimensionality_reduction dimensionality reduction]: they seek to reduce the dimensions of an item/document. The state of the art here is [http://en.wikipedia.org/wiki/Principle_components_analysis Principle Components Analysis(PCA)]. In text applications it is a PCA variant variant called latent semantic indexing LSI. A rich body of practical experience indicates LSI is not ideal for the task and theoretical justification use unrealistic assumptions.</ins></div></td></tr>
</table>Lishayuhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=is_Multinomial_PCA_Multi-faceted_Clustering_or_Dimensionality_Reduction&diff=8800&oldid=prevLishayu: Created page with "==Introduction== A now standard method for analyzing discrete data such as documents is [http://en.wikipedia.org/wiki/Cluster_analysis clustering] or [http://en.wikipedia.org/wi..."2010-11-15T04:29:47Z<p>Created page with "==Introduction== A now standard method for analyzing discrete data such as documents is [http://en.wikipedia.org/wiki/Cluster_analysis clustering] or [http://en.wikipedia.org/wi..."</p>
<p><b>New page</b></p><div>==Introduction==<br />
<br />
A now standard method for analyzing discrete data such as documents is [http://en.wikipedia.org/wiki/Cluster_analysis clustering] or [http://en.wikipedia.org/wiki/Unsupervised_learning unsupervised learning]. A rich variety of methods exist borrowing theory and algorithm from a board spectrum of computer science:[http://en.wikipedia.org/wiki/Spectral_method spectral method], [http://en.wikipedia.org/wiki/Kd-tree kd-trees], data merging algorithm and so on. All these methods, however, have one significant drawback for typical application in areas such as document or image analysis: each item/document is to be classified exclusively to one class. In practice documents invariable mix a few topics, readily seen by inspection of the human-classified Reuters newswire, so the automated construction of topic hierarchies need to be reflect this. One alternative is to make clusters multifaceted whereby a document can be assigned using a convex combination to a number of clusters rather than uniquely to one cluster. This is an unsupervised version of the so-called multi-class classification task.</div>Lishayu