From statwiki
Revision as of 18:09, 29 October 2009 by Y2yao (talk | contribs) (By: Yao Yao, Min Chen, Jiaxi Liang, Zhenghui Wu)
Jump to: navigation, search

Use the following format for your proposal (maximum one page)

Project 1 : How to Make a Birdhouse

By: Maseeh Ghodsi, Soroush Ghodsi and Ali Ghodsi

Write your proposal here

Project 1 : Recognizing Cheaters in Multi-Player Online Game Environment

By: Mark Stuart, Mathieu Zerter, Iulia Pargaru

Multiplayer online games constitute a very large market in the entertainment industry that generates billions in revenue.<ref> S. F. Yeung, John C. S. Lui, Jianchuan Liu, Jeff Yan, Detecting Cheaters for Multiplayer Games: Theory, Design, and Implementation </ref> Multiplayer on-line games are games in which players use characters to perform specific actions and interact with other characters. The number of online game users is rapidly increasing. Computer play-programs are often used to automatically perform actions on behalf of a human player. This type of cheating gains the player unfair advantage, abusing resources, disrupting players’ gaming experience and even harming servers.<ref>Hyungil Kim, Sungwoo Hong, Juntae Kim, Detection of Auto Programs for MMORPGs</ref> Computer play-programs usually have a specific goal or a task that is repeated often. We suspect that sequences of events and actions created by play-programs are statistically different from the sequence of events generated by a human player. We will be using an on-line game called Tibia created by CIPSoft as a study case.

We have recruited volunteers who agreed to provide us with their gaming information. We are gathering and parsing packets sent by the user to the game server that contain detailed information about the actions performed by the user. The original data consist of: User ID, length of event, time of event, action type, action details, cheating (0 or 1). The sequences of events produced by human and the play-programs will be transformed into a set of features to reveal additional information such as periodicity of events, common sequential actions, rare events or actions not performed often, creating a measure for complexity of an action. Various algorithms will be applied to classify the data represented by the set available attributes. Some similar studies suggest that the following methods perform an effective classification of human vs. machine in on-line game environment:

  • Dynamic Bayesian Network
  • Isomap
  • Desicion Tree
  • Artificial Neural Network
  • Support Vector Machines
  • K nearest neighbours
  • Naive Bayesian

We intend to find a classification algorithm that detects in-game cheating in on-line game Tibia with reasonable accuracy.

Project 2 :

By: Jiheng Wang

Project 3 : Identifying faults of waste water treatment plant

By: Yao Yao, Min Chen, Jiaxi Liang, Zhenghui Wu

Objective: To classify the operational state of the plant in order to predict faults through the state variables of the plant at each of the stages of the treatment process.

Background Information: Liquid waste treatment plant and system operators, also known as waste water treatment plant and system operators, remove harmful pollutants from domestic and industrial liquid waste so that it is safe to return to the environment. There are four stages in the water treatment process: plant input, primary settler, secondary settler and plant output. Operators read, interpret, and adjust meters and gauges to make sure that plant equipment and processes are working properly. Operators control chemical-feeding devices, take samples of the water or waste water, perform chemical and biological laboratory analyses, and adjust the amounts of chemicals in the water. We use sensors to sample and measure water quality.

Data Description: This dataset comes from the daily measures of sensors in a urban waste water treatment plant. The data includes 527 data points and 38 variables, recording the water quality of each stage of the treatment process.


  • Principal Component Analysis(PCA)
  • Locally Linear Embedding(LEE)
  • Cluster Analysis and Conceptual Clustering
  • Linear Discriminant Analysis(LDA/FLDA)
  • Linear and Logistic Regression
  • Neural Network(NN)