Note: There will be minor changes in the assignment. Please, check the last updates!


Many pervasive computing systems attempt to behave smarter and less intrusive by interpreting what a human agent is doing at a given moment, e.g. by not letting the cellular phone ring when you are in a cinema. In some cases, context-aware systems are first “trained” by being fed with example sensor data captured while the human agent is performing the activities the system should know about. That capturing of the training sensor data is often performed in a certain time period and the actual training in another (following) time period. Once the system has been trained, and formed a model of the activities (more precisely, it has determined the features of the data that make the activities differ from each other), that model is embedded into the software application/mobile device making that application/mobile device able to behave differently depending on the activity which it’s user is engaged in at the moment.


The goal of this assignment is to create a model in the Weka analysis tool (there are versions of it running on both Windows and Mac) that is able to automatically distinguish among three everyday body locomotion activities (”walking”, “climbing the stairs”, and “sitting on a chair”) on the basis of accelerometer data. You are not asked to embed the model into some kind of interactive system as described in the introduction section above but instead stop at the stage when your Weka model has been trained. The data is generated by an Android phone worn in a pocket while the actual activity identification is to be done “offline” (that is, not in real-time) using the Weka data analysis toolkit. All data files generated by the Android device accelerometer is to be uploaded to a Google App Engine DataStore where an App Engine Servlet should convert the data into a CSV formatted file that Weka can take as input.

Once both system components are developed (The Android app + the App Engine Servlet), you are to place the Android phone in your pocket and perform repeated recordings of the previously mentioned three locomotion activities, that is 10 recordings of walking, 10 recordings of climbing the stairs, 10 recordings of sitting on a chair. After that, you will take the CSV file which your App Engine DataStore servlet generated from the data and load it into Weka for preprocessing and training of your activity model.

Assignment details

The assignment consists of three parts which all build on previous lab classes on this course, specifically lab class #10, #11, and #12.

Part 1: Develop the activity recorder Android app

Write an Android application that works as an “activity recorder”, allowing the user to a) start and stop the recording of accelerometer data and b) label it, e.g., give the data recorded the name “13:45:16_climbing_the_stairs”). The application should allow the user to upload the recorded accelerometer data to a Google App Engine DataStore by simply pushing a button after the recording has taken place or later at wish, after more recordings have been made. The recommended approach for developing this app is the one described in lab class #10 (see part III:Cloud Computing and part IV: Activity Detection). We recommend to use an accelerometer sampling frequency of 20 Hz and to make the activity recordings have a length of around 10 seconds each.

Hint for getting clean data: don’t record everything

We have learned on the past couple of lectures that clean data leads to better analysis results, i.e. the identification of activities will be more accurate if the noise in the data is kept at a minimum. While some noise can be filtered away at the analysis stage, it is good practice to reduce the amout of noise as early in the data processing chain as possible. In order to make your Android activity recorder app produce cleaner accelerometer data you can design the app to avoid recording data you don’t want to know about. Let your app start the recording _after_ you have put it in your pocket, not before, by for instance delaying the recording and producing a beep once it starts the actual recording (so that the user knowse when to get going up those stairs or whatever). If you start recording immediately when the user pushes the record button in your app, you will have a lot of useless noisy accelerometer data describing how you put the phone into your pocket. You can also design the app so that it cuts away data in the end, e.g. the last 5 seconds, before storing it. Those 5 seconds might be enough time for the user to pick up the phone from her/his pocket and press “stop recording”.

Another good strategy for acquiring “clean” sensor data is to plan the recording carefully. Another good idea is to make sure the user has the option to not store/upload recordings that for one reason or the other went wrong, or have the option to delete recordings at a later stage.

New: Collect data the same way as explained above where another person is carrying the phone. This data should be used as the test set in the analysis phase.

Part 2: Develop the CSV generating servlet

Weka can only read sensor data files if they are appropriately formatted. One format that is Weka compatible is CSV. Take a look on the CSV files used in lab class #12 or read the Weka instructions to know how the data should be organized in the file. Then write a Google App Servlet which takes the accelerometer data stored in your Google App Engine DataStore and generates a CSV file. By pointing your browser to the right URL, the CSV file should simply appear. The lab class #5 specification provides basic info on how to build this kind of servlet.

Part 3: Analyse and classify the recorded activities in Weka

Once you have downloaded the CSV data file containing in total 30 recordings of activities (that is 10 recordings of walking, 10 recordings of climbing the stairs, 10 recordings of sitting on a chair), you are to import it into Weka and use it as training dataset. For testing, you can record new instances of each activity in a separate file.

You can do the activity recognition task in different ways. Some suggestions can be to first preprocess your data (both training and test sets) which includes:

  • Removing parts of data in the beginning and at the end of recordings which are considered as noise.
  • Converting the data types to a form which is compatible with the classifier you are using (hint: for example BayesNet will work with both nominal and numeric data but the results might be higher with numeric types.)
  • Generating time series delta features which contain the difference between current and a number of previous values (hint: you can use timeseriesdelta filter in Weka. Look for its description to find out how to generate delta features.)
The preprocessing can be done either in Weka or by coding a few lines.
For the classification, load your training set in Weka and then choose the option “use the supplied test set” in order to load your test data. Then try different algorithms to find those that optimize the accuracy of your results. Also try to repeat the classification with a reduced set of attributes, e.g., by removing the time stamp or one/two axis, and compare the results.

New: Try testing on two test datasets: one is recorded when the same person who collected the training data is carrying the phone. The other one is the testset which is collected when another person is carrying the phone. Perform classification on both test sets and report the results and differences in performance.

Details about using Weka

  1. load the training file (which includes the training data for all three activities) and preprocess the data as mentioned before.
  2. save the preprocessed training data
  3. load the test file(s) (which includes the test data for one or more activities and preprocess the data exactly the same way as you did with the training data
  4. save the preprocessed test data
  5. from here on you are only to work with preprocessed data. For classification, now load your training data set (still under the “Preprocess” tab, then choose the option “use the supplied test set” under the “Classify” tab.
  6. Now load the the test data set and choose the attribute name that should be used for classification (this depends on the naming of the attributes you used in your training and test data files to refer to the activity). Choose the same for both the training and test data set.
  7. Choose the classifier you want from the list of available and press “Start”. If the start button is not activated then 1) check that you have selected the same classification attribute both for training and test data sets, 2) check that the data type of your selected attribute is compatible with the type the classifier needs.

If everything worked out the way it should, you will get a nice “Classifier output” which among other things contain a confusion matrix and the accuracy of your classification.

What to hand in, how to demo, and when

You can make this assignment in groups of 2-3 students. The deliverables of this assignment consists of a report and of a demo, both to be delivered/performed for Mads during the week of May 14th. E-mail Mads (madsf[at] for scheduling a time for the demo. Remember that May 18th at 12:00 is the absolute final deadline for having ALL three mandatory assignments approved!


The report should be ~5 pages long and contain the following:

  • Title section containing the name of the assignment, name of the group members including ITU email addresses.
  • System description. Briefly describes the classes you have used for developing the android application and for the App Engine servlet. Length: approximately 1/2 page.
  • Training data capturing. Section describing how you chose to perform the activities you recorded (e.g. for how long time you walked when you recorded the 10 walking data samples, etc.).
  • Sensor data example. Example snippet of the CSV file generated by your Google App Engine servlet showing 1 second of recorded accelerometer data for one of the activities you recorded for your training set.
  • Preprocessing. Brief description of the approach you have used to preprocess your sensor data (e.g. normalizing, cleaning, etc.) using the Weka toolkit.
  • Cross validation. Brief description of how you did your cross validation.
  • Confusion matrix. Presentation of your model’s confusion matrix after training, that is, evidence for how well your trained model manages to discriminate the three activities from each other.
  • Reflections. Here you briefly write your own reflections on how it was to perform this assignment.


The demo of your system will be performed as follows:

  1. The TA (Mads) is give the Android device with your activity recording app on it and is introduced to how it works. He is also instructed on how you exactly performed the activities when recording the test data, e.g., time intervals, etc.
  2. The TA then chooses to perform one of the three given activities and record it using your app.
  3. The TA then uploads the recorded activity.
  4. You point a web browser to your App Engine Servlet and let the TA confirm that it is indeed the activity he performed that is represented in the CSV file by checking the timestamps of the data.
  5. You load the CSV file representing the TA’s recently recorded activity into Weka and let your previously trained model identify which of the three activities that the TA performed earlier. Hold your breath and wish that your trained model makes the correct choice. :-)


Questions regarding part 1 and 2 of this assignment (Android and Google App Engine) can be directed to Mads (madsf[at] Questions regarding part 3 (Analysis and training your model using Weka) can be directed to Afsaneh (adoryab[at] and questions regarding this assignment as a whole can be directed to Jakob and Afsaneh.

4 Responses to “Mandatory Assignment #3 - Sensing and identifying everyday body locomotion”

  1. 1 Egil Hansen April 27, 2012 at 10:59

    Mads, are you sure you want us to hand in the report and demo during the week of November 28th? :)

  2. 2 jsha April 27, 2012 at 17:46

    Does any group looking for a new member?

  3. 3 jsha April 27, 2012 at 17:47

    I mean is any group looking for a new member? :)

  1. 1 paleo recipe book Trackback on May 4, 2013 at 08:35

Leave a Reply

You must login to post a comment.