The ExtraSensory Dataset

A dataset for behavioral context recognition in-the-wild from mobile sensors

Jump to: Dataset description How was the data collected? Download Tutorial Open problems Relevant papers

This dataset was collected in 2015-2016 by Yonatan Vaizman and Katherine Ellis with the supervision of professor Gert Lanckriet.
Department of Electrical and Computer Engineering, University of California, San Diego.
Original publication: "Recognizing Detailed Human Context In-the-Wild from Smartphones and Smartwatches".
Go to the ExtraSensory App


Dataset description:

The ExtraSensory dataset contains data from 60 users (also referred to as subjects or participants), each identified with a universally unique identifier (UUID). From every user it has thousands of examples, typically taken in intervals of 1 minute (but not necessarily in one long sequence, there are time gaps). Every example contains measurements from sensors (from the user's personal smartphone and from a smartwatch that we provided). Most examples also have context labels self-reported by the user.

Users:
The users were mostly students (both undergraduate and graduate) and research assistants from the UCSD campus.
34 iPhone users, 26 Android users.
34 female, 26 male.
56 right handed, 2 left handed, 2 defined themselves as using both.
Diverse ethnic backgrounds (each user defined their "ethnicity" how they liked), including Indian, Chinese, Mexican, Caucasian, Filipino, African American and more.
Here are some more statistics over the 60 users:
RangeAverage (standard deviation)
Age (years)18-4224.7 (5.6)
Height (cm)145-188171 (9)
Weight (kg)50-9366 (11)
Body mass index (kg/m^2)18-3223 (3)
Labeled examples685-9,7065,139 (2,332)
Additional unlabeled examples2-6,2181,150 (1,246)
Average applied labels per example1.1-9.73.8 (1.4)
Days of participation2.9-28.17.6 (3.2)

Devices:
The users in ExtraSensory had a variety of phone devices.
iPhone generations: 4, 4S, 5, 5S, 5C, 6 and 6S.
iPhone operating system versions ranging from iOS-7 to iOS-9.
Android devices: Samsung, Nexus, HTC, moto G, LG, Motorola, One Plus One, Sony.

Sensors:
The sensors used were diverse and include high-frequency motion-reactive sensors (accelerometer, gyroscope, magnetometer, watch accelerometer), location services, audio, watch compass, phone state indicators and additional sensors that were sampled in low frequency (once a minute).
Not all sensors were available all the time. Some phones didn't have some sensors (e.g. iPhones didn't have air pressure sensor). In other cases sensors were sometimes unavailable (e.g. location services were sometimes turned off by the user's choice, audio was not available when the user was on a phone call).
The following table specifies the different sensors, the format of their measurements for a single example and the total number of labeled examples (#ex) and users (#us) that have measurements from each sensor.
sensordetailsdimension#us#ex
accelerometerTri-axial direction and magnitude of acceleration. 40Hz for ~20sec.(~800) x 360308,306
gyroscopeRate of rotation around phone's 3 axes. 40Hz for ~20sec.(~800) x 357291,883
magnetometerTri-axial direction and magnitude of magnetic field. 40Hz for ~20sec.(~800) x 358282,527
watch accelerometerTri-axial acceleration from the watch. 25Hz for ~20sec.(~500) x 356210,716
watch compassWatch heading (degrees). nC samples (whenever changes in 1deg).nC x 153126,781
locationLatitude, longitude, altitude, speed, accuracies. nL samples (whenever changed enough).nL x 658273,737
location (quick)Quick location-variability features (no absolute coordinates) calculated on the phone.1 x 658263,899
audio22kHz for ~20sec. Then 13 MFCC features from half overlapping 96msec frames.(~430) x 1360302,177
audio magnitudeMax absolute value of recorded audio, before it was normalized.160308,877
phone stateApp status, battery state, WiFi availability, on the phone, time-of-day.5 discrete60308,320
additionalLight, air pressure, humidity, temperature, proximity. If available sampled once in session.5------

Here are some examples of raw-measurements recorded from various sensors during the 20-second window. These examples are taken from different examples in the dataset (the relevant context is presented in parenthesis):
Phone-accelerometer (recorded while running with phone in pocket):Watch-accelerometer (recorded during shower):
Audio (recorded while watching TV and eating at home):Location (recorded during drive in a car):

Additional measurements were recorded from pseudo-sensors - processed versions that are given by the OS:

Labels:
Cleaned labels:
We have processed and cleaned the labels that were self-reported by users.
Labels with prefix 'OR_', 'LOC_', or 'FIX_' are processed versions of original labels.
The primary data provided here is from these cleaned labels, including the following (sorted according to descending order of number of examples):
Label#users#examples
1OR_indoors59184692
2LOC_home57152892
3SITTING60136356
4PHONE_ON_TABLE53115037
5LYING_DOWN58104210
6SLEEPING5383055
7AT_SCHOOL4942331
8COMPUTER_WORK4538081
9OR_standing6037782
10TALKING5436293
11LOC_main_workplace3233944
12WITH_FRIENDS3224737
13PHONE_IN_POCKET4023401
14FIX_walking6022136
15SURFING_THE_INTERNET3519416
16EATING5716594
17PHONE_IN_HAND4314573
18WATCHING_TV4013311
19OR_outside4512114
20PHONE_IN_BAG2610201
21OR_exercise448081
22DRIVE_-_I_M_THE_DRIVER317975
23WITH_CO-WORKERS216224
24IN_CLASS206110
25IN_A_CAR336083
26IN_A_MEETING455153
27BICYCLING255020
28COOKING404029
29LAB_WORK93848
30CLEANING303806
31GROOMING363064
32TOILET432655
33DRIVE_-_I_M_A_PASSENGER232526
34DRESSING352233
35FIX_restaurant282098
36BATHING_-_SHOWER372087
37SHOPPING271841
38ON_A_BUS311794
39AT_A_PARTY91470
40DRINKING__ALCOHOL_121456
41WASHING_DISHES251228
42AT_THE_GYM81151
43FIX_running261090
44STROLLING11806
45STAIRS_-_GOING_UP18798
46STAIRS_-_GOING_DOWN19774
47SINGING8651
48LOC_beach8585
49DOING_LAUNDRY15556
50AT_A_BAR5551
51ELEVATOR12200

To better understand how the data is structured, and how to use it, check out our introduction tutorial.


Original labels:
In the mobile app self-reporting interface, the users could report labels of two types:
  1. Main activity. Labels describing movement or posture of the user. This category is mutually exclusive and the possible 7 values are: lying down, sitting, standing in place, standing and moving, walking, running, bicycling.
  2. Secondary activity. Additional 109 labels describing more specific context in different aspects: sports (e.g. playing basketball, at the gym), transportation (e.g. drive - I'm the driver, on the bus), basic needs (e.g. sleeping, eating, toilet), company (e.g. with family, with co-workers), location (e.g. at home, at work, outside) etc.
    Multiple secondary labels can be applied to an example.
Some examples may have no main activity selected, but have secondary labels (e.g. when the user didn't remember if they were sitting or walking, but did remember they were indoors).

On average (over the sixty users) an example has more than 3 labels assigned to it.
On average a user's label usage distribution has an entropy of 3.9 bits, which roughly mean that a typical user mainly used ~15 labels during the participation period.
The following table displays the labels (main and secondary) and specifies for each label the number of examples that have the label applied and the number of users that used the label. Labels are numbered according to descending order of number of examples.
Label#users#examples
1SITTING60136356
2PHONE_ON_TABLE53116425
3LYING_DOWN58104210
4AT_HOME55103889
5SLEEPING5383055
6INDOORS3157021
7AT_SCHOOL4942331
8COMPUTER_WORK4538081
9TALKING5436293
10STANDING_AND_MOVING5829754
11AT_WORK3229574
12STUDYING3326277
13WITH_FRIENDS3224737
14PHONE_IN_POCKET4024226
15WALKING6022517
16RELAXING3221223
17SURFING_THE_INTERNET3519416
18PHONE_AWAY_FROM_ME2717937
19EATING5716594
20PHONE_IN_HAND4316308
21WATCHING_TV4013311
22OUTSIDE4011967
23PHONE_IN_BAG2610760
24LISTENING_TO_MUSIC__WITH_EARPHONES_3110228
25WRITTEN_WORK159083
26STANDING_IN_PLACE598028
27DRIVE_-_I_M_THE_DRIVER317975
28WITH_FAMILY147749
29WITH_CO-WORKERS216224
30IN_CLASS206110
31IN_A_CAR336083
32TEXTING245936
33LISTENING_TO_MUSIC__NO_EARPHONES_245589
34DRINKING__NON-ALCOHOL_305544
35IN_A_MEETING455153
36WITH_A_PET15125
37BICYCLING255020
38LISTENING_TO_AUDIO__NO_EARPHONES_114359
39READING_A_BOOK224223
40COOKING404029
41LISTENING_TO_AUDIO__WITH_EARPHONES_74029
42LAB_WORK93848
43CLEANING303806
44GROOMING363064
45EXERCISING142679
46TOILET432655
47DRIVE_-_I_M_A_PASSENGER232526
48AT_A_RESTAURANT292519
49PLAYING_VIDEOGAMES92441
50LAUGHING82428
51DRESSING352233
52BATHING_-_SHOWER372087
53SHOPPING271841
54ON_A_BUS311794
55STRETCHING121667
56AT_A_PARTY91470
57DRINKING__ALCOHOL_121456
58RUNNING281335
59WASHING_DISHES251228
60SMOKING21183
61AT_THE_GYM81151
62ON_A_DATE61086
63STROLLING11806
64STAIRS_-_GOING_UP18798
65STAIRS_-_GOING_DOWN19774
66SINGING8651
67ON_A_PLANE4630
68DOING_LAUNDRY15556
69AT_A_BAR5551
70AT_A_CONCERT5538
71MANUAL_LABOR8494
72PLAYING_PHONE-GAMES4403
73ON_A_TRAIN5344
74DRAWING3273
75ELLIPTICAL_MACHINE2233
76AT_THE_BEACH6230
77AT_THE_POOL5216
78ELEVATOR12200
79TREADMILL2164
80PLAYING_BASEBALL2163
81LIFTING_WEIGHTS1162
82SKATEBOARDING3131
83YOGA3128
84BATHING_-_BATH6121
85DANCING3115
86PLAYING_MUSICAL_INSTRUMENT2114
87STATIONARY_BIKE286
88MOTORBIKE186
89TRANSFER_-_BED_TO_STAND473
90VACUUMING168
91TRANSFER_-_STAND_TO_BED463
92LIMPING162
93PLAYING_FRISBEE254
94AT_A_SPORTS_EVENT252
95PHONE_-_SOMEONE_ELSE_USING_IT341
96JUMPING129
97PHONE_STRAPPED127
98GARDENING121
99RAKING_LEAVES121
100AT_SEA118
101ON_A_BOAT118
102WHEELCHAIR19
103WHISTLING15
104PLAYING_BASKETBALL00
105PLAYING_LACROSSE00
106PLAYING_SOCCER00
107MOWING_THE_LAWN00
108WASHING_CAR00
109HIKING00
110CRYING00
111USING_CRUTCHES00
112RIDING_AN_ANIMAL00
113TRANSFER_-_BED_TO_WHEELCHAIR00
114TRANSFER_-_WHEELCHAIR_TO_BED00
115WITH_KIDS00
116TAKING_CARE_OF_KIDS00
Out of the main and secondary labels 103 were applied by users. Although some labels were applied very rarely they may still be useful, for instance by joining labels with logical-or operation (e.g. "running or playing Frisbee or playing baseball"). In addition there are cases where the user wrongfully applied an irrelevant label; and more commonly, cases where a relevant label wasn't reported by the user (e.g. at home). For those reasons we conducted the cleaning and provide the cleaned-version of the labels.


How was the data collected?

Data was collected using the ExtraSensory mobile application (see the ExtraSensory App). We developed a version for iPhone and a version for Android, with a Pebble watch component that interfaces with both the iPhone and the Android versions. The app performs a 20-second "recording session" automatically every minute. In every recording session the app collects measurements from the phone's sensors and from the watch (if it is available), including: the phone's accelerometer, gyroscope and magnetometer (sampled in 40Hz), audio (sampled in 22kHz, then processed to MFCC feature representation), location, the watch's accelerometer (sampled in 25Hz) and compass and additional sensors if available (light, humidity, air pressure, temperature). The measurements from a recording session are bundled into a zip file and sent to the lab's web server (if WiFi is available, or stored on the phone until WiFi is available).

In addition, the app's interface is flexible and has many mechanisms to allow the user to report labels describing their activity and context:
History view. Designed as a daily journal where every item is a chapter of time where the context was the same. Real-time predictions from the server are sent back to the phone and appear on the history as a basic "guess" of the main activity (with question mark). By clicking on an item in the history the user can provide their actual context labels by selecting from menus (including multiple relevant labels together). The user can also easily merge consecutive history items to a longer time-period when the context was constant, or split an item in case the context changed during its period of time. The user can view the daily journal of previous days, but can only edit labels for today and yesterday.
Label selection view. The interface to select the so called 'secondary activity' has a large menu of over 100 context labels. The user has the option to select multiple labels simultaneously. In order to easily and quickly find the relevant labels, the menu is equipped with a side-bar index. The user can find a relevant label in the appropriate index-topic(s), e.g. 'Skateboarding' can be found under 'Sports' and under 'Transportation'. The 'frequent' index-topic is a convenient link to show the user their own personalized list of frequently-used labels.
Active feedback view. The user can report the relevant context labels for the immediate future and report that the same labels will stay relevant for a selected amount of time (up to 30 minutes).
Notifications. Periodically (every 20 minutes by default, but the user can set the interval) the app raises a notification to the user. In case no labels were reported in a while, the notification will ask the user to provide labels. In case the user reported labels recently, the notification will ask whether the context remained the same until now. The notification also appear on the face of the smartwatch, and if the context labels remain the same a simple click of a watch-button is sufficient to apply the same labels for all the recent minutes.
The labels are also sent to the lab's server and saved with the sensor-measurements.

We conducted a meeting with every participant, in which we installed the app on their personal phone and explained how to use the app. We provided the Pebble watch to the participant for the week of study, as well as an external battery to allow them for an extra charge of the phone during the day (because the app takes much of the battery). We requested the participant to engage in their regular natural behavior while the app is recording and to try to report as many labels as they can without it bothering their natural behavior too much.

For full details on how we collected the dataset, please refer to our original paper, "Recognizing Detailed Human Context In-the-Wild from Smartphones and Smartwatches".


Download the dataset:

If you use the dataset for your published work, you are required to cite the ExtraSensory original publication, mentioned here as
Vaizman2017a.
  1. Primary data - features and labels.
    The zip file contains a separate 'csv.gz' file for each user in the dataset.
    Each user's csv file (after uncompressing the gzip format) holds all the examples for that user.
    From each example, there are features computed from the different sensors, and the cleaned context labels.
    Download the features and labels zip file (215MB).
    README for the features and labels data.
    Read the tutorial to better understand how to use the data in these files.
  2. Cross validation partition.
    Download this if you want to perform classification experiments and evaluate them.
    This has a pre-generated partition of the 60 users in the data to 5-folds and prepared text files with the list of users (UUIDs) of the train set and test set of each fold.
    This is the same partition that was used for the experiments in the original paper, Vaizman2017a.
    Use this also to see all the UUIDs and see which users used iPhone and which used Android.
    Download the cross-validation partition zip file.

  3. Additional parts of the data:

  4. Original context labels.
    The original labels, as were self-reported by the users.
    This version has the full label list from the mobile app interface. The labels here have only two values: either "reported" or "not reported" (they don't have the notion of "missing labels" that the cleaned labels have).
    This version of the labels is less reliable (it's before cleaning), but you can still use it. It also includes additional interesting labels that were not included in the cleaned labels, like "LISTENING_TO_MUSIC__NO_EARPHONES_" or "PLAYING_VIDEOGAMES".
    The zip file contains a separate 'csv.gz' file for each user in the dataset.
    Download the original labels zip file (970KB).
  5. Mood labels.
    Although the data collection app (ExtraSensory App) had an option to select mood labels when annotating, we did not focus data collection on mood.
    We told the users that they do not have to report mood, but they can only if it is really clear to them how they felt.
    Only few users reported mood labels, so some of the mood label data files are filled with missing labels.
    Download the mood labels zip file (795KB).
  6. Absolute location coordinates.
    If you are interested in the absolute geographic location (the actual latitude-longitude coordinates) data you can use this part of the data.
    (The primary data has location-features that are only based on relative-location, meaning they only refer to the variability in space during the recorded 20-second window).
    The zip file contains a separate 'csv.gz' file for each user in the dataset, holding the latitude and longitude coordinates for the user's examples (indexed by timestamp).
    The original location measurements that were recorded by the mobile app during every example's 20-second window were in the form of a sequence of location-updates (new update every time there's a significant change in location).
    The coordinates we give here are 'representative' coordinates for an example and they were calculated as followed:
    Download the absolute location zip file (2.2MB).
  7. Raw sensor measurements.
    If you are interested in the signal processing stages to extract features, or feature learning, you may want to work with the raw sensor measurements.
    The following files are separated by sensor and inside each zip file the measurment files are arranged by user (UUID) and timestamp.
    Notice that not all sensors were available for recording at all times, so for some examples (timestamps) the measurement file may be missing or may be a dummy file containing just 'nan'.
    Be gentle on the server! The raw-measurement files are very large, so please only download one at a time.

Tutorial:

To help you get familiar and comfortable with the dataset, we provide a tutorial, with Python code, in the form of an i-Python notebook:
Download the tutorial in i-Python notebook.
Alternatively, you can view the output of the notebook as a document, in the following page:
Introduction to ExtraSensory (html version).

View this lecture by Yonatan Vaizman for an introduction to Behavioral Context Recognition and specifically to the ExtraSensory App and the ExtraSensory Dataset.


Open problems:

The ExtraSensory Dataset enables research and development of algorithms and comparison of solutions to many problems, related to behavioral context recognition. Here are some of the related problems, some of them were addressed in our papers, others remain open for you to solve:

Sensor fusion The dataset has features (and raw measurements) from sensors of diverse modalities, from the phone and from the watch.
In Vaizman2017a (referenced below), we compared different approaches to fuse information from the different sensors, namely early-fusion (concatenation of features) and late-fusion (averaging or weighted averaging of probability outputs from 6 single-sensor classifiers).
Multi-task modeling The general context-recognition task in the ExtraSensory Dataset is a multi-label task, where at any minute the behavioral context can be described by a combination of relevant context-labels.
In Vaizman2017b (referenced below), we compared the basline system of separate model-per-label with a multi-task MLP that outputs probabilities for 51 labels. We showed the advantage of sharing parameters in a unified model. Specifically, an MLP with narrow hidden layers can be richer than a linear model, while having fewer parameters, thus reducing over-fitting.
Perhaps other methods can also successfully model many diverse context-labels in a unified model.
Absolute location The ExtraSensory Dataset includes location coordinates for many examples. So far, in our papers, we only extracted relative location features - capturing how much a person moves around in space within each minute.
We did not address utilizing the absolute location data. There may be useful information in addressing the movement from minute to minute, and incorporating GIS data and geographic landmarks.
Time series modeling The models we suggested so far treat each example (minute) as independent of the others.
There's a lot of work to be done on modeling minute-by-minute time series, smoothing the recognition over minutes, and ways to segment time into meaningful "behavioral events".
More sensing modalities The dataset includes occasional measurements from sensors that we did not yet utilize in our experiments, including magnetometer, ambient light, air pressure, humidity, temperature, and watch-compass.
Semi-supervised learning The ability to improve a model with plenty unlabeled examples will enable collecting huge amounts of data with little effort (less self-reporting).
Active learning Active learning will make future data collections easier on participants - instead of asking for labels for many examples, the system can sparsely prompt the user for labels in the most critical examples.
User adaptation In Vaizman2017a (referenced below), we demonstrated the potential improvement of context-recognition with a few days of labeled data from a new users.
Can you achieve successful user-adaptation without labels from the new user?
Feature learning All our experiments were done with designed features, using traditional DSP methods.
Feature learning can potentially extract meaningful information from the sensor measurements that the designed features miss.
The dataset includes the full raw measurements from the sensors and enables experimenting with feature learning.
Privacy The ExtraSensory Dataset can be a testbed to compare methods for privacy-preserving.

Relevant papers:

Vaizman2017a
Vaizman, Y., Ellis, K., and Lanckriet, G. "Recognizing Detailed Human Context In-the-Wild from Smartphones and Smartwatches". IEEE Pervasive Computing, vol. 16, no. 4, October-December 2017, pp. 62-74. doi:10.1109/MPRV.2017.3971131
*Cite this paper if you use the ExtraSensory Dataset for any publication!
link (official)
pdf (accepted)
Supplementary
Vaizman2017b
Vaizman, Y., Weibel, N., and Lanckriet, G. "Context Recognition In-the-Wild: Unified Model for Multi-Modal Sensors and Multi-Label Classification". Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), vol. 1, no. 4. December 2017. doi:10.1145/3161192
*The supplementary material for this paper explain how we processed the labels to introduce the "missing label information".
link (official)
pdf (accepted)
Supplementary
Vaizman2018a
Vaizman, Y., Ellis, K., Lanckriet, G., and Weibel, N. "ExtraSensory App: Data Collection In-the-Wild with Rich User Interface to Self-Report Behavior". Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI 2018), ACM, April 2018. doi:10.1145/3173574.3174128
*Cite this paper if you use the ExtraSensory App for any publication!".
link (official) - To be published April 2018
pdf