This dataset was collected in 2015-2016 by Yonatan Vaizman and Katherine Ellis with the supervision of professor Gert Lanckriet.
Department of Electrical and Computer Engineering, University of California, San Diego.
Original publication: "Recognizing Detailed Human Context In-the-Wild from Smartphones and Smartwatches".
|Go to the ExtraSensory App|
The ExtraSensory dataset contains data from 60 users (also referred to as subjects or participants), each identified with a universally unique identifier (UUID). From every user it has thousands of examples, typically taken in intervals of 1 minute (but not necessarily in one long sequence, there are time gaps). Every example contains measurements from sensors (from the user's personal smartphone and from a smartwatch that we provided). Most examples also have context labels self-reported by the user.
The users were mostly students (both undergraduate and graduate) and research assistants from the UCSD campus.
34 iPhone users, 26 Android users.
34 female, 26 male.
56 right handed, 2 left handed, 2 defined themselves as using both.
Diverse ethnic backgrounds (each user defined their "ethnicity" how they liked), including Indian, Chinese, Mexican, Caucasian, Filipino, African American and more.
Here are some more statistics over the 60 users:
|Range||Average (standard deviation)|
|Age (years)||18-42||24.7 (5.6)|
|Height (cm)||145-188||171 (9)|
|Weight (kg)||50-93||66 (11)|
|Body mass index (kg/m^2)||18-32||23 (3)|
|Labeled examples||685-9,706||5,139 (2,332)|
|Additional unlabeled examples||2-6,218||1,150 (1,246)|
|Average applied labels per example||1.1-9.7||3.8 (1.4)|
|Days of participation||2.9-28.1||7.6 (3.2)|
The users in ExtraSensory had a variety of phone devices.
iPhone generations: 4, 4S, 5, 5S, 5C, 6 and 6S.
iPhone operating system versions ranging from iOS-7 to iOS-9.
Android devices: Samsung, Nexus, HTC, moto G, LG, Motorola, One Plus One, Sony.
The sensors used were diverse and include high-frequency motion-reactive sensors (accelerometer, gyroscope, magnetometer, watch accelerometer), location services, audio, watch compass, phone state indicators and additional sensors that were sampled in low frequency (once a minute).
Not all sensors were available all the time. Some phones didn't have some sensors (e.g. iPhones didn't have air pressure sensor). In other cases sensors were sometimes unavailable (e.g. location services were sometimes turned off by the user's choice, audio was not available when the user was on a phone call).
The following table specifies the different sensors, the format of their measurements for a single example and the total number of labeled examples (#ex) and users (#us) that have measurements from each sensor.
|accelerometer||Tri-axial direction and magnitude of acceleration. 40Hz for ~20sec.||(~800) x 3||60||308,306|
|gyroscope||Rate of rotation around phone's 3 axes. 40Hz for ~20sec.||(~800) x 3||57||291,883|
|magnetometer||Tri-axial direction and magnitude of magnetic field. 40Hz for ~20sec.||(~800) x 3||58||282,527|
|watch accelerometer||Tri-axial acceleration from the watch. 25Hz for ~20sec.||(~500) x 3||56||210,716|
|watch compass||Watch heading (degrees). nC samples (whenever changes in 1deg).||nC x 1||53||126,781|
|location||Latitude, longitude, altitude, speed, accuracies. nL samples (whenever changed enough).||nL x 6||58||273,737|
|location (quick)||Quick location-variability features (no absolute coordinates) calculated on the phone.||1 x 6||58||263,899|
|audio||22kHz for ~20sec. Then 13 MFCC features from half overlapping 96msec frames.||(~430) x 13||60||302,177|
|audio magnitude||Max absolute value of recorded audio, before it was normalized.||1||60||308,877|
|phone state||App status, battery state, WiFi availability, on the phone, time-of-day.||5 discrete||60||308,320|
|additional||Light, air pressure, humidity, temperature, proximity. If available sampled once in session.||5||---||---|
Here are some examples of raw-measurements recorded from various sensors during the 20-second window. These examples are taken from different examples in the dataset (the relevant context is presented in parenthesis):
|Phone-accelerometer (recorded while running with phone in pocket):||Watch-accelerometer (recorded during shower):|
|Audio (recorded while watching TV and eating at home):||Location (recorded during drive in a car):|
Additional measurements were recorded from pseudo-sensors - processed versions that are given by the OS:
We have processed and cleaned the labels that were self-reported by users.
Labels with prefix 'OR_', 'LOC_', or 'FIX_' are processed versions of original labels.
The primary data provided here is from these cleaned labels, including the following (sorted according to descending order of number of examples):
On average (over the sixty users) an example has more than 3 labels assigned to it.
On average a user's label usage distribution has an entropy of 3.9 bits, which roughly mean that a typical user mainly used ~15 labels during the participation period.
The following table displays the labels (main and secondary) and specifies for each label the number of examples that have the label applied and the number of users that used the label. Labels are numbered according to descending order of number of examples.
Data was collected using the ExtraSensory mobile application (see the ExtraSensory App). We developed a version for iPhone and a version for Android, with a Pebble watch component that interfaces with both the iPhone and the Android versions. The app performs a 20-second "recording session" automatically every minute. In every recording session the app collects measurements from the phone's sensors and from the watch (if it is available), including: the phone's accelerometer, gyroscope and magnetometer (sampled in 40Hz), audio (sampled in 22kHz, then processed to MFCC feature representation), location, the watch's accelerometer (sampled in 25Hz) and compass and additional sensors if available (light, humidity, air pressure, temperature). The measurements from a recording session are bundled into a zip file and sent to the lab's web server (if WiFi is available, or stored on the phone until WiFi is available).
In addition, the app's interface is flexible and has many mechanisms to allow the user to report labels describing their activity and context:
|History view. Designed as a daily journal where every item is a chapter of time where the context was the same. Real-time predictions from the server are sent back to the phone and appear on the history as a basic "guess" of the main activity (with question mark). By clicking on an item in the history the user can provide their actual context labels by selecting from menus (including multiple relevant labels together). The user can also easily merge consecutive history items to a longer time-period when the context was constant, or split an item in case the context changed during its period of time. The user can view the daily journal of previous days, but can only edit labels for today and yesterday.|
|Label selection view. The interface to select the so called 'secondary activity' has a large menu of over 100 context labels. The user has the option to select multiple labels simultaneously. In order to easily and quickly find the relevant labels, the menu is equipped with a side-bar index. The user can find a relevant label in the appropriate index-topic(s), e.g. 'Skateboarding' can be found under 'Sports' and under 'Transportation'. The 'frequent' index-topic is a convenient link to show the user their own personalized list of frequently-used labels.|
|Active feedback view. The user can report the relevant context labels for the immediate future and report that the same labels will stay relevant for a selected amount of time (up to 30 minutes).|
|Notifications. Periodically (every 20 minutes by default, but the user can set the interval) the app raises a notification to the user. In case no labels were reported in a while, the notification will ask the user to provide labels. In case the user reported labels recently, the notification will ask whether the context remained the same until now. The notification also appear on the face of the smartwatch, and if the context labels remain the same a simple click of a watch-button is sufficient to apply the same labels for all the recent minutes.|
We conducted a meeting with every participant, in which we installed the app on their personal phone and explained how to use the app. We provided the Pebble watch to the participant for the week of study, as well as an external battery to allow them for an extra charge of the phone during the day (because the app takes much of the battery). We requested the participant to engage in their regular natural behavior while the app is recording and to try to report as many labels as they can without it bothering their natural behavior too much.
For full details on how we collected the dataset, please refer to our original paper, "Recognizing Detailed Human Context In-the-Wild from Smartphones and Smartwatches".
Additional parts of the data:
View this lecture by Yonatan Vaizman for an introduction to Behavioral Context Recognition and specifically to the ExtraSensory App and the ExtraSensory Dataset.
The ExtraSensory Dataset enables research and development of algorithms and comparison of solutions to many problems, related to behavioral context recognition. Here are some of the related problems, some of them were addressed in our papers, others remain open for you to solve:
|Sensor fusion||The dataset has features (and raw measurements) from sensors of diverse modalities, from the phone and from the watch.
In Vaizman2017a (referenced below), we compared different approaches to fuse information from the different sensors, namely early-fusion (concatenation of features) and late-fusion (averaging or weighted averaging of probability outputs from 6 single-sensor classifiers).
The general context-recognition task in the ExtraSensory Dataset is a multi-label task, where at any minute the behavioral context can be described by a combination of relevant context-labels.
In Vaizman2017b (referenced below), we compared the basline system of separate model-per-label with a multi-task MLP that outputs probabilities for 51 labels. We showed the advantage of sharing parameters in a unified model. Specifically, an MLP with narrow hidden layers can be richer than a linear model, while having fewer parameters, thus reducing over-fitting.
Perhaps other methods can also successfully model many diverse context-labels in a unified model.
The ExtraSensory Dataset includes location coordinates for many examples.
So far, in our papers, we only extracted relative location features - capturing how much a person moves around in space within each minute.
We did not address utilizing the absolute location data. There may be useful information in addressing the movement from minute to minute, and incorporating GIS data and geographic landmarks.
|Time series modeling||
The models we suggested so far treat each example (minute) as independent of the others.
There's a lot of work to be done on modeling minute-by-minute time series, smoothing the recognition over minutes, and ways to segment time into meaningful "behavioral events".
|More sensing modalities||The dataset includes occasional measurements from sensors that we did not yet utilize in our experiments, including magnetometer, ambient light, air pressure, humidity, temperature, and watch-compass.|
|Semi-supervised learning||The ability to improve a model with plenty unlabeled examples will enable collecting huge amounts of data with little effort (less self-reporting).|
|Active learning||Active learning will make future data collections easier on participants - instead of asking for labels for many examples, the system can sparsely prompt the user for labels in the most critical examples.|
In Vaizman2017a (referenced below), we demonstrated the potential improvement of context-recognition with a few days of labeled data from a new users.
Can you achieve successful user-adaptation without labels from the new user?
All our experiments were done with designed features, using traditional DSP methods.
Feature learning can potentially extract meaningful information from the sensor measurements that the designed features miss.
The dataset includes the full raw measurements from the sensors and enables experimenting with feature learning.
|Privacy||The ExtraSensory Dataset can be a testbed to compare methods for privacy-preserving.|
link (official) - To be published April 2018