spaxel/docs/research/papers/wisee.md
jedarden 948c966226 init: spaxel project — docs, plan, and marathon infrastructure
- WiFi CSI-based indoor positioning system for self-hosted home environments
- docs/plan/plan.md: full 9-phase implementation plan (65 gaps closed by analysis)
- docs/research/: CSI fundamentals, physics, algorithms, signal processing, mesh topology, accuracy limits, literature
- docs/notes/: recovery mechanisms, simulation testing, UX visualization
- .marathon/instruction.md: per-iteration marathon instructions with detailed commit format
- .marathon/start.sh: GLM-5 tmux launcher via ZAI proxy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 06:43:25 -04:00

128 lines
6.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# WiSee: Whole-Home Gesture Recognition Using Wireless Signals
**Authors:** Qifan Pu, Sidhant Gupta, Shyamnath Gollakota, Shwetak Patel
**Venue:** ACM MobiCom 2013 **(Best Paper Award)**
**DOI:** [10.1145/2500423.2500436](https://doi.org/10.1145/2500423.2500436)
**PDF:** https://wisee.cs.washington.edu/wisee_paper.pdf
**Institution:** University of Washington
---
## Citation
```
@inproceedings{pu2013wisee,
title = {Whole-Home Gesture Recognition Using Wireless Signals},
author = {Pu, Qifan and Gupta, Sidhant and Gollakota, Shyamnath and Patel, Shwetak},
booktitle = {Proceedings of the 19th Annual International Conference on Mobile Computing and Networking},
series = {MobiCom '13},
year = {2013},
pages = {27--38},
doi = {10.1145/2500423.2500436},
publisher = {ACM}
}
```
---
## Abstract
> "This paper presents WiSee, a novel gesture recognition system that leverages wireless signals (e.g., Wi-Fi) to enable whole-home sensing and recognition of human gestures. Since wireless signals do not require line-of-sight and can traverse through walls, WiSee can enable whole-home gesture recognition using few wireless sources. Further, it achieves this goal without requiring instrumentation of the human body with sensing devices. We implement a proof-of-concept prototype of WiSee using USRP-N210s and evaluate it in both an office environment and a two-bedroom apartment. Our results show that WiSee can identify and classify a set of nine gestures with an average accuracy of 94%."
---
## Problem Statement
Computing is increasingly ambient — users need to interact with devices throughout a home without carrying devices or maintaining line-of-sight. Camera-based (Kinect) systems need LoS. On-body sensors (wristbands) are inconvenient. WiSee exploits existing WiFi infrastructure: signals permeate homes, traverse walls, and carry Doppler information from human motion.
Key challenge: hand gestures at 0.5 m/s produce only **17 Hz Doppler shift** on a 5 GHz WiFi signal, while WiFi bandwidth is 20 MHz — the Doppler shift is **6 orders of magnitude smaller** than the signal bandwidth. Standard Wi-Fi processing discards this information entirely.
---
## Core Technique: Narrowband Pulse Creation from OFDM
WiSee transforms the wideband 802.11 OFDM signal (20 MHz) into a narrowband signal of a few Hz bandwidth, making gesture Doppler detectable.
### Case 1 — Identical OFDM Symbols
Performing an MN-point FFT over M repeated OFDM symbols of N subcarriers:
```
X_{2l} = 2 · Σ_{k=1}^{N} x_k · e^{i2πkl/N}
X_{2l+1} = 0
```
Odd sub-channels cancel; even sub-channels capture N-point FFT. Each sub-channel bandwidth is **halved** per repetition. With M=1000 symbols over 1 second → 1 Hz bandwidth, enabling Hertz-resolution Doppler measurement.
### Case 2 — Arbitrary (Real) OFDM Symbols
WiSee decodes each symbol with the standard 802.11 decoder, then **re-encodes** every symbol to match the first by multiplying each sub-channel n in symbol i by `X¹_n / Xⁱ_n`. After IFFT, all symbols become identical → reduces to Case 1.
Critical: the re-encoder re-introduces phase/amplitude changes that the decoder removed, preserving gesture Doppler information while normalising the data content.
---
## Gesture Classification
### Doppler Extraction
1. Receiver computes half-second FFTs with 5 ms sliding intervals → frequency-time Doppler profile
2. Human gestures at 0.254 m/s → Doppler shifts of 8134 Hz at 5 GHz
3. Segmentation: detect gesture start/end when energy ratio crosses 3 dB threshold
### Gesture Encoding
Each gesture = unique sequence of positive (+) and negative () Doppler shift segments:
- Positive-only (+1): motion toward receiver
- Negative-only (1): motion away from receiver
- Mixed (+2): simultaneous toward and away (e.g., arm sweep)
**9 gestures:** push, pull, sweep, flower, circle, horizontal S-curve, vertical S-curve, left jab, right jab. Each encodes as a distinct +/ pattern independent of user speed.
Pattern matching against templates. Speed changes duration but not the +/ sequence.
### Frequency Offset Handling
Track the maximum-energy peak (DC component from non-human static paths) and correct Doppler peaks relative to it — residual carrier frequency offset shifts all peaks equally.
---
## Multi-User Isolation via MIMO Nulling
- Target user performs a repetitive "preamble" gesture sequence
- Receiver estimates the MIMO channel that maximises energy from that user's reflections
- Subsequent gestures classified using that channel estimate
- With 4-antenna receiver: handles up to 3 simultaneous other users
**False positive rate** (24-hour continuous test):
- 2-gesture preamble: 2.63 events/hour
- 4-repetition preamble: **0.07 events/hour**
---
## Results
| Metric | Value |
|---|---|
| Average gesture accuracy | **94%** |
| Gestures | 9 whole-body gestures |
| Users evaluated | 5 |
| Test instances | 900 |
| Coverage (1 Tx in living room, 4-antenna Rx) | 94% accuracy in 60% of home |
| Coverage (2 Tx, 4-antenna Rx) | 94% accuracy in **all rooms** |
| False positive rate (4-rep preamble) | 0.07 events/hour |
---
## Limitations
- Prototype uses USRP-N210 software-defined radio, not commodity WiFi — requires custom narrowband pulse creation not available in standard chips
- Classification accuracy reduces with more simultaneous users (MIMO nulling has finite capacity)
- Only 9 pre-defined gesture classes — arbitrary gestures require extensions (HMM, DTW suggested)
- Requires minimum 3% channel occupancy (transmitter must be sending packets)
- Does not identify which room the gesture occurred in
- Requires preamble gesture to lock onto target user
---
## Relevance to Spaxel
WiSee demonstrates that Doppler-based sensing through walls is achievable with a single WiFi source. The narrowband pulse creation technique reveals the sensitivity floor: 17 Hz at 5 GHz for hand gestures. At 2.4 GHz (Spaxel's band), the same gesture produces ~8 Hz. This sets the minimum sample rate and processing window length requirements. WiSee's +/ Doppler pattern encoding is a simple but effective approach to motion classification that Spaxel could adopt for gross motion event detection (entry/exit, active/passive).