Lex Fridman

Lex Fridman
Podcast - Research - Lectures
Research interests: Human-AI interaction, robotics, and machine learning. Podcast interests: History, philosophy, physics, biology, chemistry, engineering, AI, robotics, programming, music, film, art, sports, psychology, neuroscience, geopolitics, business,economics, religion, and astronomy.

Beyond the above, I also enjoy:
- Playing guitar & piano (link is a video of me playing Comfortably Numb by Pink Floyd)
- Training & competing in jiu jitsu & judo (link is a video of me receiving my jiu jitsu black belt)

Contact me: To contact me, please check out the Contact Page.
Connect: X, YouTube, LinkedIn, Instagram, TikTok, Facebook, Reddit, Telegram.

Research & Publications (Google Scholar)

Arguing Machines: Human Supervision of Black Box AI Systems That Make Life-Critical Decisions

Paper (Cite: BibTeX, Scholar)

Summary: Framework for providing human supervision of a black box AI system that makes life-critical decisions. We demonstrate this approach on two applications: (1) image classification and (2) real-world data of AI-assisted steering in Tesla vehicles.

MIT Advanced Vehicle Technology Study: Large-Scale Naturalistic Study of Human Interaction with Automation

Website - Paper (Cite: BibTeX, Scholar)

Summary: Large-scale real-world AI-assisted driving data collection study to understand how human-AI interaction in driving can be safe and enjoyable. The emphasis is on computer vision based analysis of driver behavior in the context of automation use.

Cognitive Load Estimation in the Wild

Paper (Cite: BibTeX, Scholar)

Summary: Winner of the CHI 2018 Honorable Mention Award. We propose two novel vision-based methods for cognitive load estimation and evaluate them on a large-scale dataset collected under real-world driving conditions.

DeepTraffic: Reinforcement Learning System for Multi-Agent Dense Traffic Navigation

Paper (Cite: BibTeX, Scholar)

Summary: Traffic simulation and optimization with deep reinforcement learning. Primary goal is to make the hands-on study of deep RL accessible to thousands of students, educators, and researchers.

Active Authentication on Mobile Devices

Paper (Cite: BibTeX, Scholar)

Summary: An approach for verifying the identity of a smartphone user with with four biometric modalities. We evaluate the approach by collecting real-world behavioral biometrics data from smartphones of 200 subjects over a period of at least 30 days.

CLERA: A Unified Model for Joint Cognitive Load and Eye Region Analysis in the Wild

Paper (Cite: BibTeX, Scholar)

Summary: Unification of cognitive load estimation and eye region analysis (landmark/pupil/blink detection) in a single deep learning framework using shared feature extraction with task-specific heads. Introduces Localized Feature Tracking to model cognitive load from tracked eye region features over time and Mask-Localized Regressor for sub-pixel precise keypoint detection, achieving 66.58% accuracy on real-world driving cognitive load classification while running at 38+ FPS with joint eye analysis.

What Can Be Predicted from 6 Seconds of Driver Glances?

Paper (Cite: BibTeX, Scholar)

Summary: Winner of the CHI 2017 Best Paper Award. We consider a dataset of real-world, on-road driving to explore the predictive power of driver glances.

Learning Human Identity From Motion Patterns

Paper (Cite: BibTeX, Scholar)

Summary: Dense Clockwork RNNs learn shift-invariant representations from smartphone IMU data for passive biometric authentication, fixing temporal aliasing in CWRNNs while modeling multi-scale kinematics. Achieves 20% EER on 1500-user dataset of natural prehensile movements captured in the wild.

A Fast Foveated Fully Convolutional Network Model for Human Peripheral Vision

Paper (Cite: BibTeX, Scholar)

Summary: Generative neural network is trained to simulate human peripheral vision degradation 21,000x faster than existing behaviorally-validated texture synthesis models (4.2 hours → 0.7 seconds per image), enabling real-time visualization of what observers see when fixating different locations. The network learns to replicate crowding and acuity loss effects from the Texture Tiling Model while preserving statistical accuracy for HCI design applications.

Driver Gaze Region Estimation without Use of Eye Movement

Paper (Cite: BibTeX, Scholar)

Summary: We propose a simplification of the general gaze estimation task by framing it as a gaze region estimation task in the driving context, thereby making it amenable to machine learning approaches. We go on to describe and evaluate one such learning-based approach.

Semi-Automated Annotation of Discrete States in Large Video Datasets

Paper (Cite: BibTeX, Scholar)

Summary: Semi-automated video annotation framework reduces per-frame labeling to detecting state transitions, modeled with a Hidden Markov Model. On 16M driver-gaze frames, it cuts manual work by up to 84× while maintaining 91–99% accuracy

Automated Synchronization of Driving Data Using Vibration and Steering Events

Paper (Cite: BibTeX, Scholar)

Summary: A method for automated synchronization of vehicle sensors using accelerometer, telemetry, audio, and dense optical flow from three video sensors.

Crowdsourced Assessment of External Vehicle-to-Pedestrian Displays

Paper (Cite: BibTeX, Scholar)

Summary: 30 external vehicle-to-pedestrian display concepts for autonomous vehicles were evaluated. Simple, minimalist displays performed best.

Driver Frustration Detection from Audio and Video in the Wild

Paper (Cite: BibTeX, Scholar)

Summary: A method for detecting driver frustration from both video and audio streams captured during the driver's interaction with an in-vehicle voice-based navigation system. An interesting observation: smiles are more common in unsatisfied vs satisfied interactions.

Owl and Lizard: Patterns of Head Pose and Eye Pose in Driver Gaze Classification

Paper (Cite: BibTeX, Scholar)

Summary: Monocular driver gaze classification achieves 94.6% accuracy using head+eye pose vs 89.2% with head pose alone on 6-region classification. "Owlness" metric (dh/(dh+dp)) reveals users with eye-dominant gaze strategies benefit most from pupil detection, while head-movers show minimal improvement.

Multi-modal Decision Fusion for Continuous Authentication

Paper (Cite: BibTeX)

Summary: Decision fusion of 12 behavioral biometric sensors (keystroke dynamics, mouse movement, stylometry) for continuous authentication. System is be robust to partial spoofing.

Observations on Sum User Rate for Cellular Downlink

Paper (Cite: BibTeX)

Summary: Cellular downlink performance is analyzed using an expected spatial capacity metric based on SINR-driven user association. Results show counter-intuitive effects: clustered transmitter placement and shared channel use can yield higher sum rates than evenly spaced deployments or slot scheduling

Decision Fusion for Multimodal Active Authentication

Paper (Cite: BibTeX)

Summary: Continuous authentication via behavioral biometrics fuses 10 SVM-based sensors (mouse dynamics, keystroke timing, stylometry with varying window sizes, domain visit patterns) using Chair-Varshney optimal decision rule on 19-user office dataset. Achieves FAR=0.00122/FRR=0.00218, with stylometry contributing more error reduction than web browsing patterns in the multimodal fusion.

Communication-Based Motion Planning

Paper (Cite: BibTeX)

Summary: Mobile agents navigating obstacle-laden terrain optimize movement timing along predetermined paths to minimize network disconnections, formulated as minimizing average strongly connected components. Cooperative uniform-cost search achieves optimality at O(βn²·2^(nT_max)) while distributed noncooperative planning scales to O(cβn³·2^T_max), achieving near-optimal 80-128% connectivity improvements.

Robust Optimal Power Control for Ad Hoc Networks

Paper (Cite: BibTeX)

Summary: Robust power control for ad hoc networks minimizes total transmit power while penalizing expected SINR violations under uncertain channels (fading, shadowing, noise). Achieves better feasibility-optimality tradeoff than deterministic methods using outdated channel state information.

Distributed Path Planning for Connectivity Under Uncertainty by Ant Colony Optimization

Paper (Cite: BibTeX)

Summary: Distributed path planning via ant colony optimization minimizes time-averaged connected components under incomplete knowledge of jamming zones, using pheromone tables updated by utility functions combining distance-to-goal, nearest-neighbor, and learned no-comm zone probability estimates.

Path Planning for Network Performance

Paper (Cite: BibTeX)

Summary: Decentralized A* search computes Pareto-optimal paths for MANET nodes balancing minimum-time navigation against six network performance metrics (connected components, link density, multicommodity flow), achieving up to 5x performance improvement with zero travel time penalty. Nodes incorporate real-time network feedback to dynamically adjust trajectories, enabling significant connectivity gains in sparse networks where traditional formation control fails.

Cross-Layer Multicommodity Capacity Expansion on Ad Hoc Wireless Networks of Cognitive Radios

Paper (Cite: BibTeX)

Summary: Joint optimization of power, constellation size, scheduling, and flow across PHY/MAC/NET layers in cognitive radio networks achieves higher throughput than modular layer-by-layer design. Cross-layer resource allocation yields 20-140% performance gains over conventional layered approaches.

OMAN: A Mobile Ad Hoc Network Design System

Paper (Cite: BibTeX)

Summary: OMAN integrates cross-layer resource allocation for mobile ad hoc networks into a unified optimization framework, solving power control under channel uncertainty, scheduling with directional antennas, and relay node movement planning simultaneously. The system provides both API and GUI interfaces for jointly optimizing network resources across PHY, MAC, and mobility layers rather than treating each layer's optimization problems in isolation.

On the Joint Impact of Bias and Power Control on Downlink Spectral Efficiency in Cellular Networks

Paper (Cite: BibTeX)

Summary: Cell biasing and downlink power control are jointly optimized to improve cellular network spectral efficiency. Joint control shows significant improvements in mean-variance and throughput-fairness tradeoffs over using either control alone.