Saurav Sharma

I am a PhD student in the CAMMA group supervised by Prof. Nicolas Padoy. I am primarily working on surgical scene understanding for computer assisted surgery systems.

Previously, I worked as a Senior Data Scientist at Inference Labs. I also did a research internship at INRIA Sophia-Antipolis and worked on video based action understanding. Prior to internship, I was a Senior Data Analyst at Kantar. I did my masters from NIT Rourkela and bachelors from Tezpur University with specialization in Computer Science.

Email  |  CV  |  Google Scholar  |  ResearchGate  |  LinkedIn  |  GitHub

profile photo


I'm interested in holistic scene understanding that can be useful for multiple downstream tasks. My research goal is to understand and analyze how spatio-temporal scene context can help capture action dynamics. I am also curious about complex interplay of inputs coming from different modalities.


  • MICCAI'2023 paper on surgical triplet detection. NEW
  • IPCAI/IJCARS'2023 paper on temporal approach for surgical triplet recognition.
  • Contributed to Self-supervised learning experiments (SSSL) on surgical triplet recognition.
  • Co-organized CholecTriplet Challenge'2022 held at MICCAI'2022, Singapore.
  • Started PhD in the Camma Group on surgical video understanding in 2022.
  • TPAMI'2022 paper on Toyota Smarthomes Untrimmed Dataset For Action Detection. (Work done as a part of research internship in INRIA)
  • ECCV'2020 on Learning Video-Pose Embedding for Activities of Daily Living (Work done as a part of research internship in INRIA)
  • Graduated in Computer Science with a specialization in Computer Vision from NIT, Rourkela, India in 2017.

Publications (representative papers are highlighted)
Surgical Action Triplet Detection by Mixed Supervised Learning of Instrument-Tissue Interactions
Saurav Sharma, Chinedu Innocent Nwoye, Didier Mutter, Nicolas Padoy.
26th International Conference on Medical Image Computing and Computer Assisted Intervention, 2023

webpage | paper

Surgical triplet detection with instrument aware target features learned with weak labels and pseudo triplet instance labels.

Rendezvous in time: an attention-based temporal fusion approach for surgical triplet recognition
Saurav Sharma, Chinedu Innocent Nwoye, Didier Mutter, Nicolas Padoy.
International Journal of Computer Assisted Radiology and Surgery, 2023

webpage | paper

Surgical triplet recognition using video based evolution of surgical actions captured using temporal attention method.

Dissecting Self-Supervised Learning Methods for Surgical Computer Vision
Sanat Ramesh, Vinkle Srivastav, Deepak Alapatt, Tong Yu, Aditya Murali, Luca Sestini, Chinedu Innocent Nwoye, Idris Hamoud, Saurav Sharma, Antoine Fleurentin, Georgios Exarchakis, Alexandros Karargyris, Nicolas Padoy.
webpage | paper

Investigation of state-of-the-art self supervised methods in the context of surgical computer vision.

Toyota Smarthome Untrimmed: Real-World Untrimmed Videos for Activity Detection
Rui Dai, Srijan Das, Saurav Sharma, Luca Minciullo, Lorenzo Garattoni, Francois Bremond, Gianpiero Francesca.
(Arxiv Pre-print), October 2020

project page

Untrimmed daily-living action detection dataset with dense annotations and features several real-world challenges and offers 3 modalities: RGB + Depth + 3D Skeleton

VPN: Learning Video-Pose Embedding for Activities of Daily Living
Srijan Das, Saurav Sharma, Rui Dai, Francois Bremond, Monique Thonnat.
16th European Conference on Computer Vision (ECCV'20 ONLINE), 2020

project page

Temporal Activity Recognition using RGB videos guided by Human Pose based Attention Network

DenseNet with pre-activated deconvolution for estimating depth map from single image
Saurav Sharma, Ram Prasad Padhy, Suman Choudhury, Nabarun Goswami, Pankaj Sa.
5th Activity monitoring by multiple distributed sensing (AMMDS) Workshop under BMVC, London, 2017

Master's Thesis | paper

Monocular depth estimation with state of the art DenseNet CNN with custom deconvolution modules.

A Framework for Pixel Intensity Modulation Based Image Steganography
Srijan Das, Saurav Sharma, Sambit Bakshi, Imon Mukherjee.
1st International Conference on Advanced Computing and Intelligent Engineering (ICACIE), India, 2016

Novel steganographic algorithm in the spatial domain using the concept of pixel modulation.

Conferences, Workshops & Summer Schools


  • Jan 2017 - Apr 2017    Teaching Assistant for CS 172 Computing Laboratory-II at NIT Rourkela.
  • Aug 2016 - Nov 2016   Teaching Assistant for CS 171 Computing Laboratory-I at NIT Rourkela.

Take a look at this website