MENU: Home Bio Affiliations Research Teaching Publications Videos Collaborators/Students Contact FAQ ©2007-15 RSS

My Infographic Resume

January 20th, 2015 Irfan Essa Posted in Interesting | No Comments »

Check out my infographic resume created via Vizualize.me.

via My Infographic Resume.

Tags: , ,

AddThis Social Bookmark Button

Four Papers at IEEE Winter Conference on Applications of Computer Vision (WACV 2015)

January 5th, 2015 Irfan Essa Posted in Computational Photography and Video, Computer Vision, PAMI/ICCV/CVPR/ECCV, Papers, S. Hussain Raza, Steven Hickson, Vinay Bettadapura | No Comments »

Four papers accepted at the IEEE Winter Conference on Applications of Computer Vision (WACV) 2015. See you at Waikoloa Beach, Hawaii!

  • V. Bettadapura, E. Thomaz, A. Parnami, G. Abowd, and I. Essa (2015), “Leveraging Context to Support Automated Food Recognition in Restaurants,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF] [WEBSITE] [BIBTEX]
    @inproceedings{2015-Bettadapura-LCSAFRR,
      Author = {Vinay Bettadapura and Edison Thomaz and Aman Parnami and Gregory Abowd and Irfan Essa},
      Booktitle = {Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV)},
      Date-Added = {2015-01-06 00:49:58 +0000},
      Date-Modified = {2015-01-06 00:49:58 +0000},
      Month = {January},
      Pdf = {http://www.cc.gatech.edu/~irfan/p/2015-Bettadapura-LCSAFRR.pdf},
      Publisher = {IEEE Computer Society},
      Title = {Leveraging Context to Support Automated Food Recognition in Restaurants},
      Url = {http://www.vbettadapura.com/egocentric/food/},
      Year = {2015},
      Bdsk-Url-1 = {http://www.vbettadapura.com/egocentric/food/}}
  • S. Hickson, I. Essa, and H. Christensen (2015), “Semantic Instance Labeling Leveraging Hierarchical Segmentation,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [BIBTEX]
    @inproceedings{2015-Hickson-SILLHS,
      Author = {Steven Hickson and Irfan Essa and Henrik Christensen},
      Booktitle = {Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV)},
      Date-Added = {2015-01-06 00:45:58 +0000},
      Date-Modified = {2015-01-06 00:51:08 +0000},
      Month = {January},
      Publisher = {IEEE Computer Society},
      Title = {Semantic Instance Labeling Leveraging Hierarchical Segmentation},
      Year = {2015},
      Bdsk-Url-1 = {http://www.vbettadapura.com/egocentric/food/}}
  • S. H. Raza, A. Humayun, M. Grundmann, D. Anderson, and I. Essa (2015), “Finding Temporally Consistent Occlusion Boundaries using Scene Layout,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [BIBTEX]
    @inproceedings{2015-Raza-FTCOBUSL,
      Author = {Syed Hussain Raza and Ahmad Humayun and Matthias Grundmann and David Anderson and Irfan Essa},
      Booktitle = {Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV)},
      Date-Added = {2015-01-06 00:43:02 +0000},
      Date-Modified = {2015-01-06 00:47:13 +0000},
      Month = {January},
      Publisher = {IEEE Computer Society},
      Title = {Finding Temporally Consistent Occlusion Boundaries using Scene Layout},
      Year = {2015}}
  • V. Bettadapura, I. Essa, and C. Pantofaru (2015), “Egocentric Field-of-View Localization Using First-Person Point-of-View Devices,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF] [WEBSITE] [BIBTEX]
    @inproceedings{2015-Bettadapura-EFLUFPD,
      Author = {Vinay Bettadapura and Irfan Essa and Caroline Pantofaru},
      Booktitle = {Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV)},
      Date-Added = {2015-01-06 00:38:59 +0000},
      Date-Modified = {2015-01-06 00:42:16 +0000},
      Month = {January},
      Pdf = {http://www.cc.gatech.edu/~irfan/p/2015-Bettadapura-EFLUFPD.pdf},
      Publisher = {IEEE Computer Society},
      Title = {Egocentric Field-of-View Localization Using First-Person Point-of-View Devices},
      Url = {http://www.vbettadapura.com/egocentric/localization/},
      Year = {2015}}

Last one was also the WINNER of Best Paper Award (see http://wacv2015.org/). More details coming soon.

 

Tags: , , ,

AddThis Social Bookmark Button

Computational Photography (CS 6475) for Georgia Tech’s Online MSCS Program (via Udacity)

January 5th, 2015 Irfan Essa Posted in Computational Photography, Computational Photography and Video | No Comments »

Today, the Inaugural Offering of the Computational Photography (CS 6475) was launched for the Georgia Tech’s Online MSCS Program using the Udacity platform.

Course Description

CS 6475* (3-0-3): Computational Photography – (Instructor: Irfan Essa) – This class explores how computation impacts the entire workflow of photography, which is traditionally aimed at capturing light from a (3D) scene to form an (2D) image. A detailed study of the perceptual, technical and computational aspects of forming pictures, and more precisely the capture and depiction of reality on a (mostly 2D) medium of images is undertaken over the entire term. The scientific, perceptual, and artistic principles behind image-making will be emphasized, especially as impacted and changed by computation. Topics include the relationship between pictorial techniques and the human visual system; intrinsic limitations of 2D representations and their possible compensations; and technical issues involving capturing light to form images. Technical aspects of image capture and rendering, and exploration of how such a medium can be used to its maximum potential, will be examined. New forms of cameras and imaging paradigms will be introduced. Students will undertake a hand-on approach over the entire term using computation techniques, merged with digital imaging processes to produce photographic artifacts.

DO NOTE that there are programming assignments in this class, and working knowledge of Linear Algebra, Calculus, Probability, and Programming in C++/Python/Matlab/Java will be required. OpenCV OR Matlab are used in this class as appropriate. More information on this class at Computational Photography Class Website.

Video Preview

Tags: , ,

AddThis Social Bookmark Button

William Mong Distinguished Lecture at the University of Hong Kong on “Video Cameras are Everywhere: Data-Driven Methods for Video Analysis and Enhancement”

December 11th, 2014 Irfan Essa Posted in Computational Photography and Video, Computer Vision, Presentations | No Comments »

Video Cameras are Everywhere: Data-Driven Methods for Video Analysis and Enhancement

Irfan Essa (prof.irfanessa.com)
Georgia Institute of Technology
School of Interactive Computing
GVU and RIM @ GT Centers 

Abstract 

2014-12-11-HKUIn this talk, I will start with describing the pervasiveness of image and video content, and how such content is growing with the ubiquity of cameras.  I will use this to motivate the need for better tools for analysis and enhancement of video content. I will start with some of our earlier work on temporal modeling of video, then lead up to some of our current work and describe two main projects. (1) Our approach for a video stabilizer, currently implemented and running on YouTube, and its extensions. (2) A robust and scaleable method for video segmentation. 

I will describe, in some detail, our Video stabilization method, which generates stabilized videos and is in wide use. Our method allows for video stabilization beyond the conventional filtering that only suppresses high frequency jitter. This method also supports removal of rolling shutter distortions common in modern CMOS cameras that capture the frame one scan-line at a time resulting in non-rigid image distortions such as shear and wobble. Our method does not rely on a-priori knowledge and works on video from any camera or on legacy footage. I will showcase examples of this approach and also discuss how this method is launched and running on YouTube, with Millions of users.

Then I will  describe an efficient and scalable technique for spatio-temporal segmentation of long video sequences using a hierarchical graph-based algorithm. This hierarchical approach generates high quality segmentations and we demonstrate the use of this segmentation as users interact with the video, enabling efficient annotation of objects within the video. I will also show some recent work on how this segmentation and annotation can be used to do dynamic scene understanding. 

Bio: http://prof.irfanessa.com/bio 

Tags: , , ,

AddThis Social Bookmark Button

Computation + Journalism Symposium 2014

October 25th, 2014 Irfan Essa Posted in Computational Journalism, Events, Nick Diakopoulos | No Comments »

Hosted the 3rd Computation + Journalism Symposium 2014 at The Brown Institute for Media Innovation in the Pulitzer Hall, Columbia University, New York, NY, USA, on October 24-25. It was a huge success with about 250 attendees, and mixture of invited panels and contributed papers.  More details below:

Jon Klienberg kicked off the meeting with a very exciting keynote.  Videos of all sessions should be available from the above website.  Next C+J event will be in a year. Stay tuned for more details.  I was the co-organizer of this event with Nick Diakopoulos and Mark Hansen.

 

 

Tags: ,

AddThis Social Bookmark Button

Paper in BMCV (2014): “Depth Extraction from Videos Using Geometric Context and Occlusion Boundaries”

September 5th, 2014 Irfan Essa Posted in Computational Photography and Video, PAMI/ICCV/CVPR/ECCV, S. Hussain Raza | No Comments »

  • S. H. Raza, O. Javed, A. Das, H. Sawhney, H. Cheng, and I. Essa (2014), “Depth Extraction from Videos Using Geometric Context and Occlusion Boundaries Depth Extraction from Videos Using Geometric Context and Occlusion Boundaries,” in Proceedings of British Machine Vision Conference (BMVC), Nottingham, UK, 2014. [PDF] [WEBSITE] [BIBTEX]
    @inproceedings{2014-Raza-DEFVUGCOBDEFVUGCOB,
      Address = {Nottingham, UK},
      Author = {Syed Hussain Raza and Omar Javed and Aveek Das and Harpreet Sawhney and Hui Cheng and Irfan Essa},
      Booktitle = {{Proceedings of British Machine Vision Conference (BMVC)}},
      Date-Added = {2014-08-30 12:56:03 +0000},
      Date-Modified = {2014-11-10 16:10:07 +0000},
      Month = {September},
      Pdf = {http://www.cc.gatech.edu/~irfan/p/2014-Raza-DEFVUGCOBDEFVUGCOB.pdf},
      Title = {Depth Extraction from Videos Using Geometric Context and Occlusion Boundaries Depth Extraction from Videos Using Geometric Context and Occlusion Boundaries},
      Url = {http://www.cc.gatech.edu/cpl/projects/videodepth/},
      Year = {2014},
      Bdsk-Url-1 = {http://www.cc.gatech.edu/cpl/projects/videodepth/}}

We present an algorithm to estimate depth in dynamic video scenes.We present an algorithm to estimate depth in dynamic video scenes.

We propose to learn and infer depth in videos from appearance, motion, occlusion boundaries, and geometric context of the scene. Using our method, depth can be estimated from unconstrained videos with no requirement of camera pose estimation, and with significant background/foreground motions. We start by decomposing a video into spatio-temporal regions. For each spatio-temporal region, we learn the relationship of depth to visual appearance, motion, and geometric classes. Then we infer the depth information of new scenes using piecewise planar parametrization estimated within a Markov random field (MRF) framework by combining appearance to depth learned mappings and occlusion boundary guided smoothness constraints. Subsequently, we perform temporal smoothing to obtain temporally consistent depth maps.

To evaluate our depth estimation algorithm, we provide a novel dataset with ground truth depth for outdoor video scenes. We present a thorough evaluation of our algorithm on our new dataset and the publicly available Make3d static image dataset.

Tags: , , ,

AddThis Social Bookmark Button

Paper in CVPR 2014 “Efficient Hierarchical Graph-Based Segmentation of RGBD Videos”

June 22nd, 2014 Irfan Essa Posted in Computer Vision, Henrik Christensen, Papers, Steven Hickson | No Comments »

  • S. Hickson, S. Birchfield, I. Essa, and H. Christensen (2014), “Efficient Hierarchical Graph-Based Segmentation of RGBD Videos,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014. [PDF] [WEBSITE] [BIBTEX]
    @inproceedings{2014-Hickson-EHGSRV,
      Author = {Steven Hickson and Stan Birchfield and Irfan Essa and Henrik Christensen},
      Booktitle = {{Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}},
      Date-Added = {2014-06-22 14:44:17 +0000},
      Date-Modified = {2014-06-22 14:53:26 +0000},
      Month = {June},
      Organization = {IEEE Computer Society},
      Pdf = {http://www.cc.gatech.edu/~irfan/p/2014-Hickson-EHGSRV.pdf},
      Title = {Efficient Hierarchical Graph-Based Segmentation of RGBD Videos},
      Url = {http://www.cc.gatech.edu/cpl/projects/4dseg},
      Year = {2014},
      Bdsk-Url-1 = {http://www.cc.gatech.edu/cpl/projects/4dseg}}

Abstract

We present an efficient and scalable algorithm for seg- menting 3D RGBD point clouds by combining depth, color, and temporal information using a multistage, hierarchical graph-based approach. Our algorithm processes a moving window over several point clouds to group similar regions over a graph, resulting in an initial over-segmentation. These regions are then merged to yield a dendrogram using agglomerative clustering via a minimum spanning tree algorithm. Bipartite graph matching at a given level of the hierarchical tree yields the final segmentation of the point clouds by maintaining region identities over arbitrarily long periods of time. We show that a multistage segmentation with depth then color yields better results than a linear combination of depth and color. Due to its incremental process- ing, our algorithm can process videos of any length and in a streaming pipeline. The algorithm’s ability to produce robust, efficient segmentation is demonstrated with numerous experimental results on challenging sequences from our own as well as public RGBD data sets.

Tags: , , , , ,

AddThis Social Bookmark Button

PhD Thesis (2014) by Yachna Sharma “Surgical Skill Assessment Using Motion Texture analysis”

May 2nd, 2014 Irfan Essa Posted in Medical, PhD, Yachna Sharma | No Comments »

Thesis title: Surgical Skill Assessment Using Motion Texture analysis

Yachna Sharma, Ph. D. Candidate, ECE
http://users.ece.gatech.edu/~ysharma3/

Committee:

Prof. Irfan Essa (advisor), College of Computing
Prof. Mark A. Clements (co-advisor), School of Electrical and Computer Engineering
Prof. David Anderson, School of Electrical and Computer Engineering
Prof. Anthony Yezzi, School of Electrical and Computer Engineering
Prof. Christopher F. Barnes, School of Electrical and Computer Engineering
Dr. Thomas Ploetz, Culture lab, School of Computing Science, Newcastle University, United Kingdom
Dr. Eric L. Sarin, Division of Cardiothoracic Surgery, Department of Surgery, Emory University School of Medicine

Abstract:

The objective of this Ph.D. research is to design and develop a framework for automated assessment of surgical skills.Automated assessment can help expedite the manual assessment process and provide unbiased evaluations with possible dexterity feedback.

Evaluation of surgical skills is an important aspect in training of medical students. Current practices rely on manual evaluations from faculty and residents and are time consuming. Proposed solutions in literature involve retrospective evaluations such as watching the offline videos. It requires precious time and attention of expert surgeons and may vary from one surgeon to another. With recent advancements in computer vision and machine learning techniques, the retrospective video evaluation can be best delegated to the computer algorithms.

Skill assessment is a challenging task requiring expert domain knowledge that may be difficult to translate into algorithms. To emulate this human observation process, an appropriate data collection mechanism is required to track motion of the surgeon’s hand in an unrestricted manner. In addition, it is essential to identify skill defining motion dynamics and skill relevant hand locations.

This Ph.D. research aims to address the limitations of manual skill assessment by developing an automated motion analysis framework. Specifically, we propose (1) to design and implement quantitative features to capture fine motion details from surgical video data, (2) to identify and test the efficacy of a core subset of features in classifying the surgical students into different expertise levels, (3) to derive absolute skill scores using regression methods and (4) to perform dexterity analysis using motion data from different hand locations.

Tags: , , , ,

AddThis Social Bookmark Button

PhD Thesis (2014) by S. Hussain Raza “Temporally Consistent Semantic Segmentation in Videos

May 2nd, 2014 Irfan Essa Posted in Computational Photography and Video, PhD, S. Hussain Raza | No Comments »

Title : Temporally Consistent Semantic Segmentation in Videos

S. Hussain Raza, Ph. D. Candidate in ECE (https://sites.google.com/site/shussainraza5/)

Committee:

Prof. Irfan Essa (advisor), School of Interactive Computing
Prof. David Anderson (co-advisor), School of Electrical and Computer Engineering
Prof. Frank Dellaert, School of Interactive Computing
Prof. Anthony Yezzi, School of Electrical and Computer Engineering
Prof. Chris Barnes, School of Electrical and Computer Enginnering
Prof. Rahul Sukthanker, Department of Computer Science and Robotics, Carnegie Mellon University.

Abstract :

The objective of this Thesis research is to develop algorithms for temporally consistent semantic segmentation in videos. Though many different forms of semantic segmentations exist, this research is focused on the problem of temporally-consistent holistic scene understanding in outdoor videos. Holistic scene understanding requires an understanding of many individual aspects of the scene including 3D layout, objects present, occlusion boundaries, and depth. Such a description of a dynamic scene would be useful for many robotic applications including object reasoning, 3D perception, video analysis, video coding, segmentation, navigation and activity recognition.

Scene understanding has been studied with great success for still images. However, scene understanding in videos requires additional approaches to account for the temporal variation, dynamic information, and exploiting causality. As a first step, image-based scene understanding methods can be directly applied to individual video frames to generate a description of the scene. However, these methods do not exploit temporal information across neighboring frames. Further, lacking temporal consistency, image-based methods can result in temporally-inconsistent labels across frames. This inconsistency can impact performance, as scene labels suddenly change between frames.

The objective of our this study is to develop temporally consistent scene descriptive algorithms by processing videos efficiently, exploiting causality and data-redundancy, and cater for scene dynamics. Specifically, we achieve our research objects by (1) extracting geometric context from videos to give broad 3D structure of the scene with all objects present, (2) detecting occlusion boundaries in videos due to depth discontinuity, and (3) estimating depth in videos by combining monocular and motion features with semantic features and occlusion boundaries.

Tags: , , ,

AddThis Social Bookmark Button

PhD Thesis by Zahoor Zafrulla “Automatic recognition of American Sign Language Classifiers

May 2nd, 2014 Irfan Essa Posted in Affective Computing, Behavioral Imaging, Face and Gesture, PhD, Thad Starner, Zahoor Zafrulla | No Comments »

Title: Automatic recognition of American Sign Language Classifiers

Zahoor Zafrulla
School of Interactive Computing
College of Computing
Georgia Institute of Technology
http://www.cc.gatech.edu/grads/z/zahoor/

Committee:

Dr. Thad Starner (Advisor, School of Interactive Computing, Georgia Tech)
Dr. Irfan Essa (Co-Advisor, School of Interactive Computing, Georgia Tech)
Dr. Jim Rehg (School of Interactive Computing, Georgia Tech)
Dr. Harley Hamilton (School of Interactive Computing, Georgia Tech)
Dr. Vassilis Athitsos (Computer Science and Engineering Department, University of Texas at Arlington)

Summary:

Automatically recognizing classifier-based grammatical structures of American Sign Language (ASL) is a challenging problem. Classifiers in ASL utilize surrogate hand shapes for people or “classes” of objects and provide information about their location, movement and appearance. In the past researchers have focused on recognition of finger spelling, isolated signs, facial expressions and interrogative words like WH-questions (e.g. Who, What, Where, and When). Challenging problems such as recognition of ASL sentences and classifier-based grammatical structures remain relatively unexplored in the field of ASL recognition.

One application of recognition of classifiers is toward creating educational games to help young deaf children acquire language skills. Previous work developed CopyCat, an educational ASL game that requires children to engage in a progressively more difficult expressive signing task as they advance through the game.

We have shown that by leveraging context we can use verification, in place of recognition, to boost machine performance for determining if the signed responses in an expressive signing task, like in the CopyCat game, are correct or incorrect. We have demonstrated that the quality of a machine verifier’s ability to identify the boundary of the signs can be improved by using a novel two-pass technique that combines signed input in both forward and reverse directions. Additionally, we have shown that we can reduce CopyCat’s dependency on custom manufactured hardware by using an off-the-shelf Microsoft Kinect depth camera to achieve similar verification performance. Finally, we show how we can extend our ability to recognize sign language by leveraging depth maps to develop a method using improved hand detection and hand shape classification to recognize selected classifier-based grammatical structures of ASL.

Tags: , , ,

AddThis Social Bookmark Button