Aman Raj

I am Senior Machine Learning Research Engineer at Apple Inc., where I develop core vision technologies that power new user experiences across the ecosystem of Apple devices and services.

I went to graduate school in the beautiful city of San Diego, where I completed an MS degree in Machine Learning and Data Science at University of California, San Diego. In Spring 2020, I defended my MS thesis titled Learning Augmentation Policy Schedules for Unsupervised Depth Estimation which addresses the problem of depth estimation in bad weather conditions for autonomous driving use case.

After completing my undergraduate studies in B.Tech from Delhi Technological University, I worked as Software Engineer at Facebook and Research Intern at Samsung Labs. My research work has received notable awards such as The Jack Dangermond Award – Best Paper.

Email / CV / Google Scholar / Github / LinkedIn

Research

My research interests are in computer vision, machine learning and optimization, and image processing. The majority of my research is about image and video understanding.

	Learning Augmentation Policy Schedules for Unsuperivsed Depth Estimation. Aman Raj UC San Diego Electronic Theses and Dissertations, 2020 thesis / code My MS thesis that proposes a novel approach to augment data for unsupervised depth estimation. Our method learn data augmentation strategies from data itself.
	SUW-Learn: Joint Supervised, Unsupervised, Weakly Supervised Deep Learning for Monocular Depth Estimation. Haoyu Ren, Aman Raj, Mostafa El-Khamy and Jungwon Lee CVPR, 2020 video / supplement A framework for deep-learning with joint supervised learning (S), unsupervised learning (U), and weakly-supervised learning (W). We deploy SUW- Learn for deep learning of the monocular depth from im- ages and video sequences.
	SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception. Yue Meng, Yongxi Lu, Aman Raj, Samuel Sunarjo, Rui Guo, Tara Javidi, Gaurav Bansal, Dinesh Bharadia CVPR, 2019 project / supplement SIGNet integrates semantic information to make depth and flow predictions consistent with objects and robust to low lighting conditions. SIGNet is shown to improve upon the state-of-the-art unsupervised learning for depth prediction.
	A Holistic Framework for Addressing the World using Machine Learning. Ilke Demir, Forest Hughes, Aman Raj, Kaunil Dhruv, Suryanarayana Murthy Muddla, Sanyam Garg, Barrett Doo, Ramesh Raskar CVPR, 2018 project We propose an automatic generative algorithm to create street addresses from satellite imagery. Our addressing scheme is coherent with the street topology, linear and hierarchical to follow human perception, and universal to be used as a unified geocoding system.
	Generative Street Addresses from Satellite Imagery. Ilke Demir, Forest Hughes, Aman Raj, Kaunil Dhruv, Suryanarayana Murthy Muddla, Sanyam Garg, Barrett Doo, Ramesh Raskar ISPRS, 2018 (The Jack Dangermond Award – Best Paper) project / code / talk Our algorithm starts with extracting roads from satellite imagery by utilizing deep learning. Then, it uniquely labels the regions, roads, and structures using graph and proximity-based algorithms.
	Robocodes: Towards Generative Street Addresses from Satellite Imagery. Ilke Demir, Forest Hughes, Aman Raj, Kleovoulos Tsourides, Divyaa Ravichandran, Suryanarayana Murthy, Kaunil Dhruv, Sanyam Garg, Jatin Malhotra, Barrett Doo, Grace Kermani, Ramesh Raskar CVPR, 2017 (Best Paper Award) project / code / blog / news We describe our automatic generative algorithm to create street addresses (Robocodes) from satellite images by learning and labeling regions, roads, and blocks.
	FPGA Accelerated Abandoned Object Detection. Rajesh Rohilla, Aman Raj, Saransh Kejriwal, Rajiv Kapoor ICCTICT, IEEE 2016 slides We propose a hardware implementation of abandoned object detection algorithm on FPGA aimed for making a custom chip that can do real-time inference on live video feed.
	Multi-Scale Convolutional Architecture for Semantic Segmentation. Aman Raj, Daniel Maturana, Sebastian Scherer RI Technical Reports, CMU 2015 project / slides This work exploits the geocentric encoding of a depth image and uses a multi-scale deep convolutional neural network architecture that captures high and low- level features of a scene to generate rich semantic labels.
	Digitization of Historic Inscription Images using Cumulants based Simultaneous Blind Source Extraction. N. Jayanthi, Ayush Tomar, Aman Raj, S. Indu, Santanu Chaudhury ICVGIP 2014 Proposed technique provides a suitable method to separate the text layer from the historic inscription images by considering the problem as blind source separation which aims to calculate the independent components from a linear mixture of source signals, by maximizing a contrast function based on higher order cumulants.
	Enhancement and Retrieval of Historic Inscription Images. S. Indu, Ayush Tomar, Aman Raj, Santanu Chaudhury ACCV 2014 Binarization of inscription images using the proposed cumulants based Blind Source Extraction(BSE) method, and store them in a digital library with their corresponding historic information which can be retrieved later using image retrieval algorithms such as BoW.

Patents

System and Method for Deep Machine Learning for Computer Vision Applications
US20210124985A1, US20220391632A1

Miscellaneous

	Teaching Assistant, CSE 12 - Spring 2019
	Comic Polyglot CMU Winter School, 2014 (Best Project Award) poster Comic Polyglot-A system that identifies the text regions in comic strips like Manga and subsequently translates it’s Japanese text into English using an OCR engine while maintaining the syntax. It is aimed to help English-speaking manga comic readers.
	Lunabot NASA's Lunabotics Mining Competition, 2013 paper / outreach / gallery Aaravya Lunabot - DTU’s official entry into NASA Lunabotics Mining Competition 2013. The challenge required student teams to design and build a mining robot that can traverse the simulated lunar chaotic terrain, excavate lunar regolith and deposit the regolith into a collector bin within ten minutes.

Website design from here