Aman Raj

I am Senior Machine Learning Research Engineer at Apple Inc., where I develop core vision technologies that power new user experiences across the ecosystem of Apple devices and services.

I went to graduate school in the beautiful city of San Diego, where I completed an MS degree in Machine Learning and Data Science at University of California, San Diego. In Spring 2020, I defended my MS thesis titled Learning Augmentation Policy Schedules for Unsupervised Depth Estimation which addresses the problem of depth estimation in bad weather conditions for autonomous driving use case.

After completing my undergraduate studies in B.Tech from Delhi Technological University, I worked as Software Engineer at Facebook and Research Intern at Samsung Labs. My research work has received notable awards such as The Jack Dangermond Award – Best Paper.

Email  /  CV  /  Google Scholar  /  Github  /  LinkedIn

profile photo
Research

My research interests are in computer vision, machine learning and optimization, and image processing. The majority of my research is about image and video understanding.

Boundary_png
Boundary_png
Learning Augmentation Policy Schedules for Unsuperivsed Depth Estimation.
Aman Raj
UC San Diego Electronic Theses and Dissertations, 2020
thesis / code

My MS thesis that proposes a novel approach to augment data for unsupervised depth estimation. Our method learn data augmentation strategies from data itself.

Boundary_png SUW-Learn: Joint Supervised, Unsupervised, Weakly Supervised Deep Learning for Monocular Depth Estimation.
Haoyu Ren, Aman Raj, Mostafa El-Khamy and Jungwon Lee
CVPR, 2020
video / supplement

A framework for deep-learning with joint supervised learning (S), unsupervised learning (U), and weakly-supervised learning (W). We deploy SUW- Learn for deep learning of the monocular depth from im- ages and video sequences.

Boundary_png SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception.
Yue Meng, Yongxi Lu, Aman Raj, Samuel Sunarjo, Rui Guo, Tara Javidi, Gaurav Bansal,
Dinesh Bharadia
CVPR, 2019
project / supplement

SIGNet integrates semantic information to make depth and flow predictions consistent with objects and robust to low lighting conditions. SIGNet is shown to improve upon the state-of-the-art unsupervised learning for depth prediction.

Boundary_png A Holistic Framework for Addressing the World using Machine Learning.
Ilke Demir, Forest Hughes, Aman Raj, Kaunil Dhruv, Suryanarayana Murthy Muddla, Sanyam Garg, Barrett Doo, Ramesh Raskar
CVPR, 2018
project

We propose an automatic generative algorithm to create street addresses from satellite imagery. Our addressing scheme is coherent with the street topology, linear and hierarchical to follow human perception, and universal to be used as a unified geocoding system.

Boundary_png
Boundary_png
Generative Street Addresses from Satellite Imagery.
Ilke Demir, Forest Hughes, Aman Raj, Kaunil Dhruv, Suryanarayana Murthy Muddla, Sanyam Garg, Barrett Doo, Ramesh Raskar
ISPRS, 2018   (The Jack Dangermond Award – Best Paper)
project / code / talk

Our algorithm starts with extracting roads from satellite imagery by utilizing deep learning. Then, it uniquely labels the regions, roads, and structures using graph and proximity-based algorithms.

Boundary_png Robocodes: Towards Generative Street Addresses from Satellite Imagery.
Ilke Demir, Forest Hughes, Aman Raj, Kleovoulos Tsourides, Divyaa Ravichandran, Suryanarayana Murthy, Kaunil Dhruv, Sanyam Garg, Jatin Malhotra, Barrett Doo, Grace Kermani, Ramesh Raskar
CVPR, 2017   (Best Paper Award)
project / code / blog / news

We describe our automatic generative algorithm to create street addresses (Robocodes) from satellite images by learning and labeling regions, roads, and blocks.

Boundary_png
Boundary_png
FPGA Accelerated Abandoned Object Detection.
Rajesh Rohilla, Aman Raj, Saransh Kejriwal, Rajiv Kapoor
ICCTICT, IEEE 2016
slides

We propose a hardware implementation of abandoned object detection algorithm on FPGA aimed for making a custom chip that can do real-time inference on live video feed.

Boundary_png Multi-Scale Convolutional Architecture for Semantic Segmentation.
Aman Raj, Daniel Maturana, Sebastian Scherer
RI Technical Reports, CMU 2015
project / slides

This work exploits the geocentric encoding of a depth image and uses a multi-scale deep convolutional neural network architecture that captures high and low- level features of a scene to generate rich semantic labels.

Boundary_png Digitization of Historic Inscription Images using Cumulants based Simultaneous Blind Source Extraction.
N. Jayanthi, Ayush Tomar, Aman Raj, S. Indu, Santanu Chaudhury
ICVGIP 2014

Proposed technique provides a suitable method to separate the text layer from the historic inscription images by considering the problem as blind source separation which aims to calculate the independent components from a linear mixture of source signals, by maximizing a contrast function based on higher order cumulants.

Boundary_png Enhancement and Retrieval of Historic Inscription Images.
S. Indu, Ayush Tomar, Aman Raj, Santanu Chaudhury
ACCV 2014

Binarization of inscription images using the proposed cumulants based Blind Source Extraction(BSE) method, and store them in a digital library with their corresponding historic information which can be retrieved later using image retrieval algorithms such as BoW.

Patents
System and Method for Deep Machine Learning for Computer Vision Applications
US20210124985A1, US20220391632A1
Miscellaneous
Teaching Assistant, CSE 12 - Spring 2019
Boundary_png
Boundary_png
Comic Polyglot
CMU Winter School, 2014   (Best Project Award)
poster

Comic Polyglot-A system that identifies the text regions in comic strips like Manga and subsequently translates it’s Japanese text into English using an OCR engine while maintaining the syntax. It is aimed to help English-speaking manga comic readers.

Boundary_png
Boundary_png
Lunabot
NASA's Lunabotics Mining Competition, 2013
paper / outreach / gallery

Aaravya Lunabot - DTU’s official entry into NASA Lunabotics Mining Competition 2013. The challenge required student teams to design and build a mining robot that can traverse the simulated lunar chaotic terrain, excavate lunar regolith and deposit the regolith into a collector bin within ten minutes.


Website design from here