Technical Skills – Parallel Programming (OpenMP, CILK), Functional Programming (Haskell), Assembly Programming (DSP, RISC, x86, MIPS), Embedded Systems and DSP
Graduate Student Researcher, University of California San Diego
Advisor: Prof. Michael B. Taylor Summer ‘08 – Present
Attacking Memory and On-chip Network Latencies in Multi-core Architectures 
The performance of multi-core systems is severely limited by the memory and on-chip communication latency. In this project, we minimize them by finding efficient mapping of memory accesses, and optimally redistributing data so as to place them closer to the execution nodes, thus reducing off-chip memory accesses.
Parallel Programming on Many-core Systems
The goal of this project was to parallelize Computer Vision applications using CILK and OpenMP programming models such that speed-up scales with the number of cores. This was very challenging considering the negative interference of shared resources and the overheads introduced by the parallel programming models. The project involved applying complex code and data transformations techniques and evaluating them on 32-core systems. This involved understanding the system-level design of 32-core machines and optimizing the software to map efficiently onto hardware.
Automating Parallelism Discovery and Planning [2, 3]
Developed compiler and runtime techniques such as dynamic critical path analysis to automatically expose parallelism and predict speedup of target programs. We applied these techniques on SD-VBS (The San Diego Vision Benchmark Suite), among others, to expose the underlying parallelism that could be extracted using multi-core architectures.
SD-VBS: The San Diego Vision Benchmark Suite 
Developed SD-VBS, which is a suite of diverse vision applications drawn from the vision domain. Vision applications have a fair amount of parallelism, which makes them a good candidate for formulating future multi-core and parallel architectures. SD-VBS is intended to help architects, compiler writers, and system designers to study the construction of future systems that excel at vision-oriented applications.
University of California, San Diego Fall 2010
CS Teaching Assistant – Graduate Computer Architecture. Facilitated a graduate class on principles and design of architecture and advanced memory. Took initiative to design new course material and lead discussions on emerging research in core and interconnect design.
Sarnoff Innovative Technologies, Bangalore July 2006 – July 2008
Research Engineer – Applied Vision. Developed DSP portable image processing and vision algorithms for real-time applications such as Adaptive Cruise Control, Pedestrian Tracking, Lane Departure Warning System. This involved porting vision applications from Matlab to C and assembly language. For this purpose, the applications had to be converted from floating point to fixed-point assembly, in addition to optimizing them for the underlying DSP pipeline architecture and memory model. Finally, these DSP-compatible vision applications were packaged with real-time performance of >25 fps.
Analog Devices, Bangalore July 2005 – Dec 2005
Intern – DSP Software. Worked on porting audio/video codecs, such as, H.263 and H.323 standards onto Blackfin DSP processor. Implemented system-C and assembly codes compatible with DSP for the audio-video codecs that perform at real-time speed of >25 fps.
Birla Institute of Technology and Science, Pilani Jan 2005 – July 2005
EE/CS Teaching Assistant – Microprocessor Design and Integration.
 SD-VBS: The San Diego Vision Benchmark Suite
Sravanthi Kota V, I Ahn, D Jeon, A Gupta, C Louie, S Garcia, S Belongie, M B. Taylor.