I am an Applied Scientist at Amazon, and a graduate from the Department of Computer Science and Engineering, at The Ohio State University. I received my Ph.D in Computer Science and was advised by Professor Kannan Srinivasan while working at the CoSyNe group.

I’m interested in the application of machine learning to problems in wireless communication and time series data. I also enjoy working on problems in network/graph mining, and text mining. Before joining OSU, I received my MS in Computer Science from the University Of Cincinnati.

Projects and publications

Machine Learning Aided Channel Prediction:

Devices in cellular networks need to exchange band specific CSI frequently. Some techniques exist that can eliminate this overhead, however, they are computationally expensive and have long runtimes. This project leverages deep neural networks to significantly reduce the computational and time complexity cost of that process. The system described in this paper allows a device to predict its downlink CSI based on its uplink CSI, while being orders of magnitude faster than other approaches

Abstract:

Channel information plays an important role in modern wireless communication systems. Systems that use different frequency bands for uplink and downlink communication often need feedback between devices to exchange band specific channel information. The current state-of-the-art approach proposes a way to predict the channel in the downlink based on that of the observed uplink by identifying variables underlying the uplink channel. In this paper we present a solution that greatly reduces the complexity of this task, and is even applicable for single antenna devices. Our approach uses a neural network trained on a standard channel model to generate coarse estimates for the variables underlying the channel. We then use a simple and efficient single antenna optimization framework to get more accurate variable estimates, which can be used for downlink channel prediction. We implement our approach on software defined radios and compare it to the state-of-the-art through experiments and simulations. Results show that our approach reduces the time complexity by at least an order of magnitude (10x), while maintaining similar prediction quality.

An overview of OptML: A neural network model is used to generate coarse estimates of the lengths of the multipath components in the observed channel. These estimates are then refined by an optimization process. The refined components are then used to calculate the channel in another band.

Pattern Based Community Detection in Graph Datasets:

The goal of this project is to add supervision to the process of community detection in large graphs. We extract telltale patterns from a small seed set of communities and use them to greatly improve the quality of detected communities.

Abstract:

In recent years there have been a few semi-supervised community detection approaches that use community membership information, or node metadata to improve their performance. However, communities have always been thought of as clique-like structures, while the idea of finding and leveraging other patterns in communities is relatively unexplored. Online social networks provide a corpus of real communities in large graphs which can be used to understand dataset specific community patterns. In this paper, we design a way to represent communities concisely in an easy to compute feature space. We design an efficient community detection algorithm that uses size and structural information of communities from a training set to find communities in the rest of the graph. We show that our approach achieves 10% higher F1 scores on average compared to several other methods on large real-world graph datasets, even when the training set is small.

Unique patterns discovered by Bespoke

Full Duplex System for Vehicular Networks:

Communication systems in vehicular networks face issues like dynamic environment, fast changing network size and membership, and high coordination costs. This project addresses those issues, and enables full-duplex communication between cars that is resilient to interference from other broadcast messages. The system is well suited for short road safety messages.

Abstract:

Reliable and timely delivery of periodic V2V (vehicle-to-vehicle) broadcast messages is essential for realizing the benefits of connected vehicles. Existing MAC protocols for ad hoc networks fall short of meeting these requirements. In this paper, we present, CoReCast, the first collision embracing protocol for vehicular networks. CoReCast provides high reliability and low delay by leveraging two unique opportunities: no strict constraint on energy consumption, and availability of GPS clocks to achieve near-perfect time and frequency synchronization. Due to low coherence time, the channel changes rapidly in vehicular networks. CoReCast embraces packet collisions and takes advantage of the channel dynamics to decode collided packets. The design of CoReCast is based on a preamble detection scheme that estimates channels from multiple transmitters without any prior information about them. The proposed scheme reduces the space and time requirement exponentially than the existing schemes.

Low Latency MAC for IoT Devices:

A MAC protocol designed specifically for IoT traffic in a dense IoT deployment. It eliminates the large channel access delay experienced by such devices by making interference power predictable across space and time, and adapting to an appropriate rate and modulation.

Abstract

The future Internet of Things (IoT) networks are expected to be composed of a large population of low-cost devices communicating dynamically with access points or neighboring devices to communicate small bundles of delay-sensitive data. To support the high-intensity and short-lived demands of these emerging networks, we propose an Efficient MAC paradigm for IoT (EMIT). Our paradigm bypasses the high overhead and coordination costs of existing MAC solutions by employing an interference-averaging strategy that allow users to share their resources simultaneously. In contrast to the predominant interference-suppressing approaches, EMIT exploits the dense and dynamic nature of IoT networks to reduce the spatio-temporal variability of interference to achieve low-delay and high-reliability in service. This paper introduces foundational ideas of EMIT by characterizing the global interference statistics in terms of single-device operation and develops power-rate allocation strategies to guarantee low-delay high-reliability performance.

Compared to CSMA, the variation in interference across space and time is reduced under the proposed scheme (EMIT)

Quantitative Image Analysis of Brain Tumor Histopathology:

High throughput medical imaging generates images that are good candidates for automated, data driven analysis. This project presents a pipeline for efficiently identifying patterns that can be used to detect gene expression from H&E stained biopsy images. The pipeline first identifies and segments cells present in the images. It then extracts local and global feactures based on patterns in cell count, density and layout that can be used for correlation analysis with gene expression data.

BiClustering on ChIP-Seq Data

Abstract:

In this paper we present a novel framework capable of finding statistically significant biclusters on ENCODE ChIP sequencing datasets. Our goal is to discover low biclusters with low variance in terms of gene expression, which in theory shold point to coherent functional modules.

Learning Cost-Sensitive Rules for Classification Tasks

Abstract:

Building accurate classifiers is very desirable for many KDD processes. Rule-based classifiers are appealing because of their simplicity and their self-explanatory nature in describing reasons for their decisions. The objective of classifiers generally has been to maximize the accuracy of predictions. When data points of different classes have different misclassification costs it becomes desirable to minimize the expected cost of the classification decisions. In this paper we present an algorithm for inducing a rule based classifier that (i) shifts the class boundaries so as to minimize the cost of misclassifications and (ii) refuses to announce a class decision for those regions of the data space that are likely to contribute significantly to the expected cost of decisions.

Courses Taught

Course Projects