We address the challenge of zeroth-order online convex optimization where the objective function’s gradient exhibits sparsity. Our aim is to leverage this sparsity to obtain estimates of the objective function’s gradient when only a limited number of function samples are available. We tackle this problem by introducing the Compressive Online Gradient Optimization framework which allows compressive sensing methods previously applied to stochastic optimization to achieve regret bounds with an optimal dependence on the time horizon. Moreover, the number of samples required is only logarithmic in the full problem dimension. We use simulations and a microservice autoscaling problem to demonstrate CONGO’s superiority over gradient descent approaches that do not account for sparsity.
J. Carleton, P. Vijaykumar, D. Saxena, D. Narasimha, S. Shakkottai, A. Akella
We prove that a single layer softmax attention transformer (SAT) computes the optimal solution of canonical estimation problems in wireless communications in the limit of large prompt length. We also prove that the optimal configuration of such transformer is indeed the minimizer of the corresponding training loss.
V. T. Kunde, V. Rajagopalan, C. S. K. Valmeekam, K. Narayanan, S. Shakkottai, D. Kalathil, J.-F. Chamberland
We propose two novel distributionally robust direct preference optimization (DPO) algorithms, Wasserstein DPO (WDPO) and Kullback-Leibler DPO (KLDPO), that mitigate large language model (LLM) alignment failures arising from shifting user preferences. We prove theoretical sample complexity guarantees for these algorithms and also demonstrate empirically that WDPO and KLDPO significantly improve alignment performance under preference distribution shifts.
Z. Xu, S. Vemuri, K. Panaganti, D. Kalathil, R. Jain, D. Ramachandran
We propose DiffusionBlend which enables test-time alignment of Diffusion Models to user-defined weights for objectives and regularization without additional fine-tuning. DiffusionBlend leverages the closed-form solution of regularized fine-tuning to combine pre-finetuned drift terms for individual reward functions at test time, guided by user-specified objectives and regularization weights
M. Cheng, F. Doudi, D. Kalathil, M. Ghavamzadeh, P. R. Kumar
We show that English text is a lot more compressible than what was believed to be possible previously. Our lossless compression algorithm combines the prediction from modern large language models with arithmetic coding and compresses text to about 0.7~bits/character, outperforming state-of-the-art compressors like BSC and PAQ
C. S. K. Valmeekam, K. Narayanan, D. Kalathil, J.F. Chamberland, S. Shakkottai
We explore coverless steganography using a Large Language Model (LLM) to guide an arithmetic coding decoder in generating stego-texts, aiming to embed secret bits efficiently while maintaining natural language fluency. Our approach optimizes a replacement probability distribution under a KL divergence constraint, offering a closed-form solution and addressing practical challenges such as tokenization mismatch, vocabulary truncation, and integration with previous stega methods like Discop.
Y.S. Huang, P. Just, K. Narayanan, C. Tian
In coded distributed matrix multiplication, a master node wishes to compute large-scale matrix product with the help of several worker nodes, we proposed a class of codes to deal with this type of matrix multiplication, called Squeezed Random Khatri-Rao Product (RKRP) codes. We showed that squeezed RKRP codes are MDS with probability 1, and have optimial recovery thresholds with probability 1.
R. Ji, A. K. Pradhan, A. Heidarzadeh, K. Narayanan
IEEE Information Theory Workshop 2021 (ITW9611422)
Addressing the limitations of conventional scalar rewards, DOPL leverages pairwise preference feedback to navigate restless multi-armed bandit problems. By directly employing online preference learning, the approach achieves sublinear regret of O(√(T ln T)), offering both theoretical guarantees and empirical validation for adaptive decision-making in environments where reward signals are scarce or imprecise.
Guojun Xiong, Ujwal Dinesha, Debajoy Mukherjee, Jian Li, Srinivas Shakkottai
We consider a wireless network where each user/sensor wishes to communicate an image to a central base station for the purpose of classifying images with intrinsic class imbalance. We introduce a new framework for encoding images where a single neural network is used to perform joint source coding and channel coding for the purpose of classification. The main novelty in the proposed approach is that all users simultaneously transmit features extracted from their image over the air, and inference is performed directly on the received signal (noisy sum of the transmitted signals) using neural group testing. We show that our framework can achieve significantly lower communication and computational costs than traditional communication schemes while offering privacy and better classification performance.
C. S. K. Valmeekam, K. Narayanan, A. Sprintson
IEEE International Conference on Machine Learning for Communication and Networking (2024ICMLCN)
We propose a semi-blind decision-directed joint channel estimation and decoding algorithm for low-density parity check (LDPC) coded Orthogonal Frequency Division Multiplexing (OFDM) systems. There are two novel aspects in our algorithm - (i) an enhanced semi-blind channel estimation algorithm that hypothesizes a small set of symbols as additional pilots and (ii) an edge strength-based metric to determine a small subset of decisions at the output of the LDPC decoder as being reliable. Our algorithm has an order of magnitude lower complexity than a recently proposed machine learning-based channel estimation algorithm, while their performances are comparable.
C. S. K. Valmeekam, K. Narayanan
IEEE Wireless Communications and Networking Conference (2022WCNC)