publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2025
- TMLRCoDe: Blockwise Control for Denoising Diffusion ModelsAnuj Singh, Sayak Mukherjee, Ahmad Beirami, and 1 more authorTransactions on Machine Learning Research, 2025
Aligning diffusion models to downstream tasks often requires finetuning new models or gradient-based guidance at inference time to enable sampling from the reward-tilted posterior. In this work, we explore a simple inference-time gradient-free guidance approach, called controlled denoising (CoDe), that circumvents the need for differentiable guidance functions and model finetuning. CoDe is a blockwise sampling method applied during intermediate denoising steps, allowing for alignment with downstream rewards. Our experiments demonstrate that, despite its simplicity, CoDe offers a favorable trade-off between reward alignment, prompt instruction following, and inference cost, achieving a competitive performance against the state-of-the-art baselines.
@article{singh2025code, title = {CoDe: Blockwise Control for Denoising Diffusion Models}, author = {Singh, Anuj and Mukherjee, Sayak and Beirami, Ahmad and Rad, Hadi J.}, journal = {Transactions on Machine Learning Research}, issn = {2835-8856}, year = {2025}, url = {https://openreview.net/forum?id=DqPCWMiMU0}, note = {}, } - ICMITowards Context-sensitive Emotion RecognitionSayak MukherjeeIn Proceedings of the 27th International Conference on Multimodal Interaction, , 2025
Achieving socially compatible human-AI interaction requires systems that can interpret and respond to human emotions appropriately in complex social environments. While traditional emotion recognition models rely heavily on facial or bodily expressions, a growing body of research demonstrates that such cues are insufficient without the dynamic, multimodal contextual cues. Positioned at the intersection of cognitive psychology and AI, this work identifies three essential qualities for context-sensitive emotion recognition (CSER): generalizability to unseen scenarios, data efficiency in adapting to new contexts, and reliability in predictive performance across contexts. We outline a research plan that systematically investigates the role of contextual factors, domain adaptation, and uncertainty quantification in building CSER models capable of robust performance across real-world settings. Our approach integrates computational rigour with ethical responsibility to lay the foundation for next-generation emotion-aware systems that are not only accurate but also trustworthy, transparent, and support human well-being in digital interactions.
@inproceedings{mukherjee2025towards, author = {Mukherjee, Sayak}, title = {Towards Context-sensitive Emotion Recognition}, year = {2025}, isbn = {9798400714993}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3716553.3750824}, doi = {10.1145/3716553.3750824}, booktitle = {Proceedings of the 27th International Conference on Multimodal Interaction}, pages = {730–734}, numpages = {5}, keywords = {Context-aware Emotion Recognition, Affective Computing, Context, Data-efficiency, Generalisation}, location = { }, series = {ICMI '25}, }
2024
- arXivMAPL: Model Agnostic Peer-to-peer LearningSayak Mukherjee, Andrea Simonetto, and Hadi Jamali-RadarXiv preprint arXiv:2403.19792, 2024
Effective collaboration among heterogeneous clients in a decentralized setting is a rather unexplored avenue in the literature. To structurally address this, we introduce Model Agnostic Peer-to-peer Learning (coined as MAPL) a novel approach to simultaneously learn heterogeneous personalized models as well as a collaboration graph through peer-to-peer communication among neighboring clients. MAPL is comprised of two main modules: (i) local-level Personalized Model Learning (PML), leveraging a combination of intra- and inter-client contrastive losses; (ii) network-wide decentralized Collaborative Graph Learning (CGL) dynamically refining collaboration weights in a privacy-preserving manner based on local task similarities. Our extensive experimentation demonstrates the efficacy of MAPL and its competitive (or, in most cases, superior) performance compared to its centralized model-agnostic counterparts, without relying on any central server.
@article{mukherjee2024mapl, title = {MAPL: Model Agnostic Peer-to-peer Learning}, author = {Mukherjee, Sayak and Simonetto, Andrea and Jamali-Rad, Hadi}, journal = {arXiv preprint arXiv:2403.19792}, year = {2024}, }