Home News Experience Research Education Achievements

I am a Research MSc student at MILA affiliated with University of Montréal supervised by Associate Professor Irina Rish.

I also am the Founder and President at Landskape AI, a theoretical and analytical deep learning non-profit research organization. Previously, I served as a Machine Learning Engineer at Weights & Biases where I worked on the Frameworks and Integration Team.

I broadly work on theoretical and analytical deep learning with focus on but not limited to the following domains:

  • Anytime Learning
  • Active Learning
  • Adversarial Robustness
  • Sparsity

Presently, I am working as a Visiting Research Scholar on topics of Sparsity at VITA, UT-Austin, under Dr. Zhangyang Wang.
In the past I have been fortunate to work with the likes of Dr. Amrita Chaturvedi from Indian Institute of Technology, Varanasi (IIT-BHU) in the field of biomedical data analysis and Vijay Kumar Verma from Indian Space Research Organization (ISRO) in the domain of Genetic Algorithms.

CV  /  Google Scholar  /  GitHub  /  Blog  /  Twitter


profile photo


   Research Experience

ResearcherApril. 2022 - Present
Morgan Stanley
Supervisor: Sahil Garg
Research Area: Continual Learning, Time Series, High-dimensional model based clustering


Visiting Research ScholarAug. 2021 - Present
VITA, University of Texas at Austin
Supervisor: Dr. Zhangyang Wang
Research Area: Sparsity, Robustness and Knowledge Distillation.


Research AssociateFeb. 2020 - Present
Laboratory of Space Research (LSR), University of Hong Kong
Supervisor: Dr. Quentin A. Parker
Research Area: Computer Vision applications in PNe Exploration.


Research InternJun. 2018 - Aug. 2018
NVIDIA AI Lab, Bennett University
Supervisors: Dr. Deepak Garg and Dr. Suneet Gupta
Research Area: Large Scale Visual Recognition.

   Industrial and Leadership Experience

Founder, President and ResearcherSept. 2019 - Present
Landskape AI
Mentors: Assc. Prof. Jaegul Choo, Javier Ideami and Federico Lois
Research Area: Analytical Deep Learning Theory.


Machine Learning EngineerDec. 2020 - Oct. 2021
Weights & Biases
Team: Frameworks and Integrations.


Technical Content DeveloperJun. 2020 - Jan. 2021
Topic Area: Computer Vision (Attention Mechanisms).

*indicates equal contribution

nthu Mish: A Self Regularized Non-Monotonic Neural Activation Function
Diganta Misra

BMVC, 2020
project / paper / abstract / bibtex

We propose Mish, a novel self-regularized non-monotonic activation function which can be mathematically defined as: $f(x)=xtanh(softplus(x))$. As activation functions play a crucial role in the performance and training dynamics in neural networks, we validated experimentally on several well-known benchmarks against the best combinations of architectures and activation functions. We also observe that data augmentation techniques have a favorable effect on benchmarks like ImageNet-1k and MS-COCO across multiple architectures. For example, Mish outperformed Leaky ReLU on YOLOv4 with a CSP-DarkNet-53 backbone on average precision ($AP^{val}_{50}$) by $2.1\%$ in MS-COCO object detection and ReLU on ResNet-50 on ImageNet-1k in Top-1 accuracy by $\approx 1 \%$ while keeping all other network parameters and hyperparameters constant. Furthermore, we explore the mathematical formulation of Mish in relation with the Swish family of functions and propose an intuitive understanding on how the first derivative behavior may be acting as a regularizer helping the optimization of deep neural networks.

title={Mish: A self regularized non-monotonic neural activation function},
author={Misra, Diganta},
journal={arXiv preprint arXiv:1908.08681},
CV Talk Episode / ML Cafe Episode / Sicara Talk / W&B Salon Episode

GitHub Repo starsGitHub forks
Run on Gradient
For those who are curious, the name Mish was coined by my girlfriend. 👩‍💻
nthu Rotate to Attend: Convolutional Triplet Attention Module
Diganta Misra*, Trikay Nalamada*, Ajay Uppili Arasanipalai*, Qibin Hou

WACV, 2021
project / paper / supplementary / video / abstract / bibtex

Benefiting from the capability of building interdependencies among channels or spatial locations, attention mechanisms have been extensively studied and broadly used in a variety of computer vision tasks recently. In this paper, we investigate light-weight but effective attention mechanisms and present triplet attention, a novel method for computing attention weights by capturing cross-dimension interaction using a three-branch structure. For an input tensor, triplet attention builds inter-dimensional dependencies by the rotation operation followed by residual transformations and encodes inter-channel and spatial information with negligible computational overhead. Our method is simple as well as efficient and can be easily plugged into classic backbone networks as an add-on module. We demonstrate the effectiveness of our method on various challenging tasks including image classification on ImageNet-1k and object detection on MSCOCO and PASCAL VOC datasets. Furthermore, we provide extensive insight into the performance of triplet attention by visually inspecting the GradCAM and GradCAM++ results. The empirical evaluation of our method supports our intuition on the importance of capturing dependencies across dimensions when computing attention weights.

title={Rotate to attend: Convolutional triplet attention module},
author={Misra, Diganta and Nalamada, Trikay and Arasanipalai, Ajay Uppili and Hou, Qibin},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
GitHub Repo starsGitHub forks
nthu APP: Anytime Progressive Pruning   New!
Diganta Misra*, Bharat Runwal*, Tianlong Chen, Zhangyang Wang, Irina Rish

Preprint, 2022
project / paper / webpage / abstract / bibtex

With the latest advances in deep learning, there has been a lot of focus on the online learning paradigm due to its relevance in practical settings. Although many methods have been investigated for optimal learning settings in scenarios where the data stream is continuous over time, sparse networks training in such settings have often been overlooked. In this paper, we explore the problem of training a neural network with a target sparsity in a particular case of online learning: the anytime learning at macroscale paradigm (ALMA). We propose a novel way of progressive pruning, referred to as \textit{Anytime Progressive Pruning} (APP); the proposed approach significantly outperforms the baseline dense and Anytime OSP models across multiple architectures and datasets under short, moderate, and long-sequence training. Our method, for example, shows an improvement in accuracy of $\approx 7\%$ and a reduction in the generalization gap by $\approx 22\%$, while being $\approx 1/3$ rd the size of the dense baseline model in few-shot restricted imagenet training. We further observe interesting nonmonotonic transitions in the generalization gap in the high number of megabatches-based ALMA. The code and experiment dashboards can be accessed at \url{} and \url{}, respectively.

title={APP: Anytime Progressive Pruning},
author={Diganta Misra and Bharat Runwal and Tianlong Chen and Zhangyang Wang and Irina Rish},
NSL presentation / MLC Research Jam #8

GitHub Repo stars
nthu Genetic Algorithm Optimized Inkjet Printed Electromagnetic Absorber on Paper Substrate
Diganta Misra, Rahul Pelluri, Vijay Kumar Verma, Bhargav Appasani, Nisha Gupta

paper / abstract / bibtex

Printable electronics based electromagnetic absorbers are receiving increasing attention of the electromagnetic community because of their unprecedented advantages. This paper presents the design of printable electromagnetic absorbers for the X band. The design of the absorber is optimized using the Genetic Algorithm (GA) to enhance the absorptivity and the absorption bandwidth. The design involves the placement of several square-shaped conductive ink at optimal locations on the paper substrate such that desired absorption characteristics are obtained. Simulations are carried out using the HFSS simulation software. The optimized structure offers an absorptivity of more than 90% in the X band thereby proving to be a viable solution for stealth applications.

title={Genetic Algorithm Optimized Inkjet Printed Electromagnetic Absorber on Paper Substrate},
author={Misra, Diganta and Pelluri, Rahul and Verma, Vijay Kumar and Appasani, Bhargav and Gupta, Nisha},
booktitle={2018 International Conference on Applied Electromagnetics, Signal Processing and Communication (AESPC)},
Research under progress/ under review
Ticket2Dense Hypothesis
Diganta Misra, Bharat Runwal, Gintare Karolina Dziugaite

Shapely Tickets
Diganta Misra, Bharat Runwal, Naomi Saphra, Annabelle Carrell

Robust Generalisation(task variance)>Robust Generalisation(config variance)
Diganta Misra, Bharat Runwal, Boris Knyazev, Marwa El Halabi, Yan Zhang

PSNR: Progressive Scaled Noisy Reconstruction
Richeek Das, Bharat Runwal, Diganta Misra

Active Anytime Learning at Macroscale
Diganta Misra, Bharat Runwal, Timothée Lesort, Lucas Caccia

Predicting classifier accuracy from Kolmogrov Smirnov statistics of minimal exit depth
Weijian Deng, Bharat Runwal, Diganta Misra, Liang Zheng

Learning Curves for Continual Learning: Trade-off in generalization and forgetting in replay methods
Diganta Misra, Oleksiy Ostapenko, Timothée Lesort, Irina Rish

Reprogrammers are robust learners
Pin Yu Chen, Huck C.-H. Yang, Diganta Misra, Bharat Runwal, Irina Rish

Progressive Pruning and Growing in sequential learning
Bharat Runwal, Richeek Das, Diganta Misra

HT-RMT insights of sparse models
Diganta Misra

Curriculum Learning Arithmetic Compositionally
Andrei Mircea, Diganta Misra

Watermark degradation in continual learning frameworks
Diganta Misra, Bharat Runwal

Pointwise Flips: Flipping under adversarial training
Bharat Runwal, Diganta Misra

Open Source Frameworks & Projects Sponsor
nthu Avalanche: an End-to-End Library for Continual Learning
Dec'20 - Present

I am an active lead maintainer of the Reproducible Continual Learning framework by Avalanche and also actively work on the evaluation framework of Avalanche mainly in the direction of integration of Weights & Biases API.

GitHub Repo starsGitHub forks

nthu Echo
Jun'19 - Present

Echo is an OSS deep learning package with support for TensorFlow, PyTorch and MegEngine, containing novel validated methods, components and building blocks used in deep learning.

GitHub Repo starsGitHub forks

nthu Evonorm

Created the most popular open source reimplementation of Evolving Normalization-Activation Layers by Liu. et. al.

GitHub Repo starsGitHub forks

eca ECANets

Reproduced the CVPR 2020 paper: ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks for the ML Reproducibility Challenge 2020. Integrated with Weights & Biases.

GitHub Repo stars

nthu Big Bench

Our fine grained tense modification task was accepted to Google's Big Bench for testing large LMs. In collaboration with Mukund Varma T.

GitHub Repo starsGitHub forks


Masters in Machine LearningSeptember 2021 - Present
Montréal Institute of Learning Algorithms (MILA)
Advisor: Associate Professor Irina Rish
Montréal, Canada


Masters of Science in Computer Science (MSc CS)September 2021 - Present
University of Montréal
Advisor: Associate Professor Irina Rish
Montréal, Canada


Bachelors of Technology (B.Tech) in EEEJun. 2016 - May. 2020
Kalinga Institute of Industrial Technology (KIIT)
Advisor: Asst. Prof. Dr. Bhargav Appasani
Bhubaneswar, India

   Internships and Exchange Programs

Data Science InternJun. 2018 - Feb. 2019

During this internship, I was involved in building the analytical pipeline, data collection, pre-processing of data, cleaning of data, Geo-spatial Analysis of data and Document writing for the project on understanding demographics of Venture Capital and Early Seed Investments. As a part of a team of three, I was advised and mentored by Dr. Sukant Khurana.



Summer InternMay. 2018 - Jun. 2018

Studied basic algorithmic techniques using functional programming languages - Lisp and Prolog under the guidance of Assc. Prof. Pawan Kumar.

Kharagpur, India


Summer Exchange InternJun. 2017 - Aug. 2017
Bangkok University

Served as a primary instructor for cultural engagements along with teaching basic english and computer science to primary grade students at RangsonWittaya School, Nakhon Sawan under the AIESEC SDG #4 programme. Was also part of culture exchange, entrepreneurship and social service programs at Bangkok University

Bangkok, Thailand

Initiatives and Academic Services
NeuroMatch Academy

I am responsible for developing the content for the Strategies section in the Continual Learning lecture of the Deep Learning Cohort of Neuromatch Academy 2021.
W&B ML Reproducibility Challenge

I am the lead organizer of the W&B MLRC 2021 where I actively support our challenge participants. Our mission of organizing this challenge is to make machine learning research reproducible, transparent and accessible to everyone. This initiative is also supported by our W&B MLRC Grant of $500 for each participant.
INF8225: Probabilistic Learning

I am a teaching assistant for the INF8225: Probabilistic Learning at Polytechnique University taught by Christopher J. Pal for the Winter 2022 semester.
Deep Learning Theory Reading Group, MILA

I am an organizer of the DL Theory Reading Group at MILA, Montreal.
MILA 2022 Entrepreneurs Cohort Program

I was selected as one of the entrepreneurs in residence and pitched my startup idea called 9CLEF (Elevator Pitch).
Served as a Program Committee member for:

Conference on Lifelong Learning Agents(CoLLA) 2022
(Complete list available upon request)
Quebec Ministry of Higher Education International Students Scholarship

I was awarded the DIRO x Quebec Ministry of Higher Education international students scholarship worth CAD$4,000 for the academic year 2022.
UNIQUE AI Excellence Scholarship

I was awarded the UNIQUE AI Excellence Scholarship worth CAD$10,000 for the academic year 2022. Under this scholarship, I will be working with Irina Rish and Pouya Bashivan on dynamic sparsity based research.
MILA Entrepreneurs Grant

I was awarded the MILA Entrepreneurs Grant worth CAD$5,000 to pursue my startup venture 9CLEF (Elevator Pitch) and build an early prototype.
AMII AI Week Travel Bursary

I was awarded the AMII AI Week 2022 Student Travel Bursary worth CAD$1,500.

Updated on: 12th May, 2022 Merci, Jon Barron!