kai@asdneat.ldpiaang

 Home News Experience Research Education Achievements
 I am a Research MSc student at MILA affiliated with University of Montréal supervised by Associate Professor Irina Rish. I also am the Founder and President at Landskape AI, a theoretical and analytical deep learning non-profit research organization. Previously, I served as a Machine Learning Engineer at Weights & Biases where I worked on the Frameworks and Integration Team. I broadly work on theoretical and analytical deep learning with focus on but not limited to the following domains: Anytime Learning Active Learning Adversarial Robustness Sparsity Presently, I am working as a Visiting Research Scholar on topics of Sparsity at VITA, UT-Austin, under Dr. Zhangyang Wang. In the past I have been fortunate to work with the likes of Dr. Amrita Chaturvedi from Indian Institute of Technology, Varanasi (IIT-BHU) in the field of biomedical data analysis and Vijay Kumar Verma from Indian Space Research Organization (ISRO) in the domain of Genetic Algorithms.

## News

Research Experience
 ResearcherApril. 2022 - Present Morgan Stanley Supervisor: Sahil Garg Research Area: Continual Learning, Time Series, High-dimensional model based clustering Visiting Research ScholarAug. 2021 - Present VITA, University of Texas at Austin Supervisor: Dr. Zhangyang Wang Research Area: Sparsity, Robustness and Knowledge Distillation. Research AssociateFeb. 2020 - Present Laboratory of Space Research (LSR), University of Hong Kong Supervisor: Dr. Quentin A. Parker Research Area: Computer Vision applications in PNe Exploration. Research InternJun. 2018 - Aug. 2018 NVIDIA AI Lab, Bennett University Supervisors: Dr. Deepak Garg and Dr. Suneet Gupta Research Area: Large Scale Visual Recognition.
 Founder, President and ResearcherSept. 2019 - Present Landskape AI Mentors: Assc. Prof. Jaegul Choo, Javier Ideami and Federico Lois Research Area: Analytical Deep Learning Theory. Machine Learning EngineerDec. 2020 - Oct. 2021 Weights & Biases Team: Frameworks and Integrations. Technical Content DeveloperJun. 2020 - Jan. 2021 Paperspace Blog Topic Area: Computer Vision (Attention Mechanisms).
 Publication *indicates equal contribution

 Mish: A Self Regularized Non-Monotonic Neural Activation Function Diganta Misra BMVC, 2020 project / paper / abstract / bibtex We propose Mish, a novel self-regularized non-monotonic activation function which can be mathematically defined as: $f(x)=xtanh(softplus(x))$. As activation functions play a crucial role in the performance and training dynamics in neural networks, we validated experimentally on several well-known benchmarks against the best combinations of architectures and activation functions. We also observe that data augmentation techniques have a favorable effect on benchmarks like ImageNet-1k and MS-COCO across multiple architectures. For example, Mish outperformed Leaky ReLU on YOLOv4 with a CSP-DarkNet-53 backbone on average precision ($AP^{val}_{50}$) by $2.1\%$ in MS-COCO object detection and ReLU on ResNet-50 on ImageNet-1k in Top-1 accuracy by $\approx 1 \%$ while keeping all other network parameters and hyperparameters constant. Furthermore, we explore the mathematical formulation of Mish in relation with the Swish family of functions and propose an intuitive understanding on how the first derivative behavior may be acting as a regularizer helping the optimization of deep neural networks. @article{misra2019mish, title={Mish: A self regularized non-monotonic neural activation function}, author={Misra, Diganta}, journal={arXiv preprint arXiv:1908.08681}, volume={4}, pages={2}, year={2019}, publisher={CoRR} } CV Talk Episode / ML Cafe Episode / Sicara Talk / W&B Salon Episode       For those who are curious, the name Mish was coined by my girlfriend. 👩‍💻 Rotate to Attend: Convolutional Triplet Attention Module Diganta Misra*, Trikay Nalamada*, Ajay Uppili Arasanipalai*, Qibin Hou WACV, 2021 project / paper / supplementary / video / abstract / bibtex Benefiting from the capability of building interdependencies among channels or spatial locations, attention mechanisms have been extensively studied and broadly used in a variety of computer vision tasks recently. In this paper, we investigate light-weight but effective attention mechanisms and present triplet attention, a novel method for computing attention weights by capturing cross-dimension interaction using a three-branch structure. For an input tensor, triplet attention builds inter-dimensional dependencies by the rotation operation followed by residual transformations and encodes inter-channel and spatial information with negligible computational overhead. Our method is simple as well as efficient and can be easily plugged into classic backbone networks as an add-on module. We demonstrate the effectiveness of our method on various challenging tasks including image classification on ImageNet-1k and object detection on MSCOCO and PASCAL VOC datasets. Furthermore, we provide extensive insight into the performance of triplet attention by visually inspecting the GradCAM and GradCAM++ results. The empirical evaluation of our method supports our intuition on the importance of capturing dependencies across dimensions when computing attention weights. @inproceedings{misra2021rotate, title={Rotate to attend: Convolutional triplet attention module}, author={Misra, Diganta and Nalamada, Trikay and Arasanipalai, Ajay Uppili and Hou, Qibin}, booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision}, pages={3139--3148}, year={2021} } APP: Anytime Progressive Pruning   New! Diganta Misra*, Bharat Runwal*, Tianlong Chen, Zhangyang Wang, Irina Rish Preprint, 2022 project / paper / webpage / abstract / bibtex With the latest advances in deep learning, there has been a lot of focus on the online learning paradigm due to its relevance in practical settings. Although many methods have been investigated for optimal learning settings in scenarios where the data stream is continuous over time, sparse networks training in such settings have often been overlooked. In this paper, we explore the problem of training a neural network with a target sparsity in a particular case of online learning: the anytime learning at macroscale paradigm (ALMA). We propose a novel way of progressive pruning, referred to as \textit{Anytime Progressive Pruning} (APP); the proposed approach significantly outperforms the baseline dense and Anytime OSP models across multiple architectures and datasets under short, moderate, and long-sequence training. Our method, for example, shows an improvement in accuracy of $\approx 7\%$ and a reduction in the generalization gap by $\approx 22\%$, while being $\approx 1/3$ rd the size of the dense baseline model in few-shot restricted imagenet training. We further observe interesting nonmonotonic transitions in the generalization gap in the high number of megabatches-based ALMA. The code and experiment dashboards can be accessed at \url{https://github.com/landskape-ai/Progressive-Pruning} and \url{https://wandb.ai/landskape/APP}, respectively. @misc{misra2022app, title={APP: Anytime Progressive Pruning}, author={Diganta Misra and Bharat Runwal and Tianlong Chen and Zhangyang Wang and Irina Rish}, year={2022}, eprint={2204.01640}, archivePrefix={arXiv}, primaryClass={cs.LG} } NSL presentation / MLC Research Jam #8 Genetic Algorithm Optimized Inkjet Printed Electromagnetic Absorber on Paper Substrate Diganta Misra, Rahul Pelluri, Vijay Kumar Verma, Bhargav Appasani, Nisha Gupta IEEE AESPC, 2018 paper / abstract / bibtex Printable electronics based electromagnetic absorbers are receiving increasing attention of the electromagnetic community because of their unprecedented advantages. This paper presents the design of printable electromagnetic absorbers for the X band. The design of the absorber is optimized using the Genetic Algorithm (GA) to enhance the absorptivity and the absorption bandwidth. The design involves the placement of several square-shaped conductive ink at optimal locations on the paper substrate such that desired absorption characteristics are obtained. Simulations are carried out using the HFSS simulation software. The optimized structure offers an absorptivity of more than 90% in the X band thereby proving to be a viable solution for stealth applications. @inproceedings{misra2018genetic, title={Genetic Algorithm Optimized Inkjet Printed Electromagnetic Absorber on Paper Substrate}, author={Misra, Diganta and Pelluri, Rahul and Verma, Vijay Kumar and Appasani, Bhargav and Gupta, Nisha}, booktitle={2018 International Conference on Applied Electromagnetics, Signal Processing and Communication (AESPC)}, volume={1}, pages={1--3}, year={2018}, organization={IEEE} }
 Research under progress/ under review
 Ticket2Dense Hypothesis Diganta Misra, Bharat Runwal, Gintare Karolina Dziugaite Shapely Tickets Diganta Misra, Bharat Runwal, Naomi Saphra, Annabelle Carrell Robust Generalisation(task variance)>Robust Generalisation(config variance) Diganta Misra, Bharat Runwal, Boris Knyazev, Marwa El Halabi, Yan Zhang PSNR: Progressive Scaled Noisy Reconstruction Richeek Das, Bharat Runwal, Diganta Misra Active Anytime Learning at Macroscale Diganta Misra, Bharat Runwal, Timothée Lesort, Lucas Caccia Predicting classifier accuracy from Kolmogrov Smirnov statistics of minimal exit depth Weijian Deng, Bharat Runwal, Diganta Misra, Liang Zheng Learning Curves for Continual Learning: Trade-off in generalization and forgetting in replay methods Diganta Misra, Oleksiy Ostapenko, Timothée Lesort, Irina Rish Reprogrammers are robust learners Pin Yu Chen, Huck C.-H. Yang, Diganta Misra, Bharat Runwal, Irina Rish Progressive Pruning and Growing in sequential learning Bharat Runwal, Richeek Das, Diganta Misra HT-RMT insights of sparse models Diganta Misra Curriculum Learning Arithmetic Compositionally Andrei Mircea, Diganta Misra Watermark degradation in continual learning frameworks Diganta Misra, Bharat Runwal Pointwise Flips: Flipping under adversarial training Bharat Runwal, Diganta Misra
 Open Source Frameworks & Projects   Sponsor
 Avalanche: an End-to-End Library for Continual Learning Dec'20 - Present I am an active lead maintainer of the Reproducible Continual Learning framework by Avalanche and also actively work on the evaluation framework of Avalanche mainly in the direction of integration of Weights & Biases API. Echo Jun'19 - Present Echo is an OSS deep learning package with support for TensorFlow, PyTorch and MegEngine, containing novel validated methods, components and building blocks used in deep learning. Evonorm Apr'20 Created the most popular open source reimplementation of Evolving Normalization-Activation Layers by Liu. et. al. ECANets Jan'21 Reproduced the CVPR 2020 paper: ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks for the ML Reproducibility Challenge 2020. Integrated with Weights & Biases. Big Bench Aug'21 Our fine grained tense modification task was accepted to Google's Big Bench for testing large LMs. In collaboration with Mukund Varma T.

Education
 Masters in Machine LearningSeptember 2021 - Present Montréal Institute of Learning Algorithms (MILA) Advisor: Associate Professor Irina Rish Montréal, Canada Masters of Science in Computer Science (MSc CS)September 2021 - Present University of Montréal Advisor: Associate Professor Irina Rish Montréal, Canada Bachelors of Technology (B.Tech) in EEEJun. 2016 - May. 2020 Kalinga Institute of Industrial Technology (KIIT) Advisor: Asst. Prof. Dr. Bhargav Appasani Bhubaneswar, India
Internships and Exchange Programs
 Data Science InternJun. 2018 - Feb. 2019 CSIR-CDRI During this internship, I was involved in building the analytical pipeline, data collection, pre-processing of data, cleaning of data, Geo-spatial Analysis of data and Document writing for the project on understanding demographics of Venture Capital and Early Seed Investments. As a part of a team of three, I was advised and mentored by Dr. Sukant Khurana. Remote Summer InternMay. 2018 - Jun. 2018 IIT-Kharagpur Studied basic algorithmic techniques using functional programming languages - Lisp and Prolog under the guidance of Assc. Prof. Pawan Kumar. Kharagpur, India Summer Exchange InternJun. 2017 - Aug. 2017 Bangkok University Served as a primary instructor for cultural engagements along with teaching basic english and computer science to primary grade students at RangsonWittaya School, Nakhon Sawan under the AIESEC SDG #4 programme. Was also part of culture exchange, entrepreneurship and social service programs at Bangkok University Bangkok, Thailand
 NeuroMatch Academy I am responsible for developing the content for the Strategies section in the Continual Learning lecture of the Deep Learning Cohort of Neuromatch Academy 2021. W&B ML Reproducibility Challenge I am the lead organizer of the W&B MLRC 2021 where I actively support our challenge participants. Our mission of organizing this challenge is to make machine learning research reproducible, transparent and accessible to everyone. This initiative is also supported by our W&B MLRC Grant of $500 for each participant. INF8225: Probabilistic Learning I am a teaching assistant for the INF8225: Probabilistic Learning at Polytechnique University taught by Christopher J. Pal for the Winter 2022 semester. Deep Learning Theory Reading Group, MILA I am an organizer of the DL Theory Reading Group at MILA, Montreal. MILA 2022 Entrepreneurs Cohort Program I was selected as one of the entrepreneurs in residence and pitched my startup idea called 9CLEF (Elevator Pitch). Served as a Program Committee member for: Conference on Lifelong Learning Agents(CoLLA) 2022  Achievements (Complete list available upon request)  Quebec Ministry of Higher Education International Students Scholarship 2022 I was awarded the DIRO x Quebec Ministry of Higher Education international students scholarship worth CAD$4,000 for the academic year 2022. UNIQUE AI Excellence Scholarship 2022 I was awarded the UNIQUE AI Excellence Scholarship worth CAD$10,000 for the academic year 2022. Under this scholarship, I will be working with Irina Rish and Pouya Bashivan on dynamic sparsity based research. MILA Entrepreneurs Grant 2022 I was awarded the MILA Entrepreneurs Grant worth CAD$5,000 to pursue my startup venture 9CLEF (Elevator Pitch) and build an early prototype. AMII AI Week Travel Bursary 2022 I was awarded the AMII AI Week 2022 Student Travel Bursary worth CAD\$1,500.