Anandharaju Durai Raju

Ph.D., Candidate @ Simon Fraser University
Vancouver, BC, Canada

An impact-driven AI/ML Applied Research Engineer solving real-world challenges with AI innovations.
I am passionate about classic (CNN, Recurrent) and advanced (multi-modal LLM, xLSTM) deep learning and their optimization.
I have 7+ years of previous professional experience as a backend lead in Retail and Telecom domains.

Email:    anandharaju@ieee.org
                 aduraira@sfu.ca
Phone:    (+1)-604-518-3116

Education

Simon Fraser University

Doctor of Philosophy, Computing Science

2019 Jan - 2025 (ETA)

Anna University

Bachelor of Engineering, Computer Science

2007 Aug - 2011 Apr

Skills

Languages

Python 2.x-3.x
Java 8.x
J2EE (JSP, Springboot)
Javascript
Apache Spark
Bash Scripting
REST API
HTML
CSS

Parallelization

Python (multiprocessing)
Dask (Pandas)
MPI with C++
HF Accelerate

AI Agents

smolAgents
LangChain
LangGraph
LlamaIndex
Azure AI
AWS Bedrock

Containers & Cloud

Docker
AWS

Relational Databases

Postgres
HBase
Oracle
DB2
MySQL

Operating Systems

Linux
Windows

Libraries & Frameworks

Tensorflow
Keras
PyTorch
HuggingFace
Unsloth AI

Tools/Packages

Ollama
SGLang
VLLM
Jinja
NLTK
Spacy
Postman
Logstash
Kafka
Grafana
Kibana
SLURM
Jupyter

IDE

PyCharm
Jupyter Notebook
Google Colab
Eclipse
IntelliJ
Visual Studio Code
Toad for Oracle
DBVisualizer for IBM DB2
Putty
MobaXterm

AI & ML Platforms

GCP
AWS
Microsoft Azure
Infosys NIA
IBM Watson

RPA Platforms

BluePrism
UIPath
Automation Anywhere
Edgeverve AssistEdge

Source Control & Management

GIT
TFS
SVN
JIRA
Confluence

Publications

Low Carbon Footprint Training for 1D-CNNs with Temporal Max-Pooling [Oct '24]

Highlights:

Successfully adopted “Sustainable AI” to the malware detection problem in Cybersecurity by addressing the time and space constraints of training CNN over extremely large sequence inputs (2^28 or 268 million timesteps). Achieved 22-times less GPU memory, half the training time and up to 7-times less carbon footprint than existing approaches, all without sacrificing model performance.

CNN
GPU
Large Sequences
Sustainability

LockBoost: Detecting Malware Binaries by Locking False Alarms. [Jul '22]

Highlights:

Improving the performance of CNN by locking FPR and boosting TPR, using a novel boosting method called 'LockBoost'. Surpassed SOTA performance by 2-9% over public malware datasets.

CNN
Large Sequences
TPR-FPR Tradeoff
Sequential
Tabular

A survey on cross-architectural IoT malware threat hunting [Jun '21]

Highlights:

Cross-Architectural IoT Malware Detection and Classification.

IoT
Malware

Echelon: Two-Tier Malware Detection for Raw Executables to Reduce False Alarms [Jan '21]

Highlights:

Improving malware detection for stringent false positive requirements.

Windows
Malware

On building machine learning pipelines for Android malware detection: a procedural survey of practices, challenges and opportunities. [Aug '22]

Highlights:

Survey on machine learning for Android malware

Android
ML

FAT ALBERT: Finding Answers in Large Texts using Semantic Similarity Attention Layer based on BERT [Aug '19]

Highlights:

This paper was developed as part of my Machine Learning course at SFU by Prof. Greg Mori

Machine Learning
Google BERT

Research Experience

Research Assistant

Data Mining Lab, SFU

I am working with Prof. Ke Wang and my current research focus in on optimizing deep learning / LLM training and inference.

[ Current Research ]

Optimization of Transformer for accelerating LLM inference over long sequences

Low GPU learning of Transformers and xLSTMs on unlimited sequences with CNN extractors

[ Past Research ]

Reduced GPU memory (22x), time (50%), and carbon footprint (7x) "without performance loss" in training malware classification CNNs on ultra-long sequences (>250M timesteps), achieved via a novel retroactive pruning and custom backpropagation [ Published in ACM CIKM 2024 ]

Surpassed state-of-the-art performance by 2-9% TPR @ 0.1% FPR using a novel boosting method designed for learning sequential representations [ Published in IJCNN 2022 ]

Expertise in optimizing LLM/DL GPU usage via gradient checkpointing, offloading, quantization and LoRA/QLoRA

Successfully adopted “Sustainable AI” to the malware detection problem in Cybersecurity by addressing the time and space constraints of training CNN over extremely large sequences.

Worked on a two-tiered architecture using TensorFlow (Keras) library to improve the malware detection performance of existing state-of-the-art convolutional neural network (CNN) models.

Devised and implemented three different variants of the two-tiered architecture in Python, and evaluated their results on two different datasets and achieved significant performance improvements.

[ Academic ]

Built and pre-trained (GPT and Llama) from scratch on FineWeb data

Fine-tuned multi-modal LLM for zero-shot audio classification and speech/visual QA

Topped leaderboard on MovieQA task by improving BERT via semantic sentence similarity-based input pruning

Fine-tuned TimeGPT, achieving 5x better multi time-series electricity demand forecasting than LGBM

Trained credit card fraud detection models (XGBoost, LightGBM, Variational AutoEncoder) with 97.6% accuracy

Completed a directed study over Lateral Movement based stealthy attacks that pose a critical threat in compromising network-wide systems.

Carried out teaching assistantship thrice for SFU data mining course and assisted 200+ students.

Jan 2019 - Present

Researcher, Intern

Data Privacy and Protection Technology Lab, Huawei Canada

Built and delivered a compact and top performing neural network model based on handcrafted LIEF features, which achieved 1st place in the leaderboard among the team members. The delivered model was capable of detecting highest rate of malware 97% for lowest rate of misclassifications all with affordable memory footprint.
Developed a research paper on advanced assembly learning for cross-architectural IoT malware threat hunting. (Submitted to IEEE Symposium on Security & Privacy)
Successfully published the first-ever survey paper on cross-architectural IoT malware threat hunting. (Published in IEEE Xplore). [Link to Paper]
Dynamically informed static analysis for advanced assembly learning to improve Opcode-based malware detection for hunting cross-architectural IoT ELF malware.
Model compression using Knowledge Distillation techniques.
Explored novel learning methods via 3D-CNNs and Grouped convolution for learning different feature types.
Deployed a CNN-based deep learning malware detection solution as a Docker release.
Worked on ML model Compression using teacher-student-based distillation methods & performed optimization

Jan 2021 - Dec 2021

Professional Experience

Technology Lead

Artificial Intelligence and Automation Services, Infosys Limited

Played the role of Feature Team Lead (Onsite + Offshore) driving the project development in a data provisioning platform for order visibility.
Managed a development team of 14 resources – for UI, API and Spark modules.
Planned Agile sprints including resource allocation and execution.
Developed Spark Streaming based modules for processing real-time event data.
Postgres database management activities for dev and test regions.
Played major role in designing and implementation of application code changes and enhancements in the form of change requests. These efforts fetched me “AWARD OF EXCELLENCE” recognition twice as well as appreciations from higher management and client side.

Jan 2018 - Dec 2018

Technology Analyst

NIA Expert Services and Solutions, Infosys Limited

Developed solutions for a petroleum client to automate gas quality and weather monitoring activities using Infosys’ Nia and RPA platforms along with open source technologies.
Understood and gathered knowledge over existing organizational data from people, processes and legacy software systems to discover the potential areas of automation.
Analyzed and Designed solutions in accordance to the client landscape, and estimated manual effort and cost for an RPA automation solution. Implemented Proof of Concepts for the proposed automation solutions.
Engaged clients as well as account teams driving the requirements discussion, and managed a team of 3.
Provided web-based inventory management of various product assortments for a confectionary brand.
Performed root cause analysis as well as estimation of efforts for defect/bug fixes.
Framed test cases and performed unit testing, integration testing and regression testing.
Involved in knowledge management activities at account level.
Performed database management activities and performance tuning activities to improve UI responsiveness.

Nov 2013 - Dec 2017

Senior Systems Engineer

Retail, Commerce and Logistics, Infosys Limited

Performed application development, maintenance and 24x7 support for an e-commerce B2B solution, and participated in solution engineering for placing different types of orders like standard, seasonal and returns.
Played the role of configuration controller at project level.
Developed SQL scripts and procedures for generating reports in PPMS.
Carried out tuning activities for increasing PPMS application performance.
Contributed for sending weekly and bi-weekly knowledge management mailers at account level.
Provided B2b and B2C sales maintenance and support for Webshop application’s SAP backend.

Jul 2012 - Oct 2013

Systems Engineer

Telecommunications and Networking, Infosys Limited

Infovista - a network performance management tool for providing a centralized and comprehensive view of performance data (Voice, Internet) across a network infrastructure.
Played major role in upgrading of Infovista VFK kit in all production servers, and provided support for centralized monitoring of Voice/Internet data performance.
Prepared SQL scripts for module level testing, rollout and functionality testing support, and carried out the Configuration Management activities for VFK Upgrade.

Sep 2011 - Jun 2012

Achievements

Won the "AWARD OF EXCELLENCE" - twice in 2016 and 2017 awarded by Infosys Limited at unit level, for adding business values to my projects with my commitment, hard work and competency.
Garnered "GOLD MEDAL" in my undergraduate studies from Anna University at state level in 2011.
Played the role of "STUDENT CHAIRMAN" of Computer Science department in Sona College of Technology during 2010 - 2011.
Had undergone "YOUNG STUDENT SCIENTIST PROGRAM", an initiative directly under the control of, by then, President of India, Dr. A.P.J Abdul Kalam, during my schooling, for three years in a row.
Certified in ROBOTIC PROCESS AUTOMATION TOOLS: Automation Anywhere Advanced RPA Professional.
Certified in "IBM RATIONAL APPLICATION DEVELOPER" for Websphere 6.0 during 2011.
Completed training in “IBM Watson V3 application development” and in “Perform cloud data science with Azure machine learning” during 2017.

Certifications

Natural Language Processing Specialization

Issuing Organization: Coursera (Click Here To View Credential)

Learned both the classical machine learning skills and the state-of-the-art deep learning techniques needed to build NLP systems. Equipped to design applications that perform question-answering and sentiment analysis, create tools to translate languages, and summarize text!

Question Answering
Sentiment Analysis
Auto Correct
Auto Complete
Parts of speech tagging
Text Summarization
Sentence Completion

Deep Learning Specialization

Issuing Organization: Coursera (Click Here To View Credential)

Build and train deep neural networks, identify key architecture parameters, implement vectorized neural networks and deep learning to applications. Train test sets, analyze variance for DL applications, use standard techniques and optimization algorithms, and build neural networks in TensorFlow. Build a CNN and apply it to detection and recognition tasks, use neural style transfer to generate art, and apply algorithms to image and video data. Build and train RNNs, work with NLP and Word Embeddings, and use HuggingFace tokenizers and transformer models to perform NER and Question Answering.

Deep Learning
Neural Networks
Convolutional Neural Networks
Sequence Models

Machine Learning with Tensorflow on Google Cloud

Issuing Organization: Coursera (Click Here To View Credential)

Frame a business use case as a machine learning problem. Gain a broad perspective of machine learning and where it can be used. Convert a candidate use case to be driven by machine learning. Recognize biases that machine learning can amplify.

Machine Learning

Projects

FAT ALBERT: Finding Answers in Large Texts using Semantic Similarity Attention Layer based on BERT

The focus of the problem is to perform multiple choice question answering using BERT (a state-of-the-art transformer network). This is achieved by alleviating the ability of BERT to support large text corpus by extracting the highest influence sentences through a semantic similarity model. Our approach outperformed the leading models and ranked first in the MovieQA challenge leaderboard with test accuracy of 87.79%.

BERT
PyTorch
CMPT726 Course

Learning to Rank Article Popularity

Implement models from Data-Mining’s Learning to Rank field to predict the most shared articles in a week using a provided data set.

Neural Networks
Keras
CMPT741 Course

Articles

Article 1

April 2020 [View Article]

A peek inside boosting classification performance by locking false alarms.

LockBoost
Medium.com