Anandharaju Durai Raju

(Anand)

Ph.D., Candidate @ Simon Fraser University, Vancouver, BC, Canada

A deep learning optimization enthusiast and Sustainable AI advocate.
An impactful applied researcher enabling AI innovations for real-world challenges.




Email:    anandharaju@ieee.org
                 aduraira@sfu.ca
Phone:    (+1)-604-518-3116

Education

Simon Fraser University

Doctor of Philosophy, Computing Science

2019 Jan - 2025 Jan (ETA)

Anna University

Bachelor of Engineering, Computer Science

2007 Aug - 2011 Apr

Skills

Languages
  • Python 2.x-3.x
  • Java 8.x
  • J2EE (JSP)
  • Javascript
  • Apache Spark
  • Bash Scripting
  • REST API
  • HTML
  • CSS
Parallelization
  • Python (multiprocessing)
  • Dask (Pandas)
  • MPI with C++
  •  
Containers & Cloud
  • Docker
  •   
  • AWS
  •   
Relational Databases
  • Postgres
  • HBase
  • Oracle
  • DB2
  • MySQL
Operating Systems
  • Linux
  •   
  • Windows
Libraries & Frameworks
  • Tensorflow
  •   
  • Keras
  •   
  • PyTorch
  •   
  • HuggingFace
  •   
  •             
  •             
  •             
  •             
  •             
  •             
  •             
  •             
Tools/Packages
     
  • Postman
  • Logstash
  • Kafka
  •   
  • Grafana
  •   
  • Kibana
  •   
  • SLURM
  •   
  • Jupyter
IDE
  • PyCharm
  •   
  • Jupyter Notebook
  • Google Colab
  • Eclipse
  •   
  • IntelliJ
  •   
  • Visual Studio Code
  •   
  • Toad for Oracle
  •   
  • DBVisualizer for IBM DB2
  • Putty
  • MobaXterm
AI & ML Platforms
  • GCP
  •   
  • AWS
  •   
  • Microsoft Azure
  •   
  • Infosys NIA
  •   
  • IBM Watson
  •   
RPA Platforms
  • BluePrism
  •   
  • UIPath
  •   
  • Automation Anywhere
  •   
  • Edgeverve AssistEdge
Source Control & Management
  • GIT
  •   
  • TFS
  •   
  • SVN
  •   
  • JIRA
  • Confluence
  •   

Publications

Low Carbon Footprint Training for 1D-CNNs with Temporal Max-Pooling  [Oct '24]


Highlights:

Successfully adopted “Sustainable AI” to the malware detection problem in Cybersecurity by addressing the time and space constraints of training CNN over extremely large sequence inputs (2^28 or 268 million timesteps). Achieved 22-times less GPU memory, half the training time and up to 7-times less carbon footprint than existing approaches, all without sacrificing model performance.

LockBoost: Detecting Malware Binaries by Locking False Alarms.  [Jul '22]


Highlights:

Improving the performance of CNN by locking FPR and boosting TPR, using a novel boosting method called 'LockBoost'. Surpassed SOTA performance by 2-9% over public malware datasets.

A survey on cross-architectural IoT malware threat hunting  [Jun '21]


Highlights:

Cross-Architectural IoT Malware Detection and Classification.

Echelon: Two-Tier Malware Detection for Raw Executables to Reduce False Alarms  [Jan '21]


Highlights:

Improving malware detection for stringent false positive requirements.

FAT ALBERT: Finding Answers in Large Texts using Semantic Similarity Attention Layer based on BERT  [Aug '19]


Highlights:

This paper was developed as part of my Machine Learning course at SFU by Prof. Greg Mori

Research Experience

Research Assistant

Data Mining Lab, SFU

  • I am working with Prof. Ke Wang and my current research focus in on the domain of Data Mining and Cybersecurity.
  • Worked on a two-tiered architecture using TensorFlow (Keras) library to improve the malware detection performance of existing state-of-the-art convolutional neural network (CNN) models.
  • Devised and implemented three different variants of the two-tiered architecture in Python, and evaluated their results on two different datasets and achieved significant performance improvements.
  • Completed a directed study over Lateral Movement based stealthy attacks that poses critical threat in compromising network wide systems.

  • Successfully adopted “Sustainable AI” to the malware detection problem in Cybersecurity by addressing the time and space constraints of training CNN over extremely large sequences.
  • Achieved 22-times less GPU memory, half the training time and up to 7-times less carbon footprint than existing approaches, all without sacrificing model performance.
  • Surpassed SOTA performance by 2-9% in public malware datasets using a novel boosting method - 'LockBoost'.
  • Enabled training hybrid techniques such as CNN-Transformers using low GPU resources.
  • Carried out teaching assistantship thrice for SFU data mining course and assisted 200+ students.
  • Jan 2019 - Present

    Researcher, Intern

    Data Privacy and Protection Technology Lab, Huawei Canada

    • Built and delivered a compact and top performing neural network model based on handcrafted LIEF features, which achieved 1st place in the leaderboard among the team members. The delivered model was capable of detecting highest rate of malware 97% for lowest rate of misclassifications all with affordable memory footprint.
    • Developed a research paper on advanced assembly learning for cross-architectural IoT malware threat hunting. (Submitted to IEEE Symposium on Security & Privacy)
    • Successfully published the first-ever survey paper on cross-architectural IoT malware threat hunting. (Published in IEEE Xplore). [Link to Paper]
    • Dynamically informed static analysis for advanced assembly learning to improve Opcode-based malware detection for hunting cross-architectural IoT ELF malware.
    • Model compression using Knowledge Distillation techniques.
    • Explored novel learning methods via 3D-CNNs and Grouped convolution for learning different feature types.
    • Deployed a CNN-based deep learning malware detection solution as a Docker release.
    • Worked on ML model Compression using teacher-student-based distillation methods & performed optimization

    Jan 2021 - Dec 2021

    Professional Experience

    Technology Lead

    Artificial Intelligence and Automation Services, Infosys Limited

    • Played the role of Feature Team Lead (Onsite + Offshore) driving the project development in a data provisioning platform for order visibility.
    • Managed a development team of 14 resources – for UI, API and Spark modules.
    • Planned Agile sprints including resource allocation and execution.
    • Developed Spark Streaming based modules for processing real-time event data.
    • Postgres database management activities for dev and test regions.

    Jan 2018 - Dec 2018

    Technology Analyst

    NIA Expert Services and Solutions, Infosys Limited

    • Developed solutions for a petroleum client to automate gas quality and weather monitoring activities using Infosys’ Nia and RPA platforms along with open source technologies.
    • Understood and gathered knowledge over existing organizational data from people, processes and legacy software systems to discover the potential areas of automation.
    • Analyzed and Designed solutions in accordance to the client landscape, and estimated manual effort and cost for an RPA automation solution. Implemented Proof of Concepts for the proposed automation solutions.
    • Engaged clients as well as account teams driving the requirements discussion, and managed a team of 3.
    • Provided web-based inventory management of various product assortments for a confectionary brand.
    • Played major role in designing and implementation of application code changes and enhancements in the form of change requests. These efforts fetched me “AWARD OF EXCELLENCE” recognition twice as well as appreciations from higher management and client side.
    • Performed root cause analysis as well as estimation of efforts for defect/bug fixes.
    • Framed test cases and performed unit testing, integration testing and regression testing.
    • Involved in knowledge management activities at account level.
    • Performed database management activities and performance tuning activities to improve UI responsiveness.

    Nov 2013 - Dec 2017

    Senior Systems Engineer

    Retail, Commerce and Logistics, Infosys Limited

    • Performed application development, maintenance and 24x7 support for an e-commerce B2B solution, and participated in solution engineering for placing different types of orders like standard, seasonal and returns.
    • Played the role of configuration controller at project level.
    • Developed SQL scripts and procedures for generating reports in PPMS.
    • Carried out tuning activities for increasing PPMS application performance.
    • Contributed for sending weekly and bi-weekly knowledge management mailers at account level.
    • Provided B2b and B2C sales maintenance and support for Webshop application’s SAP backend.

    Jul 2012 - Oct 2013

    Systems Engineer

    Telecommunications and Networking, Infosys Limited

    • Infovista - a network performance management tool for providing a centralized and comprehensive view of performance data (Voice, Internet) across a network infrastructure.
    • Played major role in upgrading of Infovista VFK kit in all production servers, and provided support for centralized monitoring of Voice/Internet data performance.
    • Prepared SQL scripts for module level testing, rollout and functionality testing support, and carried out the Configuration Management activities for VFK Upgrade.

    Sep 2011 - Jun 2012

    Achievements

    • Won the "AWARD OF EXCELLENCE" - twice in 2015 and 2016 awarded by Infosys Limited at unit level, for adding business values to my projects with my commitment, hard work and competency.
    • Garnered "GOLD MEDAL" in my undergraduate studies from Anna University at state level in 2011.
    • Played the role of "STUDENT CHAIRMAN" of Computer Science department in Sona College of Technology during 2010 - 2011.
    • Had undergone "YOUNG STUDENT SCIENTIST PROGRAM", an initiative directly under the control of, by then, President of India, Dr. A.P.J Abdul Kalam, during my schooling, for three years in a row.
    • Certified in ROBOTIC PROCESS AUTOMATION TOOLS: Automation Anywhere Advanced RPA Professional.
    • Certified in "IBM RATIONAL APPLICATION DEVELOPER" for Websphere 6.0 during 2011.
    • Completed training in “IBM Watson V3 application development” and in “Perform cloud data science with Azure machine learning” during 2017.

    Certifications

    Natural Language Processing Specialization

    Issuing Organization: Coursera  (Click Here To View Credential)

    Learned both the classical machine learning skills and the state-of-the-art deep learning techniques needed to build NLP systems. Equipped to design applications that perform question-answering and sentiment analysis, create tools to translate languages, and summarize text!

    Deep Learning Specialization

    Issuing Organization: Coursera  (Click Here To View Credential)

    Build and train deep neural networks, identify key architecture parameters, implement vectorized neural networks and deep learning to applications. Train test sets, analyze variance for DL applications, use standard techniques and optimization algorithms, and build neural networks in TensorFlow. Build a CNN and apply it to detection and recognition tasks, use neural style transfer to generate art, and apply algorithms to image and video data. Build and train RNNs, work with NLP and Word Embeddings, and use HuggingFace tokenizers and transformer models to perform NER and Question Answering.

    Machine Learning with Tensorflow on Google Cloud

    Issuing Organization: Coursera  (Click Here To View Credential)

    Frame a business use case as a machine learning problem. Gain a broad perspective of machine learning and where it can be used. Convert a candidate use case to be driven by machine learning. Recognize biases that machine learning can amplify.

    Projects

    FAT ALBERT: Finding Answers in Large Texts using Semantic Similarity Attention Layer based on BERT

    The focus of the problem is to perform multiple choice question answering using BERT (a state-of-the-art transformer network). This is achieved by alleviating the ability of BERT to support large text corpus by extracting the highest influence sentences through a semantic similarity model. Our approach outperformed the leading models and ranked first in the MovieQA challenge leaderboard with test accuracy of 87.79%.

    Learning to Rank Article Popularity

    Implement models from Data-Mining’s Learning to Rank field to predict the most shared articles in a week using a provided data set.

    Articles

    Article 1 

    April 2020 [View Article]

    A peek inside boosting classification performance by locking false alarms.