Arinjoy Basak

Arinjoy Basak

PhD Student, Dept. of Computer Science

Virginia Tech

Contact Me

About Me

I am a first-year PhD student in the Department of Computer Science at Virginia Tech. I am broadly interested in Data Mining, Big Data Analytics, Machine Learning, and Artificial Intelligence. I am currently working with Dr. Francisco Servant on Software Analytics.

Prior to joining CS@VT, I have worked in Dimensionality Reduction algorithms for Data Mining, Educational Big Data Analytics, and Cryptography. More details about my research work can be found here.

In my spare time, I also like to pursue photography and music, as well as catch up on my ever increasing reading list. No need to go too far - you can find what I'm listening to right now by just clicking here.

Education

Bachelor of Engineering, Computer Science and Technology

IIEST, Shibpur (formerly BESU, Shibpur) (August, 2012-May, 2016)

Primary and Secondary School

St. Xaviers's Collegiate School, Kolkata (2000-2012)

Teaching

Intro to Python (CS 1064)

Graduate Teaching Assistant Fall 2016


Learning

Probability and Distribution (STAT 5104))

Dr. Leanna House, Dept. of Statistics, Virginia Tech Fall 2016

Data Analytics 1 (CS 5525)

Dr. Chandan Reddy, Dept. of Computer Science, Virginia Tech Fall 2016

Research Experiences

PhD Student - CS@VT - Virginia Tech (August 2016 - present)

Working with Dr. Francisco Servant on Software Analytics at SEALAB.

Bachelor's Degree Project Work IIEST Shibpur, Howrah, India (August 2015 - May 2016)

I was working with Dr. Asit Kr. Das on the development of Graph Based Feature Selection Algorithms for my final year project, which resulted in a conference paper which was accepted at the IEEE-IEMCON 2016 Conference at Vancouver, Canada.

Ekalavya 2015 Summer Intern - IIT Bombay, India (May 2015 - July 2015) [link]

My main work was focused on developing Data Analytics components for IITBombayX Insights. I worked on the development of a Data Analytics module for InSights which would determine the regions of the lecture videos that were difficult for the students to grasp or understand. The work was supported by the MHRD (Ministry of Human Resource Development) National Mission on Education through ICT (Information and Communication Technology) undertaken by the institute, and all the R&D contributions made by the students were released in open source.

I also worked on the Software Specification of the Blended MOOCs system IITBombayX during first part, focusing on creation of Use Cases for the MIS Systems for IITBombayX.

Summer Research Internship - Indian Statistical Institute, Kolkata, India (May 2014 - July 2014)

I worked under Prof. Sushmita Ruj, in the group of Dr. Bimal Kr. Roy, Director of ISI and the Head of the Cryptology group in the sphere of Unattended Wireless Sensor Networks. I worked on developing a non-cryptographic technique of achieving data survivability and confidentiality in wireless sensor networks. We also published a paper on this work, which was subsequently accepted in the IEEE - International Conference on Advanced Information Networking and Applications (AINA) – 2015 and for presentation in the IEEE CPS Proceedings, held in Gwangju, Korea, from 24th to 27th March, 2015.

Project Work Indian Institute of Engineering Science and Technology, Shibpur, Howrah, India (August 2013 - May 2014)

I was working on Feature Selection algorithms in Data Mining with Dr Asit Kr. Das and with Dr. Saptarshi Ghosh, Department of Computer Science and Technology, IIEST Shibpur. My work was divided in two parts - Study of Rough Set theory and implementation of the Quick Reduct Extraction Algorithm using Rough Set theory, and development of an algorithm for dynamic extraction of most relevant features from a dataset using graph based algorithms.

My Current Curriculum Vitae

Recent Publications and Reports:




project name

A Graph Based Feature Selection Algorithm Utilizing Attribute Intercorrelation

Arinjoy Basak, Asit Kr. Das

The 7th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEEE - IEMCON 2016), University of British Columbia, Vancouver, Canada, October 2016


Recently, every enterprise generates large volumes of on a regular basis. Complex data mining and analysis techniques are used to feasibly analyse high dimensional data. Feature selection aids in this by providing a re- duced representation of this data while maintaining integrity. We propose a graph-based feature selection algorithm utilizing feature intercorrelation to construct a weighted attribute graph, from which attributes are iteratively removed to construct the reduct based on a scoring scheme. Disconnectivity of the graph serves as the point of termination for our algorithm. The performance of our algorithm on real valued and discretized datasets is evaluated statistically by generating the Receiver Operator Characteristic (ROC) curve for each reduced dataset, and by measuring accuracies for classification training tasks, for the datasets reduced by our method.

Check out the paper here




project name

[Report Excerpt] Data Analytics for IITBombayX (Based on OpenEdx InSights) - Detection of difficulty regions in lecture videos of students

Arinjoy Basak

Ekalavya 2015, Department of Computer Science, IIT Bombay

This part of the project report talks about the project work done regarding the development of models for the detection of difficulty regions in videos based on the students behaviour, recorded through the log events in IITBombayX. Such inferences can be determined for a video, in a particular course, based on the students' activities on the lecture video – and inform the same to the course instructors or coordinators, who would then take appropriate steps to address such a problem. We provide a basic outline of our idea to deal with the problem and subsequently, we describe how we had designed the data model and the functionality both at the processing and visualization level for a final implementation of the analytics module in the scenario of Big Data. After the final implementation and testing, the module was integrated with the OPENedX Insight analytics system for IITBombayX, and made available for use by the course instructors participating in the Blended MOOCs model. The entire work of this part was completed in 3 weeks of the internship.

Check out the excerpt of my work here

Check out the FULL report here




project name

Achieving Data Survivability and Confidentiality in Unattended Wireless Sensor Networks

Arpan Sen, Shrestha Ghosh, Arinjoy Basak, Harsh Parshuram Puria, Sushmita Ruj

The 29th IEEE International Conference on Advanced Information Networking and Applications (AINA-2015) , Gwangju, Korea, March 2015


In Unattended Wireless Sensor Networks (UWSNs) the nodes are subjected to hostile environment for sensing critical data. Due to the unattended nature of the network the sink is not always present. Hence, the nodes in the network are required to function in a distributed way in order to ensure Data Survivability and Data Confidentiality. In this work we address these two issues. We have proposed algorithm (s) to ensure Data Survivability by encryption and data replication. We propose a simple scheme for key management which ensures confidentiality by sharing the key among various nodes in the network so that the adversary cannot read the data by compromising a node in the network. We have compared our scheme with the existing ones, both mathematically and by simulations. Analysis shows that our scheme performs better in terms of overheads and efficiency.

Check out the paper here

My Google Scholar profile

Here's How to Contact me

  • Location: GTA Office Hours, Room 128, McBryde Hall, Virginia Tech

  • Email: email id