Sohan Chowdhury

Work place: Department of Computer Science, American International University-Bangladesh, Dhaka, Bangladesh

E-mail:

Website:

Research Interests: Computer Vision, Computational Learning Theory, Computer systems and computational processes

Biography

Sohan Chowdhury is an undergraduate (UG) student of Computer Science and Software Engineering under the Faculty of Science and Information Technology of American International University Bangladesh. His research interest and passion mostly focuses on but is not limited to Deep Learning, Computer Vision and Image Context Extraction. He has achieved first and second positions in multiple programming contests and hackathons worked with successful startups and worked on government projects. He is currently working as an Associate Software Engineer at Telenor Health AS.

Author Articles
Category Specific Prediction Modules for Visual Relation Recognition

By Sohan Chowdhury Tanbirul Hashan Afif Abdur Rahman A. F. M. Saifuddin Saif

DOI: https://doi.org/10.5815/ijmsc.2019.02.02, Pub. Date: 8 Apr. 2019

Object classification in an image does not provide a complete understanding of the information contained in it. Visual relation information such as “person playing with dog” provides substantially more understanding than just “person, dog”. The visual inter-relations of the objects can provide substantial insight for truly understanding the complete picture. Due to the complex nature of such combinations, conventional computer vision techniques have not been able to show significant promise. Monolithic approaches are lacking in precision and accuracy due to the vastness of possible relation combinations. Solving this problem is crucial to development of advanced computer vision applications that impact every sector of the modern world. We propose a model using recent advances in novel applications of Convolution Neural Networks (Deep Learning) combined with a divide and conquer approach to relation detection. The possible relations are broken down to categories such as spatial (left, right), vehicle-related (riding, driving), etc. Then the task is divided to segmenting the objects, estimating possible relationship category and performing recognition on modules specially built for that relation category. The training process can be done for each module on significantly smaller datasets with less computation required. Additionally this approach provides recall rates that are comparable to state of the art research, while still being precise and accurate for the specific relation categories.

[...] Read more.
Other Articles