A Data Scientist's Industry Perspective on Productizing AI/ML Models
Abstract
For both data scientists and software developers, the shift from AI/ML models to production-ready AI-based systems is a problem. At this study, we present the findings of a workshop held in a consulting firm to learn how practitioners see this change. The key topics that arose, starting with the need to make AI experiments repeatable, were the usage of the Jupyter Notebook as the primary prototype tool and the absence of support for software engineering best practises as well as functionality particular to data science.
References
Gupta, K., Jiwani, N., & Whig, P. (2023). Effectiveness of Machine Learning in Detecting Early-Stage Leukemia. In International Conference on Innovative Computing and Communications (pp. 461-472). Springer, Singapore.
D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, et al., "Hidden technical debt in machine learning systems", Advances in neural information processing systems, pp. 2503-2511, 2015.
Jiwani, N., Gupta, K., & Whig, P. (2023). Analysis of the Potential Impact of Omicron Crises Using NLTK (Natural Language Toolkit). In Proceedings of Third Doctoral Symposium on Computational Intelligence (pp. 445-454). Springer, Singapore.
J. M. Perkel, "Why Jupyter is data scientists’ computational notebook of choice" in Nature, Nature Publishing Group, vol. 563, no. 7732, pp. 145-147, 2018.
Gupta, K., Jiwani, N., & Whig, P. (2023). An Efficient Way of Identifying Alzheimer’s Disease Using Deep Learning Techniques. In Proceedings of Third Doctoral Symposium on Computational Intelligence (pp. 455-465). Springer, Singapore.
S. Amershi, A. Begel, C. Bird, R. DeLine, H. Gall, E. Kamar, et al., "Software Engineering for Machine Learning: a Case Study", Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice, pp. 291-300, 2019.
M. Kim, T. Zimmermann, R. DeLine and A. Begel, "The Emerging Role of Data Scientists on Software Development Teams", Proceedings of the 38th International Conference on Software Engineering, pp. 96-107, 2016.
L. E. Lwakatare, A. Raj, J. Bosch, H. H. Olsson and I. Crnkovic, "A Taxonomy of Software Engineering Challenges for Machine Learning Systems: An Empirical Investigation" in Agile Processes in Software Engineering and Extreme Programming, Springer International Publishing, pp. 227-243, 2019.
N. Jiwani, K. Gupta and P. Whig, "Novel HealthCare Framework for Cardiac Arrest With the Application of AI Using ANN," 2021 5th International Conference on Information Systems and Computer Networks (ISCON), 2021, pp. 1-5, doi: 10.1109/ISCON52037.2021.9702493.