HTML5 Webook
173/194

of multiple modalities of data and crossmodal AI tech-niques in understanding the surrounding world. We also introduce our multimodal and crossmodal AI framework for smart data analytics. Moreover, we present three in-stances of this framework to tackle air pollution, trac incident query, and congestion prediction problems. For each instance, we discuss the motivation and hypothesis by which we can adjust the general framework to adapt dif-ferent multimodal datasets and crossmodal AI to solve the problem. In the future, we will continue to extend the framework and develop accurate AI models to deal with more challenges. We also want to investigate more on mak-ing the framework able to work in a mobile environment (e.g., IoT devices) and distributed networks (e.g., Federated Learning). eferencesR1Baltrusaitis, T., Ahuja, C., and Morency, L.P., “Multimodal Machine Learning: A Survey and Taxonomy,” IEEE Trans. Pattern Anal. Mach. Intell. vol.41, issue 2, pp.423–443, Feb. 2019. 2Bayoudh, K., Knani, R., Hamdaoui, F. et al, “A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets,” Vision Computing, vol.38, pp.2939–2970 ,2022.3Vukotic, V., Raymond, C., and Gravier, G., “Bidirectional Joint Representation Learning with Symmetrical Deep Neural Networks for Multimodal and Crossmodal Applications,” Int. Conf. on Multimedia Retrieval (ICMR ’16), pp.343–346, 2016.4Wang, K.Y., Yin, Q.Y., Wang, W., Wu, S., and Wang, L, “A Comprehensive Survey on Cross-modal Retrieval,” CoRR abs/1607.06215 ,2016.5Ji et. al: CRET, “Cross-Modal Retrieval Transformer for Efficient Text-Video Retrieval,” SIGIR ’22, July 11–15, pp.949–959, 2022.6Khare, A., Parthasarathy, S., and Sundaram, S., “Self-Supervised Learning with CrossModal Transformers for Emotion Recognition,” IEEE Spoken Language Technology Workshop (SLT), pp.381–388, 2021.7Zhang, J.W., Wermter, S., Sun, F.C., Zhang, C.S., Engel, A.K, R¨oder, B., and Fu, X.L., “Editorial: Cross-Modal Learning: Adaptivity, Prediction and Interaction,” Frontiers in Neurorobotics, vol.16, Article 889911, April 2022.8Ravela, S., Torralba, A., Freeman, W. T., “An ensemble prior of image structure for cross-modal inference,” Tenth IEEE International Conference on Computer Vision (ICCV’05) vol.1, pp.871–876, 2005.9Lu, G.J., “Air pollution: A systematic review of its psychological, economic, and socialeffects.” Current Opinion in Psychology, vol.32, pp.52–65, 2020.10Duong, Q.D., Le, M.Q., Nguyen-Tai, T.L., Nguyen, D.H., Dao, M.S., and Nguyen, T.B., “An Effective AQI Estimation Using Sensor Data and Stacking Mechanism,” SoMeT 2021, pp.405–418, 2021.11Zhao, P. and Zettsu,K., “MASTGN: Multi-Attention Spatio-Temporal Graph Networksfor Air Pollution Prediction,” IEEE Big Data 2020, pp.1442–1448, 2020.12Liang, Y.C., Maimury, Y., Chen, A.L., and Juarez J.R.C., “Machine Learning-Based Prediction of Air Quality,” Applied Sciences. 2020, vol.10, no.24, 9151.13Dao, M.S., Zettsu, K., and Uday, R.K., “IMAGE-2-AQI: Aware of the Surrounding Air Qualification by a Few Images,” IEA/AIE (2): pp.335–346, 2021.14La, T.V., Dao, M.S., Tejima, K., Uday R.K., and Zettsu, K., “Improving the Awareness of Sustainable Smart Cities by Analyzing Lifelog Images and IoT Air Pollution Data,” IEEE BigData 2021, pp.3589-3594, 2021.15Adamov´a, V., “Dashcam as a device to increase the road safety level,” Int. Conf. on Innovations in Science and Education (CBU), pp.1–5, 2020.16Kim, J., Park, S., and Lee, U., “Dashcam witness: Video sharing motives and privacy concerns across different nations,” IEEE Access, vol.8, pp.425–437, 2020.17Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin, “Attention is all you need,” Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17).18Mai-Nguyen, A.V., Phan, T.D., Vo, A.K., Tran, V.L., Dao, M.S., and Zettsu, K., “BIDAL-HCMUS@LSC2020: An Interactive Multimodal Lifelog Retrieval with Query-to-Sample Attention-based Search Engine,” LSC@ICMR 2020: pp.43–49.19Dao, M.S., Pham, D.D., Nguyen, M.P., Nguyen, T.B., and Zettsu, K., “MM-trafficEvent: An Interactive Incident Retrieval System for First-view Travel-log Data,” IEEE BigData 2021, pp.4842–4851, 2021.20Wang, Y., “Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry and Fusion,” J. ACM 37, 4, Article 111,(Aug. 2018.21Dao, M.S., Nguyen, N.T., and Zettsu, K., “Multi-time-horizon Traffic Risk Prediction using Spatio-Temporal Urban Sensing Data Fusion,” 2019 IEEE International Conference on Big Data (Big Data).22Rodrigue, J.-P., “The Geography of Transport Systems, FIFTH EDITION, New York: Routledge, p.456, 2020. ISBN 978-0-367-36463-223Akhtar, M. and Moridpour, S., “A Review of Traffic Congestion Prediction Using Artificial Intelligence,” Journal of Advanced Transportation, vol.2021, Article ID 8878011, p.18, 2021.24Kumar, N and Raubal, M., “Applications of deep learning in congestion detec-tion, prediction and alleviation: A survey,” Transportation Research Part C: Emerging Technologies, vol.133, 2021, 103432.25Jiang, H.L., Li, Q., Jiang, Y., Shen, G.B., Sinnott, R., Tian, C., and Xu, M.G., “When machine learning meets congestion control: A survey and comparison,” Computer Networks, vol.192, 108033, 2021, ISSN 1389-1286.26Dao, MS., Uday Kiran, R., and Zettsu, K., “ Insights for Urban Road Safety: A New Fusion-3DCNN-PFP Model to Anticipate Future Congestion from Urban Sensing Data,” Kiran, R.U., Fournier-Viger, P., Luna, J.M., Lin, J.CW., Mondal, A. (eds), Periodic Pattern Mining, Springer, 2021.Minh-Son DAO (ダオ ミン ソン)Senior Researcher,Big Data Integration Center, Universal Communication Research InstitutePh.D.Information Technology 1674-2 マルチモーダルやクロスモーダルAIによるスマートなデータ分析

元のページ  ../index.html#173

このブックを見る