IEEE 2020 ICCE-TW

Smart Technologies for
Consumer Electronics - AI, IoT and More


SEP. 28-30, 2020
South Garden Hotels and Resorts
Taoyuan, Taiwan


KEYNOTE SPEECHES


  • Jenq-Neng Hwang
  • Yoshikazu Miyanaga
  • Kohei Shiomoto


Jenq-Neng Hwang

Professor, Associate Chair
Department of Electrical and Computer Engineering
University of Washington, USA
IEEE Fellow

When 5G Meets with Big IoT Data for Coordinated Mining of 3D World

Thanks to the ultra-reliable low-latency communication (URLLC) capability of the emergent 5G mobile networks, the information derived from the roadside static surveillance or on-board moving IoT sensors (e.g., video cameras, Radars and Lidars), which can be jointly explored by the mobile edge computing (MEC) and real-time shared by all the local connected users for various smart city applications. To achieve this goal of coordinated mining of different modalities of IoT data, all of the detected/segmented and tracked human/vehicle objects need to be 3D localized in the world coordinate for effective 3D understanding of local dynamic evolutions. In this talk I will mainly talk about some challenges and potential solutions, more specifically, a robust tracking and 3D localization of detected objects, from either static/moving monocular video cameras, is proposed based on a variant of the Cascade R-CNN detector trained with triplet loss to obtain the accurate localization and the corresponding discriminating identity-aware features for tracking association, even with long-term occlusion, of each detected object in one-shot. When the cameras fail to reliably achieve these tasks due to poor lighting or adverse weather conditions, Radars and Lidars can offer more robust localization than the monocular cameras. However, the semantic information provided by the radio or point cloud data is limited and difficult to extract. In this talk, I will also introduce a radio object detection network (RODNet) to detect objects purely from radio signals captured by Radar based on an innovative cross-modal supervision framework, which utilizes the rich information extracted from the camera to teach object detection for Radar without tedious and laborious human labelling of ground truth on the Radar signals. Moreover, to compensate the disadvantage of Lidar detection on far-away small objects, effective integration of Lidar based detections, along with 2D object detections and 3D localization from monocular images based on 3D tracking associations, to achieve superior tracking and 3D localization performance. Finally, an efficient 3D human pose estimation for action description of detected human in natural monocular videos is also presented for finer-grained 3D scene understanding for smart city applications.

Biography:

Dr. Jenq-Neng Hwang received the BS and MS degrees, both in electrical engineering from the National Taiwan University, Taipei, Taiwan, in 1981 and 1983 separately. He then received his Ph.D. degree from the University of Southern California. In the summer of 1989, Dr. Hwang joined the Department of Electrical and Computer Engineering (ECE) of the University of Washington in Seattle, where he has been promoted to Full Professor since 1999. He served as the Associate Chair for Research from 2003 to 2005, and from 2011-2015. He is currently the Associate Chair for Global Affairs and International Development in the ECE Department. He is the founder and co-director of the Information Processing Lab., which has won CVPR AI City Challenges awards consecutively in the past years. He has written more than 360 journal, conference papers and book chapters in the areas of machine learning, multimedia signal processing, and multimedia system integration and networking, including an authored textbook on "Multimedia Networking: from Theory to Practice," published by Cambridge University Press. Dr. Hwang has close working relationship with the industry on multimedia signal processing and multimedia networking.

Dr. Hwang received the 1995 IEEE Signal Processing Society's Best Journal Paper Award. He is a founding member of Multimedia Signal Processing Technical Committee of IEEE Signal Processing Society and was the Society's representative to IEEE Neural Network Council from 1996 to 2000. He is currently a member of Multimedia Technical Committee (MMTC) of IEEE Communication Society and also a member of Multimedia Signal Processing Technical Committee (MMSP TC) of IEEE Signal Processing Society. He served as associate editors for IEEE T-SP, T-NN and T-CSVT, T-IP and Signal Processing Magazine (SPM). He is currently on the editorial board of ZTE Communications, ETRI, IJDMB and JSPS journals. He served as the Program Co-Chairs of IEEE ICME 2016, ISCAS 2009 and ICASSP 1998, as the General Co-Chair of IEEE MMSP 2019. Dr. Hwang is a fellow of IEEE since 2001.


Yoshikazu Miyanaga

Prof. Dr.-Eng.
Graduate School of Information Science and Technology
Hokkaido University, Japan
IEICE FM’09, IEEE SM’03, APSIPA M’09

Psychoacoustic Masking Effect for Robust Speech Communication Robot


Abstract:

This topic introduces the design of a noise robust speech recognition system. It is suitable for speech communication robots including an AI-ROBOT. For almost all of communication robots, a strong noisy robust speech recognition has been demanded. For both of a continuous speech dialog-based and a command-based automatic speech recognition (ASR) systems, we can design a strong robust ASR systems against various noise circumstances.

In this presentation, advanced speech analysis techniques named psychoacoustic masking effects have been introduced. In order to develop the robustness under low SNR, Dynamic Range Adjustment (DRA) and Modulation Spectrum Control (MSC) have been developed for the robust speech features and they focus on the speech feature adjustment with an important speech components. DRA normalizes dynamic ranges and MSC eliminates the noise corruption of speech feature parameters.

In addition to DRA and MSC, the psychoacoustic masking effects for speech feature extraction in automatic speech recognition (ASR) is also introduced in this presentation. It is based on the human auditory system. Generally, the mel-frequency cepstral coefficients (MFCC) are the most widely used speech features in ASR systems, and however one of their main drawbacks is background noise, which can affect and hamper the results. This presentation introduces noise robust speech features which improve upon MFCC. A psychoacoustic model-based feature extraction which simulates the perception of sound in the human auditory system is investigated and integrated into the MFCC. In this presentation, this new approach has been useful for noise robust speech recognition embedded into AI-Robots.

Biography:

He received the B.S., M.S., and Dr. Eng. degrees from Hokkaido University, Sapporo, Japan, in 1979, 1981, and 1986, respectively. Since 1983 he has been with Hokkaido University. He is now Professor at Division of Information Communication Systems in Graduate School of Information Science and Technology, Hokkaido University. He was the dean of Graduate School of Information Science and Technology (2014-2018). He is also the director of Global Station for Big-Data and Cybersecurity (GSB), GI-CoRE, Hokkaido University (2018-present). From 1984 to 1985, he was a visiting researcher at Department of Computer Science, University of Illinois, USA. His research interests are in the areas of speech signal processing, wireless communication signal processing and low-power consumption VLSI system design.

Dr. Miyanaga served as an associate editor of IEICE Transactions on Fundamentals of Electronics, Communications and Computer Science from 1996 to 1999, editors of IEICE Transactions on Fundamentals, Special Issues. He was a delegate of IEICE, Engineering Sciences Society Steering Committee, i.e., IEICE ESS Officers from 2004 to 2006. He was a chair of Technical Group on Smart Info-Media System, IEICE (IEICE TG-SIS) during the same period and now a member of the advisory committee, IEICE TG-SIS. He was vice-President, IEICE Engineering Science (ES) Society (2010-2011), President-elect (2014) and President (2015). He was the Editor-in-Chief, IEICE ESS (2016-2018). He is now an auditor of IEICE. He is Fellow member of IEICE.

He served as a member in the board of directors, IEEE Japan Council as a chair of student activity committee from 2002 to 2004. He was a secretary of IEEE Circuits and Systems Society, Technical Committee on Digital Signal Processing (IEEE CASS DSP TC) (2004-2006) and was its chair (2006-2008). He was a distinguished lecture (DL) of IEEE CAS Society (2010-2011) and a Board of Governor (BoG) of IEEE CAS Society (2011-2013). He was an associate editor of IEEE CAS Society Transaction on Circuits and Systems II (2012-2014). He was Chair, IEEE Sapporo Section (2017-2018).

He has been serving as a chair of international steering committee, IEEE ISPACS (2005-2007), and IEEE ISCIT (2006-2011). He was also an international steering committee member of IEEE ICME, IEEE/EURASIP NSIP, IEICE SISA et. al. He was an honorary chair and general chair/co-chairs of international symposiums/workshops, i.e., ISCIT 2005, NSIP 2005, ISCIT 2006, SISB 2007, ISPACS 2008, ISMAC 2009, ISMAC2010, APSIPA ASC 2009, IEICE ITC-CSCC 2011, APSIPA ASC 2011, ISMAC 2011, ISPACS2011, ISCIT 2012, ISMAC 2013, ISCIT 2013, ISMAC 2015, ISCIT 2015, ISMAC 2016, ISCIT 2016, ISMAC 2017, ISCIT 2017, ISCIT 2018, ISCAS 2019, ICCE-Asia 2019, ISMAC 2019, ISCIT 2019 and so on.


Kohei Shiomoto

Professor, Department of Information and Engineering,
Tokyo City University, Japan
IEICE Fellow

Semi-supervised Learning and Few-shot Learning in Data-Driven Management of Computer Networks

We are witnessing a remarkable advancement of machine learning in the last two decades. Machine learning has been applied to process various kind of data accumulated in hyper-scale data centers and has been successfully applied to various tasks such as picture image recognition, voice recognition, natural language processing. Recently many network researchers have begun to apply machine learning to network management tasks such as traffic analysis and control, failure detection and root cause analysis, security management, etc. The existing supervised machine learning algorithms require a large number of label-annotated data samples for training the model. To minimize the human labor-intensive and time-consuming dataset annotation task, it is thus required to find a data-efficient learning algorithm/technique to build a classifier model. We should also note that anomalies are difficult to occur in practice, so the anomaly classes are usually sparse in the dataset. As such it is extremely important for the operators to deal with an unbalanced data set where a few class has only a handful data instances while others have a lot of data instances. In this talk we discuss the issues in modern computer network management. Then we introduce machine learning including the latest topics in machine learning such as few-shot learning and semi-supervised learning. Few-shot learning is defined as a type of machine learning problems, where only a limited number of samples with supervised information for the target are available. Existing supervised machine learning algorithms need a plenty of training data while humans can recognize new object classes from very few samples. The goal of few-shot learning is to classify new data having seen only a few training data samples. In semi-supervised learning, we use a small number of label-annotated data to build an initial classifier and improve the classifier using a large number of unlabeled data. We learn how few-shot learning and semi-supervised learning applied to network management tasks by illustrating two usecases for network intrusion detection systems and LTE eNodeB KPI analysis.

Biography:

Kohei Shiomoto is a Professor at Tokyo City University, Tokyo Japan. He published 70+ journal papers and 170+ reviewed international conference papers in the areas of the network systems, routers, network management for the Internet, mobile, and cloud computing. He published 6 RFCs in IETF. His current interests include Network Virtualization, Data-Mining for Network Management, Traffic & QoE Management. He had been engaged in R&D in computer networks over 28 years in NTT laboratories, Japan. He produced many technologies to innovate Internet, Mobile, and Cloud and achieved research projects successfully introduced into NTT Group’s production systems. He has been full professor at Tokyo City University since 2017, where he is engaged in research and education in computer communications. He served as associate editor for IEEE Transactions on Network and Service Management (TNSM) and the IEICE Trans. Commn., Series/Guest editor in the IEEE Communications Magazine (ComMag) and Guest editor for IEEE TNSM. He also served as TPC Co-Chair for various prestigious conferences including IEEE NetSoft, Globecom/ICC symposium. He is a Fellow of IEICE and a Senior Member of IEEE, a Member of ACM.