Publications

53 entries « ‹ 1 of 2 › »

2025

Yu-Huan Wu, Shi-Chen Zhang, Yun Liu, Le Zhang, Xin Zhan, Daquan Zhou, Jiashi Feng, Ming-Ming Cheng, Liangli Zhen

Low-Resolution Self-Attention for Semantic Segmentation Journal Article

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025.

Abstract | BibTeX | 标签: | Links:

Bing Li, Haotian Duan, Yun Liu, Le Zhang, Wei Cui, Joey Tianyi Zhou

STADe: Sensory Temporal Action Detection via Temporal-Spectral Representation Learning Journal Article

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025.

Abstract | BibTeX | 标签:

@article{nokey,

title = {STADe: Sensory Temporal Action Detection via Temporal-Spectral Representation Learning},

author = {Bing Li, Haotian Duan, Yun Liu, Le Zhang, Wei Cui, Joey Tianyi Zhou},

year  = {2025},

date = {2025-06-05},

urldate = {2025-06-05},

journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},

abstract = {Temporal action detection (TAD) is a vital challenge in computer vision and the Internet of Things, aiming to detect and identify actions within temporal sequences. While TAD has primarily been associated with video data, its applications can also be extended to sensor data, opening up opportunities for various real-world applications. However, applying existing TAD models to sensory signals presents distinct challenges such as varying sampling rates, intricate pattern structures, and subtle, noise-prone patterns. In response to these challenges, we propose a Sensory Temporal Action Detection (STADe) model. STADe leverages Fourier kernels and adaptive frequency filtering to adaptively capture the nuanced interplay of temporal and frequency features underlying complex patterns. Moreover, STADe embraces adaptability by employing deep fusion at varying resolutions and scales, making it versatile enough to accommodate diverse data characteristics, such as the wide spectrum of sampling rates and action durations encountered in sensory signals. Unlike conventional models with unidirectional category-to-proposal dependencies, STADe adopts a cross-cascade predictor to introduce bidirectional and temporal dependencies within categories. To extensively evaluate STADe and promote future research in sensory TAD, we establish three diverse datasets using various sensors, featuring diverse sensor types, action categories, and sampling rates. Experiments across one public and our three new datasets demonstrate STADe's superior performance over state-of-the-art TAD models in sensory TAD tasks. Codes, models, and data will be released.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Cheng Lei, Ao Li, Hu Yao, Ce Zhu, Le Zhang

Rethinking Token Reduction with Parameter-Efficient Fine-Tuning in ViT for Pixel-Level Tasks Inproceedings

In: CVPR, 2025.

Abstract | BibTeX | 标签:

Xiangtao Zhang, Sheng Li, Ao Li, Yipeng Liu, Fan Zhang, Ce Zhu, Le Zhang

Subspace Constraint and Contribution Estimation for Heterogeneous Federated Learning Inproceedings

In: CVPR, 2025.

Abstract | BibTeX | 标签:

Hao Yu, Xin Yang, Le Zhang, Hanlin Gu, Tianrui Li, Lixin Fan, Qiang Yang

Handling Spatial-Temporal Data Heterogeneity for Federated Continual Learning via Tail Anchor Inproceedings

In: CVPR, 2025.

Abstract | BibTeX | 标签:

Zhendong Liu, Le Zhang, Bing Li, Yingjie Zhou, Zhenghua Chen, Ce Zhu

WiFi CSI Based Temporal Activity Detection Via Dual Pyramid Network Conference

AAAI, 2025.

Abstract | BibTeX | 标签: | Links:

Ao Li, Le Zhang, Yun Liu, Ce Zhu

Exploring Frequency-Inspired Optimization in Transformer for Efficient Single Image Super-Resolution Journal Article

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025.

Abstract | BibTeX | 标签: | Links:

@article{nokey,

title = {Exploring Frequency-Inspired Optimization in Transformer for Efficient Single Image Super-Resolution},

author = {Ao Li, Le Zhang, Yun Liu, Ce Zhu},

url = {https://github.com/AVC2-UESTC/Frequency-Inspired-Optimization-for-EfficientSR},

year  = {2025},

date = {2025-01-05},

urldate = {2025-01-05},

journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},

abstract = {Transformer-based methods have exhibited remarkable potential in single image super-resolution (SISR) by effectively extracting long-range dependencies. However, most of the current research in this area has prioritized the design of transformer blocks to capture global information, while overlooking the importance of incorporating high-frequency priors, which we believe could be beneficial. In our study, we conducted a series of experiments and found that transformer structures are more adept at capturing low-frequency information, but have limited capacity in constructing high-frequency representations when compared to their convolutional counterparts. Our proposed solution, the cross-refinement adaptive feature modulation transformer (CRAFT), integrates the strengths of both convolutional and transformer structures. It comprises three key components: the high-frequency enhancement residual block (HFERB) for extracting high-frequency information, the shift rectangle window attention block (SRWAB) for capturing global information, and the hybrid fusion block (HFB) for refining the global representation. To tackle the inherent intricacies of transformer structures, we introduce a frequency-guided post-training quantization (PTQ) method aimed at enhancing CRAFT's efficiency. These strategies incorporate adaptive dual clipping and boundary refinement. To further amplify the versatility of our proposed approach, we extend our PTQ strategy to function as a general quantization method for transformer-based SISR techniques. Our experimental findings showcase CRAFT's superiority over current state-of-the-art methods, both in full-precision and quantization scenarios. These results underscore the efficacy and universality of our PTQ strategy.  Code available: https://github.com/AVC2-UESTC/Frequency-Inspired-Optimization-for-EfficientSR},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Xinhao Li, Yun Liu, Guolei Sun, Min Wu, Le Zhang, Ce Zhu

Towards Open-Vocabulary Video Semantic Segmentation Journal Article

In: IEEE Transactions on Multimedia, 2025.

Abstract | BibTeX | 标签: | Links:

Weiting Ou, Yipeng Liu, Zhijie Sun, Bing Li, Le Zhang, Ce Zhu

Codar: Complex-valued Neural Network for Crossing-Floor Intrusion Detection via WiFi Conference

IEEE International Conference on Acoustics, Speech and Signal Processing, 2025.

Abstract | BibTeX | 标签: | Links:

Obed Irihose, Le Zhang

ExVC: Leveraging Mixture of Experts Models for Efficient Zero-shot Voice Conversion Conference

IEEE International Conference on Acoustics, Speech and Signal Processing, 2025.

Abstract | BibTeX | 标签:

2024

Zhengyuan Xie, Haiquan Lu, Jia-wen Xiao, Enguang Wang, Le Zhang, Xialei Liu

Early Preparation Pays Off: New Classifier Pre-tuning for Class Incremental Semantic Segmentation Conference

ECCV, 2024.

Abstract | BibTeX | 标签: | Links:

Cheng Gong, Yao Chen, Qiuyang Luo, Ye Lu, Tao Li, Yuzhi Zhang, Yufei Sun, Le Zhang

Deep Feature Surgery: Towards Accurate and Efficient Multi-Exit Networks Conference

ECCV, 2024.

Abstract | BibTeX | 标签:

Bing Li, Wei Cui, Le Zhang, Qi Yang, Min Wu, Joey Tianyi Zhou

Democratizing Federated WiFi-based Human Activity Recognition Using Hypothesis Transfer Journal Article

In: IEEE Transactions on Mobile Computing, 2024.

Abstract | BibTeX | 标签:

@article{nokey,

title = {Democratizing Federated WiFi-based Human Activity Recognition Using Hypothesis Transfer},

author = {Bing Li, Wei Cui, Le Zhang, Qi Yang, Min Wu, Joey Tianyi Zhou},

year  = {2024},

date = {2024-11-12},

urldate = {2024-11-12},

journal = {IEEE Transactions on Mobile Computing},

abstract = {Human activity recognition (HAR) is a crucial task in IoT systems with applications ranging from surveillance and intruder detection to home automation and more. Recently, non-invasive HAR utilizing WiFi signals has gained considerable attention due to advancements in ubiquitous WiFi technologies. However, recent studies have revealed significant privacy risks associated with WiFi signals, raising concerns about bio-information leakage. To address these concerns, the decentralized paradigm, particularly federated learning (FL), has emerged as a promising approach for training HAR models while preserving data privacy. Nevertheless, FL models may struggle in end-user environments due to substantial domain discrepancies between the source training data and the target end-user environment. This discrepancy arises from the sensitivity of WiFi signals to environmental changes, resulting in notable domain shifts. As a consequence, FL-based HAR approaches often face challenges when deployed in real-world WiFi environments. Albeit there are pioneer attempts on federated domain adaptation, they typically require non-trivial communication and computation cost, which is prohibitively expensive especially considering edge-based hardware equipment of end-user environment. In this paper, we propose a model to democratize the WiFi-based HAR system by enhancing recognition accuracy in unannotated end-user environments while prioritizing data privacy. Our model leverages the hypothesis transfer and a lightweight hypothesis ensemble to mitigate negative transfer. We prove a tighter theoretical upper bound compared to existing multi-source federated domain adaptation models. Extensive experiments shows our model improves the average accuracy by approximately 10 absolute percentage points in both cross-person and cross-environment settings comparing several state-of-the-art baselines.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Le Zhang, Qibin Hou, Yun Liu, Jia-Wang Bian, Xun Xu, Joey Tianyi Zhou, Ce Zhu

Deep Negative Correlation Classification Journal Article

In: Machine Learning, 2024.

Abstract | BibTeX | 标签:

Boyuan Sun, Yuqi Yang, Le Zhang, Ming-Ming Cheng, Qibin Hou

CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation Inproceedings

In: CVPR, 2024.

Abstract | BibTeX | 标签:

Zhiwei Lin, Zhe Liu, Zhongyu Xia, Xinhao Wang, Yongtao Wang, Shengxiang Qi, Yang Dong, Nan Dong, Le Zhang, Ce Zhu

RCBEVDet: Radar-camera Fusion in Bird’s Eye View for 3D Object Detection Inproceedings

In: CVPR, 2024.

Abstract | BibTeX | 标签:

@inproceedings{nokey,

title = {RCBEVDet: Radar-camera Fusion in Bird’s Eye View for 3D Object Detection},

author = {Zhiwei Lin, Zhe Liu, Zhongyu Xia, Xinhao Wang, Yongtao Wang, Shengxiang Qi, Yang Dong, Nan Dong, Le Zhang, Ce Zhu},

year  = {2024},

date = {2024-02-01},

urldate = {2024-02-01},

booktitle = {CVPR},

abstract = {Three-dimensional object detection is one of the key tasks in autonomous driving. To reduce costs in practice, low-cost multi-view cameras for 3D object detection are proposed to replace the expansive LiDAR sensors. However, relying solely on cameras is difficult to achieve highly accurate and robust 3D object detection. An effective solution to this issue is combining multi-view cameras with the economical millimeter-wave radar sensor to achieve more reliable multi-modal 3D object detection. In this paper, we

introduce RCBEVDet, a radar-camera fusion 3D object detection method in the bird’s eye view (BEV). Specifically, we first design RadarBEVNet for radar BEV feature extraction. RadarBEVNet consists of a dual-stream radar backbone and a Radar Cross-Section (RCS) aware BEV encoder.  In the dual-stream radar backbone, a point-based encoder and a transformer-based encoder are proposed to extract radar features, with an injection and extraction module to facilitate communication between the two encoders. The RCS-aware BEV encoder takes RCS as the object size prior to scattering the point feature in BEV. Besides, we present the Cross-Attention Multi-layer Fusion module to automatically align the multi-modal BEV feature from radar and camera with the deformable attention mechanism, and then fuse the feature with channel and spatial fusion layers. Experimental results show that RCBEVDet achieves new state-of-the-art radar-camera fusion results on nuScenes and view-of-delft (VoD) 3D object detection benchmarks. Furthermore, RCBEVDet achieves better 3D detection results than all real-time camera-only and radar-camera 3D object detectors with a faster inference speed at 21∼28 FPS.

},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Tian Gao, Cheng-Zhong Xu, Le Zhang, Hui Kong

GSB: Group superposition binarization for vision transformer with limited training samples Journal Article

In: Neural Networks, 2024.

Abstract | BibTeX | 标签: | Links:

@article{nokey,

title = {GSB: Group superposition binarization for vision transformer with limited training samples},

author = {Tian Gao, Cheng-Zhong Xu, Le Zhang, Hui Kong},

url = {https://github.com/IMRL/GSB-Vision-Transformer},

year = {2024},

date = {2024-01-01},

urldate = {2024-01-01},

journal = {Neural Networks},

abstract = {Vision Transformer (ViT) has performed remarkably in various computer vision tasks. Nonetheless, affected by the massive amount of parameters, ViT usually suffers from serious overfitting problems with a relatively limited number of training samples. In addition, ViT generally demands heavy computing resources, which limit its deployment on resource-constrained devices. As a type of model-compression method, model binarization is potentially a good choice to solve the above problems. Compared with the full-precision one, the model with the binarization method replaces complex tensor multiplication with simple bit-wise binary operations and represents full-precision model parameters and activations with only 1-bit ones, which potentially solves the problem of model size and computational complexity, respectively. In this paper, we investigate a binarized ViT model. Empirically, we observe that the existing binarization technology designed for Convolutional Neural Networks (CNN) cannot migrate well to a ViT’s binarization task. We also find that the decline of the accuracy of the binary ViT model is mainly due to the information loss of the Attention module and the Value vector. Therefore, we propose a novel model binarization technique, called Group Superposition Binarization (GSB), to deal with these issues. Furthermore, in order to further improve the performance of the binarization model, we have investigated the gradient calculation procedure in the binarization process and derived more proper gradient calculation equations for GSB to reduce the influence of gradient mismatch. Then, the knowledge distillation technique is introduced to alleviate the performance degradation caused by model binarization. Analytically, model binarization can limit

the parameter’s search space during parameter updates while training a model. Therefore, the binarization process can actually play an implicit regularization role and help solve the problem of overfitting in the case of insufficient training data. Experiments on three datasets with limited numbers of training samples demonstrate that the proposed GSB model achieves state-of-the-art performance among the binary quantization schemes and exceeds its full-precision counterpart on some indicators. Code and

models are available at: https://github.com/IMRL/GSB-Vision-Transformer.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Vision Transformer (ViT) has performed remarkably in various computer vision tasks. Nonetheless, affected by the massive amount of parameters, ViT usually suffers from serious overfitting problems with a relatively limited number of training samples. In addition, ViT generally demands heavy computing resources, which limit its deployment on resource-constrained devices. As a type of model-compression method, model binarization is potentially a good choice to solve the above problems. Compared with the full-precision one, the model with the binarization method replaces complex tensor multiplication with simple bit-wise binary operations and represents full-precision model parameters and activations with only 1-bit ones, which potentially solves the problem of model size and computational complexity, respectively. In this paper, we investigate a binarized ViT model. Empirically, we observe that the existing binarization technology designed for Convolutional Neural Networks (CNN) cannot migrate well to a ViT’s binarization task. We also find that the decline of the accuracy of the binary ViT model is mainly due to the information loss of the Attention module and the Value vector. Therefore, we propose a novel model binarization technique, called Group Superposition Binarization (GSB), to deal with these issues. Furthermore, in order to further improve the performance of the binarization model, we have investigated the gradient calculation procedure in the binarization process and derived more proper gradient calculation equations for GSB to reduce the influence of gradient mismatch. Then, the knowledge distillation technique is introduced to alleviate the performance degradation caused by model binarization. Analytically, model binarization can limit
the parameter’s search space during parameter updates while training a model. Therefore, the binarization process can actually play an implicit regularization role and help solve the problem of overfitting in the case of insufficient training data. Experiments on three datasets with limited numbers of training samples demonstrate that the proposed GSB model achieves state-of-the-art performance among the binary quantization schemes and exceeds its full-precision counterpart on some indicators. Code and
models are available at: https://github.com/IMRL/GSB-Vision-Transformer.

Aiping Huang, Lijian Li, Le Zhang, Yuzhen Niu, Tiesong Zhao, Chia-Wen Lin

Multi-View Graph Embedding Learning for Image Co-Segmentation and Co-Localization Journal Article

In: IEEE Transactions on Circuits and Systems for Video Technology, 2024.

Abstract | BibTeX | 标签:

2023

Wei Meng, Zhicong Liu, Bing Li, Wei Cui, Joey Tianyi Zhou, Le Zhang

GrapHAR: A Lightweight Human Activity Recognition Model by Exploring the Sub-carrier Correlations Journal Article

In: IEEE Transactions on Wireless Communications, 2023.

Abstract | BibTeX | 标签:

@article{nokey,

title = {GrapHAR: A Lightweight Human Activity Recognition Model by Exploring the Sub-carrier Correlations},

author = {Wei Meng, Zhicong Liu, Bing Li, Wei Cui, Joey Tianyi Zhou, Le Zhang},

year  = {2023},

date = {2023-08-08},

journal = {IEEE Transactions on Wireless Communications},

abstract = {Human activity recognition (HAR) is an important task due to its far-reaching applications, such as surveillance, healthcare systems, and human-computer interaction. Recently, Channel State Information (CSI)-based HAR has attracted increasing attention in the research community due to its ubiquitous availability, good user privacy, and fewer constraints on working conditions. Most of the existing methods for CSI-based HAR use various deep learning models, such as Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM), and Transformers, to distinguish activities based on their temporal patterns. Despite their remarkable effectiveness, these methods solely focus on temporal patterns while ignoring the correlations among sub-carriers. This limitation prevents them from achieving further performance improvement. Moreover, recent works often involve advanced yet massive and inefficient neural architectures, like Transformers, to obtain satisfactory recognition accuracy. The performance gain is traded off with a steep increase in model complexity, which leads to low efficacy and high training/inference costs outsides the small time window. To address these issues, we propose a lightweight CSI-based HAR model. Our model makes the first effort to explore the graphical correlations of CSI sub-carriers, working in conjunction with a temporal causal convolution module. The high efficacy design enables our model to be highly effective without requiring excessive model complexity. Extensive experiments conducted on four real-world datasets demonstrate that our model outperforms state-of-the-art methods, including a strong Transformer-based baseline. It achieves an average improvement of 8 percentage points in recognition accuracy, with only 10% of the parameters compared to the Transformer-based method (4.95M vs. 49.24M). Additionally, our model is significantly faster, with empirical training and execution times at least 2.07 times faster than the baseline.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Bing Li, Wei Cui, Le Zhang, Ce Zhu, Wei Wang, Ivor Tsang, Joey Tianyi Zhou

DifFormer: Multi-Resolutional Differencing Transformer With Dynamic Ranging for Time Series Analysis Journal Article

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.

Abstract | BibTeX | 标签:

@article{nokey,

title = {DifFormer: Multi-Resolutional Differencing Transformer With Dynamic Ranging for Time Series Analysis},

author = {Bing Li, Wei Cui, Le Zhang, Ce Zhu, Wei Wang, Ivor Tsang, Joey Tianyi Zhou},

year  = {2023},

date = {2023-07-17},

urldate = {2023-07-17},

journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},

abstract = {Time series analysis is essential to many far-reaching applications of data science and statistics including economic and financial forecasting, surveillance, and automated business processing. Though being greatly successful of Transformer in computer vision and natural language processing, the potential of employing it as the general backbone in analyzing the ubiquitous times series data has not been fully released yet. Prior Transformer variants on time series highly rely on task-dependent designs and pre-assumed ``pattern biases'', revealing its insufficiency in representing nuanced seasonal, cyclic, and outlier patterns which are highly prevalent in time series. As a consequence, they can not generalize well to different time series analysis tasks. To tackle the challenges, we propose emph{DifFormer}, an effective and efficient Transformer architecture that can serve as a workhorse for a variety of time-series analysis tasks. DifFormer incorporates a novel multi-resolutional differencing mechanism, which is able to progressively and adaptively make nuanced yet meaningful changes prominent, meanwhile, the periodic or cyclic patterns can be dynamically captured with flexible lagging and dynamic ranging operations. Extensive experiments demonstrate DifFormer significantly outperforms state-of-the-art models on three essential time-series analysis tasks, including classification, regression, and forecasting. In addition to its superior performances, DifFormer also excels in efficiency -- a linear time/memory complexity with empirically lower time consumption.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Ao Li, Le Zhang, Yun Liu, Ce Zhu

Feature Modulation Transformer: Cross-Refinement of Global Representation via High-Frequency Prior for Image Super-Resolution Inproceedings

In: ICCV, 2023.

Abstract | BibTeX | 标签:

2022

Fanxing Liu, Cheng Zeng, Le Zhang*, Yingjie Zhou*, Qing Mu, Yanru Zhang, Ling Zhang, Ce Zhu

FedTADBench: Federated Time-series Anomaly Detection Benchmark Inproceedings

In: IEEE HPCC （Best Paper Award), 2022.

Abstract | BibTeX | 标签: | Links:

@inproceedings{nokey,

title = {FedTADBench: Federated Time-series Anomaly Detection Benchmark},

author = {Fanxing Liu, Cheng Zeng, Le Zhang*, Yingjie Zhou*, Qing Mu, Yanru Zhang, Ling Zhang, Ce Zhu},

url = {https://github.com/fanxingliu2020/FedTADBench},

year  = {2022},

date = {2022-12-15},

urldate = {2022-12-15},

booktitle = {IEEE HPCC （Best Paper Award)},

abstract = {Time series anomaly detection strives to uncover potential abnormal behaviors and patterns from temporal data, and has fundamental significance in diverse application scenarios. Constructing an effective detection model usually requires

adequate training data stored in a centralized manner, however, this requirement sometimes could not be satisfied in realistic

scenarios. As a prevailing approach to address the above problem, federated learning has demonstrated its power to cooperate with the distributed data available while protecting the privacy of data providers. However, it is still unclear that how existing time series anomaly detection algorithms perform with decentralized data storage and privacy protection through federated learning.

To study this, we conduct a federated time series anomaly

detection benchmark, named FedTADBench, which involves five

representative time series anomaly detection algorithms and four

popular federated learning methods. We would like to answer

the following questions: (1)How is the performance of time series

anomaly detection algorithms when meeting federated learning?

(2) Which federated learning method is the most appropriate

one for time series anomaly detection? (3) How do federated

time series anomaly detection approaches perform on different

partitions of data in clients? Numbers of results as well as corresponding analysis are provided from extensive experiments with various settings. The source code of our benchmark is publicly

available at https://github.com/fanxingliu2020/FedTADBench},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Guolei Sun, Yun Liu, Hao Tang, Ajad Chhatkuli, Le Zhang, Luc Van Gool

Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation Inproceedings

In: ECCV2022, 2022.

Abstract | BibTeX | 标签: | Links:

GangXu, QiBin Hou, Le Zhang, Ming-Ming Cheng

FMNet: Frequency-Aware Modulation Network for SDR-to-HDR Translation Inproceedings

In: ACM MM, 2022.

Abstract | BibTeX | 标签: | Links:

Yu-Huan Wu, Yun Liu, Le Zhang, Ming-Ming Cheng, Bo Ren

EDN: Salient object detection via extremely-downsampled network Journal Article

In: IEEE TIP, 2022.

Abstract | BibTeX | 标签: | Links:

Wei Cui, Le Zhang, Bing Li, Zhenghua Chen, Min Wu, Xiaoli Li, Jiawen Kang

Semi-Supervised Deep Adversarial Forest for Cross-Environment Localization Journal Article

In: IEEE Transactions on Vehicular Technology, 2022.

Abstract | BibTeX | 标签:

2021

Yun Liu; Ming-Ming Cheng; Deng-Ping Fan; Le Zhang; JiaWang Bian; Dacheng Tao

Semantic edge detection with diverse deep supervision Journal Article

In: IJCV, 2021.

BibTeX | 标签:

Xun Xu; Loong-Fah Cheong; Zhuwen Li; Le Zhang; Ce Zhu;

Learning Clustering for Motion Segmentation Journal Article

In: IEEE TCSVT, 2021.

Abstract | BibTeX | 标签: | Links:

Le Zhang; Wei Cui; Bing Li; Zhenghua Chen; Min Wu; Teo Sin Gee

Privacy-Preserving Cross-Environment Human Activity Recognition Journal Article

In: IEEE TCybernetics, 2021.

BibTeX | 标签:

Yining Ma; Jingwen Li; Zhiguang Cao; Wen Song; Le Zhang; Zhenghua Chen; Jing Tang

Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer Inproceedings

In: NeurIPS, 2021.

Abstract | BibTeX | 标签: | Links:

Le Zhang; Zenglin Shi; Ming-Ming Cheng; Yun Liu; Jia-Wang Bian; Joey Tianyi Zhou; Guoyan Zheng; Zeng Zeng

Nonlinear Regression via Deep Negative Correlation Learning Journal Article

In: IEEE TPAMI, 43 (3), pp. 982-998, 2021.

Abstract | BibTeX | 标签: | Links:

@article{8850209,

title = {Nonlinear Regression via Deep Negative Correlation Learning},

author = {Le Zhang and Zenglin Shi and Ming-Ming Cheng and Yun Liu and Jia-Wang Bian and Joey Tianyi Zhou and Guoyan Zheng and Zeng Zeng},

url = { https://mmcheng.net/dncl/},

doi = {10.1109/TPAMI.2019.2943860},

year  = {2021},

date = {2021-01-01},

urldate = {2021-01-01},

journal = {IEEE TPAMI},

volume = {43},

number = {3},

pages = {982-998},

abstract = {Nonlinear regression has been extensively employed in many computer vision problems (e.g., crowd counting, age estimation, affective computing). Under the umbrella of deep learning, two common solutions exist i) transforming nonlinear regression to a robust loss function which is jointly optimizable with the deep convolutional network, and ii) utilizing ensemble of deep networks. Although some improved performance is achieved, the former may be lacking due to the intrinsic limitation of choosing a single hypothesis and the latter may suffer from much larger computational complexity. To cope with those issues, we propose to regress via an efficient “divide and conquer” manner. The core of our approach is the generalization of negative correlation learning that has been shown, both theoretically and empirically, to work well for non-deep regression problems. Without extra parameters, the proposed method controls the bias-variance-covariance trade-off systematically and usually yields a deep regression ensemble where each base model is both “accurate” and “diversified.” Moreover, we show that each sub-problem in the proposed method has less Rademacher Complexity and thus is easier to optimize. Extensive experiments on several diverse and challenging tasks including crowd counting, personality analysis, age estimation, and image super-resolution demonstrate the superiority over challenging baselines as well as the versatility of the proposed method. The source code and trained models are available on our project page: https://mmcheng.net/dncl/.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Wanyue Zhang; Xun Xu; Fayao Liu; Le Zhang; Chuan-Sheng Foo

On Automatic Data Augmentation for 3D Point Cloud Classification Proceeding

BMVC, 2021.

BibTeX | 标签:

Yu-Huan Wu; Yun Liu; Le Zhang; Wang Gao; Ming-Ming Cheng

Regularized Densely-Connected Pyramid Network for Salient Instance Segmentation Journal Article

In: IEEE TIP, 30 , pp. 3897-3907, 2021.

Abstract | BibTeX | 标签: | Links:

Wei Wang Wei Cui Bing Li; Min Wu

Two-Stream Convolution Augmented Transformer for Human Activity Recognition Inproceedings

In: AAAI, 2021.

Abstract | BibTeX | 标签: | Links:

Jia-Wang Bian; Huangying Zhan; Naiyan Wang; Zhichao Li; Le Zhang; Chunhua Shen; Ming-Ming Cheng; Ian Reid

Unsupervised Scale-consistent Depth Learning from Video Journal Article

In: IJCV, 2021.

Abstract | BibTeX | 标签: | Links:

Yun Liu; Xin-Yu Zhang; Jia-Wang Bian; Le Zhang; Ming-Ming Cheng

Samnet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection Journal Article

In: IEEE TIP, 30 , pp. 3804–3814, 2021.

Abstract | BibTeX | 标签: | Links:

Joey Tianyi Zhou; Le Zhang*; Jiawei Du; Xi Peng; Zhiwen Fang; Zhe Xiao; Hongyuan Zhu;

Locality-Aware Crowd Counting Journal Article

In: IEEE TPAMI, 2021.

BibTeX | 标签:

2020

Le Zhang; Zhenghua Chen; Wei Cui; Bing Li; Cen Chen; Zhiguang Cao; Kaizhou Gao

Wifi-based indoor robot positioning using deep fuzzy forests Journal Article

In: IEEE Internet of Things Journal, 7 (11), pp. 10773–10781, 2020.

BibTeX | 标签:

Zhiguang Cao; Hongliang Guo; Wen Song; Kaizhou Gao; Zhenghua Chen; Le Zhang; Xuexi Zhang

Using reinforcement learning to minimize the probability of delay occurrence in transportation Journal Article

In: IEEE transactions on vehicular technology, 69 (3), pp. 2424–2436, 2020.

BibTeX | 标签:

Cen Chen; Xiaofeng Zou; Zeng Zeng; Zhongyao Cheng; Le Zhang; Steven CH Hoi

Exploring structural knowledge for automated visual inspection of moving trains Journal Article

In: IEEE transactions on cybernetics, 2020.

BibTeX | 标签:

JiaWang Bian; Wen-Yan Lin; Yun Liu; Le Zhang; Sai-Kit Yeung; Ming-Ming Cheng; Ian Reid

GMS: Grid-based Motion Statistics for Fast, Ultra-Robust Feature Correspondence Journal Article

In: IJCV, 2020.

Abstract | BibTeX | 标签: | Links:

Le Zhang; Zenglin Shi; Joey Tianyi Zhou; Ming-Ming Cheng; Yun Liu; Jia-Wang Bian; Zeng Zeng; Chunhua Shen

Ordered or Orderless: A Revisit for Video based Person Re-Identification Journal Article

In: IEEE TPAMI, 2020.

Abstract | BibTeX | 标签: | Links:

2019

Jia-Xing Zhao; Yang Cao; Deng-Ping Fan; Ming-Ming Cheng; Xuan-Yi Li; Le Zhang

Contrast prior and fluid pyramid integration for RGBD salient object detection Inproceedings

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3927–3936, 2019.

BibTeX | 标签:

Le Zhang; Songyou Peng; Stefan Winkler

PersEmoN: a deep network for joint analysis of apparent personality, emotion and their relationship Journal Article

In: IEEE Transactions on Affective Computing, 2019.

BibTeX | 标签:

Joey Tianyi Zhou; Le Zhang; Zhiwen Fang; Jiawei Du; Xi Peng; Yang Xiao

Attention-driven loss for anomaly detection in video surveillance Journal Article

In: IEEE TCSVT, 30 (12), pp. 4639–4647, 2019.

Abstract | BibTeX | 标签: | Links:

2018

Zenglin Shi; Le Zhang; Yun Liu; Xiaofeng Cao; Yangdong Ye; Ming-Ming Cheng; Guoyan Zheng

Crowd counting with deep negative correlation learning Inproceedings

In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5382–5390, 2018.

BibTeX | 标签:

Chen Wang; Le Zhang; Lihua Xie; Junsong Yuan

Kernel cross-correlator Inproceedings

In: Proceedings of the AAAI Conference on Artificial Intelligence, 2018.

BibTeX | 标签:

Yun Liu; Peng-Tao Jiang; Vahan Petrosyan; Shi-Jie Li; Jiawang Bian; Le Zhang 0001; Ming-Ming Cheng

DEL: Deep Embedding Learning for Efficient Image Segmentation. Inproceedings

In: IJCAI, pp. 870, 2018.

BibTeX | 标签:

Zenglin Shi; Guodong Zeng; Le Zhang; Xiahai Zhuang; Lei Li; Guang Yang; Guoyan Zheng

Bayesian voxdrn: A probabilistic deep voxelwise dilated residual network for whole heart segmentation from 3d mr images Inproceedings

In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 569–577, Springer 2018.

BibTeX | 标签:

Jufeng Yang; Liyi Chen; Le Zhang; Xiaoxiao Sun; Dongyu She; Shao-Ping Lu; Ming-Ming Cheng

Historical context-based style classification of painting images via label distribution learning Inproceedings

In: Proceedings of the 26th ACM international conference on Multimedia, pp. 1154–1162, 2018.

BibTeX | 标签:

53 entries « ‹ 1 of 2 › »