马韬

马韬的个人简历

或许运气也是实力的一部分,
但我相信越努力一定越幸运!

就读本科院校 华中科技大学 985
所在城市 湖北 武汉
年级 22级本科生
论文 NeurIPS一作在投
Indin25 Conference一作在投
专利 一项实用新型专利在审
竞赛 两项国家级竞赛一等奖
一项国家级竞赛二等奖
众多省级奖项(详细看简历)
实习 笨AI公司-AI工程师(三个月)
Rank 16/121(13%)

计算机视觉
CNN、encoder、decoder

语义分割
deeplabv3、deeplabv3+、PSPnet

网络剪枝
结构化剪枝
filter剪枝、channal剪枝

github、git
Latex

医学图像处理
数字图像处理

注意力机制
Transformer

文章
01

置顶 题目待定
2022-09-14 15:43:23

题目待定

Currently, research on filter pruning primarily focuses on image classification tasks. While these methods have demonstrated remarkable performance in classification, their direct application to semantic segmentation has proven less effective. We attribute this limitation to two key factors: (1) Unlike classification, which mainly relies on semantic information, semantic segmentation requires rich spatial details. Existing pruning methods tend to overlook these spatial cues, making it difficult for pruned models to distinguish adjacent pixels and predict precise object boundaries. (2) The encoder-decoder architecture of segmentation models is inherently more complex. However, current pruning approaches lack an effective mechanism for determining layer-wise pruning ratios in segmentation models, leading to excessive pruning of critical layers and the loss of essential semantic or spatial information.

Question:Unlike classification, which mainly relies on semantic information, semantic segmentation requires rich spatial details. Existing pruning methods tend to overlook these spatial cues, making it difficult for pruned models to distinguish adjacent pixels and predict precise object boundaries.

Inspired by DecoupleSegNet and related works, we decouple semantic context from spatial information by applying Fourier Transform to feature maps, enabling separate assessments in the frequency domain. High-frequency components contain fine-grained details such as textures and edges, which are crucial for delineating object boundaries, whereas low-frequency components capture large-scale structures and global context, aiding in object classification. Based on this decomposition, we rank the importance of different channels within each layer and apply pruning accordingly.

Question:The encoder-decoder architecture of segmentation models is inherently more complex. However, current pruning approaches lack an effective mechanism for determining layer-wise pruning ratios in segmentation models, leading to excessive pruning of critical layers and the loss of essential semantic or spatial information.

Furthermore, leveraging concepts from subgraph approximation in graph theory, we assess the relative importance of different layers and dynamically allocate pruning ratios to adapt to the hierarchical structure of segmentation models. (MORE DETAILS maybe.) Since this is an NP-hard problem, conventional algorithms cannot provide an optimal solution in polynomial time. To overcome this, we propose an efficient pruning algorithm to accelerate the pruning process. Extensive experimental results demonstrate that our approach consistently achieves state-of-the-art performance across multiple semantic segmentation datasets and diverse model architectures.

文章
02

Indin25 Conference一作在投
2025-03-14

Bone Age Prediction using a Convolutional Neural Network-based Regression Algorithm employing Attention-Directing and Cluster

Bone age assessment (BAA) is a critical research topic in pediatric radiology, with growing interest in developing automated BAA methods. This study proposes a bone age prediction model integrating cluster analysis and convolutional neural network (CNN) regression, further enhanced by a multi-scale attention mechanism to construct a “divide-and-focus” dual-driven deep learning framework.

Innovation:A Hybrid Clustering-Regression Architecture: A pioneering multi-constrained K-means clustering algorithm generates data on age-specific subsets, enhancing recognition of developmental stages via subgroup-specific modeling.
Innovation:The Adaptive Spatial Attention Module (ASAM): This novel component detects saliencies in attention-guided regions of interests (ROIs), enabling hierarchical enhancement of anatomical structures via spatial-channel-mediated attention fusion.
Innovation:A Bayesian-optimized Cross-subset Ensemble Strategy: This probabilistic framework uses evidence-based weighting to aggregate dynamically the predictions made by base regression models including ResNet, DenseNet, and EfficientNetV2. This ensures optimal age estimation across the developmental phases.
© 2022 wttandroid