项目介绍 - 数据科学与人工智能研究

� 项目经历

🏥 医疗诊断动态因果事理图谱系统开发（国家自然科学基金）

时间：2022年9月 - 2025年6月

项目描述：为提升基层医疗诊断的实时性与精确度，本人参与研发基于动态不确定因果图（DUCG）的全科临床辅助诊断系统。熟练使用Pandas/Numpy进行万条数据，64种特征进行数据清洗与分析，成功利用Scikit-learn的机器学习模型提取21种关键病理因素，其中AUC为0.72。同时，研发DUCG的动态概率推理引擎，设计基于疾病表征剪枝的图遍历算法，实现单个疾病的诊断时间在至200毫秒以下，且诊断正确率高达99%，目前该系统已在北京朝阳医院试运行，基于真实数据实现疾病诊断。相关成果已投稿SCI一区（返修）。

技术栈：JAVA、图论算法、MySQL

项目职责：软件系统开发与推理算法设计实现

🔒 网络安全态势感知数据分析项目（毕业设计）

时间：2022年11月 - 2024年06月

项目描述：提出动态不确定因果攻击图（DUCAG），基于历史告警数据准确描绘攻击路径和潜在风险。与此同时设计了基于因果链的节点风险概率推理算法（CCRP）实现节点风险评估。通过与传统的变量消除算法对比，在大规模网络中，CCRP算法平均耗时<100ms。目前该项目已经成功申请国自然科学基金（面上）。

技术栈：Python、MySQL、贝叶斯网、Pytorch

项目成果：论文4篇（SCI 2区、中文核心与EI会议，学生一作）、1篇在投

📚 中国传媒大学人工智能课程教材书稿编写项目

时间：2022年11月 - 至今

项目描述：为将人工智能理论知识转换为实践应用，本项目在Pytorch环境下分别实现因果推断、强化学习与群体智能三个方向实战案例。因果推断方面，实现了酒店房间预定情况的预测分析等实际案例，应用Pandas进行数据清洗与分析，利用Scikit-learn与Dowhy工具构建分类模型与因果图，实现测试集上90%的准确率。强化学习方面，模拟了Alpha Zero等强化学习算法，将其应用于棋类游戏的博弈中。群体智能方面，综合比较并实现了蚁群算法与粒子群优化等八种群体智能算法，主要应用于解决路径规划等问题。

技术栈：因果推断、群智能优化算法、强化学习

项目职责：书稿撰写与算法实现

�🚀 核心研究项目

🧠 动态不确定性因果图模型理论研究

项目描述：开发并应用动态不确定性因果图(DUCG)模型，用于复杂系统的因果建模与推理。

应用领域：

法律领域 - 证据推理与案件分析
金融领域 - 风险评估与投资决策
软件可靠性 - 系统故障诊断与预测
网络安全 - 威胁检测与响应

技术栈：Python, NetworkX, 概率推理算法

成果：在多个实际场景中验证了DUCG模型的有效性

🎮 智能算法与游戏AI

♟️ Alpha Zero算法实现五子棋AI

项目描述：基于AlphaZero算法，结合蒙特卡洛树搜索和深度神经网络，实现高水平的五子棋AI系统。

核心技术：

深度强化学习 (Deep Reinforcement Learning)
蒙特卡洛树搜索 (MCTS)
卷积神经网络 (CNN)
自我对弈训练

技术栈：Python, TensorFlow/PyTorch, NumPy

项目亮点：AI能够通过自我学习达到专业水平

🃏 CFR算法实现德州扑克对战

项目描述：应用反事实遗憾最小化(CFR)算法，开发智能德州扑克对战系统。

核心技术：

反事实遗憾最小化 (Counterfactual Regret Minimization)
博弈论与纳什均衡
信息集抽象
策略迭代优化

技术栈：Python, 博弈论算法库

应用价值：展示了CFR在不完全信息博弈中的强大能力

🧮 新型群体智能算法对比实验

项目描述：对比研究多种群体智能算法在不同优化问题上的性能表现。

算法覆盖：

粒子群优化 (PSO)
遗传算法 (GA)
蚁群算法 (ACO)
人工蜂群算法 (ABC)

技术栈：Python, Matplotlib, 优化算法库

研究成果：为不同类型优化问题提供算法选择指导

📊 数据分析与机器学习

🔍 机器学习可解释性研究

项目描述：应用SHAP库实现机器学习模型的可解释性分析，提升模型透明度。

核心技术：

SHAP (SHapley Additive exPlanations)
特征重要性分析
模型解释性可视化
决策路径追踪

技术栈：Python, SHAP, Scikit-learn, XGBoost

应用场景：金融风控、医疗诊断、推荐系统

📈 因果推断框架应用

项目描述：应用DoWhy和YLearn框架，在实际业务场景中实现因果推断分析。

核心方法：

工具变量法 (Instrumental Variables)
断点回归 (Regression Discontinuity)
差分法 (Difference-in-Differences)
倾向性评分匹配 (PSM)

技术栈：Python, DoWhy, YLearn, EconML

实践案例：酒店业务分析、营销效果评估

📚 技术能力展示

编程语言：Python (专家级), R (熟练), SQL (熟练), JavaScript (中级)

机器学习：TensorFlow, PyTorch, Scikit-learn, XGBoost, LightGBM

数据处理：Pandas, NumPy, Dask, Spark

可视化：Matplotlib, Seaborn, Plotly, Tableau

因果推断：DoWhy, YLearn, CausalML, EconML

深度学习：CNN, RNN, LSTM, Transformer, GAN

强化学习：Q-Learning, Policy Gradient, Actor-Critic, AlphaZero

Perfect integration of theory and practice

🚀 Core Research Projects

🧠 Dynamic Uncertain Causality Graph Theory Research

Description: Development and application of Dynamic Uncertain Causality Graph (DUCG) models for causal modeling and inference in complex systems.

Application Domains:

Legal Domain - Evidence reasoning and case analysis
Finance - Risk assessment and investment decisions
Software Reliability - System fault diagnosis and prediction
Cybersecurity - Threat detection and response

Tech Stack: Python, NetworkX, Probabilistic reasoning algorithms

Achievements: Validated DUCG effectiveness across multiple real-world scenarios

🎮 Intelligent Algorithms & Game AI

♟️ Alpha Zero Gomoku AI Implementation

Description: High-level Gomoku AI system based on AlphaZero algorithm, combining Monte Carlo Tree Search and deep neural networks.

Core Technologies:

Deep Reinforcement Learning
Monte Carlo Tree Search (MCTS)
Convolutional Neural Networks (CNN)
Self-play training

Tech Stack: Python, TensorFlow/PyTorch, NumPy

Highlights: AI achieves professional level through self-learning

🃏 Texas Hold'em with CFR Algorithm

Description: Intelligent Texas Hold'em battle system using Counterfactual Regret Minimization (CFR) algorithm.

Core Technologies:

Counterfactual Regret Minimization (CFR)
Game theory and Nash equilibrium
Information set abstraction
Strategy iteration optimization

Tech Stack: Python, Game theory algorithm libraries

Value: Demonstrates CFR's power in imperfect information games

📊 Data Analysis & Machine Learning

🔍 Machine Learning Explainability Research

Description: Implementation of machine learning model interpretability analysis using SHAP library to improve model transparency.

Core Technologies:

SHAP (SHapley Additive exPlanations)
Feature importance analysis
Model interpretability visualization
Decision path tracking

Tech Stack: Python, SHAP, Scikit-learn, XGBoost

Applications: Financial risk control, medical diagnosis, recommendation systems

📈 Causal Inference Framework Applications

Description: Implementation of causal inference analysis in real business scenarios using DoWhy and YLearn frameworks.

Core Methods:

Instrumental Variables
Regression Discontinuity
Difference-in-Differences
Propensity Score Matching (PSM)

Tech Stack: Python, DoWhy, YLearn, EconML

Case Studies: Hotel business analysis, marketing effectiveness evaluation

🏢 Industry Data Analysis Practice

🚗 Ride-hailing Platform Analysis

Analysis Dimensions: User behavior, supply-demand matching, pricing strategy, operational efficiency

Key Metrics: DAU, order completion rate, driver online rate, revenue optimization

📱 Short Video Platform Analysis

Analysis Dimensions: Content consumption, user profiling, recommendation effectiveness, creator ecosystem

Key Metrics: Watch time, engagement rate, retention rate, content quality assessment

📚 Technical Skills Showcase

Programming: Python (Expert), R (Proficient), SQL (Proficient), JavaScript (Intermediate)

ML/DL: TensorFlow, PyTorch, Scikit-learn, XGBoost, LightGBM

Data Processing: Pandas, NumPy, Dask, Spark

Visualization: Matplotlib, Seaborn, Plotly, Tableau

Causal Inference: DoWhy, YLearn, CausalML, EconML

Deep Learning: CNN, RNN, LSTM, Transformer, GAN

Reinforcement Learning: Q-Learning, Policy Gradient, Actor-Critic, AlphaZero