Finch Science
Neuroscience · Nature Communications 2025

Gene expression signatures from whole blood predict amyotrophic lateral sclerosis case status and survival

研究團隊用全血 RNA-seq 與機器學習,探索 ALS 的診斷分類、存活預測與疾病路徑線索。Whole-blood RNA-seq and machine learning point toward ALS classification, survival stratification, and pathway discovery.

mechanism infographic
研究流程概覽:血液 RNA-seq 轉成基因表現特徵,再進入 ALS 分類、預後分層、路徑分析與候選藥物優先排序。Workflow overview: whole-blood RNA-seq becomes gene-expression features for ALS classification, prognosis, pathway analysis, and candidate prioritization.
1

抽血讀取 RNABlood RNA readout

研究從全血樣本讀取超過 22,000 個基因的表現,捕捉 ALS 相關的免疫、代謝與細胞壓力訊號。Whole-blood samples were profiled across more than 22,000 genes, capturing ALS-related immune, metabolic, and cellular-stress signals.

2

找出 ALS 表現特徵ALS expression signature

ALS 與對照組之間出現數千個差異表現基因,研究團隊再將它們縮減成較小的候選基因面板。Thousands of differentially expressed genes separated ALS cases from controls, then were filtered into smaller candidate gene panels.

3

機器學習分類與預後Classification and prognosis

XGBoost 等模型可區分 ALS 個案與對照;加入臨床變項後,也可協助預測較短、中等與較長存活組。Models such as XGBoost separated ALS cases from controls; with clinical variables, they also helped stratify shorter-, intermediate-, and longer-survival groups.

4

連回血液外的疾病生物學Disease biology link

血液中的核心基因與 ALS 神經元、死後脊髓資料有重疊,進一步支持路徑分析與候選藥物排序。Core blood genes overlapped with ALS neuron and postmortem spinal-cord data, supporting pathway analysis and computational drug-candidate ranking.

研究設計Study design
全血 RNA-seq:ALS 422 人,健康對照 272 人;並使用獨立外部資料集驗證分類模型。Whole-blood RNA-seq from 422 ALS cases and 272 controls, with independent external validation for classification.
分類結果Classification result
多種機器學習模型可預測 ALS case-control 狀態;最佳外部測試 AUC 為 0.894。Multiple machine-learning models predicted ALS case-control status; the best external-test AUC was 0.894.
預後結果Prognosis result
基因特徵加上臨床變項,可比單用臨床變項更好地區分外部資料中的短、中、長存活組。Gene features plus clinical variables better separated shorter-, intermediate-, and longer-survival groups in an external dataset than clinical variables alone.
生物學線索Biological signal
路徑分析顯示血液轉錄體含有 ALS 相關路徑訊號,核心基因也與神經組織資料重疊。Pathway analysis detected ALS-relevant signals in blood transcriptomes, and core genes overlapped with nervous-system datasets.

重要提醒:Important caution: 這仍是研究階段的 biomarker pipeline。它顯示血液 RNA 訊號有診斷與預後潛力,但還不是可直接取代臨床診斷的血液檢查;仍需更大、前瞻性與 ALS mimics 驗證。This remains a research-stage biomarker pipeline. It suggests diagnostic and prognostic potential, but it is not yet a clinical blood test that replaces medical diagnosis; larger prospective and ALS-mimic validation is still needed.

ALS 診斷常延遲數月,因為早期症狀容易與其他疾病混淆。這篇研究把全血 RNA-seq 轉成基因表現特徵,用機器學習辨識 ALS 個案,並把基因訊號與臨床資料結合來預測存活分層。重點不是宣稱已經有臨床血檢,而是證明血液轉錄體含有足以支持分類、預後與疾病路徑分析的訊號。ALS diagnosis is often delayed because early symptoms overlap with other conditions. This study turns whole-blood RNA-seq into gene-expression features, uses machine learning to recognize ALS cases, and combines gene signals with clinical data to stratify survival. The key message is not that a clinical blood test is already available, but that blood transcriptomes contain useful signals for classification, prognosis, and pathway analysis.


1. 研究背景

ALS 是致命的神經退化疾病,平均存活期常只有診斷後 2 到 4 年。臨床上最大的困難之一,是確診常延遲 5 到 15 個月,部分患者更久。現有候選 biomarker 如 neurofilament light chain 雖能反映神經損傷,但專一性不足,因為其他神經疾病也可能升高。

2. 研究問題

如果只看單一 biomarker 不夠,那能不能直接讀取血液細胞中的大規模基因表現,讓機器學習模型辨識 ALS 的分子簽名?更進一步,這些血液訊號能不能幫助預測存活時間,並提供疾病路徑與候選藥物線索?

3. 研究設計

研究團隊收集 422 位 ALS 個案與 272 位健康對照者的全血 RNA-seq,量測超過 22,000 個基因。接著他們找出 ALS 與對照之間的差異表現基因,訓練多種機器學習模型,並在獨立外部資料集測試模型能否泛化。

4. 主要發現

模型可準確區分 ALS case-control 狀態,最佳外部測試 AUC 達 0.894。研究也將基因表現特徵與臨床變項整合,用於短、中、長存活分層。最後,血液中的核心基因與 ALS 神經元、死後脊髓資料有重疊,支持進一步路徑分析與藥物擾動分析。

5. 生物學意義

這項研究把血液從單純的周邊樣本,提升為一個可讀取疾病系統訊號的窗口。即使 ALS 主要傷害運動神經元,免疫、代謝與細胞壓力變化仍可能在血液轉錄體中留下可分析的痕跡。

6. 研究限制

這不是立即可用的臨床診斷試劑。未來仍需前瞻性研究、更多族群、ALS mimics 與早期/疑似 ALS 病例驗證,才能判斷它是否能真正縮短臨床診斷延遲並改善照護。

1. Background

ALS is a fatal neurodegenerative disease, with median survival often only 2 to 4 years after diagnosis. One major clinical problem is diagnostic delay: confirmation can take 5 to 15 months or longer. Existing candidate biomarkers such as neurofilament light chain reflect nerve damage, but are not specific enough because they can also rise in other neurological diseases.

2. Research question

If a single biomarker is not enough, can large-scale gene expression from blood cells reveal a molecular ALS signature that machine learning can recognize? And can those blood signals also help predict survival and point toward disease pathways or drug-candidate hypotheses?

3. Study design

The team profiled whole-blood RNA-seq from 422 ALS cases and 272 healthy controls, measuring more than 22,000 genes. They identified differentially expressed genes, trained multiple machine-learning models, and tested whether the models generalized to an independent external dataset.

4. Main finding

The models accurately separated ALS cases from controls, with the best external-test AUC reaching 0.894. The study also combined gene-expression features with clinical variables to stratify shorter-, intermediate-, and longer-survival groups. Finally, core blood genes overlapped with ALS neuron and postmortem spinal-cord datasets, supporting pathway and drug-perturbation analyses.

5. Biological meaning

This work treats blood not merely as a convenient sample, but as a window into system-level disease signals. Although ALS primarily damages motor neurons, immune, metabolic, and cellular-stress changes may leave analyzable traces in the blood transcriptome.

6. Limitations

This is not yet a ready-to-use diagnostic blood test. Future prospective studies, broader populations, ALS-mimic cohorts, and early suspected-ALS cases are needed before we know whether this approach can shorten clinical diagnostic delay and improve care.

ALS
肌萎縮性脊髓側索硬化症,會造成運動神經元退化與漸進性肌肉無力。Amyotrophic lateral sclerosis, a motor-neuron disease that causes progressive weakness.
Whole-blood RNA-seq
從全血樣本讀取 RNA,量測許多基因目前的表現狀態。Sequencing RNA from whole blood to measure which genes are active.
Differentially expressed genes
在 ALS 與對照組之間表現量顯著不同的基因。Genes whose expression differs significantly between ALS cases and controls.
XGBoost
一種梯度提升機器學習模型,常用於複雜表格資料分類。A gradient-boosted machine-learning model often used for complex tabular prediction.
ROC-AUC
衡量分類模型區分能力的指標;越接近 1 代表分類越好。A metric for classifier discrimination; values closer to 1 indicate better separation.
Survival stratification
把患者依預測存活風險分成較短、中等與較長存活組。Grouping patients into shorter-, intermediate-, and longer-survival risk groups.
Core genes
在血液與神經相關資料中都呈現疾病相關變化的重疊基因。Overlapping disease-associated genes observed in blood and nervous-system datasets.
Drug perturbation analysis
用資料庫預測哪些藥物可能反轉疾病基因表現模式的計算分析。A computational screen for drugs predicted to reverse disease-associated gene patterns.
Biomarker
可用來輔助判斷疾病狀態、風險或預後的生物訊號。A biological signal that can help indicate disease status, risk, or prognosis.

Zhao, Y., Savelieff, M.G., Li, X. et al. Gene expression signatures from whole blood predict amyotrophic lateral sclerosis case status and survival. Nat Commun 16, 9631 (2025). https://doi.org/10.1038/s41467-025-64622-5

本頁為教育性整理,非原文翻譯;原文版權屬原出版方。An educational summary, not a translation; copyright remains with the original publisher.