Pilot study on the clinical reliability of neurofilament light chain as a biomarker presented
At the German Congress of Laboratory Medicine (DKLM) 2025 in Leipzig, a pilot study for external quality assurance of the measurement of neurofilament light chain was presented. The biomarker is considered promising for detecting neuronal damage in diseases such as multiple sclerosis, Alzheimer’s disease and amyotrophic lateral sclerosis. Nevertheless, a lack of standardized measurement methods, variabilities in sample preparation, different assay platforms and a lack of calibration standards hinder clinical applicability and comparability of results between laboratories.
The study, conducted by the Foundation for Pathobiochemistry and Molecular Diagnostics – Reference Institute for Bioanalytics in Bonn and the Institute of Clinical Chemistry of the Medical Faculty Mannheim of the University of Heidelberg, analyzes the analytical variability in an external quality assurance program. Lyophilized human serum served as a matrix, prepared with two concentration levels: low at about 30 picograms per milliliter and high at about 500 picograms per milliliter. Homogeneity has been checked according to international standards. The program runs twice a year, with no mandatory guidelines, with registration via a website. Samples are shipped at room temperature and stored at two to eight degrees Celsius. The test phase lasts twelve days, results are submitted online. Reconstitution is done with distilled water, followed by half-hour incubation at room temperature in the dark. Short-term storage at two degrees Celsius, long-term at minus 20 degrees Celsius after a single freeze.

25 laboratories took part in the 2025 round, 18 of them from Germany and seven from other European countries. The overall success rate was 88 percent, per sample 64 percent for the high concentration and 88 percent for the low concentration. Mean values were 527 picograms per milliliter for the high sample with a standard deviation of 229 and a coefficient of variation of 43.4 percent, and 32 picograms per milliliter for the low sample with a standard deviation of 9.84 and a coefficient of variation of 30.8 percent. Target values were based on medians per method kit combination, with at least four outcomes, otherwise on method or total medians. Permissible deviations were plus minus 30 percent.
The success rates varied depending on the method and manufacturer: electrochemiluminescence from one manufacturer achieved zero percent, chemiluminescent enzyme immunoassays and digital ELISA each 100 percent, direct chemiluminescence systems 100 percent and 92 percent, respectively, and other methods 100 percent. Graphs showed clear clusters depending on the platform, with greater heterogeneity at the high sample, due to different handling of the upper measurement range, such as dilutions or thresholds.
The discussion highlights platform-dependent systematic deviations that prevent the results from being interchangeable. Possible causes include differences in calibration, antibody detection, fragment coverage and matrix effects, supplemented by individual handling errors. The higher sample showed greater dispersion, while diagrams indicate consistent methodological differences with low intra-laboratory variability. The lack of reference methods makes it impossible to determine the proximity to the true value; Medians used are consensus values. Ongoing initiatives target reference methods and commutable materials. One manufacturer showed strong negative deviations, possibly due to matrix interactions.
In summary, serum values for Neurofilament Light Chain are currently not transferable between platforms, which requires method-specific reference ranges and cut-offs. Clinical studies must document methods, meta-analyses must provide for conversions. When changing methods, bridging studies with patient samples are essential. Priority is given to analytical harmonization through reference methods and adapted calibrations. Until then, strict quality controls, transparent method reporting and careful age-appropriate interpretation are recommended. The program plans to expand to an educational format with disease classification based on patient data and metrics to link methodological evaluation with clinical interpretation.




