There’s a Dataset Hiding in Your LIS. It’s Been There for Years.
In 2014/15, there was a clinical chemistry fellow at my job who was doing something I had never seen before. He had pulled a dataset of free-text cancellation comments out of the LIS and was running analysis on them using the R programming language. He tried to explain it to me, but I had no idea what he was doing or what any of it meant. I asked questions. I understood nothing. But I was thoroughly intrigued.
Intrigued enough that I installed R on my laptop. Intrigued enough to find a local meetup group called R-Ladies and tried to figure out what this programming language even was. I was still largely clueless at the end of all of it, but something about watching him analyze free-text comments would not leave me alone. It sent me down a path I am still following.
What he was trying to solve was a real problem. When technologists document a test cancellation, they do not pick from a dropdown. They type. So one person writes “clotted, QNS” and another writes “specimen appears hemolyzed, unable to process” and another writes “wrong tube received, cannot run test,” and technically all of that goes into the same pile of unstructured text. If you want to know how many cancellations last quarter were due to specimen quality, you cannot run a count query. You have to read them, or figure out how to teach a computer to read them for you.
That is exactly what he was doing. He just had to know R to do it.
I have been thinking about that season a lot lately.
What Was Actually Happening Under the Hood
What the fellow was doing is called natural language processing, or NLP. It is the branch of artificial intelligence concerned with getting computers to understand human language, the messy kind, the kind people actually type when no one is standardizing their input. I did not know that term in 2014. I picked it up along the way, slowly, through meetups and YouTube tutorials and eventually a doctoral program.
Some of the most foundational NLP techniques involve normalizing text so a computer can recognize that different words mean the same thing. Stemming reduces a word to its root form. Lemmatization does something similar but more intelligently, understanding that “canceled,” “canceling,” and “cancellation” all refer to the same concept. Once you apply these techniques to a column of free text, the computer can start to see patterns that would be invisible to a simple keyword search.
From there, you build what researchers call features: variables the algorithm can use to group similar comments together. “Hemolyzed,” “hemolysis,” “1+ hemolysis noted,” and “specimen hemolyzed” all end up in the same bucket. “QNS,” “quantity not sufficient,” “not enough volume,” and “too short to run” end up in another. What looked like a column of noise becomes a structured dataset you can count, chart, and report on. What that fellow was doing in our lab was legitimate data science, and at the time it was genuinely ahead of what most clinical labs were thinking about.
The Research Has Caught Up
A comprehensive 2025 review in Computer Science Review (Alafari, Driss, and Cherif) analyzed 27 peer-reviewed studies on NLP in healthcare, covering research published from 2019 to 2023. The review organized NLP applications in healthcare into four major task categories: prediction and detection, text analysis and modeling, information processing, and clinical decision support and disease diagnosis.
Across all of those studies, the most common data source was electronic health records, which accounted for just over half of the datasets studied. The techniques the fellow was using in our lab, what the researchers classify as traditional machine learning methods including text classification and clustering, remain the most widely used approach in the published research. More advanced transformer-based models like BERT and GPT are present in the literature but still underdeployed in healthcare settings.
What that tells me is that the field has validated the basics and is now building on them. NLP has demonstrated real-world performance for risk prediction, clinical decision support, named entity recognition in clinical notes, and information extraction from radiology and pathology reports. Researchers have even shown validity for NLP-based adverse event detection from EHR data, an area of active investigation across healthcare broadly. The foundation is solid. The tools are catching up fast.
Your Lab Generates More Useful Free Text Than Almost Any Department in the Hospital
This is worth sitting with for a moment.
Every test cancellation comment your technologists have ever typed is sitting in your LIS. Every note a phlebotomist or tech wrote on a specimen rejection. Every free text field completed during a critical value callback. Every narrative in an incident report. Every remark a supervisor entered during QC when something looked off before the run failed.
None of that is structured. None of it is queryable with a count or an aggregate. All of it has signal in it.
The 2025 review found that 52 percent of the research it covered drew primarily from EHR data. Clinical labs are generating EHR-adjacent free text constantly, and most of it sits completely unanalyzed because doing something with it has historically required either a clinical chemistry fellow who knows R or a data scientist who is not on your staffing model. That used to be the wall. I am here to tell you the wall has come down considerably.
Ten Ways NLP Can Work in Your Lab Today
The following ten use cases map directly to data sources that exist in most clinical lab systems right now. These are not theoretical. Every one of them involves text your lab is already producing.
- Cancellation reason analysis. Cluster free text cancellation comments into categories like QNS, hemolysis, wrong tube, and patient unavailable. Find out which pre-analytical failure mode is actually driving your cancellation rate, not just what the code says.
- Specimen rejection comment mining. Your coded rejection reasons and your technologist notes often tell different stories. NLP can surface what the free text says versus what the dropdown captured, and the gap between those two things is where process improvement lives.
- Critical value callback documentation review. Analyze free text callback notes for patterns in uncommunicable results, repeated contact attempts, or documentation language that may not meet your accreditation requirements.
- QC comment analysis. Mine technologist notes entered during QC events for language that tends to appear before an out-of-control result or an instrument flag. This turns reactive documentation into a predictive signal.
- Incident and near-miss report theme extraction. Surface recurring root cause themes from narrative incident reports without reading every one manually. Identify whether a cluster of events shares underlying language about staffing, process, or equipment.
- Turnaround time complaint categorization. Group incoming complaints or call log notes by topic to identify which test types, shifts, or workflows are generating the most friction for providers or patients.
- Pathology report information extraction. Pull structured data including diagnosis terms, margin status, and lymph node involvement from narrative pathology reports automatically. This is what researchers call named entity recognition, and it is one of the most studied NLP applications in healthcare right now.
- Order comment analysis. Extract patterns from provider free text order comments to find frequently missing clinical information, common urgent justifications, or test-specific clinical contexts that affect result interpretation.
- Patient portal message categorization. Group lab-related patient messages by topic for faster triage and to identify which results are generating the most follow-up questions.
- Competency assessment comment analysis. Analyze free text evaluator comments on competency records to identify skill gap patterns across staff, shifts, or time periods. This one does not require any patient data at all.
You Do Not Need to Know R Anymore
This is the part I find genuinely exciting.
The barrier to NLP ten years ago was the technical layer. You needed to know R or Python, understand the libraries, write the code, debug it, and interpret the output. That took either specialized training or a patient fellow who had the skills and the time to share them. Most labs had neither.
That barrier has not just lowered. For many use cases, it has effectively disappeared.
If your data is already de-identified, you can paste a column of free text comments directly into Claude, ChatGPT, or Gemini and ask it to group them by theme and count each category. You will get a usable result in under a minute, with no code, no installation, and no tutorial.
If your data is not de-identified, or you want a repeatable process you can run every month, the AI can write you Python code to run locally in a Jupyter notebook. That means the data never leaves your machine. You describe what you want in plain English, the AI produces the script, and you run it in an environment that keeps your patient data private and on-premise. The AI does the programming. You do the interpretation.
What the fellow spent significant time building in 2014 and 2015, you can now describe in one sentence to a tool on your phone.
Try This Prompt
Here is a starting point you can use this week. Pull a sample of de-identified cancellation comments from your LIS, even just 50 to 100 rows, and bring them into any of the major AI tools. Then use this prompt:
I have a list of de-identified lab test cancellation comments from our laboratory information system. Please read through these comments, identify the main reasons for cancellation, group them into meaningful categories, and give me a table showing each category name, a few example comments from that group, and the total count. Here are the comments, one per line: [paste your list here]
If you want a repeatable process that runs on your full dataset locally, try this version instead:
I have a CSV file called cancellation_comments.csv with a column named “comment_text” containing free text lab test cancellation notes from our LIS. Please write Python code I can run in a Jupyter notebook or VS Code to: load the file, clean and normalize the text, use a clustering method to group the comments into meaningful categories, and produce a summary table and bar chart showing each category and its count. I am not an experienced programmer, so please include brief comments explaining what each section of the code does.
The second version keeps your data local. The AI writes the code. You run it on your machine.
I have been thinking about that year a lot lately. There was something real happening in that lab in 2014 and I could not access it. The technology was real but the barrier was high enough that most of us could not climb it, and I spent years being on the outside of it. That gap has closed in a way that would have seemed implausible when I was sitting in a meetup group, freshly downloaded copy of R on my laptop, trying to figure out what any of it meant.
The lab data is still there. The free text is still sitting in your LIS, unanalyzed, waiting. The difference now is that you have the tools to do something with it, and you do not need a fellowship in clinical chemistry to get started.
If you have tried any kind of text analysis on your lab data, or if you have a use case I did not cover, I want to hear about it. Drop a note down in the comments and tell me what you are working on.
See you next week.
Meredith.

