Tuesday, June 23, 2020

How Workflow Bottlenecks are Choking the AI deployment Tsunami.


The introduction of AI in medical imaging could not have come at a better time with the COVID-19 pandemic, as AI applications for detection, diagnosis and acquisition support. especially when using Telemedicine. have shown to be invaluable managing these patients both at healthcare institutions as well as at home. There are a couple of caveats however, using this new technology, first the regulatory constraints limiting new AI algorithms because the FDA needs to catch up with approvals, second, as with any Deep Learning algorithm, AI for healthcare needs lots of data to train the algorithm, which is a limiting factor for COVID cases even although several hospitals are making their COVID patient data files publicly available. But, despite these limitations, institutions are ready to deploy AI for this particular use case together with other applications that have been identified and are addressed by literally hundreds of companies developing these novel applications.

However, early implementations of AI have come across a major obstacle: how to adopt it to the workflow as it has caused a true “traffic jam” of data to be routed to several algorithms, and the results from these AI applications, in the form of annotations, reports, markers, screen saves and other indications, to be routed to their destinations such as the EMR, PACS, reporting systems or viewers. This orchestration has to occur synchronized with other information flows for example, an AI result has to be available either before or at the time of the reporting of the imaging studies, and has to be available together with lab or other results, which might need delaying or queuing these other non-AI information flows to be effective.

What is needed to manage this is an AI “conductor” that orchestrates the flow of images, results, reports between all the different parties such as modalities, reporting systems, EMR, and obviously the AI applications, the latter of which could be on-premise or in the cloud. Note that the number of AI apps eventually reach hundreds if you take into account that an algorithm might be modality specific (CT, MR, US etc.), and be specialized for different body parts and/or diseases. Scalability is a key requirement of this critical device but also many other features.

A simple “DICOM router” will not be able to orchestrate this rather complex workflow. To assist users with identifying the required features, I created three levels of routers as shown in the figure.

Level 1 can do simple forwarding and multiplexing, queue management and has a simple rules engine to determine what to send where. 

The second level has additional features as it can perform “fuzzy routing” i.e. based on fuzzy logic, prefetch information using proxies (i.e. querying multiple sources while giving a single return), do conversions of data and file formats, anonymize the data and is scalable. 

The third level has all of the level 1 and 2 functionality and extends it to AI specific routing, can modify images header and split studies, perform worklist proxies (i.e. query multiple worklists while appearing as a single thread), and has secure connectivity to meet “zero-trust” requirements. It supports not only “traditional” DICOM, HL7 but also webservices such as WADO and FHIR and supports IHE. It can also perform static and dynamic routing, do data conversions, filter the data, split studies, normalize the data, anonymize it if so desired, and provide support for several different formats and support for Structured reports, annotations, to name a few. As a matter of fact, a fully featured AI conductor requires at least 25 distinctly different functions as described in detail in this white paper (link).

In conclusion, there is a serious workflow issue deploying AI, but the good news is that there are solutions available, some in the public domain with limited features and some as commercial products. Make sure you know what you need before shopping around, the link to the comprehensive white paper on this subject has a handy checklist you can use when you are shopping at your (virtual) HIMSS, SIIM or RSNA trade shows or when “Zooming” with your favorite vendor. You can download the white paper here.




Friday, April 17, 2020

Open Source PACS solutions for LMIC regions.

Students at PACS bootcamp in Tanzania
sponsored by RAD-AID

Using an open source PACS solution instead of a commercial PACS could be attractive to LMIC (Low and Middle Income Countries) as it provides a good start to gain experience with managing digital medical images with a relatively low entry cost. In this paper we’ll discuss the PACS features that can be offered by open source providers, implementations strategies, and lessons learned.

Why would someone want to use an open source PACS?

·         The most important reason is its lower cost as it is free (kind of), i.e. there are no software and/or licensing fees. The exception is for the operating system, which can be open source as well if one uses Linux or a variant, and, if applicable, other utilities such as a commercial database, but again, they can be an open source product as well. There is a significant cost involved for the hardware, i.e. servers, PC’s, medical grade monitors for the radiologists and the network infrastructure, i.e. cabling, routers and switches. The latter assumes that there is not a reliable network in place which is often the case in LMIC’s, therefore, a dedicated network is often a requirement.

·         Open source PACS allows an organization to find out what they need as they are changing from using hardcopy films to a digital environment with which they have often no experience and/or exposure. As many open source PACS systems have a free and commercial version, it is easy to migrate at a later date to the paid version, which provides the upgrades and support as the organization feels comfortable with the vendor.

·         This is not only applicable to LMIC regions, but an open source PACS can be used to address a missing feature in your current system. For example, they can be used as a DICOM router.
·         The open source PACS can function as a free back-up in case the commercial production PACS goes down as part of an unscheduled or scheduled downtime.

·         It can be used as a “test-PACS” for troubleshooting, diagnostics and training.

But the main reason is still the cost advantage. If a LMIC hospital has to choose between a purchasing a used CT or MRI for let’s say $350k US, which could have a major impact on patient care as it might be the only one in a large region serving a big population, and  investing in a PACS system, the choice is clear: they will first get the modality and then use maybe another $50k or so to buy the hardware servers, PC’s and monitors and string cable to get a network in place and install an open source PACS. One should also be aware that the argument of not having any vendor support for an open source PACS is grossly over-rated. I have seen some good dealers and support but also some very poor service engineers, so even if you would use a commercial PACS, the chance that you get any decent support is often slim in the LMIC region.

Let’s now talk about the PACS architecture as there is a difference between a “bare-bones” (BB-PACS), a typical (T-PACS) and a fully featured (FF-PACS). This is important as in many cases you might only need a BBPACS to meet the immediate needs in a LMIC hospital or clinic. 

A TPACS takes in images from different modalities, indexes them in a database, aka Image Manager, archives them in such a way that they can be returned to users, and provides a workflow manager to allow for multiple radiology users to simultaneously access the studies using different worklist criteria. For example, the workflow manager would allow the studies to be accessed using different specialties (neuro, , pediatrics) and/or body parts (extremities, breast, head) as a filter while indicating if a study is being read by someone else, its priority, and if it has been reported. The TPACS also has a tight integration with its workstations, the PACS archive, and database through the workflow manager, i.e. these workstations would typically be from the same vendor that provides the PACS archive and database.

The FF-PACS would be a T-PACS and also have reporting capability, preferably using Voice Recognition and a Modality Worklist Provider that interfaces with the digital modalities with an ordering system to allow the technologist at the modality to pick from a list instead of having to re-enter the patient demographics and selecting the appropriate study.

A BB-PACS would be merely a PACS database and archive. It would not have a workflow manager and one could use an open source workstation from another vendor. Almost all open source PACS systems are of the BB-PACS kind, which means that one has to select a preferable open source viewer with it as well.

How are these open source PACS systems implemented? In the developed world, it typically happens top-down, i.e. a hospital has a Radiology Information System (RIS) that places the orders, which is replaced in most institutions by an ordering feature in the EMR. These orders are than converted from a HL7 into a DICOM worklist format by a Worklist provider. The images that are being acquired are sent to the PACS and the radiologist uses a Voice Recognition System to create the reports.

In the LMIC regions, it typically starts bottom-up. The first step is converting the modalities from film to digital by replacing their film processors with CR reader technology or upgrading their x-ray systems to include a Direct Digital Detector. They might get a CT and/or MRI that also prints studies on a film printer. They now have digital images that need to be viewed on a viewing station, archived and managed, therefore a PACS is needed. That is when the vendors start pitching their commercial PACS products, usually a FF-PACS or T-PACS, which are typically unaffordable, hence the choice to implement an open source, BB-PACS with a couple of open source view stations.

It is critical at this point to use a medical grade monitor for the radiologist to make a diagnosis as commercial grade monitors are not calibrated to map each image pixel value into a greyscale value that can be distinguished by a user. These monitors do not need to have the high resolution (3MP or 5MP) as is commonly used in developed countries, but a 2MP will suffice, knowing that to see the full resolution the user will have to zoom in or pan the image in a higher resolution. These 2MP monitors are at least three or more times less expensive than their high-resolution versions. The only disadvantage is that they require a little bit more time for the interpretation to be done as the user has to zoom to see the full spatial resolution.

After having installed a BB-PACS and used it for a few years, the institution will have a better idea of what their specific requirements are for the PACS system and they can make a much better decision for what they want to do next. There are three options:
1.       Expand the current open source BB-PACS, e.g. upgrade the storage capacity, replace the server, have a more robust back-up solution and add a commercial workstation workflow manager, a Modality Worklist Provider and reporting system. This assumes there is a mechanism to enter orders, i.e. through a RIS or EMR.
2.       Keep the BB-PACS and turn it into a Vendor Neutral Archive (VNA) and purchase a commercial T-PACS which serves as a front end to the radiologist. The new PACS might store images for 3-6 months and the “old” PACS will function as the permanent archive.
3.       Replace the BB-PACS with a commercial T-PACS or even a FF-PACS assuming the funds are available and you are looking for a cost effective solution.

Note that the advantage of option 1 and 2 is that you don’t need to migrate the images from the old to the new PACS, which can be a lengthy and potential costly endeavor.

What are some of the open source PACS systems? The most common options are Conquest, ClearCanvas server, Orthanc, DCM4CHEE and its variant Dicoogle. Conquest and ClearCanvas are Windows based, Orthanc can be both Windows or Linux and DCM4CHEE is Linux based. Conquest is the most popular for being used as a router and for research and the easiest to install (literally a few minutes). ClearCanvas is also relatively easy to install, DCM4CHEE is the most involved but there is now a docker available that makes the process easier. DCM4CHEE is also the most scalable. For open source viewers, one can use the ClearCanvas viewer, which is the most popular, or a web-based viewer such as Oviyam with DCM4CHEE. RadiAnt is another option and Osirix is the primary choice for a MAC. There are several other options for viewers, one can do a search and try them out, but be aware that they differ greatly with regard to functionality and robustness. Another consideration is continuing support, as an example, the gold standard for the open source viewer used to be E-film, but that company was acquiredby a commercial vendor who stopped supporting the open source version which is a problem with the frequent OS upgrades especially when based on Windows.

What are some of the lessons learned with installing the open source PACS:
·         Be prepared to assign an in-house IT and/or clinical person who is computer literate to support the PACS. This person will be responsible for day-to-day support, back-ups, managing scheduled and unscheduled downtimes, adding additional modalities and interfaces with a RIS, EMR or reporting system as they are being introduced. This staff member will also be responsible for troubleshooting any issues that might occur. They will also be the go-to person for questions about its usage and he or she will train incoming users. These so-called PACS administrators are a well-established profession in the developed world, but it will be a challenge initially to justify a designated position for these people to the department and hospital administration in the LMIC region as it is a new position.
·         How will these PACS administrators get their knowledge? There are fortunately many on-line resources, including on-line training, and organizations such as RAD-aid, which has been conducting PACS bootcamp training session in LMIC regions to educate these professionals.
·         PACS is a mission critical resource that has impact on the infrastructure (power, network, HVAC, etc.). In most cases the existing network is not secure and reliable enough and/or does not have sufficient bandwidth, which requires a dedicated network with its own switches and routers.
·         It is preferred to use locally sourced hardware for the IT components to allow for a service contract and access to parts. The only problem you might have is to get medical grade monitors in some regions as they are not as popular yet.
·         Pay attention to the reading environment for diagnostics, I had to instruct people to switch off their lightboxes that were used to look at old films and even paint some outside windows to reduce the ambient light. Use medical grade monitors for diagnostic reading.
·         Use good IT practices that includes implementing cyber security measures, reliable back-up and OS patch management.
·         Create a set of Policies and Procedures for the PACS that include access control, who can import and export data on CD’s and how that is done, unscheduled and scheduled down-time procedures, and everything else needed to manage a relatively complex healthcare imaging and IT system.

In conclusion, open source PACS systems are a very viable, if not the only option due to cost constraints, in LMIC regions, especially for the first phase. One should be aware that these open source PACS systems are very much a bare bones solution with limited functionality, however they allow the user to get started and find out their specific requirements. If additional funds become available, one can upgrade later to enhance functionality or replace it with a commercial PACS which can become either “front-end” to the existing PACS or a replacement.

Resources:

Thursday, March 19, 2020

Healthcare AI Regulatory Considerations.

Based on the information provided during the recent FDA sponsored workshop, “The Evolving
Role of Artificial Intelligence in Radiological Imaging,” here are the key US FDA regulatory considerations you should be aware of.

1. AI software applications are fundamentally different in that an AI algorithm is created and improved by feeding it data so it can learn, and eventually, if it implements Deep Learning, it can learn and improve autonomously based on new data. AI is a big business opportunity.

According to an analysis by Accenture, the market for AI applications for preliminary diagnosis and automated diagnosis is $8 billion. The same analysis points out that there is a 20 percent unmet demand for clinicians in the US by 2026, which can be addressed by AI. 


It became clear during the conference that the prediction made in November of 2016 by Geoffrey Hinton that deep learning would put radiologists out of a job within 5 years was a gross miscalculation. No jobs have been lost as of today, by contrast, the number of studies to be reviewed is increasing to almost 100 billion images per year, to be read by approximately 34,000 radiologists, requiring more and more images to be read faster and more efficiently. The use of AI to eliminate “normal” cases, especially for screening exams such as for breast cancer or TB in chest images, will only be a big relief for radiologists.


2.       AI will not make radiologists obsolete but rather will change their focus as the image by itself might become less important than the overall patient context. We spend a lot of time improving image quality by reducing image artifacts and increasing resolution so a physician can make a better diagnosis. However, as one of the speakers brought up, using autonomous AI could potentially eliminate the need of creating an image, by basing the diagnosis directly on the information in the raw data. Why would we need an image? Remember, the image was created to optimally present information to a human, ideally matching our eye-brain detection and interpretation. If we apply the AI algorithm on the acquired data without worrying about the image, we could use it on CT raw data streaming straight from the detector, or the signals directly from the MR high frequency coils, the ultrasound sound waves, or the EKG electrical signals, or whatever information comes from any kind of detector. Images have served the physicians very well for many years. In some cases, “medical imaging” will be implemented without the need to produce an image and we might need to rename it to become “medical diagnosing” instead. I believe that a radiologist is first and foremost an MD and thinking that they will be out of a job when there is less of an emphasis on the images seems misguided.

3.       AI algorithms are often focused on a single characteristic, which is a problem when using them in an autonomous mode causing incidental findings to go unnoticed. There were two good examples given during the workshop, the first one was an ultrasound of the heart of a fetus which looked perfectly normal. So, if one would run an AI algorithm to look for defects, it would pass as being OK. However, in this particular case as shown in the image, the heart was outside the chest, aka Ectopia Cordis, a rare condition, but if present should be diagnosed early to treat accordingly. The other example was for autonomous AI detection of fractures. Fractures are very common for children as I can attest personally having many grandkids who are very active. One of the speakers mentioned that in some cases when looking at the fracture there are incidental findings of bone cancer, something that a “fracture algorithm” would not detect. So, maybe my previous hypothesis that an image might become eventually obsolete is not quite correct, unless we have an all-encompassing AI detection algorithm that can identify every potential finding.
The problem with creating an all-encompassing AI is that there are some very rare findings and diseases for which there is relatively little data available. It is easy to get access to tens of thousands of chest images or breast images with lung or breast cancer from the public domain for example from NCI, however for rare cases there might be not enough data available to be statistically significant to train and validate an AI algorithm.

4.       There are still many legal questions and concerns about AI applications. As an analogy, the electric car company Tesla is being sued right now by the surviving family of the person who died after his car crashed in a highway median because the autopilot misread the lane lines. Many people die because they crash into the medians because of human error, however, there is much less tolerance for errors made by machines than by humans. The question is who is accountable if an algorithm fails with subsequent patient harm or even death, the hospital, the responsible physician, or vendor of the AI algorithm?

5.       A discussion about any new technology would not be complete without a discussion about standards. How is an algorithm integrated into an existing PACS viewer or medical device software and how is the output of the AI encoded? The IHE has just released a set of profiles that address both the AI results and workflow integration in two profiles. Implementors are encouraged to support these standards and potential users are encouraged to request them in their RFP’s.

6.       There are three different US FDA regulatory approval and oversight classifications for medical devices and software:
      1.       Class 1: Low risk, such as an image router. This classification requires General Controls to be applied (Good Manufacturing practices, complaint handling, etc.)
      2.       Class 2: Moderate risk such as a PACS system or medical monitor, as well as Computer Aided Detection software. This classification requires both general as well as special controls to be applied. These devices and software require a 510(k) premarket clearance.
For a moderate risk device that does NOT have a predicate device, a new procedure has been developed aka a “de novo” filing. For example, the first Computer Aided Acquisition device which was approved in January 2020 followed the de novo process.
      3.       Class 3: High risk such as Computer Aided Diagnosis which requires general controls AND Premarket Approval (PMA).

7.       AI can be distinguished into the following categories:
a.       CADe or Computer Aided Detection - These aid in localizing and marking of regions that may reveal specific abnormalities. The first application was for breast CAD, initially approved in 1997, followed by several other organ CAD applications. CADe has recently (as of January 2020) be reclassified to NOT need a PMA but rather being class 2 and needing only a 510(k).
b.       CADx or Computer Aided Diagnosis - Aids in characterizing and assessing disease type, severity, stage and progression
c.       CADe/x or Computer Aided Detection and Diagnosis - This is a combination of the first two classifications as it will do both localizing as well as characterizing the condition.
d.       CADt or Computer Aided Triage - This aids in prioritizing/triaging time sensitive patient detection and diagnosis. Based on a CADe and/or CADx finding, it could immediately alert a physician or put it on the top of a worklist to be evaluated.
e.       CADa/o or Computer Aided Acquisition/Optimization - Aids in the acquisition/optimization of images and diagnostic signals. The first CADa/o was approved in January 2020 for ultrasound to provide help to non-medical users to acquire images. Being first-in-class, it followed the de novo clearance process.

8.       Other dimensions or differentiation between the different AI algorithms are:
·         Is the algorithm “locked” or if it is continuously adaptive? An example of a locked algorithm was the first CADe application for digital mammography, its algorithm was locked and it is still basically the same as when the FDA cleared its initial filing in 1996. An adaptive algorithm will continue to learn and supposedly improve.
·         What is the reader paradigm? AI can serve as the first reader, which then possibly determines its triage, as a concurrent reader, e.g. it will do image segmentation or annotation while a physician is looking at an image, as a secondary reader, such as used to replace a double read for mammography, or it can include no human reader being autonomous. The first clearance for a fully autonomous AI application, based on having a better specificity and sensitivity than a human reader, was for diabetic retinopathy which was cleared in January of 2019.
·         What is the oversight? Is there no oversight, is it sporadic, or continuous? Note that this is different from the reader paradigm, a fully autonomous AI algorithm application might still require regular oversight as part of the QA checking and post-market surveillance, especially if the algorithm is not locked but adaptive.

9.       The FDA has several product codes for AI applications. The labeling and relationship between these codes, the various CAD(n) definitions and corresponding Class 1,2,3 and “de novo” classifications is
inconsistent and unclear. The majority of the products, i.e. more than 60 percent are cleared under the PACS product code (LLZ) as that is the most logical place for any image processing and analysis related filings, the remainder is cleared under 6 different CAD categories (QAS, QFM, QDQ,POK, QBS, and the most recent QJU) and a handful others. If a vendor wants to file a new algorithm, the easiest path is to convince the FDA that it fits under LLZ as there are many predicates and a lot of examples, assuming that the FDA approves that approach. I would assume that they want to steer new submissions towards the new classifications, however as you can see from the chart, there are very few predicates, sometimes only a single one.

10.   Choosing the correct size and type of dataset that is used for the learning is challenging:
·         There are no guidelines on the number of cases that are to be included in the dataset that is used for the algorithm to learn and to validate its implementation. The unofficial FDA position is that the data should be “statistically significant,” which means that it requires intensive interaction with the FDA to make sure it meets its criteria.
·         Techniques and image quality vary a lot between images, to the extent that certain images might not even be useful as part of the dataset.
·         One needs to make sure that the dataset is representative for the body part, disease, and population characteristics. It has been acknowledged that a dataset from e.g. Chinese citizens might not be applicable for a population in US, Europe or Africa. In addition, it became clear that it might need to be retrained based on the type of institution (compare a patient population at a VA medical center with the patients at a clinic in a suburb) and even geographic location (compare Cleveland with Portland, the youth in Cleveland being the most obese in all of the US).
·         There is a big difference between different manufacturers on how to represent their data.  This requires the normalizing and/or preparation of the data to make sure the algorithm can work on it. Even for CR/DR there are different detector/plate characteristics, different noise patterns, image processing applied by the vendor, different LUT’s applied, etc.
The figure shows the intensity values for different MRI’s.

11.   There should be a clear distinction between the three different datasets that are used for different purposes:
·         The training dataset that is used to train the AI algorithm.
·         After the initial training is done, one would use a tuning dataset to optimize the algorithm.
·         As soon as the algorithm development is complete, it will become part of the overall architecture and is verified with an integration test, which tests against the detailed design specs. This is followed by a system test that verifies against the system requirements, and lastly by a final Validation and Verification, which test against the user requirements using a separate Test dataset.

12.   AI clearance changed the traditional process in that now pre-clearance testing and validation and post-market surveillance are required. The pre-clearance is covered by the pre-submission, aka as the Q-Submission program, which has a separate set of guidelines and is extensively used by AI vendors. It is basically a set of meetings with the FDA with the focus on determining that the clinical testing is statistically significant and that the filing strategy is acceptable. Last year, there were 2200 pre-submissions out of 4000 submissions, which shows that it has become common practice. The FDA strongly encourages this approach.

The post-market surveillance is very important for non-locked algorithms, i.e. the ones that are self-learning and supposedly continuously improving. The challenge is to make sure that the algorithms are getting better and not worse, which requires post-market surveillance. There was a lot of discussion about the post-market surveillance and a consensus that it is needed but there were no guidelines available (yet) on how this would work.

13.   There are a couple of applicable documents that are useful when looking to get FDA clearance for an AI application: the Q-submission process, the De Novo classification request, and regulatory framework discussion paper.

The FDA initiative to have an open discussion in the form of a workshop was an excellent idea and brought forth a lot of discussion and valuable information. You can find a link to the many presentations at their website. It was obvious that the regulatory framework for AI applications is still very much under discussion. Key take-aways are the use of pre-submissions to have an early dialogue with the FDA about the acceptable clinical data used for training and validation, and regulatory product classification and approach, as well as the need for a post market assessment, which is not defined (yet) especially for adaptive AI algorithms.

The de novo approach will also be very useful for the “to-be-defined” product definitions and it might be expected that the list of product classifications will grow as more products are introduced. AI is here to stay and the sooner the FDA has a well defined process and approach, the faster these products can make an impact to the healthcare industry and patient care.