Software in Medical Devices, by MD101 Consulting - Tag - software failureBlog about software medical devices and their regulatory compliance. Main subjects are software validation, IEC 62304, ISO 13485, ISO 14971, CE mark 93/42 directive and 21 CFR part 820.2024-03-27T15:32:28+01:00Cyrille Michaudurn:md5:9c06172e7cd5ed0f5b192883b657eabbDotclearFDA Guidance on Multiple Function Device Productsurn:md5:e6413e10593cf2388af05f019b73fcea2020-09-04T14:34:00+02:002020-09-06T08:39:05+02:00MitchRegulationsFDAGuidanceIEC 62304mobile medical apprisk managementsoftware failure<p>The FDA published in July the final version of the <a href="https://www.fda.gov/regulatory-information/search-fda-guidance-documents/multiple-function-device-products-policy-and-considerations">Guidance on Multiple Function Device Products</a>. Despite the absence of the word "software" in the title, it addresses at first software medical devices. It also addresses hardware devices, but we will focus on software in this post.</p> <p>Typical examples of multiple function device are platforms with software modules, within which some don't qualify as medical devices or are 510k exempted:
<img src="https://blog.cm-dm.com/public/27-FDA-Multiple-Function-Device/.sw-platform-multiple-function-device_m.png" alt="sw-platform-multiple-function-device.png, Aug 2020" style="display:table; margin:0 auto;" title="sw-platform-multiple-function-device.png, Aug 2020" />
<br />
An even simpler case is a mobile app with some MD functions and some non-MD functions:
<img src="https://blog.cm-dm.com/public/27-FDA-Multiple-Function-Device/.mobile-app-multiple-function-device_m.png" alt="mobile-app-multiple-function-device.png, Aug 2020" style="display:table; margin:0 auto;" title="mobile-app-multiple-function-device.png, Aug 2020" />
<br /></p>
<h4>Impact of 510k exempt or non-MD</h4>
<p>Qualifying software as a medical device is not the purpose of this guidance. <a href="https://blog.cm-dm.com/pages/The-essential-list-of-guidances-for-software-medical-devices#FDA">Other FDA guidance documents</a> are there to answer this absolutely not simple question.<br />
This guidance focuses on the safety and effectiveness of MD functions / modules when they are coupled with non-MD or 510k exempt functions. Practically speaking, the FDA wants to now the impact of these functions on the MD function under review:</p>
<ul>
<li>Your MD function is coupled with other software,</li>
<li>These other software may fail,</li>
<li>As a side effect your MD function may also fail,</li>
<li>The consequence of this failure on the safety and effectiveness has to be addressed.</li>
</ul>
<h4>Other function</h4>
<p>The guidance defines the term "other function" as functions not regulated by the FDA, or not subject to premarket review (i.e. 510k exempt). The FDA also include General Purpose Computing Platform in this definition.<br />
<em>General Purpose Computing Platform</em>: Rings a bell! It sound like SOUP from IEC 62304 or OTSS from other FDA guidance. This is true, this General Purpose Computing Platform can be a platform developed by 3rd party software editor, it can also be developed by the manufacturer without applying any appropriate state-of-the-art development process (i.e. IEC 62304 or FDA guidance on General Principles on Software Validation), namely the definition of SOUP.<br />
<br />
But it can be more than that: this General Purpose Computing Platform can be a class I medical device 510k exempt, developed with appropriate standards according to design controls (820.30) by the manufacturer.<br />
This could be the case in example 1, where the manufacturer could claim that:</p>
<ul>
<li>either the software platform is a MDDS or a Medical Image Storage device not regulated by the FDA,</li>
<li>or it is a class I accessory to the MD module providing low-risk functions like data merging or alerts (not alarms!).</li>
</ul>
<p>So the "other function" is software not subject to a premarket review.</p>
<h4>Negative impact or positive impact of the other function.</h4>
<p>The main message of this guidance is: the FDA wants the manufacturer to bring evidence that the other functions don't have a negative impact on the MD function under review. Taking a risk assessment wording, the negative impact is a sequence of events involving the other function that shall not be assessed as an unacceptable risk of software failure.<br />
<br />
The second part of this main message is: if there is a positive impact on the MD function (e.g. the platform manages alerts), it shall be documented in the design control records. Finding positive impacts can be far-fetched with frameworks or OS'es. Usually, you chose a general-purpose computing platform because the technology is made like that (e.g. Android or iOS). Full stop. No positive impact there.<br />
So if there is no obvious positive impact, don't waste your time seeking desperately for that one. If you have a positive impact, lucky you!<br />
<br />
<em>Note to EU manufacturers: don't mix that positive impact with the benefit of MD found in EU MDR. Please, don't cross the streams. It would be bad.</em></p>
<h5>Is there an impact</h5>
<p>The FDA proposes a flowchart as an help to assess the impact of other function. Though simple and not that much helpful, this diagram has the virtue to display the assessment process in a graphical manner.<br />
More interesting are the examples of questions to ask when assessing the impact. They give cues to determine if and how the MD software exchanges data or shares resources with the other function. These questions are very technical and need to document the architecture of the MD software with its interfaces, like IEC 62304 clause 5.3.2.</p>
<h5>What is the impact - A taste of SOUP</h5>
<p>The guidance continues with recommendations on how to assess the impact when there is one. These recommendations look quite similar to what we do when we assess the impact of SOUP, like IEC 62304 clause 7.1.2.c. The advantage of the guidance is to give a list of questions to clarify the impact assessment. These questions can be advantageously used to identify and evaluate risk arising from SOUP.<br />
Let's do a little exercise to show the analogy between other function and SOUP: replace "other function" by "SOUP":<br />
E.g. #1 in section VI.B.1 bullet 1:</p>
<blockquote><p>The “SOUP” introduces a new hazardous situation or a new cause of an existing hazardous situation that is not otherwise present in the device function-under-review</p></blockquote>
<p>E.g. #2 in section VI.B.2 bullet 1:</p>
<blockquote><p>The performance or clinical functionality of the device function-under-review depends on the “SOUP” for the device function-under-review to perform as specified.</p></blockquote>
<p>Warning: it doesn't mean that "other functions" shall be treated as SOUP. It means that the process of risk assessment is similar. The big difference between the other function and SOUP is when the other function is a class I device not subject to premarket review. In this case this is not a SOUP, hence subject to design controls. But take note that the last paragraph about general-purpose computing platform before section VII looks very typical of SOUP risk assessment. More, when there is no evidence of design controls, this general-purpose computing platform will be treated as SOUP in IEC 62304 processes.<br /></p>
<h5>Cybersecurity</h5>
<p>Ah, there it is! The cybersecurity risk assessment shall include the impacts of cyber risks arising from the other function. No surprise there, if a cyber risk can have a consequence on the patient, it shall be addressed in the impact assessment.</p>
<h4>Separation of design and implementation</h4>
<p>This is the second message of this guidance. Though separation of design is not so much discussed, compared to the negative / positive impact, the former is as important as the latter. Here we retrieve the good old principles of architectural segregation found in IEC 62304.<br />
Our little exercise allows to show once again the analogy: replace medical "device function" by "class B (or class C) software" and "other function" by "class A software".<br />
<br />
E.g. #1 in the sentence found in section V.A</p>
<blockquote><p>The <em>class B function</em> should, to the extent possible, be separated from <em>class A function</em> in its design and implementation (e.g., logical separation, architectural separation, code, and data partitioning).</p></blockquote>
<p>E.g. #2 in the sentence found in section VII.D</p>
<blockquote><p>an architecture diagram may demonstrate the independence of the <em>class B function</em> from the <em>class A function</em>, or design documents may demonstrate the use of shared resources</p></blockquote>
<p>It also works with the impact assessment with the sentence found in section VII.E</p>
<blockquote><p>The device hazard analysis (...) for the <em>class B software</em> should include the results of a risk-based assessment of any potential adverse impact (...) of the <em>class A software</em> to the safety or effectiveness of the device function-under-review.</p></blockquote>
<p>Thus, if your software is of major level of concern (or IEC 62304 Class C), be prepared to demonstrate in your premarket submission documentation how your MD software is separated for other functions. Namely by hardware segregation or by fully documented logical segregation with common failure modes analysed in the risk assessment report.<br />
As a consequence, it won't work with web applications with intertwined services or monolithic mobile apps. Fortunately, a not so clear logical segregation may work with MD software in moderate level of concern (or IEC 62304 Class B), provided that other function failure doesn't result in an unacceptable risk.</p>
<h4>Documentation requirements</h4>
<p>In case of negative or positive impact, you have to add documents on the other function in the premarket submission, to begin with indications for use, device description and specific labelling for the other function. Then, practically speaking, you have to rely on the Guidance for the Content of Premarket Submissions for Software Contained in Medical Devices. It means that you have to assign a level of concern to the other function. Then, the guidance doesn't require to bring all documents according to this level of concern for the other function. Only a subset is required, but their content shall be in line with the guidance on premarket submission for devices with software.</p>
<h4>Having all ducks aligned</h4>
<p>MD software under review, other function, OTSS, SOUP... Delineating the software components subject to premarket review, the other function, and combining that with OTSS/SOUP of IEC 62304 can be a challenging task!
Some cases will be simple:</p>
<ul>
<li>A post-processing medical imaging software exchanges images with a PACS server (the other function),</li>
<li>Software analyzing genetic data exchanges HL7 streams with a LIMS (the other function).</li>
</ul>
<p>Some other case may be less simple:</p>
<ul>
<li>A module raising alerts to check for potential adverse drug events, based on patient data, is included in a larger Computerized physician order entry (CPOE, other function),</li>
<li>A legacy health information system (HIS, other function) contains some functions qualified as medical devices, intertwined in the system.</li>
</ul>
<p>Some cases may be a nightmare:<br />
When the other function contains the mitigation action of an unacceptable risk. What should we do? Leave it as other function? Include it in the software under review? This will be a case-by-case answer.</p>
<h4>Conclusion</h4>
<p>It is quite practical for manufacturers to split their software platform in MD software and other function, allowing to reduce the regulatory burden. This guidance puts an end to the situation where manufacturers legitimately exclude from their submission software not subject to premarket review, leaving blind spots in the submission. And blind spots are the reason for countless questions from the FDA!<br />
By requiring to document in their submission the pedigree of other functions, the FDA ceases to be blind and will be able to assess in a more straightforward way the MD software integrated in its technical environment. There is a drawback for manufacturers: they are forced to document the architecture and the risk assessment of the other function in the premarket submission. <br />
<br />
<br />
<br />
<br />
<ins>1st EU MDR oriented remark</ins>: In the absence of such guidance for the EU MDR, make use of the principles of the FDA guidance when you have a part of your software not qualified as SaMD per MDCG 2019-11. It is a way to answer to general requirements 14.2.d, 14.5, 17.3, and 23.4.ab.<br />
<br />
<ins>2nd EU MDR totally grumpy and non-constructive remark</ins>: don't expect any equivalent guidance from EU MDCG for a long time. FDA is the horizon of other authorities for software, as usual.</p>https://blog.cm-dm.com/post/2020/09/04/FDA-Guidance-on-Multiple-Function-Device-Products#comment-formhttps://blog.cm-dm.com/feed/atom/comments/238ISO/DIS 13485:2014 strengthens requirements about software - Part 2urn:md5:dee6b7d3732ff388e225a6d5ed84282d2014-06-27T11:10:00+02:002014-06-29T22:06:20+02:00MitchStandardsFDAISO 13485ISO 14971software failureSoftware Validation<p>Continuing with ISO/DIS 13485:2014, after having made an overview of software-related changes <a href="https://blog.cm-dm.com/post/2014/06/13/ISO/DIS-13485%3A2014-strengthens-requirements-about-software-Part-1">in the last article</a>, let's focus on the new clause #4.1.6.</p> <h4>New clause 4.1.6</h4>
<p>The clause says:<br />
<em>The organization shall document procedures for the validation of the application of computer software used in the quality management system, including production and service provision.</em><br />
<br />
That's brand new and could require a lot of man-hours in companies where the QMS relies on computerized system and produces lots of electronic documents and records.<br />
That's however not so new for companies, which already implement 21.CFR regulation (see below).<br />
<br />
The clause 4.1.6 continues with:<br />
<em>Such software applications shall be validated for their intended use prior to initial use, and after any changes to such software and/or its application. Records of such activities shall be maintained.</em><br />
<br />
OK, that's logical. If we want to validate software, we need to validate them according to established criteria (the topmost one is the intended use), we need to revalidate when something has changed, and we need to record the validation results to prove that it was done.<br />
<br />
The clause 4.1.6 ends with:<br />
<em>For each application of computer software used in the quality management system, the organization shall determine and justify the specific approach and the level of effort to be applied for software validation activities based on the risk associated with the use of the software</em><br />
<br />
Pfew! We can choose our own approach and fine-tune the level of effort demanded by validation! But it shall be done according to the results of risk assessment.<br />
<br />
My two comments (my two cents):</p>
<h4>Comment 1: what kind of risk</h4>
<p>Validation is based on risk assessment, high risk = heavy validation, low risk = light validation.<br />
But what kind of risk assessment?<br />
In the definitions section, we find the definition of risk and risk management, which both refer to ISO 14971. We can assume that the required risk management method is ISO 14971. However, there is no reference to ISO 14971 in clause 4.1.6, contrary to some other clauses dealing with risks elsewhere in the standard.<br />
And what kind of risks should be assessed? Probably those, for which the root cause is a QMS software failure.<br />
Knowing by experience how people don't feel at ease with software-related risks, I bet this risk assessment is going to burn a lot of man-hours in quality departments!</p>
<h4>Comment 2: the least burdensome approach</h4>
<p>I like this expression: <em>least burdensome approach</em>, because this is exactly what everybody is going to do. Translated in pragmatic words: everybody is going to do as little as possible to get through the validation.<br />
This is the corollary of comment 1, if something is hard to achieve, I'd rather try not doing it.
For example:</p>
<ul>
<li>A manufacturer which uses simple excel sheets (no formulas) to record CAPA could argue that there is very low risk of software failure, and as a consequence won't validate the sheets,</li>
<li>Another, which uses an access database since 10 years without problems could argue that there is no need to validate something with a long history of use,</li>
<li>A third one, which bought a license of a QMS management software could argue that the validation was done by the supplier management process.</li>
</ul>
<p>These three examples could be adequate in some cases, and not adequate in others.</p>
<h4>And 21 CFR?</h4>
<p>To some extent, the new 4.1.6 clause is made to have ISO 13485 more in line with requirements of US 21.CFR regulation. More precisely, 21.CFR.870 (i):<br />
<em>When computers or automated data processing systems are used as part of production <strong>or the quality system</strong>, the manufacturer shall validate computer software for its intended use according to an established protocol.</em><br />
I put in bold <em>or the quality system</em>, hence software used in the production processes are already addressed by clause 7.5.2 of ISO 13485:2003. So software in quality systems, addressed by 21.CFR.870, are covered by the new clause 4.1.6.<br />
<br />
Therefore manufacturers, which already apply 21.CFR regulations, won't be surprised by clause 4.1.6.<br /></p>
<h4>Conclusion</h4>
<p>A lot of work is required to bring an existing and well automated QMS in line with the new clause 4.1.6. But if it's done in the frame of a regulatory strategy, it's worth the effort.<br />
Thus it makes the QMS more ready to the changes in regulations (for example expected in Europe) and more in line with 21.CFR requirements about software validation.</p>https://blog.cm-dm.com/post/2014/06/11/ISO/DIS-13485%3A2014-strengthens-requirements-about-software-Part-2#comment-formhttps://blog.cm-dm.com/feed/atom/comments/147Validation of compiler and IDE - Why, when and how to? - Part 2: compilersurn:md5:9e20f503775c8d9323e2a94613f0314f2014-03-28T12:50:00+01:002016-06-07T16:26:38+02:00MitchProcessescritical softwaredevelopment processrisk managementsoftware failure<p>We saw <a href="https://blog.cm-dm.com/post/2014/03/13/Validation-of-compiler-and-IDE-Why%2C-when-and-how-to-Part-1">in the last post how to validate a software development tool</a>. But we saw also that validating a compiler this way is not a satisfactory task.<br />
Then: Why, when, and how to validate a compiler?</p> <h4>Why?</h4>
<p>A compiler is an assembly of finite state machines, which transform programming language into assembly code or P-code. Given their complexity, there are good chances that some bugs remain in compilers.<br />
Want to se a list of opened bugs on a compiler? Have a look at <a href="http://gcc.gnu.org/bugzilla/buglist.cgi?component=c%2B%2B&product=gcc&resolution=---">gcc bugs in C++</a> or <a href="http://gcc.gnu.org/bugzilla/buglist.cgi?component=c&product=gcc&resolution=---">bugs in C</a>!<br />
<br />
Looking at these lists, are you convinced that bugs remain in compilers? And why a compiler should be validated?<br />
Fortunately, most of these bugs arise when advanced language features are used! And fortunately, the effect of most of these bugs is that the code won't compile.<br />
This is why coding rules are important for critical software. Using simplified coding rules (eg: no use of advanced language features) is the best way to avoid compiler bugs.</p>
<h4>When?</h4>
<p>Now that we understand that a compiler should be validated, when should we do it?<br />
When compiler bug may generate an unacceptable risk for the patient or the operator using the compiled software.<br />
<br />
Examples (dummy, as usual): There's a bug in the rounding of floating point variables under certain circumstances, and at runtime output values are randomly inconsistent. In which context is it unacceptable:</p>
<ul>
<li>The compiled software is a PACS viewer: displayed images are very rarely affected by the bug, with a few pixels in wrong color. The practician will see that occasionally some pixels are inconsistent. Negligible risk for the patient (the manufacturer didn't even bother to fix the bug!).</li>
<li>The compiled software is in a perfusion pump: the computed volumes are inconsistent from time to time. The manufacturer will discover the bug in software tests (the software is class C, it has hundreds of unit tests). And in the very unlikely case that the bug is not found during tests, if it arises in real conditions, there is still a watchdog protection which rips off too high volume values.</li>
<li>The compiled software is in a pacemaker! The computed energy quantity in an electric shock is inconsistent from time to time. Accuracy in the energy quantity is absolutely vital. Oops! I think I'm going to validate my compiler!</li>
</ul>
<h4>How?</h4>
<p>Validating a compiler is not an easy task.<br />
Given its complexity, only formal validation is possible, namely by mathematical demonstration.<br />
There's a team at INRIA who works on this infinite task and they created <a href="http://compcert.inria.fr">a compiler named CompCert</a>. CompCert is the result of their research in formal validation.<br />
<br />
Xavier Leroy, researcher at INRIA, presented the results of his team about C compiler validation at a symposium dedicated to code generation in 2011. Here is his presentation: <a href="http://www.cgo.org/cgo2011/Xavier_Leroy.pdf">http://www.cgo.org/cgo2011/Xavier_Leroy.pdf</a><br />
His presentation looks quite readable at the beginning but it becomes quickly very difficult to follow. Just have a look at this presentation to see the complexity of formal compiler validation!<br />
<br />
There are also commercial compiler validation suites which contain thousands of tests cases to verify compilers compliance to C language standards. These products don't provide a formal validation, like CompCert. But they reduce the probability of error in a compiler to an extremely low probability.<br />
But they are extremely expensive. Because they contain a huge history of tests added day by day by their manufacturers.<br />
<br />
That's why unless working on very critical medical device, it's better to spend time testing the software than the chain of development tools. If there is a bug in the compiler then the generated code will be buggey as well.<br />
<br />
<br />
By the way: why shouldn't we validate processors as well?<br />
Remember the fdiv error in Intel pentium 4 instruction set. :-)<br />
<br />
<br />
See also <a href="https://blog.cm-dm.com/post/2014/04/07/Validation-of-compiler-and-IDE-Why%2C-when-and-how-to-Part-3">the next article</a>, with additional comments on this topic.</p>https://blog.cm-dm.com/post/2014/03/28/validation-of-compiler-and-IDE-Why%2C-when-and-how-to-Part-2#comment-formhttps://blog.cm-dm.com/feed/atom/comments/141Goto Failurn:md5:937e2bea60744c87e3bbc20ab23faaf22014-02-28T13:49:00+01:002014-03-01T17:55:40+01:00MitchProcessescritical softwaredevelopment processIEC 62304software failureSoftware Verification<p>If you've haven't heard about Apple's security flaw registered as <a href="http://support.apple.com/kb/HT6150">CVE-2014-1266 on apple website</a>, you probably were on planet Mars.<br />
Basically, it was unsafe to use https connections. I couldn't help but write an article about this!<br />
Components dealing with secured connections are abolutely critical. Applying rigorous development process is the best chance to avoid any trouble with these components.</p> <h4>The guilty code</h4>
<p>Here is the code with the security flaw in Apple's ssl library:</p>
<p style="white-space: pre;">
static OSStatus
SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams,
uint8_t *signature, UInt16 signatureLen)
{
OSStatus err;
<i>...</i>
if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
goto fail;
<span style="color: #FF0000;">goto fail;</span>
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
goto fail;
<i>...</i>
fail:
SSLFreeBuffer(&signedHashes);
SSLFreeBuffer(&hashCtx);
return err;
}</p>
<p>Quote from <a href="https://www.imperialviolet.org/2014/02/22/applebug.html">Adam Langley's ImperialViolet blog</a>, who quoted it from <a href="http://opensource.apple.com/source/Security/Security-55471/libsecurity_ssl/lib/sslKeyExchange.c">Apple's published source code</a>.</p>
<h4>How to reduce the probability of such flaw?</h4>
<p>Even is it's tempting to blame the developer on its code, that's not the right way to avoid such situation to happen again. It's the development process as a whole that is questioned here.<br />
Namely in every step of the process. Considering that collecting user requirements, writing specifications and designing architecture are not called into question (this is ssl, look at the rfc and stuff, ok?), the situation can be avoided by putting safeguards during coding and testing.<br /></p>
<h4>Testing</h4>
<p>Tests are a good way to find bugs, not all of them but most of them.<br />
There are plenty of ways to test software (see "the big picture" in <a href="https://blog.cm-dm.com/post/2012/12/13/En-route-to-Software-Verification%3A-one-goal%2C-many-methods-part-3">this article</a> about tests): unit tests, user tests and so on.<br />
<br />
According to <a href="https://www.imperialviolet.org/2014/02/22/applebug.html">Adam's article</a>, the problem could only be discovered by doing very specific tests with a custom-made TLS stack.<br />
Needless to say that such tool would be itself subject to errors. How to test a complex test tool?. It would need <a href="http://en.wikipedia.org/wiki/Verification_and_validation">IQ, OQ an PQ</a>, to be qualified as the right testing tool. Or, for ISO 13485 aficionados, a tool validation protocol according to section 7.5.2.1 of that ISO standard.<br />
<br />
Thus it appears that tests are probably not enough to run after this kind of bug.<br /></p>
<h4>Coding</h4>
<p>We only have the coding phase left!<br />
There are two possiblities, to trap this kind of bug:</p>
<ul>
<li>Verifying what developers code,</li>
<li>Changing the way developers code.</li>
</ul>
<h5>Verifying code</h5>
<p>Here we have two possibilities:</p>
<ul>
<li>Human verification,</li>
<li>Automated verification.</li>
</ul>
<p>Human verification is achieved by doing peer code reviews, whereas automated is done with the help of static analysis tools or advanced compiler checks.<br />
Both have their advantages or drawbacks (dealing with humans or dealing with machines designed by humans).<br />
<br />
BTW: these kinds of verification are in line with IEC 62304 requirements about software units verification found in sections 5.5.2, 5.5.3 and 5.5.4 of the standard.<br />
So if you want to be in line with IEC 62304 in class B (and C for 5.5.4) I can only urge you to either implement unit tests and/or plan coding reviews.<br />
<br />
If you're skeptical about the benefits of code reviews, I invite you to read this <a href="https://blog.cm-dm.com/post/2013/10/27/Testing-is-overrated">previous article about code reviews vs tests</a>.</p>
<h5>Changing the way developers code</h5>
<p>Here we have two levers:</p>
<ul>
<li>Pair programming,</li>
<li>Coding standards.</li>
</ul>
<p>I can only urge you to try to impose coding standards and try pair programming with your development teams!<br />
IMHO this is the most efficient way to avoid such bug.<br />
<br />
Coding standards however are more efficient if there is a static analysis tool to verify that they are applied. They're usually a document with more than 10-20 rules and they're difficult to know by heart!<br />
<br />
Pair programming is not easy to implement, especially when managers see it as doubling costs! That's the biggest obstacle to this method!</p>
<h4>For the hell of a goto</h4>
<p>Another remark: strongly ban goto's in your coding practices.<br />
If I were to configure a C compiler on a build server, a goto wouldn't compile.<br />
Just for the hell of it! :-)</p>
<h4>Conclusion</h4>
<p>The best way to put the odds on our side is to combine several methods:</p>
<ul>
<li>Peer coding reviews and/or pair coding,</li>
<li>Coding conventions,</li>
<li>Static code analysis,</li>
<li>Classical tests.</li>
</ul>
<p>You will be totally in line with sections 5.5, 5.6 and 5.7 of IEC 62304.<br />
<br />
The stricter the development process, the less bugs. Hence the software security classes.</p>https://blog.cm-dm.com/post/2014/02/28/Goto-Fail#comment-formhttps://blog.cm-dm.com/feed/atom/comments/139Got SOUP? - Part 3 - Runtimes, Frameworksurn:md5:087458989f65c37a2245be33989224462013-05-31T15:06:00+02:002013-06-24T07:53:45+02:00MitchStandardsIEC 62304risk managementsoftware failureSOUP<p>We saw in the <a href="https://blog.cm-dm.com/post/2013/05/10/Got-SOUP-Part-1-Because-every-good-software-starts-with-SOUP">first article</a> of this series, what is a SOUP and what is not a SOUP, according to IEC 62304.<br />
Then we continued in the <a href="https://blog.cm-dm.com/post/2013/05/24/Got-SOUP-Part-2-OS%2C-Drivers%2C-Runtimes">second article</a> by having a look at OS's and drivers.<br />
Let's now see how to deal with runtimes.</p> <h4>Reminder - What IEC 62304 expects for SOUP</h4>
<p>Requirements about SOUP are spread across IEC 62304 standard. Mainly these requirements are about:</p>
<ul>
<li>Software requirements: The manufacturer shall describe what requirements (functions or non-functional) are necessary to have the SOUP work.</li>
<li>Architecture: The manufacturer shall define the software architecture to have the SOUP work in appropriate conditions.</li>
<li>Maintenance: The manufacturer shall monitor the SOUP lifecycle: patches, new versions...</li>
<li>Risk analysis: It is mandatory to do a risk analysis related to the SOUP.</li>
</ul>
<h4>Runtime is a SOUP</h4>
<p>Runtime is a SOUP, but there is always a software development framework behind a runtime. Even the smallest runtime has its framework, like gcc to make C-runtime library linked to an executable.<br /></p>
<h5>The Framework behind the runtime</h5>
<p>Frameworks provide services for the developer and the architect (make things easier to design and code) or provide services of higher level than those provided by OS's. Most of times, frameworks impose the software architecture. It may be necessary to assess the consequences of the architectural constraints of the framework on the architecture.<br />
Contrary to OS's, services are very framework-dependent. Thus, the types of software failures and the risk analysis are very framework-dependent.<br />
This can be analyzed only on a case-by-case basis.</p>
<h5>Runtimes</h5>
<p>Runtimes are the libraries and executables provided by a framework. From the C-runtime of the very limited C standard lib framework, to the Java Runtime Environment of the plethoric Java Development Kit framework.<br />
Therefore the requirements of IEC 62304 about runtimes are mostly covered by the analysis of the framework. The risk analysis about frameworks may be completed with the analysis of problems that can be specific to a runtime (memory consumption, number of threads intanciated, deployment methods...).<br /></p>
<h5>JIT compilers</h5>
<p>JIT compilers are a very peculiar case of framework runtimes. Therefore the requirements of IEC 62304 about JIT compilers are mostly covered by the analysis of the framework and its runtime. The risk analysis about JIT compilers may be limited to the analysis of specific problems that can arise when they are activated (delayed response time, behavior on different target platforms...).<br />
<br />
That said, we can rephrase what's above to present it in the same order as we did in the previous post about OS's.</p>
<h5>Requirements about frameworks and runtimes</h5>
<p>IEC 62304 requires to define what software requirements (functions or non-functional) are necessary to have the SOUP work properly.<br />
For frameworks, we can take it in the reverse order, like for OS's. The job is to design the software to run inside the framework, and to define the hardware + software environment to have the framework runtimes work.<br />
We can then have two types of requirements:</p>
<ul>
<li>Those about services used by the software, eg. web services, databases, planned tasks, security...</li>
<li>Those about hardware + software environment, eg. OS version, HW requirements...</li>
</ul>
<h5>Architecture of frameworks and runtimes</h5>
<p>IEC 62304 requires to define the software architecture to have the SOUP work properly.<br />
Once again, we can take the problem in reverse order. We have to make the software run inside its framework. This has consequences on the architecture of the software.<br />
To make a link with software requirements, the real way to do things is:</p>
<ol>
<li>Choosing the framework,</li>
<li>Defining the software functional requirements,</li>
<li>Defining the software architecture to have software run within its functional perimeter,</li>
<li>Defining the software requirements (as such listed above about services and so on) to have the right architecture,</li>
</ol>
<p><em>Perhaps, you may find a bit early to place choosing the framework in first position. This is actually what happens in most of cases, except for brand new devices.</em></p>
<h5>Risk Analysis of frameworks and runtimes</h5>
<p>As we said above, the risk analysis of framework is always peculiar to each framework and the software implementation.<br />
Having defined a list of framework services (or libraries or else) used by the software, it is possible to guide the risk analysis about the framework with a list of <em>what if</em> questions.<br />
<em>What if this service provided by the framework fails?</em><br />
<em>What if this framework runtime component fails?...</em><br />
These questions are definitely not enough to do a risk analysis. They only consider what happens if something internal to software fails. But they are a good start to think about framework risks.</p>
<h5>Maintenance of frameworks and runtimes</h5>
<p>Frameworks versions are usually frozen at the beginning of design. Unless you know that a new version of the framework will be released during your development project. Or unless you are agile enough to change of framework version on the fly (but this is not today's discussion)!<br />
Changing of framework version poses the same problems as those for OS's. Depending on what's new in the version, migrating to a new version may impose a full verification and validation of a software.
<br />
<br />
In <a href="https://blog.cm-dm.com/post/2013/05/16/Got-SOUP-Part-4-Open-Source-Software">the next post</a>, we'll see how to deal with open-source librairies.</p>https://blog.cm-dm.com/post/2013/05/31/Got-SOUP-Part-3-Runtimes%2C-Frameworks#comment-formhttps://blog.cm-dm.com/feed/atom/comments/104Got SOUP? - Part 2 - OS, Drivers, Runtimesurn:md5:e356b245492c6d406604c87428d994c02013-05-24T14:03:00+02:002013-06-24T07:53:09+02:00MitchStandardsIEC 62304risk managementsoftware failureSOUP<p>We've seen <a href="https://blog.cm-dm.com/post/2013/05/10/Got-SOUP-Part-1-Because-every-good-software-starts-with-SOUP">in the last article</a>, what is a SOUP and what is not a SOUP, according to IEC 62304.<br />
We've also seen that a lot of 3rd party software are SOUPs, to begin with OS, drivers, runtimes, Just-In-Time (JIT) compilers and frameworks.<br />
How to deal with those to be compliant with IEC 62304?</p> <h4>What IEC 62304 expects for SOUP</h4>
<p>Requirements about SOUP are spread across IEC 62304 standard. Mainly these requirements are about:</p>
<h5>Requirements</h5>
<p>The manufacturer shall describe what requirements (functions or non-functional) are necessary to have the SOUP work.</p>
<h5>Architecture</h5>
<p>The manufacturer shall define the software architecture to have the SOUP work in appropriate conditions.</p>
<h5>Maintenance</h5>
<p>The manufacturer shall monitor the SOUP lifecycle: patches, new versions...</p>
<h5>Risk Analysis</h5>
<p>It is mandatory to do a risk analysis related to the SOUP.</p>
<h5>Configuration management</h5>
<p>SOUPs have to be managed in configuration.<br />
<em>OK, so take the right measures. I won't more discuss this requirement about config management.</em>
<br />
<br />
That said, it doesn't help to know how to comply with these requirements. Let's see what can be done.</p>
<h4>OS is a SOUP</h4>
<p>OS's deliver dozens of services, some are basic (file system, IP stack), some are of higher level (multi screen management, default archiving tool…).<br />
Depending on its safety class, integration of your software on the OS should be taken for granted or not. For class C, it may be relevant to list basic services that are used and to unitary test them. For other classes, it is a case-by-case decision based on risk analysis.</p>
<h5>Requirements</h5>
<p>IEC 62304 requires that the manufacturer describes what requirements are necessary to have the SOUP work.<br />
In the case of OS, we take the problem in the reverse order. It is more relevant to define requirements to have your software work with the services delivered by the OS. Perhaps not all services (yes, you use the file system), but only those that are relevant and/or peculiar to your software.</p>
<h5>Architecture</h5>
<p>IEC 62304 requires that the manufacturer defines the software architecture to have the SOUP work in appropriate conditions.<br />
Once again, you may take the problem in reverse order, and define your architecture to have it work on the OS. With this architecture, you may end up with hardware requirements, to make OS + software run in appropriate conditions.<br /></p>
<h5>Maintenance</h5>
<p>IEC 62304 requires that the manufacturer monitors the SOUP lifecycle: patches, new versions…<br />
There are two main possible cases:</p>
<ul>
<li>Either the OS version is frozen at design. Eg: your software works on linux debian 6.0.7 with kde desktop.</li>
<li>Or the OS version is user-dependent. For example, your software works on Windows XP SP3 or higher.</li>
</ul>
<p>In the frozen case, only updates for cyber security may impose a change of version.<br />
In the open case, the updates of the OS may be analyzed to verify that they don't downgrade software performances nor bring new safety issues.<br />
That may lead to a complete verification of your software when you update the OS. This verification depends on the types of changes in the OS. A complete verification is very probable when you upgrade to a newer version, whereas a few tests may be enough if you apply a security patch.</p>
<h5>Risk Analysis</h5>
<p>It is mandatory to do a risk analysis related to the SOUP. But how to make a risk analysis related to an OS?<br />
Risk analysis is based on the probability and consequences that the OS doesn't provide its services with expected performances.<br />
<br />
As we wrote in introduction, services provided by OS's are taken for granted. It's when your software works in limit conditions that OS failures can happen.<br />
If your software read-writes a 5kb file every minute, we know that it works on a standard PC. But if your software read-write a 5Gb file every minute, it may not work. In this case, some preliminary tests are necessary to verify that the OS can hold the load over time. And the SW/HW architecture may be updated as well.<br />
<br />
The safety class of the software is also important in the risk analysis. For class C software, it may be relevant to assess with scrutiny each OS service, whereas for other classes, a macroscopic assessment of HW + OS + SW integration may be enough.</p>
<h4>Driver is a SOUP</h4>
<p>Drivers are more or less additional services plugged into the OS. Even if the service they provide is peculiar to a dedicated hardware, drivers should be treated in a very similar way, albeit identical, to OS's.<br />
The main difference is the presence of the hardware peripheral, controlled by the driver. That means that you have to verify that the hardware works in appropriate conditions.</p>
<h5>Requirements</h5>
<p>As for the OS, it is relevant to define requirements to have your software work with the services delivered by the driver.</p>
<h5>Architecture</h5>
<p>Likewise, it is necessary to define your architecture to have it work with the driver and the underlying hardware. With this architecture, you may end up with hardware requirements, to make OS + drivers + software run in appropriate conditions.</p>
<h5>Risks Analysis</h5>
<p>Risk analysis is based on the probability and consequences that the driver doesn't provide its services with expected performances.<br />
But there is a new condition. Risk analysis shall also be based on the probability and consequences that the connected hardware doesn't work with expected performances. This hardware risk analysis may have been completed at the system level and may be an input data of software design.
<br />
<br />
In <a href="https://blog.cm-dm.com/post/2013/05/31/Got-SOUP-Part-3-Runtimes%2C-Frameworks">the next post</a>, we'll continue our journey in the SOUP territory, by paying a visit to runtimes and frameworks.</p>https://blog.cm-dm.com/post/2013/05/24/Got-SOUP-Part-2-OS%2C-Drivers%2C-Runtimes#comment-formhttps://blog.cm-dm.com/feed/atom/comments/100Class A, B and C. Is it possible to reduce the documentation of detailed design of software medical devices?urn:md5:271f627c838c9f07b77aef61837b248b2013-01-21T15:34:00+01:002013-01-25T17:44:56+01:00MitchStandardsClassificationcritical softwareFDAIEC 62304software failuresoftware itemsoftware unit<p>In the last two posts, we've seen <a href="https://blog.cm-dm.com/post/2013/01/11/What-is-a-Software-Unit">what a software unit is</a>, and <a href="https://blog.cm-dm.com/post/2013/01/18/Class-A%2C-B-and-C.-When-to-do-detailed-design-of-software-medical-devices">when to do software detailed design</a>, according to IEC 62304 and FDA Guidances.</p> <p>For Class C software, detailed design can be a very burdensome and time consuming task.<br /></p>
<h4>Detailed conception and its verification</h4>
<p>IEC 62304 standard requires doing the detailed conception of every software unit and to verify this detailed conception.<br />
This means that a lot of time has to be devoted to detailed conception documentation:</p>
<ul>
<li>Component/class/sequence/collaboration/deployment diagrams,</li>
<li>Comments that explain the choices of conception highlighted by these diagrams.</li>
</ul>
<p>This also means that reviews have to be planned to formally verify the conception.<br />
It bet that your managers don't give you enough time and money to do these tasks with the level of scrutiny required by the standard!</p>
<h4>Verification of software units</h4>
<p>IEC 62304 standard requires also verifying the software units. In concrete terms, doing unit tests to show that software units work the way they were designed. And to formally review these unit tests!<br />
OK, most of software development teams do unit tests today, units tests became the custom. But the problem is that for class C, he tests shall be systematic and should consider the following criteria:</p>
<ul>
<li>data flows,</li>
<li>memory allocation,</li>
<li>fault tolerance,</li>
<li>and a few others of the same type.</li>
</ul>
<p>So units tests with this level of scrutiny can become a very burdensome task. Once again, I bet that managers don't give you the necessary time to do all of these tests.<br />
<br />
<br />
<em>I'm a bit provocative, there are medical devices manufacturers that apply the standard to the letter. Ask the infusion pumps manufacturers what they think of <a href="http://www.fda.gov/MedicalDevices/ProductsandMedicalProcedures/GeneralHospitalDevicesandSupplies/InfusionPumps/ucm202511.htm">FDA research about software failures</a>!</em>
<br />
<br />
Fortunately, it's not necessary to do Class C documentation for the whole software, provided that its architecture has been designed the right way.<br /></p>
<h4>Software Architecture to isolate Class C software items</h4>
<p>Imagine a software system with a Class C software item inside larger software items, like in the diagram below.<br />
<img src="https://blog.cm-dm.com/public/18-SW-units/.Slide4_m.jpg" alt="Software in medical devices - Class C software item forces the whole topmost software item to be in Class C" style="display:block; margin:0 auto;" title="Software in medical devices - Class C software item forces the whole topmost software item to be in Class C, janv. 2013" />
The class C critical item in red propagates its critical characteristics to the whole topmost item and other siblings items. Even a class A item has to be treated as a class C item.<br />
Why? Because it is known that a software failure that happens in an item can propagate to another item and have unexpected consequences. So a class A item, which is more subject to failures, can lead to the failure of a class C item. This make the work on the class C item useless.<br />
To avoid dreadful consequences of software failures of non-critical items on critical items, it's necessary either to do everything in the highest class, or to isolate critical items.<br />
This is what is done in the diagram below:<br />
<img src="https://blog.cm-dm.com/public/18-SW-units/.Slide5_m.jpg" alt="Software in medical devices - Isolate Class C software items to reduce the scope of application of class C requirements" style="display:block; margin:0 auto;" title="Software in medical devices - Isolate Class C software items to reduce the scope of application of class C requirements, janv. 2013" />
The critical item is isolated in a topmost item, with an interface. It communicates with other topmost items through its interface.<br />
The isolation may also be physical, with the class C items running on a separate hardware.<br />
Doing so, the scope of class C items is reduced, other topmost items can be of lower classes.<br />
Though this work of architecture isolation is possible, it shall always be done a as consequence of a risk analysis:</p>
<ol>
<li>Risk analysis and definition of classes of items,</li>
<li>Risk mitigation and isolation with architectural measures.</li>
</ol>
<p>It shall not be done the other way!
<br />
<br />
As a conclusion, I would say that there are not many ways to escape from the work overload due to class C items. The right solution is to isolate Class C items in a subsystem with well defined (and simple) interfaces. If you can't do this kind of architectural separation, you're stuck with class C tasks!</p>https://blog.cm-dm.com/post/2013/01/21/Class-A%2C-B-and-C.-Is-it-possible-to-reduce-the-documentation-of-detailed-design-of-software-medical-devices#comment-formhttps://blog.cm-dm.com/feed/atom/comments/80What is a Software Unit?urn:md5:e95c607e1be81134e9c4adadd4797ab12013-01-11T14:08:00+01:002019-07-25T09:15:03+02:00MitchStandardsIEC 62304risk managementsoftware failuresoftware itemsoftware unit<p>IEC 62304 requires to split architecture of class C (mission critical) software into software items and software units. Software units are software items that can't be split into sub-items, according to the standard. Okay. But how to decide that an item can't be split into sub-items, and is a unit?</p> <p>There had been a long and excellent discussion about that subject on <a href="http://elsmar.com/Forums/showthread.php?t=47402&page=2">elsmar cove forum</a> a few years ago. I use some outputs of this discussion in my post. I draw different conclusions, however.<br /></p>
<h4>Short answer</h4>
<p>Here is the short answer that cames in mind of any developer.
A software unit is:</p>
<ul>
<li>a set of procedures or functions, in a procedural or functional language,</li>
<li>a class and its nested classes, in an object or object-oriented language.</li>
</ul>
<p>Be it procedural or object oriented, these procedures/functions/classes are grouped in a source file.<br />
This is certainly right in many cases. But I think that it's not always relevant in all cases. Let me argue!</p>
<h4>Software equals modeling</h4>
<p>Software design is modeling how data are processed. The less detailed model is the user requirements or use cases. The most refined modeling is ... the software code itself that does the job. In between we have a few levels of granularity, usually:</p>
<ul>
<li>Systems and subsystems,</li>
<li>Main software items, that can be
<ul>
<li>Processes or services if we think about deployment,</li>
<li>Topmost packages or components if we think about development</li>
</ul></li>
<li>Software items of lower level (level 1) nested inside the main items,</li>
<li>Software items of lower level (level 2) nested inside level 1 items,</li>
<li>... and so on down to software units.</li>
</ul>
<p><img src="https://blog.cm-dm.com/public/18-SW-units/.Slide1_m.jpg" alt="Software in medical devices - software items and software units" style="display:table; margin:0 auto;" title="Software in medical devices - software items and software units, janv. 2013" /></p>
<h4>Don't go too low</h4>
<p>Delving on the details of software, one can argue that a software unit is:</p>
<ul>
<li>a processor instruction,</li>
<li>a language instruction,</li>
<li>a line of code,</li>
<li>a loop,</li>
<li>a branch of an <em>if then else</em> instruction.</li>
</ul>
<p>We don't go that far in modeling because we all know that there is nothing to gain from such a level of scrutiny. There is a level at which the modeling ends and the programming begins.<br />
But when does the programming begins?</p>
<h4>Think about patterns</h4>
<p>Software engineers and developers have the habit to think about patterns. There are lots of reusable patterns, like the very generic <a href="http://en.wikipedia.org/wiki/Design_Patterns_(book)">design patterns of the gang of four</a>, or more specific patterns, or patterns found in software libraries like GUI.<br />
So a set of classes/procedures/functions grouped to implement a pattern could be seen as a software unit.<br />
Another proof in favor of patterns is when refactoring is necessary. Most of times, we don't refactor a class/procedure/function. We refactor a set of classes/procedures/functions to enhance their performance, their consistency.<br />
Thinking about patterns brings also a dynamic view of software behavior that doesn't appear in single classes/functions. The pattern is used for a specific purpose that can be highlighted. Thanks to the pattern, we can highlight interaction between software units. We can also communicate easily on the internal classes/procedures inside a well-known pattern.</p>
<h4>Think about interfaces</h4>
<p>Interfaces are often a source of bugs and software failure. At which level is it necessary to go in the software items to describe the functioning of an interface?<br />
If an interface is not well described by the software units representing it, then they probably aren't software units. They are software items that need to be refined in software items/units to describe the interface well.<br />
<img src="https://blog.cm-dm.com/public/18-SW-units/.Slide3_m.jpg" alt="Software in medical devices - refining software items to describe an interface" style="display:table; margin:0 auto;" title="Software in medical devices - refining software items to describe an interface, janv. 2013" /></p>
<h4>Think about risks</h4>
<p>We always have to think about risks in the medical devices industry. Risk mitigation can be a good clue to define the level of details necessary to software modeling.<br />
Is it relevant to split a software item in one or more software units, to explain in which unit the risks resides and/or in which unit it is mitigated?<br />
If yes, then the software item has to be modeled more in details, with items of lower levels, down to units.<br />
<img src="https://blog.cm-dm.com/public/18-SW-units/.Slide2_m.jpg" alt="software in medical devices - refining software items to describe risk and risk mitigation" style="display:table; margin:0 auto;" title="software in medical devices - refining software items to describe risk and risk mitigation, janv. 2013" />
<br />
Remember we do all of this (Software architecture, modeling, IEC 62034 and so on) to avoid software failures and mitigate risks.<br /></p>
<h4>Use programming language structure</h4>
<p>Programming languages offer many ways to organize sets of classes/procedures/functions:</p>
<ul>
<li>Files,</li>
<li>Modules,</li>
<li>Namespaces,</li>
<li>Packages,</li>
<li>...</li>
</ul>
<p>I personally like organizing software code with packages (mandatory in languages like java or .Net family) or namespaces (in C++ but optional) or modules (in procedural languages). Packages/namespaces/modules of lowest level can be a good way to name software units and group classes/procedures/functions.
<br /></p>
<h4>Conclusion</h4>
<p>The way we do software modeling, using design patterns, and the way programming languages allow to organize code, using packages or namespaces, speak in favor of software units of higher level than classes or procedures in a source file.<br />
There are undoubtedly cases where classes or procedures are the right level of software units (like tiny embedded SW in microcontrollers).<br />
But for most of software, modeling down to single classes/procedures doesn't bring more information about its functioning and doesn't make a better risk analysis.<br />
<br />
<br />
<a href="https://blog.cm-dm.com/post/2013/01/18/Class-A%2C-B-and-C.-When-to-do-detailed-design-of-software-medical-devices">Next article</a> will be about when doing this job, given the software class (A, B, or C) according to IEC 62304.</p>https://blog.cm-dm.com/post/2013/01/11/What-is-a-Software-Unit#comment-formhttps://blog.cm-dm.com/feed/atom/comments/78Probability of occurence of a software failureurn:md5:c1095e164ae81530e09005c3e3c1d2422012-09-28T12:10:00+02:002012-10-02T07:48:47+02:00MitchMisccritical softwareIEC 62304risk managementsoftware failure<p>In two previous articles, I talked about <a href="https://blog.cm-dm.com/post/2012/09/07/How-to-differenciate-Bugs%2C-Software-Risks-and-Software-Failures-Part-1">the differences of bugs, software failures, and risks</a>.<br />
I left the discussion unfinished about the probability of occurence of a software failure or a defect.<br />
I think that assessing the probability of occurence of a software failure is a hot subject. I've already seen many contradictory comments on this subject. It's also a hot subject for software manufacturers that are not well used to risk assessment.</p> <h3>Risk: probability matters</h3>
<p>We've seen in those previous articles that the criticity of a risk is strongly linked to:</p>
<ul>
<li>the gravity of the potential injury,</li>
<li>the probability of occurence of the injury</li>
</ul>
<p>The probability of the injury is the combination of the probabilities of events that lead to the injury, including the software failure.
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-7_m.jpg" alt="The probability of the injury is the combination of the probabilities of events that lead to the injury" style="display:block; margin:0 auto;" title="The probability of the injury is the combination of the probabilities of events that lead to the injury, sept. 2012" />
<br />
When the cause of the software failure is a defect, the diagrams changes to this:
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-8_m.jpg" alt="The probability of occurrence of the injury is the combination of probabilities of events leading to the injury, including the defect and the software failure." style="display:block; margin:0 auto;" title="The probability of occurrence of the injury is the combination of probabilities of events leading to the injury, including the defect and the software failure., sept. 2012" /></p>
<h4>Probability of software failure</h4>
<p>In case of software failure that could lead to an injury, the probability of occurence of the injury is directly linked to the probability of software failure.<br />
But it is extremely difficult to set the probability of occurence of software failures.<br />
So many components are involved in software that casting the bases of a modeling with probabilistic computations is seriously compromised.<br />
There are methods based on historical and statistical data, or even probabilistic calculations. But in general, they're not applicable to your case!<br />
<br />
Zooming on the first diagram above, we miss a serious piece of the puzzle:
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-9_m.jpg" alt="Software in medical devices - Probability of software failure is unknown" style="display:block; margin:0 auto;" title="Software in medical devices - Probability of software failure is unknown, sept. 2012" /></p>
<h4>Probability of defects</h4>
<p>It's all the more difficult to evaluate the probability of defects. Hence defects are human errors in coding and pitfalls in tests.
With bugs, the only thing that we can say is:</p>
<ul>
<li>We've done our best to eliminate bugs, but some may still be hidden,</li>
<li>We've done our best to prevent software failures for which the root cause is a bug, but perhaps there is a case that we haven't seen yet.</li>
</ul>
<p>By the way, IEC 62304 standard helps us to assert this. This is the main purpose of the standard! Better software development process, deeper risks assessment, better software, less bugs, less software failures!<br />
<br />
Controlled process or not, we can't know the probability of a defect. Like with software failures, we miss a master piece of the puzzle:
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-10_m.jpg" alt="Software in medical devices - Probability of defect is unknown" style="display:block; margin:0 auto;" title="Software in medical devices - Probability of defect is unknown, sept. 2012" /></p>
<h4>If you don't know, take the worst case</h4>
<p>The situation is not desperate though. There is a very simple measure to take. If we don't know the probability, we should take the worst case. It means that a software failure will happen one day. Somewhere. Somehow.<br />
Thus the probability of occurence of a software failure should be set to 1.<br />
There is an interesting discussion about that in section 4.4.3 of IEC/TR 80002-1. I invite you to read it.<br />
So our green boxes on the diagram have a probability of 1.
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-11_s.jpg" alt="Defects and software failures have a probability set to 1" style="display:block; margin:0 auto;" title="Defects and software failures have a probability set to 1, sept. 2012" /></p>
<h3>Probability = 1, also for risk???</h3>
<p>It doesn't mean that the probability of risks where software is involved shall be set to 1.
The final probability of a risk is the multiplication of:</p>
<ul>
<li>The probability of the root cause(s) of the software failure, and</li>
<li>The probability of the software failure, when the root cause occurs, and</li>
<li>The probablity of events after software failure.</li>
</ul>
<p>Since the probability of the software is 1, the final probability is equal to the probability of the root causes and the events after failure.<br />
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-12_m.jpg" alt="Probability of defects and software failure=1. Their "contribution" to the final probability of injury "disappears"" style="display:block; margin:0 auto;" title="Probability of defects and software failure=1. Their "contribution" to the final probability of injury "disappears", sept. 2012" />
<br />
Let me give you some examples.</p>
<h4>A software reads data from a CD-ROM</h4>
<p>I limit this example to one root cause and assume that there is not a chain of hazardous events to generate the hazardous situation.</p>
<ul>
<li>When the CD is burned, there is a probability <strong>P1</strong> that the burning process generates errors, and that data are corrupted.</li>
<li>When data are corrupted on the CD-ROM, the software may read wrong information or could't read them,</li>
<li>When CD-ROM is corrupted, it's possible that the software:
<ul>
<li>could read the corrupted data, but without any failure (the corrupted data are in a trailing block of unused bytes),</li>
<li>could read the corrupted data with a failure, like corrupted values of pixels displayed in an image,</li>
<li>could not read anything, the file structure of the CD-ROM is totally corrupted.</li>
</ul></li>
<li>Since I can't determine when it goes ok or when it goes wrong with my software, depending on what data is corrupted; I set the probability of software failure to 1.</li>
</ul>
<p>So the probability of the risk is equal to the probability of burning a corrupted CD-ROM.
P-risk = P1</p>
<h4>A software fails to monitor a hardware</h4>
<p>I give in this example a chain of two hazardous events which generate the hazardous situation.</p>
<ul>
<li>A software monitors a hardware through an analogic wire connection in a very perturbated environment,</li>
<li>There are electomagnetic perturbations, that despite the EMC shielding, are likely to generate errors in the signal,</li>
<li>The analogic signal is converted to digital data,</li>
<li>The digital data may have been corrupted by these EMC perturbations, hence the converter could not convert anything,</li>
<li>Corrupted or not, the digital data are received by the software,</li>
<li>The software may fail to monitor the hardware (eg. loss of position of an engine, unable to measure a value of a sensor ...),</li>
<li>Since I can't determine when it goes ok or when it goes wrong with my software, depending on what kind of EMC perturbation took place, I set the probability of software failure to 1.</li>
</ul>
<p>So the probability of the risk is equal to the probability of two chained events:</p>
<ul>
<li>P1 = the probability of having strong EMC pertubations,</li>
<li>P2 = the probability of the analogic-digital converter being unable to convert the perturbated analogic data.</li>
</ul>
<p>A short probabilistic calculation gives:
P-risk = P1 x P2</p>
<h3>What is the risk linked to a defect in the software.</h3>
<p>This kind of situation is very different form examples given above, because the root cause is a pitfall in the design of the medical device. In short, the root cause is the software development team which missed something. I can't be more specific, hence there are thousands of reasons to have a defect. IMHO, this is the most difficult situation to grasp.<br />
Why is it so difficult to understand it? Because we said that the probability of a software failure is 1. So, there will be a defect in the software, somewhere and it will generate a software failure one day, somewhere, somehow.</p>
<h4>IEC 62304 reduces the probability of defects</h4>
<p>Don't look too far how to manage this kind of software failure. The main mitigation action of risks linked to software failures generated by defects is applying the IEC 62304 standard!<br />
The purpose of the standard is to decrease as sharply as possible the probability of having a defect:</p>
<ul>
<li>Class C: defects are very unlikely,</li>
<li>Class B: defects are unlikely,</li>
<li>Class A: defects are occasional.</li>
</ul>
<p><img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-13_m.jpg" alt="Probability of defects decrease with controled software developement process compliant with IEC 62304" style="display:block; margin:0 auto;" title="Probability of defects decrease with controled software developement process compliant with IEC 62304, oct. 2012" />
Even if we apply IEC 62304 from A to Z, like in class C, there is still a very low probability that a defect occurs.<br />
A software developed with class C contraints is very robust, but we can't assert it is perfect.</p>
<h4>Acceptability of defects</h4>
<p>That being said, it doesn't mean that a software failure linked to a defect, even with a low probability, is acceptable in a class C software.<br />
What, if a defect generates a software failure that could lead to an extremely hazardous situation, like patient death? Though with extremely low probability, the consequence is extremely high and leads to an unacceptable risk.<br />
In this case:</p>
<ol>
<li>either there is a hardware solution to avoid the extremely hazardous situation, to even more decrease the probability,</li>
<li>or there is no solution. A software solution is not acceptable, since it might be also subject to defects (we're going in a circle).</li>
</ol>
<p>Situation #1 is the most comfortable. The software risk is mitigated by something outside software.<br />
In situation #2, the risk remains. The benefit/risk balance has to be assessed with medical experts, to determine if the risk is acceptable.
<br />
<br /></p>
<h4>Conclusion</h4>
<p>It's not possible to assess the probability of a software failure. So it has to be set to 1. Thus the probability of the risk is only assessed from the probabilities of the chain of events that generate the software failure and the events generated by the software failure.<br />
In the case of software failures generated by defects, the probability of defects is unknown. The purpose of IEC 62304 is to decrease the probability with more strigent development methods according to the class of software.<br />
Risks related to these software failures, however, will still be present. It's only the policy of the manufacturer about residual risks that can determine if the risk, versus benefits, is acceptable or not.</p>https://blog.cm-dm.com/post/2012/09/14/Probability-of-occurence-of-a-software-failure#comment-formhttps://blog.cm-dm.com/feed/atom/comments/68How to differenciate Bugs, Software Risks and Software Failures - Part 2urn:md5:4a07ac1c443d409ebcbfb08d03d1fe332012-09-14T12:06:00+02:002012-11-09T15:27:10+01:00MitchMiscrisk managementsoftware failure<p>In my <a href="https://blog.cm-dm.com/post/2012/09/07/How-to-differenciate-Bugs%2C-Software-Risks-and-Software-Failures-Part-1">previous post about Bugs, Software Risks and Software Failures</a>, I explained the concepts of bugs, defects or anomalies, and the concept of software failure.<br />
Let's continue now with <strong>Risks</strong>.</p> <h3>Risks</h3>
<p>Risks are the combination (or the consequence, I should say) of other concepts:</p>
<ul>
<li>Hazardous phenomenon</li>
<li>Hazardous situation</li>
<li>Injury</li>
<li>Gravity of injury</li>
<li>Probability of occurence of injury</li>
</ul>
<p>These concepts are chained:</p>
<ul>
<li>When a hazardous phenomenon happens, the user or other people in the environment is/are placed in an hazardous situation.</li>
<li>This hazardous situation may lead to an injury of the user or those other people in the environment.</li>
<li>The risk is the assessment of:
<ul>
<li>the gravity of the injury, and</li>
<li>the probability of occurence of the injury.</li>
</ul></li>
</ul>
<p>Low gravity + low probability = low or negligible risk.<br />
High gravity + high probability = high or unacceptable risk.<br />
<br />
The diagram below sums-up all of above:
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-3_s.jpg" alt="Software in medical devices - Risks is the combination of gravity and probability of injury" style="display:block; margin:0 auto;" title="Software in medical devices - Risks is the combination of gravity and probability of injury, sept. 2012" />
There is a diagram in figure E.1 of ISO 14971 standard that sums-up all these concepts in a better way. I don't have the right to reproduce it here. But if you have a copy of the standard, it's worth having a look at it.<br />
<br />
OK we've defined what a risk is, but not the link between risks and other concepts.
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-4_m.jpg" alt="Software in Medical Devices - Risks vs Defects and Software Failures - link TBD" style="display:block; margin:0 auto;" title="Software in Medical Devices - Risks vs Defects and Software Failures - link TBD, sept. 2012" />
<br /></p>
<h4>Risks vs defects</h4>
<p>Let's now apply the chainning described above to critical or major defects:</p>
<ul>
<li>When a critical or major bug happens, the software crashes or gives wrong results. Thus people are placed in an hazardous situation.</li>
<li>This software crash or erroneous result may lead to an injury of people, like:
<ul>
<li>A lung ventilator is controlled by software and stops --> risk of severe injury!</li>
<li>A PACS viewer is boggey and shows wrong images --> risk of misinterpretation by the radiologist.</li>
</ul></li>
<li>The risk is the assessment of:
<ul>
<li>the gravity of the injury (see the two examples above),</li>
<li>the probability of occurence of the injury --> the probability of occurence of a defect.</li>
</ul></li>
</ul>
<p>Ouch! The probability of occurence of a defect?<br />
<br />
I don't know the probability of occurence of a defect. You don't know either. Nobody knows!<br />
We'll see later on how to deal with it.<br /></p>
<h4>Risks vs software failure</h4>
<p>Let's continue to apply the chainning described above to software failures:</p>
<ul>
<li>When a software failure happens, the software crashes or gives wrong results. Thus people are placed in an hazardous situation.</li>
<li>This software crash or erroneous result may lead to an injury of people (same examples as those for defects)</li>
<li>The risk is the assessment of:
<ul>
<li>the gravity of the injury,</li>
<li>the probability of occurence of the injury --> the probability of occurence of a software failure.</li>
</ul></li>
</ul>
<p>Ouch! the probability of occurence of a software failure?<br />
<br />
We'll see later on how to deal with it.<br /></p>
<h5>Same chainning</h5>
<p>What's interresting here, is that the chainning is the same for critical or major defects, and for software failures.<br />
Remember that (as I said in my previous post):</p>
<ul>
<li>Critical and major defects can lead to a software failure,</li>
<li>Minor and cosmetic defects don't,</li>
<li>A Software failure can happen without any defect, for other reasons, like wrong input data, hardware failure.</li>
</ul>
<p>We can add to this that:</p>
<ul>
<li>A defect can lead to an hazardous situation and a risk,</li>
<li>A software failure can also lead to an hazardous situation and a risk.</li>
</ul>
<h5>Defects and software failures vs risks</h5>
<p>But not all of defects and software failure could represent a risk.<br />
For example, the archive function doesn't work on your desktop computer because:</p>
<ul>
<li>There's a defect in the software (eg, defect in the driver of the CDROM),</li>
<li>The USB port is out (hardware failure).</li>
</ul>
<p>If the archive function is not used at all for medical purposes but only by the manufacturer for technical purpose, it's possible to say that there's no risk "attached" to this software failure.<br />
<br />
Warning, though, this is a very rare case.<br />
To sum-up: There are 99% chances that critical and major defects, and software failures lead to risks.<br />
<br />
This is represented in the diagram below.
<br />
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-5_m.jpg" alt="Software in Medical Devices - Risks vs Defects vs Software Failures" style="display:block; margin:0 auto;" title="Software in Medical Devices - Risks vs Defects vs Software Failures, sept. 2012" /></p>
<h3>The big picture</h3>
<p>We've seen that:</p>
<ul>
<li>Most of defects (mainly the critical and major ones) generate software failures and represent a risk,</li>
<li>Most of software failures (from a defect or not) represent a risk,</li>
<li>Minor defects don't represent a risk,</li>
<li>Some software failures (very few) don't represent a risk.</li>
</ul>
<p>To continue and review all cases:</p>
<ul>
<li>Some risks are neither software failure, nor defects (those outside software ...),</li>
<li>Most of minor and cosmetic defects don’t represent a risk,</li>
<li>Some minor and cosmetic defects may represent a (very remote) risk,</li>
<li>Some defects (very few), may generate software failure but don't represent a risk.</li>
</ul>
<p>The last assertion is a bit far-fetched but is still possible ...
<br />
<br />
The big picture below summarizes all these cases.
<br />
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/bugs-failure-risk-6.png" alt="Software in Medical Devices - Risks vs Defects vs Software Failures - the big picture" style="display:block; margin:0 auto;" title="Software in Medical Devices - Risks vs Defects vs Software Failures - the big picture, sept. 2012" /></p>
<h4>Changes in the status of defects, software failure and risks</h4>
<p>The status of a defect, a software failure, a risk is not frozen in the lifecycle of a software.<br />
Especially far-fetched or very rare cases may hide more obvious situations. Namely when a critical defect or a software failure doesn't represent a risk now, it may represent a risk in the near future.<br />
Example:</p>
<ul>
<li>A minor defect that represent a risk that is not acceptable shall be promoted to major or critical defect,</li>
<li>A defect that generates a software failure, previously not seen as a risk, may represent a risk now.</li>
</ul>
<p>The archive function I used as an example above, may become used by physicians to store patient data. Thus it is used for medical purposes and the software failure becomes risky.</p>
<h3>Conclusion</h3>
<p>I hope I managed to clearly differenciate all of these concepts. I guess reading this article gave you a headache. It gave me a headache writing it!<br />
Next week, I'll post something more fun.<br />
<br />
Oh, there is one remaining thing: <a href="https://blog.cm-dm.com/post/2012/09/14/Probability-of-occurence-of-a-software-failure">the probability of occurence of a software failure</a>. We'll see that the week after.
Bye.</p>https://blog.cm-dm.com/post/2012/09/14/How-to-differenciate-Bugs%2C-Software-Risks-and-Software-Failures-Part-2#comment-formhttps://blog.cm-dm.com/feed/atom/comments/59How to differenciate Bugs, Software Risks and Software Failures - Part 1urn:md5:bad4ac7f8f0ab135a26a18c2a4c0a7ff2012-09-07T18:10:00+02:002012-10-02T07:52:02+02:00MitchMiscrisk managementsoftware failure<p>A bug can lead to a software failure.<br />
Having bugs is a risk.<br />
Having a software failure is a risk.<br />
A software failure is not necessarily a bug!<br />
<br />
Do you follow me?<br />
If not, let me give you some more explanations.</p> <h3>Bugs</h3>
<p>Let's begin by the easiest concept: bugs.<br />
With the prominence of digital appliances in our everyday life, everyone knows what a bug is:</p>
<div class="post-title" style="text-align:center;">
<p><strong>A bug, it's when it doesn't work.</strong><br /></p>
</div>
<h4>More than one concept behind bugs</h4>
<p>This simple concept encompasses several other concepts that software engineers have defined. It was necessary to do so because the way you solve the problems raised by each concept is different.<br /></p>
<h4>Problems resolution</h4>
<p>IEC 62304 introduces a process named <strong>Problems Resolution</strong>. The purpose of this process does fit the general concept of bug. It aims at analysing and solving any bug, any issue that happens with a software in a medical device.<br />
Thus you should trigger the Problems Resolution process when you have to manage every concept described below.</p>
<h3>Defects or anomalies</h3>
<p>While bug is a term used in everyday life, engineers - especially testers - prefer using the term: <strong>defect</strong>. And they prefer adding this precision of context: It's when it doesn't work in conditions where it's supposed to work.
You did everything you were told, and it doesn't work!
That's because:</p>
<ul>
<li>There is an error in the program or,</li>
<li>Something was missed,</li>
<li>Something was forgotten.</li>
</ul>
<p><strong>Example</strong> (if by chance you needed one):<br />
A pocket calculator, supposed to work with 10-digits numbers, that gives this result: 2+2=5 --> Bug<br />
For sure there is an error in the program, or something was missed/forgotten.</p>
<h4>Anomalies</h4>
<p>IEC 62304 prefers using the term <strong>Anomaly</strong>. It is defined at the beginning of the document and is used many times in the standard. That means that IEC 62304 focuses a big part of the efforts to reduce the amount of anomalies that remains when a software is delivered to end-users.</p>
<h4>Characterization of defect</h4>
<p>Defects can be so numerous, diverse and annoying that software engineers have defined a lot of attributes to characterize and categorize them. A bit like entomologists do with real bugs.<br />
A defect has a title and an explanation plus a bunch of criteria. The most used criteria are:</p>
<ul>
<li><ins>Severity</ins>: how it affects the use of software,</li>
<li><ins>Frequency</ins>: how often it appears,</li>
<li><ins>Version of Software</ins>: on which version of software,</li>
</ul>
<p>Other "administrative" criteria are also useful:</p>
<ul>
<li><ins>Status</ins>: is it open (still there), closed (fixed), pending (being fixed),</li>
<li><ins>Creation date</ins>: when it was found,</li>
<li><ins>Creator</ins>: who found it,</li>
<li><ins>Owner</ins>: who's in charge to fix it.</li>
</ul>
<p>Some companies like to add a looong list of criteria in their databases to characterize defects. This is not very useful, as most of engineers only use a small set of them.</p>
<h4>Severity, the most important criterion</h4>
<p>Most of times, the severity is defined by values like:</p>
<ul>
<li><ins>Critical</ins>: software stops or can't be used,</li>
<li><ins>Major</ins>: software doesn't give expected results, but there is a workaround,</li>
<li><ins>Minor</ins>: software gives a result with a non-significant discrepancy compared to expected result.</li>
</ul>
<p>Some engineers like to add the <ins>Cosmetic</ins> value, for very minor things. When a word is <a href="http://theoatmeal.com/comics/misspelling">misspelled in the user interface</a>, but the whole thing remains understandable, then that defect can be set as cosmetic.<br />
<br />
Severity is the most important criterion for software engineers because this is the one which is usually taken to sort defects by priority. The critical defects are fixed at first. The cosmetic ones are fixed when time remains.<br />
This way of sorting defects can lead to epic situations when beta testers and engineers review and choose the defects that have to be fixed at first.<br />
If you are an end-user that already participate to this kind of meeting, you already had to get along with a stubborn engineer that didn't want to set a high priority to a damned annoying bug!<br />
If you are a software engineer, you already participate to this kind of meeting, and you already had to get along with a stubborn end-user that didn't understand anything to the way your software works and wanted to break something that works pretty well!<br />
If you are a project manager, you know how long this meeting is going to be! :-)<br />
<br />
But I digress.</p>
<h4>In short</h4>
<p>In short, defects are deviations of software from the expected result. They are usually put in three categories: Critical, Major and Minor.<br />
The diagram below, that I'm going to complete later on, represents the defects.
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-1_s.jpg" alt="Software in Medical Devices - Defects or Anomalies and their level of criticity" style="display:block; margin:0 auto;" title="Software in Medical Devices - Defects or Anomalies and their level of criticity, sept. 2012" /></p>
<h3>Software Failures</h3>
<p>Software failure and defect are different concepts for software engineers. The general public may tend to confuse these concepts. When it ain't work, it ain't work.<br />
When I gave the engineer-friendly definition of defect, I added the context: <strong>in conditions where it's supposed to work</strong>.<br />
This is the context that can be used to differenciate defects and software failures:<br />
A software failure is when it doesn't work, but it was not supposed to work in such condition.</p>
<h4>Examples</h4>
<h5>Pocket calculator</h5>
<p>If my 10-digits pocket calculator can't compute:</p>
<ul>
<li>2+2=5 --> fails</li>
</ul>
<p>It's a defect.<br />
If my 10-digits pocket calculator can't compute:</p>
<ul>
<li>2000000000000000000000000000 + 2000000000000000000000000000 = <em>unexpected result</em> --> fails</li>
</ul>
<p>It's not a defect but a software failure. Input data are outside the expected range of value.</p>
<h5>PACS</h5>
<p>If my PACS viewer can't read 100% DICOM compliant data, it's a defect.<br />
If my PACS viewer can't read corrupted data, it's a software failure.<br />
You can see in this case that, this software failure doesn't mean that the software is guilty. Data were corrupted through the network, when burning the CDROM ... not by the PACS viewer.</p>
<h5>Digital thermometer</h5>
<p>The battery of my thermometer is empty and it gives a wrong value. No, my patient is not in hypothermy!<br />
In this case, the software failure comes from a hardware failure. It may (shall!) have been expected, with something like the flashing led of the battery state, telling me to charge it.</p>
<h5>Human error</h5>
<p>I entered a wrong value and I got a wrong result.<br />
Probably the type of failure that is the most difficult to avoid.</p>
<h4>Defect vs Failure</h4>
<p>But defects and failures are not so distinct concepts. There is a strong relationship between them.<br />
When there is a defect that makes my software totally or partially unusable, a can say that my software fails! So, a defect can lead to a failure?<br />
The answer is yes. But not for all defects:</p>
<ul>
<li>The defects that make the software totally or partially unusable, i.e. Critical and Major defects, lead to a software failure.<br /></li>
<li>The defects that don't make the software unusable, like misspelled graphical interface, don't lead to a software failure. At worst, they can make the software a little bit annoying for the user.</li>
</ul>
<h4>In short</h4>
<p>In short, a critical or major defect can generate a software failure.<br />
Minor or cosmetic defects don't.<br />
Software failures can be generated by other types of root causes, like hardware failures or user errors.
<img src="https://blog.cm-dm.com/public/15-bug-failure-risk/.bugs-failure-risk-2_m.jpg" alt="Software in Medical Devices - Common set of Defects and Software Failures" title="Software in Medical Devices - Common set of Defects and Software Failures, sept. 2012" />
<br />
<br />
OK. But a minor defect that annoys a user, it sounds like a risk.<br />
Talking about risks, are there software related risks that are neither defects, nor failures?<br />
<br />
I'll explain this in the <a href="https://blog.cm-dm.com/post/2012/09/14/How-to-differenciate-Bugs%2C-Software-Risks-and-Software-Failures-Part-2">next article</a>. It will deal with the risks and their relationship with defects and failures.</p>https://blog.cm-dm.com/post/2012/09/07/How-to-differenciate-Bugs%2C-Software-Risks-and-Software-Failures-Part-1#comment-formhttps://blog.cm-dm.com/feed/atom/comments/60