Logo
Published on

How to Evaluate AI Software for Clinical Practice

Authors
  • Name
    Angela Mariani
    Twitter

Introduction

AI software can save clinicians hours of administrative work, but not every tool is appropriate for clinical practice. A product can be impressive in a demo and still be the wrong fit for your clients, your professional obligations, your privacy requirements, or your practice workflows.

For clinicians and business owners, the question is not simply: "Does this tool work?"

The better question is:

Can we use this tool safely, transparently, and responsibly in the context of real clinical care?

This guide is designed to help you evaluate AI software before you introduce it into your practice. It draws on guidance from Ahpra on using AI in healthcare, the Australian Commission on Safety and Quality in Health Care's AI Clinical Use Guide, the Therapeutic Goods Administration's guidance on digital scribes, and the Australian Privacy Principles.

A Note Before You Start

This guide is general information only. It is not legal, privacy, clinical governance, insurance, or professional indemnity advice.

Your obligations may vary depending on your profession, jurisdiction, service model, client group, funding requirements, workplace policies, and how the software is being used. Before implementing AI software across an organisation, consider seeking advice from appropriate legal, privacy, clinical governance, cyber security, and professional indemnity providers.

You should also check your relevant professional board, employer, funder, and state or territory requirements.

1. Start With The Intended Use

Before looking at features, pricing, or user experience, get clear on what the software is actually being used for.

AI tools can be used for many different purposes, including:

  • transcribing sessions
  • drafting progress notes
  • generating reports or letters
  • summarising client information
  • finding or summarising research
  • creating client handouts
  • supporting intake or triage
  • suggesting goals, recommendations, or treatment options
  • analysing clinical data
  • supporting diagnosis, risk assessment, or treatment planning

These are not all the same level of risk.

A tool that helps format a clinician's own notes is different from a tool that generates diagnostic suggestions. A transcription tool is different from a clinical decision support tool. A general-purpose chatbot is different from software built specifically for healthcare workflows.

The TGA's guidance on digital scribes explains that digital scribes intended only to transcribe and translate clinical conversations into written records are generally not medical devices. However, if a tool analyses or interprets clinical conversations, such as by generating a diagnosis, differential diagnosis, or treatment recommendation not explicitly stated by the healthcare practitioner, it may be regulated as a medical device.

Questions to ask

  • What is the tool designed to do?
  • What is it not designed to do?
  • Does it only support documentation, or could it influence clinical decisions?
  • Does the vendor clearly describe the tool's intended use and limitations?
  • Has the vendor considered whether the product is regulated by the TGA or another relevant regulator?

2. Keep Clinical Accountability With The Clinician

Ahpra's guidance is clear: health practitioners remain responsible for safe, quality care, regardless of what technology is used. Practitioners must apply human judgement to AI outputs and, when using AI scribing tools, are responsible for checking the accuracy and relevance of records created using generative AI.

This matters because AI can produce content that sounds confident but is incomplete, inaccurate, or not clinically appropriate.

When evaluating software, look for tools that support a clear human-in-the-loop workflow. The software should make it easy for clinicians to review, edit, approve, and reject outputs before anything is relied upon or added to the clinical record.

Questions to ask

  • Can clinicians easily review and edit AI-generated content?
  • Is the AI output clearly distinguishable from clinician-entered information?
  • Does the workflow encourage review before finalising notes, reports, or letters?
  • Does the tool make unsupported assumptions, or can it be instructed to use "not documented" where information is missing?
  • Does the vendor provide guidance on hallucinations, inaccuracies, and clinical review?

For example, Everbility's guide on handling suspected AI hallucinations explicitly tells users to double-check generated information, compare it with original documentation, and manually edit or remove inaccuracies before finalising a note.

3. Understand How The Tool Works, At Least Enough To Use It Safely

Clinicians do not need to become software engineers to use AI responsibly. But they do need to understand enough about a tool to know when it is appropriate, when it is risky, and when it should not be used.

Ahpra recommends reviewing product information, including how the tool is trained and tested, its intended use, its limitations, and the clinical contexts where it should not be used.

The Australian Commission's AI Clinical Use Guide also encourages clinicians to critically assess the scope of use, available evidence, safety, performance, risks, and limitations before using an AI tool in clinical care.

Questions to ask

  • Has the tool been tested in settings like yours?
  • Does the vendor explain known limitations?
  • Is there evidence to support the tool's safety, accuracy, and usefulness?
  • Does the tool work for your profession, client group, language, and documentation needs?
  • What happens when the underlying AI model changes?
  • How are updates communicated to users?

Be cautious of any vendor that cannot clearly explain what the tool is for, where it performs well, and where clinicians should be careful.

4. Review Privacy, Data Storage, And Data Training

Healthcare data is sensitive. In Australia, the OAIC explains that health information is sensitive information under the Privacy Act, meaning stricter requirements apply when handling it.

When AI is involved, you need to understand not only where data is stored, but how it moves through the system and whether it is used to train AI models.

The Australian Privacy Principles govern standards, rights, and obligations around the collection, use, disclosure, governance, security, access, and correction of personal information.

Questions to ask

  • What data does the tool collect?
  • Is audio recorded?
  • Are recordings stored?
  • Where is client data stored?
  • Is data encrypted at rest and in transit?
  • Is client data used to train AI models?
  • Is data shared with third parties or subprocessors?
  • Can client data be deleted?
  • What happens if you leave the platform?
  • Does the vendor comply with the Australian Privacy Principles and relevant health records legislation?
  • If you work internationally, does the tool also address relevant frameworks such as HIPAA or GDPR?

As one example, Everbility's privacy policy states that it complies with the Australian Privacy Principles, the Privacy Act, HIPAA, GDPR, and SOC 2 Type 2. It also states that client notes and report templates are encrypted at rest and in transit, and that data provided to Everbility is not used by OpenAI to train its AI models.

The key point is not that every practice must use the same vendor. The key point is that every practice should know exactly how its chosen tools handle sensitive client information.

Ahpra states that health practitioners should inform patients and clients about their use of AI and consider any concerns raised. If an AI tool requires input of personal patient or client data, informed consent is generally required. This is particularly important for AI scribes or tools that record consultations.

Consent should not be treated as a vague checkbox. Clients should understand what the tool does, what information is collected, how their information is used, and what alternatives are available if they decline.

Questions to ask

  • When will clients be told AI is being used?
  • What information will they be given?
  • How will consent be recorded?
  • What happens if a client declines?
  • Does the tool support your consent workflow?
  • Does your team have a plain-language script for explaining AI use?

A practical consent explanation might include:

  • why the tool is being used
  • what information is collected
  • whether audio is recorded or stored
  • how the clinician will review outputs
  • how privacy is protected
  • the client's right to ask questions or decline

If your practice uses AI scribes or transcription, consent processes should be especially clear. Everbility provides an editable consent form resource, which can be adapted by practices to suit their own legal, professional, and organisational requirements.

6. Check Security And Compliance Signals

Compliance labels do not remove your professional responsibility, but they can help you assess whether a vendor has invested in privacy and security foundations.

Useful signals may include:

  • Australian Privacy Principles alignment
  • Privacy Act compliance
  • HIPAA compliance where relevant
  • GDPR compliance where relevant
  • SOC 2 Type 2 or equivalent assurance
  • encryption at rest and in transit
  • access controls
  • audit logs
  • breach response processes
  • clear subprocessors list
  • staff confidentiality and privacy training
  • data deletion processes

These signals should be supported by accessible documentation, not just marketing claims.

Questions to ask

  • Can the vendor provide a clear privacy policy?
  • Does the vendor explain its subprocessors?
  • Does the vendor have documented security practices?
  • Does the vendor have a breach notification process?
  • Does your professional indemnity insurer cover your intended use of AI?

Ahpra specifically notes that practitioners should hold appropriate professional indemnity insurance arrangements for all aspects of their practice and consult their provider if unsure whether AI tools are covered.

7. Consider Bias, Equity, And Cultural Safety

AI systems can reflect the data they were trained on. If training data is incomplete, biased, or not representative of your client population, outputs may be less accurate or less appropriate for some people.

Ahpra highlights the need to support the health and safety of Aboriginal and Torres Strait Islander people and all patients and clients from diverse backgrounds by understanding the bias that can exist within data and algorithms.

This matters in allied health, disability, mental health, paediatrics, aged care, and community services, where context is often complex and deeply personal.

Questions to ask

  • Has the tool been tested with populations like yours?
  • Does the tool work well for clients with disability, neurodivergence, mental health complexity, communication differences, cultural and linguistic diversity, or low literacy?
  • Does it handle Australian terminology, funding systems, and practice contexts?
  • Can outputs be personalised for accessibility without changing clinical meaning?
  • How will clinicians identify and correct biased or inappropriate outputs?

Bias is not only a vendor issue. It is also a clinical review issue. Teams need to know when to slow down, question the output, and bring their own professional judgement back to the centre.

8. Evaluate Workflow Fit, Not Just Features

A tool may be technically impressive but still fail in practice if it does not fit how clinicians actually work.

Good clinical software should reduce friction, not add another layer of admin. It should fit around real workflows: sessions, reports, progress notes, team meetings, supervision, letters, templates, client handouts, and practice management systems.

Questions to ask

  • Does it support your actual documentation types?
  • Can you customise templates to your profession and funding context?
  • Can clinicians keep their own clinical voice?
  • Can teams share approved templates or resources?
  • Does it support supervision or quality review?
  • Does it integrate with your practice management system?
  • Can clinicians export or transfer information easily?
  • Does the tool make safe use easier than unsafe workarounds?

For practice owners, workflow fit is also a risk-management issue. If staff are frustrated by approved systems, they may drift toward unapproved tools, including general-purpose AI tools that are not appropriate for client information.

9. Build Governance Before Scaling

AI governance does not need to be complicated, but it does need to exist.

If your practice introduces AI without clear expectations, staff may use it inconsistently. Some may avoid it entirely. Others may use it in ways the business would not approve of if it knew.

This is sometimes called shadow AI: staff using AI tools outside official systems because there is no clear, safe, practical pathway.

A simple AI governance framework might include

  • approved and unapproved tools
  • acceptable and unacceptable use cases
  • consent requirements
  • privacy and data handling rules
  • documentation review expectations
  • staff training
  • incident reporting
  • supervision and quality assurance
  • review dates for policies and tools
  • process for assessing new AI features
  • process for responding to software updates

The goal is not to block innovation. The goal is to make responsible use easy.

Everbility's organisation features, such as organisation templates, organisation knowledge base, and rules, are examples of how a practice can make shared standards easier to apply across a team.

10. Use A Simple Red, Yellow, Green Decision Framework

When comparing tools, it can help to score each area as green, yellow, or red.

Green

The tool is likely suitable to trial or adopt with appropriate safeguards.

  • clear intended use
  • clear privacy policy
  • no client data used for model training without explicit agreement
  • supports informed consent
  • supports clinician review and editing
  • has appropriate security documentation
  • fits the practice workflow
  • staff can be trained to use it safely
  • vendor support is accessible

Yellow

The tool may be suitable, but more investigation is needed.

  • privacy wording is unclear
  • data retention is not fully explained
  • intended use is broader than your planned use
  • limited information about testing or limitations
  • consent workflow needs to be built separately
  • security documentation is incomplete
  • clinicians may need extra training or supervision

Red

The tool should not be used for client information or clinical workflows unless major concerns are resolved.

  • vague or missing privacy policy
  • client data used to train AI by default
  • no clear deletion process
  • no human review workflow
  • claims to replace clinical judgement
  • produces diagnoses or treatment recommendations without appropriate regulation
  • unclear data storage location
  • no meaningful support
  • no consent process
  • vendor cannot explain limitations

11. Vendor Questions Checklist

Before adopting AI software, ask vendors direct questions and keep a record of their answers.

Intended use

  • What is the tool designed to do?
  • What should clinicians not use it for?
  • Does it provide clinical decision support, recommendations, or analysis?
  • Has the product been assessed for whether it is regulated by the TGA or another regulator?

Privacy and data

  • What client data is collected?
  • Is audio recorded or stored?
  • Where is data stored?
  • Is data encrypted?
  • Is client data used to train AI models?
  • Which third parties or subprocessors handle the data?
  • Can we delete client data?
  • What happens to our data if we cancel?

Clinical safety

  • How should clinicians review outputs?
  • What are the known limitations?
  • How does the tool reduce hallucinations or unsupported assumptions?
  • What support is available if an output appears incorrect?
  • How are model or product updates communicated?

Governance

  • Can practice owners manage users, templates, or shared resources?
  • Can the tool support team-wide documentation standards?
  • Is there training for staff?
  • Is there documentation we can use for internal policies?

Consent

  • What information should clients be told?
  • Does the tool provide consent resources?
  • What happens if a client declines AI use?

12. Common Questions Clinicians Ask

Why can't I just use ChatGPT?

The short answer is: it depends what you are using it for, what type of ChatGPT account or workspace you are using, what data you are entering, and whether your organisation has approved that use.

General-purpose AI tools can be useful for low-risk tasks, such as brainstorming generic education ideas, simplifying non-client-specific language, drafting internal checklists, or learning how prompting works.

The risk changes when you enter client information, session details, reports, referral letters, health information, or anything that could identify a person.

OpenAI's data controls guidance states that when individuals use services such as ChatGPT, content may be used to train models unless the user opts out. OpenAI separately states that for business products such as ChatGPT Business, ChatGPT Enterprise, ChatGPT Edu, and the API Platform, inputs and outputs are not used for model training by default. OpenAI's business data privacy page also describes business privacy, security, data retention, compliance, and access control features.

But model training is only one part of the evaluation.

Before using any general-purpose AI tool with client information, you still need to consider:

  • whether your organisation has approved it
  • whether client consent is required
  • whether the tool is appropriate for health information
  • whether the account has the right privacy and security controls
  • whether data retention is acceptable
  • whether there is a data processing agreement or equivalent where needed
  • whether your professional indemnity insurer covers the use
  • whether the workflow supports clinical review
  • whether outputs can be safely stored in the clinical record

If you are using AI for client-related work, a healthcare-specific tool may be more appropriate because it is more likely to be designed around consent, documentation workflows, privacy expectations, and clinical review.

Can I use ChatGPT if I remove the client's name?

Be careful. Removing a name, date of birth, or address does not automatically make information de-identified.

Clinical information can still be identifying when details are specific enough, especially in small communities, schools, workplaces, specialist services, complex disability contexts, or unusual clinical presentations.

Before entering any information into a general-purpose AI tool, ask:

  • Could this information reasonably identify the client?
  • Would the client expect this information to be used this way?
  • Has the organisation approved this tool?
  • Do I understand where the information goes and how it is handled?
  • Is there a safer approved tool or workflow?

When in doubt, treat the information as sensitive and do not enter it into an unapproved system.

My company has a strict no-AI policy. What can I do?

Do not work around the policy or use AI secretly with client information. That can create privacy, employment, clinical governance, and professional risk.

Instead, use the policy as the start of a conversation. Many strict AI bans are created because leaders are worried about real risks: privacy breaches, inaccurate outputs, client consent, reputational harm, unclear indemnity, or staff using public tools without oversight.

You could propose a safer pathway, such as:

  • a staff education session on responsible AI use
  • an internal survey to understand whether shadow AI use is already happening
  • a list of approved and unapproved AI use cases
  • a low-risk trial using no client information
  • a review of healthcare-specific tools
  • a privacy and security assessment
  • a consent workflow
  • an AI use policy that distinguishes admin, documentation, and clinical decision support

A useful starting point is: "Can we create an approved pathway for safe AI use, rather than leaving staff to guess?"

What if staff are already using AI without approval?

Assume this may already be happening and respond with governance, not panic.

Start by understanding what staff are using AI for. Some uses may be low risk, such as drafting generic emails or brainstorming non-client-specific resources. Others may be high risk, such as entering client notes into public tools or relying on AI for clinical reasoning.

Practical next steps:

  • create a short interim AI policy
  • define what staff can and cannot enter into AI tools
  • name approved tools and workflows
  • provide examples of safe and unsafe use
  • explain why client information needs extra care
  • create a process for staff to ask questions without fear
  • review whether a healthcare-specific tool would reduce unsafe workarounds

Blanket bans can sometimes push AI use underground. Clear rules, training, and approved tools are usually safer.

It depends on the tool, the data involved, the purpose, and your legal and professional obligations.

If an AI tool records a session, transcribes a consultation, processes client information, or uses personal or health information, consent and transparency are likely to be important. Ahpra states that practitioners should inform patients or clients about their use of AI and consider any concerns raised, and that informed consent should be obtained when an AI tool requires input of personal patient or client data.

For lower-risk uses that do not involve client information, such as drafting a generic blog outline or creating a blank template, client consent may not be relevant. Your organisation may still have rules about approved tools.

The safest approach is to build a clear consent process into your workflow, especially for transcription, scribing, and documentation tools.

What if a client says no?

Respect the client's choice and use an alternative workflow.

That might mean typing notes manually, dictating without AI processing, using a non-recording workflow, or completing documentation after the session. The important thing is that clients should not feel pressured to accept AI use to receive care.

Document the discussion and the alternative approach according to your usual record-keeping requirements.

Can AI write clinical recommendations?

AI can help draft, organise, rephrase, or format content, but clinicians should be very cautious about using AI to generate recommendations that require clinical judgement.

If AI suggests goals, strategies, interventions, risk ratings, diagnoses, treatment plans, or recommendations, the clinician must review them carefully and take responsibility for anything that is used.

You should also consider whether the tool is moving from documentation support into clinical decision support. Depending on how the tool works and what it claims to do, different regulatory considerations may apply.

How do we introduce AI without overwhelming the team?

Start small.

A good first implementation is usually one focused use case, such as:

  • drafting session notes from clinician-approved information
  • turning dot points into a letter
  • summarising non-client-specific professional development notes
  • creating generic client education handouts
  • building internal templates

Then add:

  • staff training
  • a consent process
  • a review checklist
  • a feedback pathway
  • a named person responsible for AI governance
  • a date to review what is working and what is not

The goal is not to make every clinician use AI in the same way. The goal is to create a safe, supported environment where clinicians know what is allowed, what is not, and where to get help.

13. Final Takeaway

AI can be incredibly useful in clinical practice. It can reduce documentation burden, support clearer communication, and help clinicians spend more time on the work that matters most.

But responsible AI use starts before the first note is generated.

It starts with understanding the tool, protecting client information, obtaining meaningful consent, keeping clinical judgement at the centre, and building practice-level governance that helps staff use AI safely.

The safest AI tools are not the ones that promise to replace clinicians.

The safest tools are the ones that help clinicians do their work more sustainably, while keeping the clinician in control.

References And Further Reading

Want to see what responsible AI documentation support can look like in practice?

Everbility was founded by a clinician and built with clinicians in the team, with a focus on helping allied health professionals reduce documentation burden while keeping clinical judgement, privacy, and client care at the centre.

If you're evaluating AI tools for your practice, we can walk you through how Everbility approaches privacy, consent, clinical review, templates, transcription, and team workflows.