How to Train Generative AI Using Your Company’s Data

    Many companies are experimenting with AI programs like ChatGPT, and Claude. They all realise that these programs are based on third-party data and are not suitable for answering queries on proprietary information as well as data.

    For companies, the ability to leverage proprietary data is crucial to its ability to be ahead of its competitors as well as to come up with new product ideas. Innovation is triggered through the effective use of various concepts like agile, new management ideas as well as new ways of using and combining various knowledge assets and expertise. Knowledge is generated in various forms, in the processes, policies, transactions, meetings, and research papers etc., These are difficult to integrate to ensure that this body of knowledge is available to everyone in an effective way.

    Ai-based applications provide an opportunity to manage this knowledge effectively and enhance organisational capabilities as well as facilitate innovation. Many companies are using AI systems to reference this treasure trove of knowledge within the organisation. The possibilities seem quite exciting but at the same time some factors make this knowledge management difficult and these have to be resolved before companies can benefit from AI applications.

    AI technology

    The technology to incorporate a proprietary knowledge base is still evolving and at the moment there are three distinct approaches.

    • Train the program totally: One approach is to create your own customised AI program and train the model on unique company data. This can involve a huge investment and is rarely adopted by most companies. It requires a huge database to train the AI Large Language Models(LLM) and many companies may not have this resource. Also, the computing power needed for AI programs and talent is another big constraint.
    • Customise an existing program: This requires adding customised content to an existing system trained on general data. This is much easier as it requires fine-tuning some parameters and it does not require huge data sets to train itself. Getting quality talent to customise the program is a big constraint and some vendors may not permit tinkering with their programs.
    • Prompt-tuning: This is the most common approach of most companies where with the help of prompts the results can be filtered to answer domain-specific information. This approach is the choice of many as it requires minimum information and investment in skilled talent.

    Curation and Governance.

    The content that is used to train LLMs needs to be of high quality and needs to be curated ensuring its accuracy and timeliness and at the same time eliminating duplicates.

    Quality assurance and evaluation.

    One of the most important aspects of AI content is the quality of its results. Companies need to develop evaluation strategies to check the results of AI queries to ensure that incorrect information is not used to make business decisions that could have disastrous consequences.

    Legal and Governance issues.

    The legal and governance issues are still evolving and there are many aspects to be considered including intellectual property, data security, data bias and inaccuracy. Confidentiality is another major concern and this is being addressed to some extent by the service provider by providing features such as proper erasing, restricting certain knowledge areas, and filtering out proprietary data in public LLMs.

    Shaping User behaviour

    User experience is a critical aspect of any application for quick and easy adoption. Ai systems are, broadly, easy to use and deliver useful results across various domains and this has resulted in its broad adoption by employees.

    Companies need to benefit from this ease of use but at the same time manage the potential risk of AI delivered results and for this transparency and accountability need to be built into the system to instil trust in the use of AI programs. Users need to understand how to incorporate AI capabilities safely into their work to enhance their performance. AI systems are especially useful to automate intensive searches across high-volume tasks and free the employees to focus on complex problem-solving and decision-making.

    Users need to be trained on:

    • Types of content available on AI systems
    • Create efficient prompts
    • Type of prompts and queries permitted
    • Proper use of AI results with external stakeholders.
    • Request additional knowledge content to be added.
    • Create new content.

    While there are many challenges involved in building and maintaining AI systems, the overall benefit far outweighs these challenges. The vision of any company should be to allow easy access to the knowledge repository both within and outside the organisation. Gen AI is a technology that shows a lot of promise in this direction.

    How to Train Generative AI Using Your Company’s Data
    by Tom Davenport and Maryam Alavi
    HBR July 06, 2023  

    Leave a comment