WO2024184738A1 - Validation and maintenance of electronic life sciences industries data - Google Patents
Validation and maintenance of electronic life sciences industries data Download PDFInfo
- Publication number
- WO2024184738A1 WO2024184738A1 PCT/IB2024/051916 IB2024051916W WO2024184738A1 WO 2024184738 A1 WO2024184738 A1 WO 2024184738A1 IB 2024051916 W IB2024051916 W IB 2024051916W WO 2024184738 A1 WO2024184738 A1 WO 2024184738A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- document
- information
- ingredient
- processors
- validation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06395—Quality analysis or management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/018—Certifying business or products
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H15/00—ICT specially adapted for medical reports, e.g. generation or transmission thereof
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/186—Templates
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
Definitions
- 63/461 ,698 entitled “Confidential Disclosures of Pharmaceutical Data” (filed April 25, 2023); U.S. Provisional Application No. 63/449,682, entitled “File Creation Using Artificial Intelligence (Al), Natural Language Processing (NLP), And Other Techniques” (filed March 3, 2023); U.S. Provisional Application No. 63/449,719, entitled “Validation And Maintenance Of Electronic Pharmaceutical Data” (filed March 3, 2023); and U.S. Provisional Application No. 63/449,704, entitled “Confidential Disclosures of Pharmaceutical Data” (filed March 3, 2023), each of which is incorporated by reference herein in their entirety.
- the following discloses a system that validates and/or identifies ingredients.
- the system may start with a single document (or file), such as a document in a standardized format, such as a standardized life sciences document (e.g., a standardized pharmaceutical, biopharmaceutical, nutritional or aroma-ingredient document), thus greatly simplifying the quality and regulation process.
- the validation may be done according to the regulations/laws/standards/rules of different countries and/or companies.
- a computer-implemented method for validating ingredients may be provided.
- the method may include: (1 ) receiving, via one or more processors, a document including ingredient information; (2) receiving, via the one or more processors, validation information comprising (i) regulatory information from a government entity, and/or (ii) compliance information from a company; and (3) validating, via the one or more processors, an ingredient indicated by the ingredient information based on the validation information.
- a computer system for validating ingredients may be provided.
- the computer system may include one or more processors configured to: (1 ) receive a document including ingredient information; (2) receive validation information comprising (i) regulatory information from a government entity, and/or (ii) compliance information from a company or 3rd party organization; and (3) validate an ingredient indicated by the ingredient information based on the validation information.
- Figure 1 illustrates an example system for validating ingredients.
- Figure 2 illustrates an example method for validating ingredients.
- Figure 3 illustrates an example standardized document with a document category of declaration.
- Figure 4 illustrates an example standardized document with a document category of certificate, and a subcategory of certificate of analysis.
- Figure 5 illustrates an example standardized document with a document category of specification.
- Figure 6 illustrates an example method for validating ingredients, including forwarding validated ingredient information.
- Figure 7 depicts a combined block and logic diagram for training an ML chatbot model, in which the techniques described herein may be implemented, according to some embodiments.
- the present embodiments relate to, inter alia, validating ingredients.
- FIG. 1 depicts an exemplary computing environment 100 in which the techniques disclosed herein may be implemented, according to some aspects.
- the computing environment 100 may include a validation computing device 102, which, in some aspects, may implement the techniques described herein.
- the validation computing device 102 may validate an ingredient.
- the validation computing device 102 may include one or more processors 120, such as one or more microprocessors, controllers, and/or any other suitable type of processor.
- the validation computing device 102 may further include a memory 122 (e.g., volatile memory, non-volatile memory) accessible by the one or more processors 120, (e.g., via a memory controller).
- the one or more processors 120 may interact with the memory 122 to obtain and execute, for example, computer-readable instructions stored in the memory 122.
- computer-readable instructions may be stored on one or more removable media (e.g., a compact disc, a digital versatile disc, removable flash memory, etc.) that may be coupled to the validation computing device 102 to provide access to the computer-readable instructions stored thereon.
- the computer-readable instructions stored on the memory 122 may include applications, such as validator 124.
- the validator 124 may include instructions for, inter alia, validating ingredients.
- the ingredients may be any kind of pharmaceutical, biopharmaceutical, nutritional, or aroma ingredient of a product.
- examples of the ingredients include active pharmaceutical ingredients, inactive pharmaceutical ingredients, carbohydrates, sugars, celluloses, starches, sugar alcohols, artificial sweeteners, microcrystalline cellulose, cellulose ethers, hydroxypropyl methylcellulose, cellulose esters, carboxymethyl cellulose, croscarmellose, glycerine, propylene glycol, polyethylene glycol, polyoxyethylene, polyoxypropylene, poloxamers, povidone, crospovidone, copovidone, petrolatum, mineral oil, acrylic cyclodextrin, beta-cyclodextrin, hydroxypropyl betacyclodextrin, lactose monohydrate, anhydrous lactose, polyethylene glycol-polyvinyl alcohol graft copolymer, methacrylic acid-ethyl acrylate copo
- an ingredient may be selected from a group of a type of ingredients.
- groups of types of ingredients include: active pharmaceutical ingredients, inactive pharmaceutical ingredients, fillers, diluents, suspension agent, coatings, binders, flavoring agents, colorants, lubricants, glidants, preservatives, sweeteners, emollients, consistency factors, viscosity agent, solubilizers, solvents, disintegrants, matrix-formers, thickener, vehicle, metal oxides, emulsifiers, surfactants, oleochemicals, lipids, waxes, fats, fatty acids, fatty alcohols, penetration enhancer.
- these can be naturally derived (animal or plant), vegetable oils, mineral-derived, or synthetic.
- the internal database 1 18 may hold any suitable information.
- the internal database 118 may hold: standardized or unstandardized life sciences documents (e.g., pharmaceutical, biopharmaceutical, nutritional, or aroma-ingredient documents); pharmaceutical or biopharmaceutical product information; nutritional product information; aroma-ingredient information; regulatory information (e.g., information based on laws of a particular jurisdiction, etc.); compliance information (e.g., information from a company, such as a manufacturer of the product including the ingredient); information of companies (e.g. a company corresponding to the laboratory computing device 140, corresponding to the manufacturing computing device 150, etc.); ingredient information; etc.
- standardized or unstandardized life sciences documents e.g., pharmaceutical, biopharmaceutical, nutritional, or aroma-ingredient documents
- pharmaceutical or biopharmaceutical product information e.g., nutritional product information
- aroma-ingredient information e.g., information based on laws of a particular jurisdiction, etc.
- regulatory information e.g., information based on laws
- the regulatory information may be based on laws of a particular jurisdiction.
- Examples of the regulatory information include maximum amounts of ingredients in a product, information of ingredients not allowed to be mixed with each other, information of manufacturing practices, information of time periods that the ingredients are required to be current with and/or certified for, etc.
- Examples of the compliance information include maximum amounts of ingredients in a product, information of ingredients not allowed to be mixed with each other, information of manufacturing practices, etc.
- the external database 180 may also hold any suitable information.
- the external database 180 may hold: standardized or unstandardized life sciences documents; nutritional product information; aroma-ingredient information; pharmaceutical or biopharmaceutical product information; regulatory information (e.g., information based on laws of a particular jurisdiction, etc.); compliance information (e.g., information from a company, such as a manufacturer of the product including the ingredient); information of companies (e.g. a company corresponding to the laboratory computing device 140, corresponding to the manufacturing computing device 150, etc.); ingredient information; etc.
- the exemplary computing environment 100 may further include laboratory computing device 140, which may include one or more processors 141 such as one or more microprocessors, controllers, and/or any other suitable type of processor.
- the laboratory computing device 140 may correspond to a laboratory that tests ingredients, and/or proposes ingredients for products. Additionally or alternatively, the laboratory computing device 140 may correspond to a laboratory that is inspecting a product (e.g., inspecting the product to issue a certificate for a standardized document with a category of certificate).
- the exemplary computing environment 100 may further include manufacturer computing device 150, which may include one or more processors 151 such as one or more microprocessors, controllers, and/or any other suitable type of processor.
- the manufacturer computing device 150 may correspond to a manufacturer that manufactures ingredients.
- the exemplary computing environment 100 may further include government computing device 160, which may include one or more processors 161 such as one or more microprocessors, controllers, and/or any other suitable type of processor.
- processors 161 such as one or more microprocessors, controllers, and/or any other suitable type of processor.
- the government computing device 160 corresponds to a government entity or other regulator.
- the exemplary computing environment 100 may further include administrator computing device 170, which may include one or more processors 171 such as one or more microprocessors, controllers, and/or any other suitable type of processor.
- processors 171 such as one or more microprocessors, controllers, and/or any other suitable type of processor.
- the network 104 may be a single communication network, or may include multiple communication networks of one or more types (e.g., one or more wired and/or wireless local area networks (LANs), and/or one or more wired and/or wireless wide area networks (WANs), such as the Internet).
- LANs local area networks
- WANs wide area networks
- Figure 1 illustrates only one of each of many of the components, such as the validation computing device 102, internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.
- any number of each of the components illustrated in Figure 1 may be included in a system (e.g., multiple validation computing devices 102, internal databases 118, external databases 180, laboratory computing devices 140, manufacturer computing devices 150, government computing devices 160, administrator computing devices 170, etc.).
- FIG. 2 illustrates an example method 200 for validating ingredients.
- the blocks of the example method 200 may be performed by the one or more processors 120.
- the example description below refers to blocks of the method as performed by the one or more processors 120, it should be understood that any of the blocks may be performed by any suitable component (e.g., the one or more processors 141 , the one or more processors 151 , the one or more processors 161 , the one or more processors 171 , etc.).
- the example method 200 begins at block 210 when the one or more processors 120 receive a document (or file; although this discussion refers to receiving a document, this should not be construed as limiting, and it should be understood that the discussion applies equally for receiving a file rather than a document) including ingredient information.
- the document may be received from any suitable source.
- the document may be received from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.
- the document is a standardized life sciences document.
- the standardized document may include a plurality of standardized document elements, and the standardized document elements of the plurality of standardized document elements may include respective: element names; element types comprising (i) selection start, (ii) field, or (iii) selection end; field types; a mandatory or optional marking; and/or element values.
- the elements names and/or element values indicate names of ingredients.
- the standardized document may have a category (e.g., specification, declaration, or certificate), and a subcategory.
- a standardized document (rather than an unstandardized document, such as an unstandardized email, pdf, power point, etc.) streamlines the validation process. For example, the system is able to determine validity more quickly and more accurately because the system “knows” exactly where to pull the information from rather than having to search for the information (e.g., using an NLP algorithm, etc.) or guess where the information is; that is, the systems described herein reduce the likelihood of erroneous acquisitions of information that would come from an unstandardized document.
- a standardized document (e.g., with a category of specification, declaration, or certificate) may be created from a plurality of unstandardized documents (e.g., emails, pdf files, etc.) sent by a company; thus, for validation, the one or more processors 120 will have to review fewer document(s) than a system that reviews unstandardized documents.
- unstandardized documents e.g., emails, pdf files, etc.
- the ingredient information (received at block 210 via the document) comprises ingredient type information
- the standardized life sciences document includes a document category and a document subcategory.
- the document subcategory indicates the ingredient type information.
- the standardized document may have a subcategory corresponding to: active pharmaceutical ingredients, excipients, fillers, diluents, suspension agent, coatings, binders, flavoring agents, colorants, lubricants, glidants, preservatives, sweeteners, emollients, consistency factors, viscosity agent, solubilizers, solvents, disintegrants, matrix-formers, thickener, vehicle, metal oxides, emulsifiers, surfactants, oleochemicals, lipids, waxes, fats, fatty acids, fatty alcohols, and penetration enhancers.
- the ingredient information may also include amounts of ingredient(s) in a product, concentrations of ingredient(s) in a product, and/or properties of the ingredient (e.g., chemical properties, acidity levels, etc.).
- the document comprises a standardized life sciences document having a document category of declaration (for example, a declaration about an ingredient that is required by applicable law) and a subcategory of: aflatoxins, allergens, Genetically Modified Organisms (GMO), Good Manufacturing Procedure (GMP), Manufacturing Procedure, melamine, nitrosamine, and Transmissible Spongiform Encephalopathy (TSE) / Bovine Spongiform Encephalopathy (BSE).
- Aflatoxin declarations may include information on absence or present levels of specific undesirable substances produced by categorized molds. Allergen declarations may include information on occurrence of specified allergens as referred to, for example, in European and US regulatory provisions. GMO declarations may include information on the status of Good Manufacturing Practice as described by the manufacturing party. GMP declarations may include information on the status of Good Manufacturing Practice as described by the manufacturing party. Manufacturing Procedure declarations may include basic details to an ingredient’s manufacturing process. Melamine declarations may include information on the disuse of Melamine in the manufacturing process. Nitrosamine declarations may include information on a risk evaluation for Nitrosamines following international industry associations and authorities. TSE and BSE declarations may include information on the disuse of BSE/TSE relevant material from animal or human origin in production process.
- subcategories of declaration documents include: GMP Compliance, Manufacturing Flow Chart, Site Information, Quality Summary, Regulatory Summary, Technical Information, Stability Report, Specification, Alfatoxin Statement, Allergens Statement, BSE/TSE Statement, Contaminant Statement, Dioxine Statement, GMO Statement, Halal statement or certificate, FDA inspection letter or statement, melamine statement, Gluten statement, Natural Latex statement, Nano Statement, Microplastic Statement, Microbial Information Note or Statement, Nitrosamine Risk Assessment, REACH Statement, Phthalate Statement, Prop 65 Statement, Residual Solvents Statement, Pharmacopeial compliance statement or note, Product Carbon Footprint Statement or Document, Social and/or Environmental Statements, Dossier, Technical Packet, and/or drug master file (DMF).
- GMP Compliance Manufacturing Flow Chart
- Site Information Quality Summary
- Regulatory Summary Technical Information
- Stability Report Specification
- Alfatoxin Statement Allergens Statement
- BSE/TSE Statement Contaminant Statement
- Dioxine Statement GMO Statement
- some of the standardized document elements apply to each subcategory.
- elements with elements names indicating: an ingredient name, file information (e.g., information of the standardized document, including a version of the standard), an email address (e.g., an email address for questions related to the file information and format, or an email address for questions related to the data content), a document generation date, a document generation time, company disclaimer, an electronic signature, etc.
- standardized document elements may correspond to specific subcategories.
- allergen subcategory there may be standardized document elements names indicating, inter alia. allergen name (or identifier), allergen text, allergen footnote, etc.
- Figure 3 illustrates a working example of a standardized document 300 with a document category of a declaration.
- the line 310 indicates the product ID (e.g., name of the product, product identification number, etc.). Put another way, the line 310 shows an element with an element name of product ID and an element value of “Product XYZ.”
- the example standardized document 300 further includes line 320, which shows the declaration(s) being made (e.g., shows an element with an element name of declaration description; and an element value of a description of the declaration being made).
- the declarations 320 include declaration topic 321 A; declaration body 321 B; declaration data 321 C; declaration reference 321 D; declaration comment 321 E; and declaration footnote 321 F.
- Line 330 shows an element with an element name of company name, and an element value of “ABC Corp.”
- Line 340 shows an element with an element name of Company Address, and an element value of “123 Main St.; Montpelier, VT 05601.”
- Line 350 shows an element with an element name of Company Email, and an element value of “Jane.Doe@ABCCorp.com.” It should be appreciated that any or all of the information from lines 310, 320, 330, 340, 350 may be taken by the one or more processors 120 from unstandardized document(s). Additionally or alternatively, the information from lines 310, 320, 330, 340, 350 may be taken from other standardized documents.
- Line 360 shows an element with an element name of electronic signature, which has been signed by the company representative, Jane Doe.
- the one or more processors 120 may send and/or present a link (e.g., to the laboratory computing device 140, the manufacturer computing device 150, the government computing device 160, the administrator computing device 170, the validation computing device 102, etc.) so that the standardized document 300 may be electronically signed. Additionally or alternatively, the standardized document 300 may be signed by any other suitable technique.
- the document (received at block 210) comprises a standardized life sciences document having a document category of a certificate (for example, a certificate about an ingredient that is required by applicable law) and a subcategory of: a certificate of analysis; a third-party certification of a good manufacturing practice; a third-party certification of quality management; or a third party certification of Kosher or Halal or RSPO (Round Table of Sustainable Palm Oil) compliance.
- the validating may include validating the ingredient based on the document category and/or subcategory.
- the certification category and/or corresponding subcategories may be used subsequently in the validation process (e.g., at block 230).
- Figure 4 illustrates another working example of a standardized document 400.
- the example standardized document 400 has a document category of a certificate, and a subcategory of certificate of analysis.
- the line 410 indicates the product ID (e.g., name of the product, product identification number, etc.). Put another way, the line 410 shows an element with an element name of product ID and an element value of “Product XYZ.”
- the example certificate of analysis 400 includes line 420, which shows the certification(s) being made (e.g., shows an element with an element name of certification description, and an element value of a description of the certification being made).
- the certifications include parameters, such as parameters 421 A, 421 B, 421 C, 421 D, 421 E, 421 F, 422A, 422B, 422C, 422D, 422E, 422F.
- the example certifications 420 include: Tested Parameter 1 - Name [text]: Water; Tested Parameter 1 - Description [text]: Humidity; Tested Parameter 1 - Dimension [Unit of measure]: g/100g; Tested Parameter 1 - Numeric Value [number]: 4.2; Tested Parameter 1 - Method of Analysis [text]: Ph.Eur.
- Line 430 shows an element with an element name of certifier, and an element value of “Bob Smith.”
- Line 440 shows an element with an element name of Certifying Company, and an element value of “ZZZ Inspection Corp.”
- Line 450 shows an element with an element name of Certificate Number, and an element value of “12345.”
- Line 460 shows an element with an element name of Date of First Certification, and an element value of “1/1/2024.” It should be appreciated that any or all of the information from lines 410, 420, 430, 440, 450 may be taken by the one or more processors 120 from unstandardized document(s). Additionally or alternatively, the information from lines 410, 420, 430, 440, 450 may be taken from other standardized documents (e.g., document(s) sent by Bob Smith of ZZZ Inspection Corp., etc.).
- Line 470 shows an element with an element name of electronic signature, which has been signed by the inspection company representative, Bob Smith.
- the one or more processors 120 may send a link to Bob Smith (e.g., send the link to the laboratory computing device 140, etc.) so that the standardized document 400 may be electronically signed. Additionally or alternatively, the standardized document 400 may be signed by any other suitable technique.
- subcategories of certificate include: aflatoxins; allergens; genetically modified organisms (GMO); good manufacturing practice (GMP); manufacturing procedure; melamine; nitrosamine; and Transmissible Spongiform Encephalopathy (TSE) or Bovine Spongiform Encephalopathy (BSE).
- GMO genetically modified organisms
- GMP good manufacturing practice
- TSE Transmissible Spongiform Encephalopathy
- BSE Bovine Spongiform Encephalopathy
- a third party may inspect and/or certify that a particular toxin is not present in the product, or present below a certain threshold amount.
- a third party may inspect and/or certify that a particular allergen is not present in the product, or present below a certain threshold amount.
- a third party may inspect and/or certify that the company is adhering to a GMP. Analogous inspections and/or certifications may be made for the other listed example subcategories.
- the document (received at block 210) comprises a standardized life sciences document having a document category of specification and a subcategory of product specification.
- Product specifications may include information on criteria, their dimensions and guaranteed levels or statements for a defined product.
- the standardized document may have document elements with names indicating: specification data, specification parameters, test method, description, reference standard, etc.
- Figure 5 illustrates one working example of a standardized document 500.
- the example standardized document 500 has a document category of a specification.
- the line 510 indicates the product ID (e.g., name of the product, product identification number, etc.). Put another way, the line 510 shows an element with an element name of product ID and an element value of “Product XYZ.”
- Line 520 shows an element with an element name of specification data, and an element value to be filled in with information from standardized or unstandardized documents in the creation of standardized document 500.
- the specification data includes specification parameters, such as specification parameters 521 A, 521 B, 521 C, 521 D, 521 E, 521 F, 522A, 522B, 522C, 522D, 522E, 522F.
- the example specification data 420 includes: Specification Parameter 1 - Name [text]: Water; Specification Parameter 1 - Description [text]: Humidity; Specification Parameter 1 - Dimension [Unit of measure]: w% (g/100g); Specification Parameter 1 - Numeric Value [number]: NMT 5; Specification Parameter 1
- Line 530 shows an element with an element name of company name, and an element value of “ABC Corp.”
- Line 540 shows an element with an element name of Company Address, and an element value of “123 Main St.; Montpelier, VT 05601.”
- Line 550 shows an element with an element name of Company Email, and an element value of Jane.Doe@ABCCorp.com. It should be appreciated that any or all of the information from lines 510, 520, 530, 540, 550 may be taken by the one or more processors 120 from the unstandardized document(s). Additionally or alternatively, the information from lines 510, 520, 530, 540, 550, 560 may be taken from other standardized document(s).
- Line 560 shows an element with an element name of quality signature, which has been signed by the company representative, Jane Doe.
- the one or more processors 120 may send and/or present a link (e.g., to the laboratory computing device 140, the manufacturer computing device 150, the government computing device 160, the administrator computing device 170, the validation computing device 102, etc.) so that the standardized document 500 may be electronically signed. Additionally or alternatively, the standardized document 500 may be signed by any other suitable technique.
- standardized documents with a category of declaration are made and/or signed by the manufacturer of the ingredient (e.g., via the manufacturer computing device 150); whereas, standardized documents with a category of certificate are made and/or signed by a third party (e.g., an inspector; and/or signed via the laboratory computing device 140).
- the one or more processors 120 receive validation information.
- the validation information may be received from any suitable source.
- the validation information may be received from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.
- the validation information comprises: (i) regulatory information from a government entity (e.g., a government corresponding to government computing device 160; in some examples, the government entity may be a government entity of a federal, state, or local government), and/or (ii) compliance information from a company (e.g. a company corresponding to the laboratory computing device 140, corresponding to the manufacturing computing device 150, etc.).
- a government entity e.g., a government corresponding to government computing device 160; in some examples, the government entity may be a government entity of a federal, state, or local government
- compliance information e.g. a company corresponding to the laboratory computing device 140, corresponding to the manufacturing computing device 150, etc.
- Validation information may be aggregated together.
- the one or more processors 120 may: receive first validation information (e.g., from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.); then receive second validation information (e.g., from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.); and then aggregate the first validation information with the second validation information to thereby create the validation information to be used.
- the first validation information may be received from the government entity and comprise regulatory information
- the second validation information may be received from the company and comprise compliance information.
- the validation information may also be updated.
- the one or more processors 120 may receive third validation information (e.g., from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.), and then update the validation information with the third validation information.
- the one or more processors 120 may replace either the first validation information or the second validation information with the third validation information.
- the one or more processors 120 may replace a list of ingredients not to combine with a new list of ingredients not to combine.
- the one or more processors 120 may replace a previous maximum amount allowed of an ingredient in a product with a new maximum amount allowed of the ingredient in the product.
- Updates may also be requested by the one or more processors 120.
- the one or more processors 120 may: determine a validity date of the document from the validation information, wherein the validity date is comprised in an element of the document, and wherein the validity date comprises: (i) a valid from date, or (ii) a valid to date; compare the validity date of the document to a current date to thereby determine that the document is not valid; in response to the determination that the document is not valid, request an additional document; in response to requesting the additional document, receive the additional document; and update the validation information based on the additional document.
- the additional document may be requested and/or received from any suitable source (e.g., from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.).
- the validation computing device 102 may request confirmation from the originator of the document that there have been no updates to the document. If the originator of the document confirms that there have been no updates to the document, the validation computing device 102 may extend the validity date of the document by a predetermined amount (e.g., extend the validity date by one year, two years, etc.). [0064] At block 230, the one or more processors 120 validate an ingredient indicated by the ingredient information (e.g., based on the validation information).
- the one or more processors 120 determine a maximum amount of the ingredient based on the validation information (e.g., a maximum amount from either the regulatory information, and/or the compliance information); and compare the maximum amount of the ingredient to an amount of the ingredient from the ingredient information.
- the validation information e.g., a maximum amount from either the regulatory information, and/or the compliance information
- the one or more processors 120 determine a validity date of the document from the validation information; and compare the validity date to a current date. However, in some scenarios, the document does not include a validity date.
- the one or more processors 120 may handle this scenario in any of the following ways.
- the one or more processors 120 may request (e.g., from the laboratory computing device 140, the manufacturer computing device 150, the external database 180, the internal database 118, etc.) an additional document with a validity date. Additionally or alternatively, the one or more processors 120 may automatically determine the ingredient to be invalid. Additionally or alternatively, the one or more processors 120 may simply make the validation determination on other factors rather than making the validation based on a validity date (e.g., essentially waving any requirement for a validity date).
- the validation computing device 102 may periodically (e.g., once a year, once every three months, etc.) check in with the originator of the document to ensure that the document is current. If the document is not current, the validation computing device 102 may request an additional document.
- the one or more processors 120 validate the ingredient based on an electronic signature of the document.
- the one or more processors 120 validate the ingredient based on the standardized life sciences document comprising a category of certificate and/or its corresponding subcategory.
- the standardized life sciences document comprises a category of certificate, and includes elements with element names, such as: certificate number; date of first certification; company name (e.g., name of company making the certification); electronic signature; etc.
- the one or more processors 120 may also identify a particular ingredient for validation. For example, the one or more processors 120 may determine properties (e.g., pharmaceutical, biopharmaceutical, nutritional, and/or aroma-ingredient properties) of the ingredient by analyzing text of the document; and identify the ingredient based on the determined properties. In another example, the one or more processors 120 determine two ingredients from the validation information, and then perform the validation by determining if the two ingredients may be combined based on the validation information.
- properties e.g., pharmaceutical, biopharmaceutical, nutritional, and/or aroma-ingredient properties
- a certificate may be constructed certifying the validation.
- the one or more processors 120 may construct a standardized life sciences document having a document category of a certificate and a subcategory of: a certificate of analysis; a third- party certification of a good manufacturing practice; a third-party certification of quality management; or a third party certification of Kosher compliance.
- the certificate/standardized constructed standardized document may include an electronic signature. Additionally or alternatively, the certificate/standardized constructed standardized document may include the ingredient information and/or validation information.
- the one or more processors 120 may also cause a display device to display an indication of if the ingredient has been validated.
- the display device may be any display device, such as a display device of any of the laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.
- Figure 6 illustrates an example method 600 for validating ingredients, including forwarding validated ingredient information.
- the blocks of the example method 600 may be performed by the one or more processors 120.
- the example description below refers to blocks of the method as performed by the one or more processors 120, it should be understood that any of the blocks may be performed by any suitable component (e.g., the one or more processors 141 , the one or more processors 151 , the one or more processors 161 , the one or more processors 171 , etc.).
- the example method 600 begins at block 610 when the one or more processors 120 receive a document (e.g., similarly to block 210 of Figure 2).
- the one or more processors 120 receive validation information (e.g., similarly to block 220 of Figure 2).
- the one or more processors 120 may attempt to validate the ingredient information (e.g., as described elsewhere herein, for example, with respect to block 230 of Figure 2).
- the one or more processors may 120 send an indication of the unsuccessful validation (e.g., to an entity that sent the document at block 210/610, such as the laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.).
- the indication may cause a display (e.g., a display of any of the validation computing device 102, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.) to display information indicating that the validation was unsuccessful.
- the display may also display information indicating why the validation was not successful; for example, the display may display “ingredient XYZ was present in too high a level to comply with the laws of state ABC,” or “the system was not able to verify the electronic signature on the document.”
- the one or more processors 120 determine if the document is a certificate or a declaration (e.g., a standardized document with a document category of a certificate or declaration) at block 650. If the document is a certificate or a declaration, at block 660, the one or more processors 120 may forward the certificate or declaration (e.g., to the government computing device 160, the laboratory computing device 140, the manufacturing computing device 150, the administrator computing device 170, etc.). However, in some embodiments, the one or more processors 120 may check the certificate or declaration based on a confidentiality rule and/or a privacy rule, and refrain from forwarding if a rule is violated.
- a certificate or a declaration e.g., a standardized document with a document category of a certificate or declaration
- the one or more processors 120 may construct, at block 670, a certificate or declaration (e.g., a standardized document with a category of certificate or declaration).
- a certificate or declaration e.g., a standardized document with a category of certificate or declaration.
- Examples of documents that are not certificates or declarations include: a standardized document with a category of a specification, a document that an inspector has input into the system that indicates levels of ingredients, etc.
- the constructed certificate or declaration may include the (validated) ingredient information from the document received at block 210/610.
- the one or more processors 120 may facilitate the ingredient manufacturer electronically signing the declaration (e.g., by allowing the ingredient manufacturer to electronically sign the declaration via the manufacturing computing device 150).
- the one or more processors 120 may facilitate a third party (e.g., an inspector) electronically signing the certificate (e.g., by allowing the third part to electronically sign the certificate via the laboratory computing device 140).
- a third party e.g., an inspector
- electronically signing the certificate e.g., by allowing the third part to electronically sign the certificate via the laboratory computing device 140.
- the one or more processors 120 may forward the constructed certificate or declaration (e.g., to the government computing device 160, the manufacturer computing device 150, the laboratory computing device 140, the administrator computing device 170, etc.). However, in some embodiments, the one or more processors 120 may check the certificate or declaration based on a confidentiality rule and/or a privacy rule, and refrain from forwarding if a rule is violated.
- Chatbot aspects including example training of a chatbot
- chatbot may converse with a party to obtain the necessary information to validate an ingredient.
- ML machine learning
- Al artificial intelligence
- voicebot voicebot
- model any artificial intelligence
- examples of the chatbot may include a generative Al chatbot, a generative pre-trained transformer chatbot (ChatGPT), a large language model (LLM)-based chatbot, etc.
- the chatbot may be trained by validation computing device 102 using large training datasets of text and/or data which may provide sophisticated capability for natural-language tasks, such as answering questions and/or holding conversations.
- the chatbot may include a general-purpose pretrained LLM which, when provided with a starting set of words (prompt) as an input, may attempt to provide an output (response) of the most likely set of words that follow from the input.
- the prompt may be provided to, and/or the response received from, the chatbot and/or any other ML model, via a user interface of the validation computing device 102.
- This may include a user interface device operably connected to the server via an I/O module.
- Exemplary user interface devices may include a touchscreen, a keyboard, a mouse, a microphone, a speaker, a display, and/or any other suitable user interface devices.
- Multi-turn (i.e., back-and-forth) conversations may require LLMs to maintain context and coherence across multiple user utterances, which may require the chatbot to keep track of an entire conversation history as well as the current state of the conversation.
- the chatbot may rely on various techniques to engage in conversations with users, which may include the use of short-term and long-term memory.
- Short-term memory may temporarily store information (e.g., in the memory 122 of the validation computing device 102) that may be required for immediate use and may keep track of the current state of the conversation and/or to understand the user’s latest input in order to generate an appropriate response.
- Long-term memory may include persistent storage of information (e.g., the internal database 1 18 of the validation computing device 102) which may be accessed over an extended period of time.
- the long-term memory may be used by the chatbot to store information about the user (e.g., preferences, chat history, etc.) and may be useful for improving an overall user experience by enabling the chatbot to personalize and/or provide more informed responses.
- the system and methods to generate and/or train a ML chatbot model which may be used in the chatbot may include three steps: (1 ) a supervised fine-tuning (SFT) step where a pretrained language model (e.g., an LLM) may be fine-tuned on a relatively small amount of demonstration data curated by human labelers to learn a supervised policy (SFT ML model) which may generate responses/outputs from a selected list of prompts/inputs.
- a pretrained language model e.g., an LLM
- SFT ML model supervised policy
- the SFT ML model may represent a cursory model for what may be later developed and/or configured as the ML chatbot model; (2) a reward model step where human labelers may rank numerous SFT ML model responses to evaluate the responses which best mimic preferred human responses, thereby generating comparison data.
- the reward model may be trained on the comparison data; and/or (3) a policy optimization step in which the reward model may further fine-tune and improve the SFT ML model.
- the outcome of this step may be the ML chatbot model using an optimized policy.
- step one may take place only once, while steps two and three may be iterated continuously, e.g., more comparison data is collected on the current ML chatbot model, which may be used to optimize/update the reward model and/or further optimize/update the policy.
- Figure 7 depicts a combined block and logic diagram 700 for training an ML chatbot model, in which the techniques described herein may be implemented, according to some embodiments. It should be understood that Figure 7 may apply to training any chatbot described herein. In addition, the chatbot may be trained in accordance with any of the other techniques described herein; and the training of chatbot should not be considered restricted to the teachings of Figure 7.
- Some of the blocks in Figure 7 may represent hardware and/or software components, other blocks may represent data structures or memory storing these data structures, registers, or state variables (e.g., 712), and other blocks may represent output data (e.g., 725). Input and/or output signals may be represented by arrows labeled with corresponding signal names and/or other identifiers.
- the methods and systems may include one or more blocks 702, 704, 706, which will be described in further detail below.
- a pretrained language model 710 may be finetuned.
- the pretrained language model 710 may be obtained at block 702 and be stored in a memory, such as memory 122 and/or internal database 118.
- the pretrained language model 710 may be loaded into an ML training module at block 702 for retraining/fine-tuning.
- a supervised training dataset 712 may be used to fine-tune the pretrained language model 710 wherein each data input prompt to the pretrained language model 710 may have a known output response for the pretrained language model 710 to learn from.
- the supervised training dataset 712 may be stored in a memory at block 702, e.g., the memory 122 or the internal database 1 18.
- the data labelers may create the supervised training dataset 712 prompts and appropriate responses.
- the pretrained language model 710 may be fine-tuned using the supervised training dataset 712 resulting in the SFT ML model 715 which may provide appropriate responses to user prompts once trained.
- the trained SFT ML model 715 may be stored in a memory, such as the memory 122 or the internal database 118.
- the supervised training dataset 712 includes historical data (e.g., held by the validation computing device 102, etc.).
- the historical data may include, for example: (a) historical ingredient information, (b) historical validation information, (c) historical standardized documents, (d) historical communications from validation requestors, and/or (e) historical communications from validating parties.
- the chatbot may be trained using the above (a)-(d) as input (e.g., also referred to as independent variables, or explanatory variables), and the above (e) used as the output (e.g., also referred to as a dependent variable, or response variable).
- the chatbot may be trained to generate the above (e) (e.g., generate communications to send to validation requestors).
- the generated communications may be sent in the form of text message, email, or as part of a chat session (e.g., the chatbot generates multiple communications as part of a chat session/conversation), etc.
- examples of the historical ingredient information includes historical ingredient information corresponding to ingredient information discussed elsewhere herein.
- the historical ingredient information may include historical: ingredient type information, amounts of ingredient(s) in a product, concentrations of ingredient(s) in a product, and/or properties of the ingredient (e.g., chemical properties, acidity levels, etc.), etc.
- the historical validation information may include information corresponding to the validation information discussed elsewhere herein (e.g., may include historical regulatory information from a government entity, historical compliance information from a company, etc.).
- examples of the historical standardized documents include historical standardized documents corresponding to the standardized documents discussed elsewhere herein.
- the historical communications from validation requestors may be taken, for example, from historical conversations (e.g., text conversations, audio conversations, etc.) between validation requestors and validating parties (e.g., a user of validation computing device 102, etc.).
- the historical communications from validation requestors may be the statements made by the validation requestor during the conversation.
- historical communications from validating parties may be taken, for example, from historical conversations (e.g., text conversations, audio conversations, etc.) between validation requestors and validating parties.
- the historical communications from validating parties may be the statements made by a validating party during the conversation. Training The Reward Model
- training the ML chatbot model 750 may include, at block 704, training a reward model 720 to provide as an output a scaler value/reward 725.
- the reward model 720 may be required to leverage reinforcement learning with human feedback (RLHF) in which a model (e.g., ML chatbot model 750) learns to produce outputs which maximize its reward 725, and in doing so may provide improved responses.
- RLHF reinforcement learning with human feedback
- Training the reward model 720 may include, at block 704, providing a single prompt 722 to the SFT ML model 715 as an input.
- the input prompt 722 (e.g., any of the above (a)-(d)) may be provided via an input device (e.g., a keyboard) of the validation computing device 102.
- the prompt 722 may be previously unknown to the SFT ML model 715, e.g., the labelers may generate new prompt data, the prompt 722 may include testing data stored on internal database 118, and/or any other suitable prompt data.
- the SFT ML model 715 may generate multiple, different output responses 724A, 724B, 724C, 724D to the single prompt 722. In some embodiments, the different output responses 724A, 724B, 724C, 724D include suggested responses.
- the validation computing device 102 may output the responses 724A, 724B, 724C, 724D via any suitable technique, such as outputting via a display (e.g., as text responses), a speaker (e.g., as audio/voice responses), etc., for review by the data labelers.
- a display e.g., as text responses
- a speaker e.g., as audio/voice responses
- the data labelers may provide feedback (e.g., via the validation computing device 102, etc.) on the responses 724A, 724B, 724C, 724D when ranking 726 them from best to worst based upon the prompt-response pairs.
- the data labelers may rank 726 the responses 724A, 724B, 724C, 724D by labeling the associated data.
- the ranked prompt-response pairs 728 may be used to train the reward model 720.
- the validation computing device 102 may load the reward model 720 and train the reward model 720 using the ranked response pairs 728 as input.
- the reward model 720 may provide as an output the scalar reward 725.
- the scalar reward 725 may include a value numerically representing a human preference for the best and/or most expected response to a prompt, i.e., a higher scaler reward value may indicate the user is more likely to prefer that response, and a lower scalar reward may indicate that the user is less likely to prefer that response.
- a higher scaler reward value may indicate the user is more likely to prefer that response
- a lower scalar reward may indicate that the user is less likely to prefer that response.
- inputting the “winning” prompt-response (i.e., inputoutput) pair data to the reward model 720 may generate a winning reward.
- Inputting a “losing” prompt-response pair data to the same reward model 720 may generate a losing reward.
- the reward model 720 and/or scalar reward 725 may be updated based upon labelers ranking 726 additional prompt-response pairs generated in response to additional prompts 722.
- a data labeler may provide to the SFT ML model 715 as an input prompt 722, “Describe the sky.”
- the input may be provided by the labeler (e.g., via the administrator computing device 170, etc.) to the validation computing device 102 running the chatbot utilizing the SFT ML model.
- the SFT ML model 715 may provide as output responses to the labeler (e.g., via their respective devices): (i) “the sky is above” 724A; (ii) “the sky includes the atmosphere and may be considered a place between the ground and outer space” 724B; and (iii) “the sky is heavenly” 724C.
- the data labeler may rank 726, via labeling the prompt-response pairs, prompt-response pair 722/724B as the most preferred answer; prompt-response pair 722/724A as a less preferred answer; and prompt-response 722/724C as the least preferred answer.
- the labeler may rank 726 the prompt-response pair data in any suitable manner.
- the ranked prompt-response pairs 728 may be provided to the reward model 720 to generate the scalar reward 725. It should be appreciated that this facilitates training the chatbot to correspond with a user (e.g., a validation requestor) to, for example, request additional information that may be used to validate the ingredient.
- the reward model 720 may provide the scalar reward 725 as an output, the reward model 720 may not generate a response (e.g., text). Rather, the scalar reward 725 may be used by a version of the SFT ML model 715 to generate more accurate responses to prompts, i.e., the SFT model 715 may generate the response such as text to the prompt, and the reward model 720 may receive the response to generate a scalar reward 725 of how well humans perceive it. Reinforcement learning may optimize the SFT model 715 with respect to the reward model 720 which may realize the configured ML chatbot model 750.
- the validation computing device 102 may train the ML chatbot model 750 to generate a response 734 to a random, new and/or previously unknown user prompt 732.
- the ML chatbot model 750 may use a policy 735 (e.g., algorithm) which it learns during training of the reward model 720, and in doing so may advance from the SFT model 715 to the ML chatbot model 750.
- the policy 735 may represent a strategy that the ML chatbot model 750 learns to maximize its reward 725.
- a human labeler may continuously provide feedback to assist in determining how well the ML chatbot’s 750 responses match expected responses to determine rewards 725.
- the rewards 725 may feed back into the ML chatbot model 750 to evolve the policy 735.
- the policy 735 may adjust the parameters of the ML chatbot model 750 based upon the rewards 725 it receives for generating good responses.
- the policy 735 may update as the ML chatbot model 750 provides responses 734 to additional prompts 732.
- the response 734 of the ML chatbot model 750 using the policy 735 based upon the reward 725 may be compared 738 to the SFT ML model 715 (which may not use a policy) response 736 of the same prompt 732.
- the validation computing device 102 may compute a penalty 740 based upon the comparison 738 of the responses 734, 736.
- the penalty 740 may reduce the distance between the responses 734, 736, i.e. , a statistical distance measuring how one probability distribution is different from a second, in one aspect the response 734 of the ML chatbot model 750 versus the response 736 of the SFT model 715.
- the ML chatbot model 750 optimizations may result in generating responses 734 which are unreasonable but may still result in the reward model 720 outputting a high reward 725.
- the responses 734 of the ML chatbot model 750 using the current policy 735 may be passed, at block 706, to the rewards model 720, which may return the scalar reward 725.
- the ML chatbot model 750 response 734 may be compared 738 to the SFT ML model 715 response 736 to compute the penalty 740.
- a final reward 742 may be generated which may include the scalar reward 725 offset and/or restricted by the penalty 740.
- the final reward 742 may be provided to the ML chatbot model 750 and may update the policy 735, which in turn may improve the functionality of the ML chatbot model 750.
- RLHF via the human labeler feedback may continue ranking 726 responses of the ML chatbot model 750 versus outputs of earlier/other versions of the SFT ML model 715, i.e., providing positive or negative rewards 725.
- the RLHF may allow the validation computing device 102 to continue iteratively updating the reward model 720 and/or the policy 735.
- the ML chatbot model 750 may be retrained and/or fine-tuned based upon the human feedback via the RLHF process, and throughout continuing conversations may become increasingly efficient.
- each providing one of the three steps of the overall ML chatbot model 750 training fewer and/or additional servers may be utilized and/or may provide the one or more steps of the chatbot training. In some embodiments, one server may provide the entire ML chatbot model 750 training.
- routines, subroutines, applications, or instructions may constitute either software (code embodied on a non-transitory, tangible machine-readable medium) or hardware.
- routines, etc. are tangible units capable of performing certain operations and may be configured or arranged in a certain manner.
- one or more computer systems e.g., a standalone, client or server computer system
- one or more hardware modules of a computer system e.g., a processor or a group of processors
- software e.g., an application or application portion
- a hardware module may be implemented mechanically or electronically.
- a hardware module may comprise dedicated circuitry or logic that is permanently configured (e.g., as a special-purpose processor, such as a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC) to perform certain operations.
- a hardware module may also comprise programmable logic or circuitry (e.g., as encompassed within a general- purpose processor or other programmable processor) that is temporarily configured by software to perform certain operations. It will be appreciated that the decision to implement a hardware module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
- the term “hardware module” should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein.
- hardware modules are temporarily configured (e.g., programmed)
- each of the hardware modules need not be configured or instantiated at any one instance in time.
- the hardware modules comprise a general-purpose processor configured using software
- the general-purpose processor may be configured as respective different hardware modules at different times.
- Software may accordingly configure a processor, for example, to constitute a particular hardware module at one instance of time and to constitute a different hardware module at a different instance of time.
- Hardware modules can provide information to, and receive information from, other hardware modules. Accordingly, the described hardware modules may be regarded as being communicatively coupled. Where multiple of such hardware modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the hardware modules. In embodiments in which multiple hardware modules are configured or instantiated at different times, communications between such hardware modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware modules have access. For example, one hardware module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further hardware module may then, at a later time, access the memory device to retrieve and process the stored output. Hardware modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information).
- a resource e.g., a collection of information
- processors may constitute processor-implemented modules that operate to perform one or more operations or functions.
- the modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
- the methods or routines described herein may be at least partially processor-implemented. For example, at least some of the operations of a method may be performed by one or more processors or processor-implemented hardware modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of geographic locations.
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Economics (AREA)
- General Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Educational Administration (AREA)
- General Business, Economics & Management (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Marketing (AREA)
- Life Sciences & Earth Sciences (AREA)
- Epidemiology (AREA)
- Accounting & Taxation (AREA)
- Public Health (AREA)
- Primary Health Care (AREA)
- Finance (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Molecular Biology (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The following relates generally to validating ingredients in life sciences systems. In some examples, the validation may be done by validating ingredient information from a standardized life sciences document. The validation may be done based on regulatory information from a government entity, and/or compliance information from a company. The regulatory information and/or compliance information may be updated in real time.
Description
VALIDATION AND MAINTENANCE OF ELECTRONIC LIFE SCIENCE INDUSTRIES DATA
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to all of: U.S. Provisional Application No. 63/461 ,707, entitled “File Creation Using Artificial Intelligence (Al), Natural Language Processing (NLP), And Other Techniques For Product, Quality And Regulatory Information The In Life Science Industries” (filed April 25, 2023); U.S. Provisional Application No. 63/461 ,689, entitled “Validation And Maintenance Of Electronic Pharmaceutical Data” (filed April 25, 2023); U.S. Provisional Application No.
63/461 ,698, entitled “Confidential Disclosures of Pharmaceutical Data” (filed April 25, 2023); U.S. Provisional Application No. 63/449,682, entitled “File Creation Using Artificial Intelligence (Al), Natural Language Processing (NLP), And Other Techniques” (filed March 3, 2023); U.S. Provisional Application No. 63/449,719, entitled “Validation And Maintenance Of Electronic Pharmaceutical Data” (filed March 3, 2023); and U.S. Provisional Application No. 63/449,704, entitled “Confidential Disclosures of Pharmaceutical Data” (filed March 3, 2023), each of which is incorporated by reference herein in their entirety.
BACKGROUND
[0002] In the pharmaceutical, biopharmaceutical, human & animal nutrition, and aroma ingredient manufacturing and supplying industries, current systems for determining compliance with governmental regulations and/or company rules are cumbersome, inefficient, and/or inaccurate.
[0003] The systems and methods disclosed herein provide solutions to this problem and others.
SUMMARY
[0004] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
[0005] In one aspect, the following discloses a system that validates and/or identifies ingredients. The system may start with a single document (or file), such as a document in a standardized format, such as a standardized life sciences document (e.g., a standardized pharmaceutical, biopharmaceutical, nutritional or aroma-ingredient document), thus greatly simplifying the quality and regulation process. The validation may be done according to the regulations/laws/standards/rules of different countries and/or companies.
[0006] In one example, a computer-implemented method for validating ingredients may be provided. The method may include: (1 ) receiving, via one or more processors, a document including ingredient information; (2) receiving, via the one or more processors, validation information comprising (i) regulatory information from a government entity, and/or (ii) compliance information from a company; and (3) validating, via the one or more processors, an ingredient indicated by the ingredient information based on the validation information.
[0007] In another example, a computer system for validating ingredients may be provided. The computer system may include one or more processors configured to: (1 ) receive a document including ingredient information; (2) receive validation information comprising (i) regulatory information from a government entity, and/or (ii) compliance information from a company or 3rd party organization; and (3) validate an ingredient indicated by the ingredient information based on the validation information.
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] Figure 1 illustrates an example system for validating ingredients.
[0009] Figure 2 illustrates an example method for validating ingredients.
[0010] Figure 3 illustrates an example standardized document with a document category of declaration.
[0011 ] Figure 4 illustrates an example standardized document with a document category of certificate, and a subcategory of certificate of analysis.
[0012] Figure 5 illustrates an example standardized document with a document category of specification.
[0013] Figure 6 illustrates an example method for validating ingredients, including forwarding validated ingredient information.
[0014] Figure 7 depicts a combined block and logic diagram for training an ML chatbot model, in which the techniques described herein may be implemented, according to some embodiments.
[0015] Advantages will become more apparent to those skilled in the art from the following description of the preferred embodiments which have been shown and described by way of illustration. As will be realized, the present embodiments may be capable of other and different embodiments, and their details are capable of modification in various respects. Accordingly, the drawings and description are to be regarded as illustrative in nature and not as restrictive.
DETAILED DESCRIPTION
[0016] The present embodiments relate to, inter alia, validating ingredients.
Example system
[0017] Figure 1 depicts an exemplary computing environment 100 in which the techniques disclosed herein may be implemented, according to some aspects. The computing environment 100 may include a validation computing device 102, which, in some aspects, may implement the techniques described herein. For example, the validation computing device 102 may validate an ingredient.
[0018] The validation computing device 102 may include one or more processors 120, such as one or more microprocessors, controllers, and/or any other suitable type of processor. The validation computing device 102 may further include a memory 122 (e.g., volatile memory, non-volatile memory) accessible by the one or more processors 120, (e.g., via a memory controller). The one or more processors 120 may interact with the memory 122 to obtain and execute, for example, computer-readable instructions stored in the memory 122. Additionally or alternatively, computer-readable instructions may be stored on one or more removable media (e.g., a compact disc, a digital versatile disc, removable flash memory, etc.) that may be coupled to the validation computing device 102 to provide access to the computer-readable instructions stored thereon. In
particular, the computer-readable instructions stored on the memory 122 may include applications, such as validator 124.
[0019] The validator 124 may include instructions for, inter alia, validating ingredients. The ingredients may be any kind of pharmaceutical, biopharmaceutical, nutritional, or aroma ingredient of a product. Examples of the ingredients include active pharmaceutical ingredients, inactive pharmaceutical ingredients, carbohydrates, sugars, celluloses, starches, sugar alcohols, artificial sweeteners, microcrystalline cellulose, cellulose ethers, hydroxypropyl methylcellulose, cellulose esters, carboxymethyl cellulose, croscarmellose, glycerine, propylene glycol, polyethylene glycol, polyoxyethylene, polyoxypropylene, poloxamers, povidone, crospovidone, copovidone, petrolatum, mineral oil, acrylic cyclodextrin, beta-cyclodextrin, hydroxypropyl betacyclodextrin, lactose monohydrate, anhydrous lactose, polyethylene glycol-polyvinyl alcohol graft copolymer, methacrylic acid-ethyl acrylate copolymer, methyl methacrylate-diethylaminoethyl methacrylate copolymer, polyvinyl alcohol, polyvinyl acetate, saturated fatty acids, stearic acid, unsaturated fatty acids, oleic acid, fatty alcohols, stearyl alcohol, cetyl alcohol, long chain triglycerides, medium chain triglycerides, short chain triglycerides, polyvinylpyrrolidone-vinyl acetate copolymer, ethoxylated fatty alcohols, polyethylene glycol 15 hydroxystearate, sodium lauryl sulfate, polysorbates, triacetin, 2-pyrollidone, castor oil, corn starch, sulfabutyl ether beta- cyclodextrin sodium, metal stearates, magnesium stearate, sodium stearyl fumarate, lanolin, beeswax, carnauba wax, petroleum jelly, paraffin wax, microcrystalline wax, phospholipids, sesame oil, corn oil, palm oil, palm kernel oil, soybean oil, gelatin, pectin, albumin, citric acid, lactic acid, polysaccharide gum, shellac, calcium phosphate, calcium carbonate, silica, sodium chloride, titanium dioxide, zinc oxide, Polyvinyl caprolactam-polyvinyl acetate-polyethylene glycol graft copolymer, polymers, graft copolymers with a PEG backbone, and water. Moreover, it should be appreciated that the ingredients may include combinations of these examples (e.g., obtained by physical mixing and/or co-processing, etc.).
[0020] In some embodiments, an ingredient may be selected from a group of a type of ingredients. Examples of groups of types of ingredients include: active pharmaceutical ingredients, inactive pharmaceutical ingredients, fillers, diluents,
suspension agent, coatings, binders, flavoring agents, colorants, lubricants, glidants, preservatives, sweeteners, emollients, consistency factors, viscosity agent, solubilizers, solvents, disintegrants, matrix-formers, thickener, vehicle, metal oxides, emulsifiers, surfactants, oleochemicals, lipids, waxes, fats, fatty acids, fatty alcohols, penetration enhancer. In some examples, these can be naturally derived (animal or plant), vegetable oils, mineral-derived, or synthetic.
[0021] The internal database 1 18 may hold any suitable information. For example, the internal database 118 may hold: standardized or unstandardized life sciences documents (e.g., pharmaceutical, biopharmaceutical, nutritional, or aroma-ingredient documents); pharmaceutical or biopharmaceutical product information; nutritional product information; aroma-ingredient information; regulatory information (e.g., information based on laws of a particular jurisdiction, etc.); compliance information (e.g., information from a company, such as a manufacturer of the product including the ingredient); information of companies (e.g. a company corresponding to the laboratory computing device 140, corresponding to the manufacturing computing device 150, etc.); ingredient information; etc.
[0022] As mentioned above, the regulatory information may be based on laws of a particular jurisdiction. Examples of the regulatory information include maximum amounts of ingredients in a product, information of ingredients not allowed to be mixed with each other, information of manufacturing practices, information of time periods that the ingredients are required to be current with and/or certified for, etc.
[0023] Examples of the compliance information include maximum amounts of ingredients in a product, information of ingredients not allowed to be mixed with each other, information of manufacturing practices, etc.
[0024] The external database 180 may also hold any suitable information. For example, the external database 180 may hold: standardized or unstandardized life sciences documents; nutritional product information; aroma-ingredient information; pharmaceutical or biopharmaceutical product information; regulatory information (e.g., information based on laws of a particular jurisdiction, etc.); compliance information (e.g., information from a company, such as a manufacturer of the product including the
ingredient); information of companies (e.g. a company corresponding to the laboratory computing device 140, corresponding to the manufacturing computing device 150, etc.); ingredient information; etc.
[0025] The exemplary computing environment 100 may further include laboratory computing device 140, which may include one or more processors 141 such as one or more microprocessors, controllers, and/or any other suitable type of processor. The laboratory computing device 140 may correspond to a laboratory that tests ingredients, and/or proposes ingredients for products. Additionally or alternatively, the laboratory computing device 140 may correspond to a laboratory that is inspecting a product (e.g., inspecting the product to issue a certificate for a standardized document with a category of certificate).
[0026] The exemplary computing environment 100 may further include manufacturer computing device 150, which may include one or more processors 151 such as one or more microprocessors, controllers, and/or any other suitable type of processor. The manufacturer computing device 150 may correspond to a manufacturer that manufactures ingredients.
[0027] The exemplary computing environment 100 may further include government computing device 160, which may include one or more processors 161 such as one or more microprocessors, controllers, and/or any other suitable type of processor. In some embodiments, the government computing device 160 corresponds to a government entity or other regulator.
[0028] The exemplary computing environment 100 may further include administrator computing device 170, which may include one or more processors 171 such as one or more microprocessors, controllers, and/or any other suitable type of processor.
[0029] Any of the components in the exemplary computing environment 100 may communicate via the network 104 as illustrated. The network 104 may be a single communication network, or may include multiple communication networks of one or more types (e.g., one or more wired and/or wireless local area networks (LANs), and/or one or more wired and/or wireless wide area networks (WANs), such as the Internet).
[0030] Moreover, although the example of Figure 1 illustrates only one of each of many of the components, such as the validation computing device 102, internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc., any number of each of the components illustrated in Figure 1 may be included in a system (e.g., multiple validation computing devices 102, internal databases 118, external databases 180, laboratory computing devices 140, manufacturer computing devices 150, government computing devices 160, administrator computing devices 170, etc.).
Example methods
[0031] Figure 2 illustrates an example method 200 for validating ingredients. In some embodiments, the blocks of the example method 200 may be performed by the one or more processors 120. However, although the example description below refers to blocks of the method as performed by the one or more processors 120, it should be understood that any of the blocks may be performed by any suitable component (e.g., the one or more processors 141 , the one or more processors 151 , the one or more processors 161 , the one or more processors 171 , etc.).
[0032] The example method 200 begins at block 210 when the one or more processors 120 receive a document (or file; although this discussion refers to receiving a document, this should not be construed as limiting, and it should be understood that the discussion applies equally for receiving a file rather than a document) including ingredient information. The document may be received from any suitable source. For example, the document may be received from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.
[0033] In some embodiments, the document is a standardized life sciences document. The standardized document may include a plurality of standardized document elements, and the standardized document elements of the plurality of standardized document elements may include respective: element names; element types comprising (i) selection start, (ii) field, or (iii) selection end; field types; a
mandatory or optional marking; and/or element values. In some examples, the elements names and/or element values indicate names of ingredients.
[0034] As will be described elsewhere herein, the standardized document may have a category (e.g., specification, declaration, or certificate), and a subcategory. Advantageously, receiving the document as a standardized document (rather than an unstandardized document, such as an unstandardized email, pdf, power point, etc.) streamlines the validation process. For example, the system is able to determine validity more quickly and more accurately because the system “knows” exactly where to pull the information from rather than having to search for the information (e.g., using an NLP algorithm, etc.) or guess where the information is; that is, the systems described herein reduce the likelihood of erroneous acquisitions of information that would come from an unstandardized document. Further advantageously, a standardized document (e.g., with a category of specification, declaration, or certificate) may be created from a plurality of unstandardized documents (e.g., emails, pdf files, etc.) sent by a company; thus, for validation, the one or more processors 120 will have to review fewer document(s) than a system that reviews unstandardized documents.
[0035] In some embodiments, the ingredient information (received at block 210 via the document) comprises ingredient type information, and the standardized life sciences document includes a document category and a document subcategory. In some examples, the document subcategory indicates the ingredient type information. For example, the standardized document may have a subcategory corresponding to: active pharmaceutical ingredients, excipients, fillers, diluents, suspension agent, coatings, binders, flavoring agents, colorants, lubricants, glidants, preservatives, sweeteners, emollients, consistency factors, viscosity agent, solubilizers, solvents, disintegrants, matrix-formers, thickener, vehicle, metal oxides, emulsifiers, surfactants, oleochemicals, lipids, waxes, fats, fatty acids, fatty alcohols, and penetration enhancers. The ingredient information may also include amounts of ingredient(s) in a product, concentrations of ingredient(s) in a product, and/or properties of the ingredient (e.g., chemical properties, acidity levels, etc.).
[0036] In some embodiments, the document (received at block 210) comprises a standardized life sciences document having a document category of declaration (for example, a declaration about an ingredient that is required by applicable law) and a subcategory of: aflatoxins, allergens, Genetically Modified Organisms (GMO), Good Manufacturing Procedure (GMP), Manufacturing Procedure, melamine, nitrosamine, and Transmissible Spongiform Encephalopathy (TSE) / Bovine Spongiform Encephalopathy (BSE). Aflatoxin declarations may include information on absence or present levels of specific undesirable substances produced by categorized molds. Allergen declarations may include information on occurrence of specified allergens as referred to, for example, in European and US regulatory provisions. GMO declarations may include information on the status of Good Manufacturing Practice as described by the manufacturing party. GMP declarations may include information on the status of Good Manufacturing Practice as described by the manufacturing party. Manufacturing Procedure declarations may include basic details to an ingredient’s manufacturing process. Melamine declarations may include information on the disuse of Melamine in the manufacturing process. Nitrosamine declarations may include information on a risk evaluation for Nitrosamines following international industry associations and authorities. TSE and BSE declarations may include information on the disuse of BSE/TSE relevant material from animal or human origin in production process. Further examples of subcategories of declaration documents include: GMP Compliance, Manufacturing Flow Chart, Site Information, Quality Summary, Regulatory Summary, Technical Information, Stability Report, Specification, Alfatoxin Statement, Allergens Statement, BSE/TSE Statement, Contaminant Statement, Dioxine Statement, GMO Statement, Halal statement or certificate, FDA inspection letter or statement, melamine statement, Gluten statement, Natural Latex statement, Nano Statement, Microplastic Statement, Microbial Information Note or Statement, Nitrosamine Risk Assessment, REACH Statement, Phthalate Statement, Prop 65 Statement, Residual Solvents Statement, Pharmacopeial compliance statement or note, Product Carbon Footprint Statement or Document, Social and/or Environmental Statements, Dossier, Technical Packet, and/or drug master file (DMF).
[0037] Furthermore, some of the standardized document elements apply to each subcategory. For example, there may be elements with elements names indicating: an ingredient name, file information (e.g., information of the standardized document, including a version of the standard), an email address (e.g., an email address for questions related to the file information and format, or an email address for questions related to the data content), a document generation date, a document generation time, company disclaimer, an electronic signature, etc.
[0038] However, some of the standardized document elements may correspond to specific subcategories. For example, for the allergen subcategory, there may be standardized document elements names indicating, inter alia. allergen name (or identifier), allergen text, allergen footnote, etc.
[0039] Figure 3 illustrates a working example of a standardized document 300 with a document category of a declaration. In this example, the line 310 indicates the product ID (e.g., name of the product, product identification number, etc.). Put another way, the line 310 shows an element with an element name of product ID and an element value of “Product XYZ.”
[0040] The example standardized document 300 further includes line 320, which shows the declaration(s) being made (e.g., shows an element with an element name of declaration description; and an element value of a description of the declaration being made).
[0041] In the example of Figure 3, the declarations 320 include declaration topic 321 A; declaration body 321 B; declaration data 321 C; declaration reference 321 D; declaration comment 321 E; and declaration footnote 321 F.
[0042] Line 330 shows an element with an element name of company name, and an element value of “ABC Corp.” Line 340 shows an element with an element name of Company Address, and an element value of “123 Main St.; Montpelier, VT 05601.” Line 350 shows an element with an element name of Company Email, and an element value of “Jane.Doe@ABCCorp.com.” It should be appreciated that any or all of the information from lines 310, 320, 330, 340, 350 may be taken by the one or more processors 120 from unstandardized document(s). Additionally or alternatively, the
information from lines 310, 320, 330, 340, 350 may be taken from other standardized documents.
[0043] Line 360 shows an element with an element name of electronic signature, which has been signed by the company representative, Jane Doe. In some examples, to electronically sign the document, the one or more processors 120 may send and/or present a link (e.g., to the laboratory computing device 140, the manufacturer computing device 150, the government computing device 160, the administrator computing device 170, the validation computing device 102, etc.) so that the standardized document 300 may be electronically signed. Additionally or alternatively, the standardized document 300 may be signed by any other suitable technique.
[0044] In some embodiments, the document (received at block 210) comprises a standardized life sciences document having a document category of a certificate (for example, a certificate about an ingredient that is required by applicable law) and a subcategory of: a certificate of analysis; a third-party certification of a good manufacturing practice; a third-party certification of quality management; or a third party certification of Kosher or Halal or RSPO (Round Table of Sustainable Palm Oil) compliance. And the validating may include validating the ingredient based on the document category and/or subcategory. In some examples, the certification category and/or corresponding subcategories may be used subsequently in the validation process (e.g., at block 230).
[0045] Figure 4 illustrates another working example of a standardized document 400. The example standardized document 400 has a document category of a certificate, and a subcategory of certificate of analysis. In this example, the line 410 indicates the product ID (e.g., name of the product, product identification number, etc.). Put another way, the line 410 shows an element with an element name of product ID and an element value of “Product XYZ.”
[0046] The example certificate of analysis 400 includes line 420, which shows the certification(s) being made (e.g., shows an element with an element name of certification description, and an element value of a description of the certification being made).
[0047] In some examples, the certifications include parameters, such as parameters 421 A, 421 B, 421 C, 421 D, 421 E, 421 F, 422A, 422B, 422C, 422D, 422E, 422F. For instance, the example certifications 420 include: Tested Parameter 1 - Name [text]: Water; Tested Parameter 1 - Description [text]: Humidity; Tested Parameter 1 - Dimension [Unit of measure]: g/100g; Tested Parameter 1 - Numeric Value [number]: 4.2; Tested Parameter 1 - Method of Analysis [text]: Ph.Eur. 123-XYZ-1998; Tested Parameter 1 - Comment [text]: fresh sample; Tested Parameter 2 - Name [text]: IR; Tested Parameter 2 - Description [text]: Identification; Tested Parameter 2 - Analytical test [Type of measurement]: qualitative; Tested Parameter 2 - Descriptive Value [text]: complies; Tested Parameter 2 - Method of Analysis [text]: USP (current version); and Tested Parameter 12- Comment [text]: test A.
[0048] Line 430 shows an element with an element name of certifier, and an element value of “Bob Smith.” Line 440 shows an element with an element name of Certifying Company, and an element value of “ZZZ Inspection Corp.” Line 450 shows an element with an element name of Certificate Number, and an element value of “12345.” Line 460 shows an element with an element name of Date of First Certification, and an element value of “1/1/2024.” It should be appreciated that any or all of the information from lines 410, 420, 430, 440, 450 may be taken by the one or more processors 120 from unstandardized document(s). Additionally or alternatively, the information from lines 410, 420, 430, 440, 450 may be taken from other standardized documents (e.g., document(s) sent by Bob Smith of ZZZ Inspection Corp., etc.).
[0049] Line 470 shows an element with an element name of electronic signature, which has been signed by the inspection company representative, Bob Smith. In some examples, to electronically sign the document, the one or more processors 120 may send a link to Bob Smith (e.g., send the link to the laboratory computing device 140, etc.) so that the standardized document 400 may be electronically signed. Additionally or alternatively, the standardized document 400 may be signed by any other suitable technique.
[0050] In addition, although the above has noted examples of the subcategories of certificate as being certificates of analysis, third-party certifications of a good
manufacturing practice, third-party certifications of quality management, and third party certifications of Kosher compliance, other subcategories are possible as well. For instance, other example subcategories of certificates include: aflatoxins; allergens; genetically modified organisms (GMO); good manufacturing practice (GMP); manufacturing procedure; melamine; nitrosamine; and Transmissible Spongiform Encephalopathy (TSE) or Bovine Spongiform Encephalopathy (BSE). For example, in a certificate with a subcategory of aflatoxin, a third party may inspect and/or certify that a particular toxin is not present in the product, or present below a certain threshold amount. In another example, in a certificate with a subcategory of allergen, a third party may inspect and/or certify that a particular allergen is not present in the product, or present below a certain threshold amount. In yet another example, in a certificate with a subcategory of GMP, a third party may inspect and/or certify that the company is adhering to a GMP. Analogous inspections and/or certifications may be made for the other listed example subcategories.
[0051] In some embodiments, the document (received at block 210) comprises a standardized life sciences document having a document category of specification and a subcategory of product specification. Product specifications may include information on criteria, their dimensions and guaranteed levels or statements for a defined product. In some examples, for the product specification subcategory, the standardized document may have document elements with names indicating: specification data, specification parameters, test method, description, reference standard, etc.
[0052] Figure 5 illustrates one working example of a standardized document 500. The example standardized document 500 has a document category of a specification. In this example, the line 510 indicates the product ID (e.g., name of the product, product identification number, etc.). Put another way, the line 510 shows an element with an element name of product ID and an element value of “Product XYZ.”
[0053] Line 520 shows an element with an element name of specification data, and an element value to be filled in with information from standardized or unstandardized documents in the creation of standardized document 500.
[0054] The specification data, in some examples, includes specification parameters, such as specification parameters 521 A, 521 B, 521 C, 521 D, 521 E, 521 F, 522A, 522B, 522C, 522D, 522E, 522F. For instance, the example specification data 420 includes: Specification Parameter 1 - Name [text]: Water; Specification Parameter 1 - Description [text]: Humidity; Specification Parameter 1 - Dimension [Unit of measure]: w% (g/100g); Specification Parameter 1 - Numeric Value [number]: NMT 5; Specification Parameter 1
- Method of Analysis [text]: Ph.Eur.; Specification Parameter 1 - Comment [text]: Karl- Fischer-Titration; Specification Parameter 2 - Name [text]: IR; Specification Parameter 2
- Description [text]: Identification; Specification Parameter 2 - Analytical test [Type of measurement]: qualitative; Specification Parameter 2 - Descriptive Value [text]: must comply; Specification Parameter 2 - Method of Analysis [text]: USP; and Specification Parameter 12- Comment [text]: test A.
[0055] Line 530 shows an element with an element name of company name, and an element value of “ABC Corp.” Line 540 shows an element with an element name of Company Address, and an element value of “123 Main St.; Montpelier, VT 05601.” Line 550 shows an element with an element name of Company Email, and an element value of Jane.Doe@ABCCorp.com. It should be appreciated that any or all of the information from lines 510, 520, 530, 540, 550 may be taken by the one or more processors 120 from the unstandardized document(s). Additionally or alternatively, the information from lines 510, 520, 530, 540, 550, 560 may be taken from other standardized document(s).
[0056] Line 560 shows an element with an element name of quality signature, which has been signed by the company representative, Jane Doe. In some examples, to electronically sign the document, the one or more processors 120 may send and/or present a link (e.g., to the laboratory computing device 140, the manufacturer computing device 150, the government computing device 160, the administrator computing device 170, the validation computing device 102, etc.) so that the standardized document 500 may be electronically signed. Additionally or alternatively, the standardized document 500 may be signed by any other suitable technique.
[0057] It should further be appreciated that, in some examples, standardized documents with a category of declaration are made and/or signed by the manufacturer
of the ingredient (e.g., via the manufacturer computing device 150); whereas, standardized documents with a category of certificate are made and/or signed by a third party (e.g., an inspector; and/or signed via the laboratory computing device 140).
[0058] At block 220, the one or more processors 120 receive validation information. The validation information may be received from any suitable source. For example, the validation information may be received from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.
[0059] In some embodiments, the validation information comprises: (i) regulatory information from a government entity (e.g., a government corresponding to government computing device 160; in some examples, the government entity may be a government entity of a federal, state, or local government), and/or (ii) compliance information from a company (e.g. a company corresponding to the laboratory computing device 140, corresponding to the manufacturing computing device 150, etc.).
[0060] Validation information, in some examples, may be aggregated together. For example, the one or more processors 120 may: receive first validation information (e.g., from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.); then receive second validation information (e.g., from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.); and then aggregate the first validation information with the second validation information to thereby create the validation information to be used. In one such example, the first validation information may be received from the government entity and comprise regulatory information; and the second validation information may be received from the company and comprise compliance information.
[0061] The validation information may also be updated. For instance, in some implementations of the example of the preceding paragraph, subsequent to the aggregating, the one or more processors 120 may receive third validation information
(e.g., from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.), and then update the validation information with the third validation information. For example, the one or more processors 120 may replace either the first validation information or the second validation information with the third validation information. For instance, the one or more processors 120 may replace a list of ingredients not to combine with a new list of ingredients not to combine. In another example, the one or more processors 120 may replace a previous maximum amount allowed of an ingredient in a product with a new maximum amount allowed of the ingredient in the product.
[0062] Updates may also be requested by the one or more processors 120. In one illustrative such example, the one or more processors 120 may: determine a validity date of the document from the validation information, wherein the validity date is comprised in an element of the document, and wherein the validity date comprises: (i) a valid from date, or (ii) a valid to date; compare the validity date of the document to a current date to thereby determine that the document is not valid; in response to the determination that the document is not valid, request an additional document; in response to requesting the additional document, receive the additional document; and update the validation information based on the additional document. The additional document may be requested and/or received from any suitable source (e.g., from the internal database 118, external database 180, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.).
[0063] In some embodiments, if the validation computing device 102 determines that the document is out-of-date based on the validity date, the validation computing device 102 may request confirmation from the originator of the document that there have been no updates to the document. If the originator of the document confirms that there have been no updates to the document, the validation computing device 102 may extend the validity date of the document by a predetermined amount (e.g., extend the validity date by one year, two years, etc.).
[0064] At block 230, the one or more processors 120 validate an ingredient indicated by the ingredient information (e.g., based on the validation information).
[0065] In one example of the validation, the one or more processors 120: determine a maximum amount of the ingredient based on the validation information (e.g., a maximum amount from either the regulatory information, and/or the compliance information); and compare the maximum amount of the ingredient to an amount of the ingredient from the ingredient information.
[0066] In another example of the validation, the one or more processors 120: determine a validity date of the document from the validation information; and compare the validity date to a current date. However, in some scenarios, the document does not include a validity date. The one or more processors 120 may handle this scenario in any of the following ways. The one or more processors 120 may request (e.g., from the laboratory computing device 140, the manufacturer computing device 150, the external database 180, the internal database 118, etc.) an additional document with a validity date. Additionally or alternatively, the one or more processors 120 may automatically determine the ingredient to be invalid. Additionally or alternatively, the one or more processors 120 may simply make the validation determination on other factors rather than making the validation based on a validity date (e.g., essentially waving any requirement for a validity date).
[0067] In still other scenarios where the document does not have a validity date, the validation computing device 102 may periodically (e.g., once a year, once every three months, etc.) check in with the originator of the document to ensure that the document is current. If the document is not current, the validation computing device 102 may request an additional document.
[0068] In yet another example of the validation, the one or more processors 120 validate the ingredient based on an electronic signature of the document.
[0069] In yet another example of the validation, the one or more processors 120 validate the ingredient based on the standardized life sciences document comprising a category of certificate and/or its corresponding subcategory. In one such example, the standardized life sciences document comprises a category of certificate, and includes
elements with element names, such as: certificate number; date of first certification; company name (e.g., name of company making the certification); electronic signature; etc.
[0070] In some embodiments, the one or more processors 120 may also identify a particular ingredient for validation. For example, the one or more processors 120 may determine properties (e.g., pharmaceutical, biopharmaceutical, nutritional, and/or aroma-ingredient properties) of the ingredient by analyzing text of the document; and identify the ingredient based on the determined properties. In another example, the one or more processors 120 determine two ingredients from the validation information, and then perform the validation by determining if the two ingredients may be combined based on the validation information.
[0071] In response to the validation, in some examples, a certificate may be constructed certifying the validation. For example, in response to the validation, the one or more processors 120 may construct a standardized life sciences document having a document category of a certificate and a subcategory of: a certificate of analysis; a third- party certification of a good manufacturing practice; a third-party certification of quality management; or a third party certification of Kosher compliance. The certificate/standardized constructed standardized document may include an electronic signature. Additionally or alternatively, the certificate/standardized constructed standardized document may include the ingredient information and/or validation information.
[0072] In some embodiments, the one or more processors 120 may also cause a display device to display an indication of if the ingredient has been validated. The display device may be any display device, such as a display device of any of the laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.
[0073] Figure 6 illustrates an example method 600 for validating ingredients, including forwarding validated ingredient information. In some embodiments, the blocks of the example method 600 may be performed by the one or more processors 120. However, although the example description below refers to blocks of the method as
performed by the one or more processors 120, it should be understood that any of the blocks may be performed by any suitable component (e.g., the one or more processors 141 , the one or more processors 151 , the one or more processors 161 , the one or more processors 171 , etc.).
[0074] The example method 600 begins at block 610 when the one or more processors 120 receive a document (e.g., similarly to block 210 of Figure 2). At block 620, the one or more processors 120 receive validation information (e.g., similarly to block 220 of Figure 2). At block 630, the one or more processors 120 may attempt to validate the ingredient information (e.g., as described elsewhere herein, for example, with respect to block 230 of Figure 2).
[0075] If the validation is not successful, at block 640, the one or more processors may 120 send an indication of the unsuccessful validation (e.g., to an entity that sent the document at block 210/610, such as the laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.). For example, the indication may cause a display (e.g., a display of any of the validation computing device 102, laboratory computing device 140, manufacturer computing device 150, government computing device 160, administrator computing device 170, etc.) to display information indicating that the validation was unsuccessful. The display may also display information indicating why the validation was not successful; for example, the display may display “ingredient XYZ was present in too high a level to comply with the laws of state ABC,” or “the system was not able to verify the electronic signature on the document.”
[0076] If the validation is successful, the one or more processors 120 determine if the document is a certificate or a declaration (e.g., a standardized document with a document category of a certificate or declaration) at block 650. If the document is a certificate or a declaration, at block 660, the one or more processors 120 may forward the certificate or declaration (e.g., to the government computing device 160, the laboratory computing device 140, the manufacturing computing device 150, the administrator computing device 170, etc.). However, in some embodiments, the one or
more processors 120 may check the certificate or declaration based on a confidentiality rule and/or a privacy rule, and refrain from forwarding if a rule is violated.
[0077] If the document is not a certificate or declaration, the one or more processors 120 may construct, at block 670, a certificate or declaration (e.g., a standardized document with a category of certificate or declaration). Examples of documents that are not certificates or declarations include: a standardized document with a category of a specification, a document that an inspector has input into the system that indicates levels of ingredients, etc.
[0078] The constructed certificate or declaration may include the (validated) ingredient information from the document received at block 210/610.
[0079] In some examples, as part of the construction, the one or more processors 120 may facilitate the ingredient manufacturer electronically signing the declaration (e.g., by allowing the ingredient manufacturer to electronically sign the declaration via the manufacturing computing device 150).
[0080] In some examples, as part of the construction, the one or more processors 120 may facilitate a third party (e.g., an inspector) electronically signing the certificate (e.g., by allowing the third part to electronically sign the certificate via the laboratory computing device 140).
[0081] At block 680, the one or more processors 120 may forward the constructed certificate or declaration (e.g., to the government computing device 160, the manufacturer computing device 150, the laboratory computing device 140, the administrator computing device 170, etc.). However, in some embodiments, the one or more processors 120 may check the certificate or declaration based on a confidentiality rule and/or a privacy rule, and refrain from forwarding if a rule is violated.
[0082] Further regarding the example flowcharts provided above, it should be noted that all blocks are not necessarily required to be performed. Moreover, additional blocks may be performed although they are not specifically illustrated in the example flowcharts. Moreover, the exemplary signal diagrams and/or flowcharts are not mutually exclusive (e.g., block(s)/events from each example signal diagram and/or flowchart may
be performed in any other signal diagram and/or flowchart). The flowcharts are illustrative, and not limiting.
Chatbot aspects including example training of a chatbot
[0083] Some embodiments leverage a chatbot to improve functionality. For example, the chatbot may converse with a party to obtain the necessary information to validate an ingredient. It should be appreciated that although the following discussion may refer to a machine learning (ML) chatbot or an ML model, the following discussion is equally applicable to any artificial intelligence (Al) and/or ML chatbot, voicebot, and/or model. Furthermore, examples of the chatbot may include a generative Al chatbot, a generative pre-trained transformer chatbot (ChatGPT), a large language model (LLM)-based chatbot, etc.
[0084] The chatbot may be trained by validation computing device 102 using large training datasets of text and/or data which may provide sophisticated capability for natural-language tasks, such as answering questions and/or holding conversations. The chatbot may include a general-purpose pretrained LLM which, when provided with a starting set of words (prompt) as an input, may attempt to provide an output (response) of the most likely set of words that follow from the input. In one aspect, the prompt may be provided to, and/or the response received from, the chatbot and/or any other ML model, via a user interface of the validation computing device 102. This may include a user interface device operably connected to the server via an I/O module. Exemplary user interface devices may include a touchscreen, a keyboard, a mouse, a microphone, a speaker, a display, and/or any other suitable user interface devices.
[0085] Multi-turn (i.e., back-and-forth) conversations may require LLMs to maintain context and coherence across multiple user utterances, which may require the chatbot to keep track of an entire conversation history as well as the current state of the conversation. The chatbot may rely on various techniques to engage in conversations with users, which may include the use of short-term and long-term memory. Short-term memory may temporarily store information (e.g., in the memory 122 of the validation computing device 102) that may be required for immediate use and may keep track of the current state of the conversation and/or to understand the user’s latest input in order
to generate an appropriate response. Long-term memory may include persistent storage of information (e.g., the internal database 1 18 of the validation computing device 102) which may be accessed over an extended period of time. The long-term memory may be used by the chatbot to store information about the user (e.g., preferences, chat history, etc.) and may be useful for improving an overall user experience by enabling the chatbot to personalize and/or provide more informed responses.
[0086] In some embodiments, the system and methods to generate and/or train a ML chatbot model which may be used in the chatbot, may include three steps: (1 ) a supervised fine-tuning (SFT) step where a pretrained language model (e.g., an LLM) may be fine-tuned on a relatively small amount of demonstration data curated by human labelers to learn a supervised policy (SFT ML model) which may generate responses/outputs from a selected list of prompts/inputs. The SFT ML model may represent a cursory model for what may be later developed and/or configured as the ML chatbot model; (2) a reward model step where human labelers may rank numerous SFT ML model responses to evaluate the responses which best mimic preferred human responses, thereby generating comparison data. The reward model may be trained on the comparison data; and/or (3) a policy optimization step in which the reward model may further fine-tune and improve the SFT ML model. The outcome of this step may be the ML chatbot model using an optimized policy. In one aspect, step one may take place only once, while steps two and three may be iterated continuously, e.g., more comparison data is collected on the current ML chatbot model, which may be used to optimize/update the reward model and/or further optimize/update the policy.
Supervised Fine-Tuning ML Model
[0087] Figure 7 depicts a combined block and logic diagram 700 for training an ML chatbot model, in which the techniques described herein may be implemented, according to some embodiments. It should be understood that Figure 7 may apply to training any chatbot described herein. In addition, the chatbot may be trained in accordance with any of the other techniques described herein; and the training of chatbot should not be considered restricted to the teachings of Figure 7.
[0088] Some of the blocks in Figure 7 may represent hardware and/or software components, other blocks may represent data structures or memory storing these data structures, registers, or state variables (e.g., 712), and other blocks may represent output data (e.g., 725). Input and/or output signals may be represented by arrows labeled with corresponding signal names and/or other identifiers. The methods and systems may include one or more blocks 702, 704, 706, which will be described in further detail below.
[0089] In one aspect, at block 702, a pretrained language model 710 may be finetuned. The pretrained language model 710 may be obtained at block 702 and be stored in a memory, such as memory 122 and/or internal database 118. The pretrained language model 710 may be loaded into an ML training module at block 702 for retraining/fine-tuning. A supervised training dataset 712 may be used to fine-tune the pretrained language model 710 wherein each data input prompt to the pretrained language model 710 may have a known output response for the pretrained language model 710 to learn from. The supervised training dataset 712 may be stored in a memory at block 702, e.g., the memory 122 or the internal database 1 18. In one aspect, the data labelers may create the supervised training dataset 712 prompts and appropriate responses. The pretrained language model 710 may be fine-tuned using the supervised training dataset 712 resulting in the SFT ML model 715 which may provide appropriate responses to user prompts once trained. The trained SFT ML model 715 may be stored in a memory, such as the memory 122 or the internal database 118.
[0090] In some examples, the supervised training dataset 712 includes historical data (e.g., held by the validation computing device 102, etc.). The historical data may include, for example: (a) historical ingredient information, (b) historical validation information, (c) historical standardized documents, (d) historical communications from validation requestors, and/or (e) historical communications from validating parties. In some embodiments, the chatbot may be trained using the above (a)-(d) as input (e.g., also referred to as independent variables, or explanatory variables), and the above (e) used as the output (e.g., also referred to as a dependent variable, or response variable). Put another way, based upon the above (a)-(d), the chatbot may be trained to generate
the above (e) (e.g., generate communications to send to validation requestors). The generated communications may be sent in the form of text message, email, or as part of a chat session (e.g., the chatbot generates multiple communications as part of a chat session/conversation), etc.
[0091] Regarding (a) above, examples of the historical ingredient information includes historical ingredient information corresponding to ingredient information discussed elsewhere herein. For example, the historical ingredient information may include historical: ingredient type information, amounts of ingredient(s) in a product, concentrations of ingredient(s) in a product, and/or properties of the ingredient (e.g., chemical properties, acidity levels, etc.), etc.
[0092] Regarding (b) above, the historical validation information may include information corresponding to the validation information discussed elsewhere herein (e.g., may include historical regulatory information from a government entity, historical compliance information from a company, etc.).
[0093] Regarding (c) above, examples of the historical standardized documents include historical standardized documents corresponding to the standardized documents discussed elsewhere herein.
[0094] Regarding (d) above, the historical communications from validation requestors (e.g., a party requesting validation of an ingredient, such as a user of laboratory computing device 140, a user of manufacturing computing device 150, etc.) may be taken, for example, from historical conversations (e.g., text conversations, audio conversations, etc.) between validation requestors and validating parties (e.g., a user of validation computing device 102, etc.). For example, the historical communications from validation requestors may be the statements made by the validation requestor during the conversation.
[0095] Regarding (e) above, historical communications from validating parties may be taken, for example, from historical conversations (e.g., text conversations, audio conversations, etc.) between validation requestors and validating parties. For example, the historical communications from validating parties may be the statements made by a validating party during the conversation.
Training The Reward Model
[0096] In one aspect, training the ML chatbot model 750 may include, at block 704, training a reward model 720 to provide as an output a scaler value/reward 725. The reward model 720 may be required to leverage reinforcement learning with human feedback (RLHF) in which a model (e.g., ML chatbot model 750) learns to produce outputs which maximize its reward 725, and in doing so may provide improved responses.
[0097] Training the reward model 720 may include, at block 704, providing a single prompt 722 to the SFT ML model 715 as an input. The input prompt 722 (e.g., any of the above (a)-(d)) may be provided via an input device (e.g., a keyboard) of the validation computing device 102. The prompt 722 may be previously unknown to the SFT ML model 715, e.g., the labelers may generate new prompt data, the prompt 722 may include testing data stored on internal database 118, and/or any other suitable prompt data. The SFT ML model 715 may generate multiple, different output responses 724A, 724B, 724C, 724D to the single prompt 722. In some embodiments, the different output responses 724A, 724B, 724C, 724D include suggested responses.
[0098] At block 704, the validation computing device 102 may output the responses 724A, 724B, 724C, 724D via any suitable technique, such as outputting via a display (e.g., as text responses), a speaker (e.g., as audio/voice responses), etc., for review by the data labelers.
[0099] The data labelers may provide feedback (e.g., via the validation computing device 102, etc.) on the responses 724A, 724B, 724C, 724D when ranking 726 them from best to worst based upon the prompt-response pairs. The data labelers may rank 726 the responses 724A, 724B, 724C, 724D by labeling the associated data. The ranked prompt-response pairs 728 may be used to train the reward model 720. In one aspect, the validation computing device 102 may load the reward model 720 and train the reward model 720 using the ranked response pairs 728 as input. The reward model 720 may provide as an output the scalar reward 725.
[0100] In one aspect, the scalar reward 725 may include a value numerically representing a human preference for the best and/or most expected response to a prompt, i.e., a higher scaler reward value may indicate the user is more likely to prefer that response, and a lower scalar reward may indicate that the user is less likely to prefer that response. For example, inputting the “winning” prompt-response (i.e., inputoutput) pair data to the reward model 720 may generate a winning reward. Inputting a “losing” prompt-response pair data to the same reward model 720 may generate a losing reward. The reward model 720 and/or scalar reward 725 may be updated based upon labelers ranking 726 additional prompt-response pairs generated in response to additional prompts 722.
[0101] In one example, a data labeler may provide to the SFT ML model 715 as an input prompt 722, “Describe the sky.” The input may be provided by the labeler (e.g., via the administrator computing device 170, etc.) to the validation computing device 102 running the chatbot utilizing the SFT ML model. The SFT ML model 715 may provide as output responses to the labeler (e.g., via their respective devices): (i) “the sky is above” 724A; (ii) “the sky includes the atmosphere and may be considered a place between the ground and outer space” 724B; and (iii) “the sky is heavenly” 724C. The data labeler may rank 726, via labeling the prompt-response pairs, prompt-response pair 722/724B as the most preferred answer; prompt-response pair 722/724A as a less preferred answer; and prompt-response 722/724C as the least preferred answer. The labeler may rank 726 the prompt-response pair data in any suitable manner. The ranked prompt-response pairs 728 may be provided to the reward model 720 to generate the scalar reward 725. It should be appreciated that this facilitates training the chatbot to correspond with a user (e.g., a validation requestor) to, for example, request additional information that may be used to validate the ingredient.
[0102] While the reward model 720 may provide the scalar reward 725 as an output, the reward model 720 may not generate a response (e.g., text). Rather, the scalar reward 725 may be used by a version of the SFT ML model 715 to generate more accurate responses to prompts, i.e., the SFT model 715 may generate the response such as text to the prompt, and the reward model 720 may receive the response to generate a scalar reward 725 of how well humans perceive it. Reinforcement learning
may optimize the SFT model 715 with respect to the reward model 720 which may realize the configured ML chatbot model 750.
RLHF To Train The ML Chatbot Model
[0103] In one aspect, the validation computing device 102 may train the ML chatbot model 750 to generate a response 734 to a random, new and/or previously unknown user prompt 732. To generate the response 734, the ML chatbot model 750 may use a policy 735 (e.g., algorithm) which it learns during training of the reward model 720, and in doing so may advance from the SFT model 715 to the ML chatbot model 750. The policy 735 may represent a strategy that the ML chatbot model 750 learns to maximize its reward 725. As discussed herein, based upon prompt-response pairs, a human labeler may continuously provide feedback to assist in determining how well the ML chatbot’s 750 responses match expected responses to determine rewards 725. The rewards 725 may feed back into the ML chatbot model 750 to evolve the policy 735. Thus, the policy 735 may adjust the parameters of the ML chatbot model 750 based upon the rewards 725 it receives for generating good responses. The policy 735 may update as the ML chatbot model 750 provides responses 734 to additional prompts 732.
[0104] In one aspect, the response 734 of the ML chatbot model 750 using the policy 735 based upon the reward 725 may be compared 738 to the SFT ML model 715 (which may not use a policy) response 736 of the same prompt 732. The validation computing device 102 may compute a penalty 740 based upon the comparison 738 of the responses 734, 736. The penalty 740 may reduce the distance between the responses 734, 736, i.e. , a statistical distance measuring how one probability distribution is different from a second, in one aspect the response 734 of the ML chatbot model 750 versus the response 736 of the SFT model 715. Using the penalty 740 to reduce the distance between the responses 734, 736 may avoid a server overoptimizing the reward model 720 and deviating too drastically from the human- intended/preferred response. Without the penalty 740, the ML chatbot model 750 optimizations may result in generating responses 734 which are unreasonable but may still result in the reward model 720 outputting a high reward 725.
[0105] In one aspect, the responses 734 of the ML chatbot model 750 using the current policy 735 may be passed, at block 706, to the rewards model 720, which may return the scalar reward 725. The ML chatbot model 750 response 734 may be compared 738 to the SFT ML model 715 response 736 to compute the penalty 740. A final reward 742 may be generated which may include the scalar reward 725 offset and/or restricted by the penalty 740. The final reward 742 may be provided to the ML chatbot model 750 and may update the policy 735, which in turn may improve the functionality of the ML chatbot model 750.
[0106] To optimize the ML chatbot 750 over time, RLHF via the human labeler feedback may continue ranking 726 responses of the ML chatbot model 750 versus outputs of earlier/other versions of the SFT ML model 715, i.e., providing positive or negative rewards 725. The RLHF may allow the validation computing device 102 to continue iteratively updating the reward model 720 and/or the policy 735. As a result, the ML chatbot model 750 may be retrained and/or fine-tuned based upon the human feedback via the RLHF process, and throughout continuing conversations may become increasingly efficient.
[0107] Although multiple blocks 702, 704, 706 are depicted in the exemplary block and logic diagram 700, each providing one of the three steps of the overall ML chatbot model 750 training, fewer and/or additional servers may be utilized and/or may provide the one or more steps of the chatbot training. In some embodiments, one server may provide the entire ML chatbot model 750 training.
Other Matters
[0108] Additionally, certain embodiments are described herein as including logic or a number of routines, subroutines, applications, or instructions. These may constitute either software (code embodied on a non-transitory, tangible machine-readable medium) or hardware. In hardware, the routines, etc., are tangible units capable of performing certain operations and may be configured or arranged in a certain manner. In example embodiments, one or more computer systems (e.g., a standalone, client or server computer system) or one or more hardware modules of a computer system (e.g., a processor or a group of processors) may be configured by software (e.g., an
application or application portion) as a hardware module that operates to perform certain operations as described herein.
[0109] In various embodiments, a hardware module may be implemented mechanically or electronically. For example, a hardware module may comprise dedicated circuitry or logic that is permanently configured (e.g., as a special-purpose processor, such as a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC) to perform certain operations. A hardware module may also comprise programmable logic or circuitry (e.g., as encompassed within a general- purpose processor or other programmable processor) that is temporarily configured by software to perform certain operations. It will be appreciated that the decision to implement a hardware module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
[0110] Accordingly, the term “hardware module” should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein. Considering embodiments in which hardware modules are temporarily configured (e.g., programmed), each of the hardware modules need not be configured or instantiated at any one instance in time. For example, where the hardware modules comprise a general-purpose processor configured using software, the general-purpose processor may be configured as respective different hardware modules at different times. Software may accordingly configure a processor, for example, to constitute a particular hardware module at one instance of time and to constitute a different hardware module at a different instance of time.
[0111] Hardware modules can provide information to, and receive information from, other hardware modules. Accordingly, the described hardware modules may be regarded as being communicatively coupled. Where multiple of such hardware modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the hardware
modules. In embodiments in which multiple hardware modules are configured or instantiated at different times, communications between such hardware modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware modules have access. For example, one hardware module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further hardware module may then, at a later time, access the memory device to retrieve and process the stored output. Hardware modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information).
[0112] The various operations of example methods described herein may be performed, at least partially, by one or more processors that are temporarily configured (e.g., by software) or permanently configured to perform the relevant operations.
Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions. The modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
[0113] Similarly, the methods or routines described herein may be at least partially processor-implemented. For example, at least some of the operations of a method may be performed by one or more processors or processor-implemented hardware modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of geographic locations.
[0114] Furthermore, the patent claims at the end of this patent application are not intended to be construed under 35 U.S.C. § 112(f) unless traditional means-plus- function language is expressly recited, such as “means for” or “step for” language being explicitly recited in the claim(s). The systems and methods described herein are
directed to an improvement to computer functionality, and improve the functioning of conventional computers.
Claims
1 . A computer-implemented method for validating ingredients, the method comprising: receiving, via one or more processors, a document including ingredient information; receiving, via the one or more processors, validation information comprising (i) regulatory information from a government entity, and/or (ii) compliance information from a company; and validating, via the one or more processors, an ingredient indicated by the ingredient information based on the validation information.
2. The computer-implemented method of claim 1 , wherein the receiving the validation information comprises: receiving, via the one or more processors, first validation information; receiving, via the one or more processors, second validation information; and aggregating, via the one or more processors, the first validation information with the second validation information to thereby create the validation information.
3. The computer-implemented method of claim 2, wherein: the first validation information is received from the government entity and comprises the regulatory information; and the second validation information is received from the company and comprises the compliance information.
4. The computer-implemented method of any one of claims 2-3, further comprising, subsequent to the aggregating: receiving, via one or more processors, third validation information; and updating, via the one or more processors, the validation information by replacing either the first validation information or the second validation information with the third validation information.
5. The computer-implemented method of any one of claims 1 -4, further comprising: determining, via the one or more processors, a validity date of the document from the validation information, wherein the validity date is comprised in an element of the document, and wherein the validity date comprises: (i) a valid from date, or (ii) a valid to date; comparing, via the one or more processors, the validity date of the document to a current date to thereby determine that the document is not valid; in response to the determination that the document is not valid, requesting, via the one or more processors, an additional document; in response to requesting the additional document, receiving, via the one or more processors, the additional document; and updating, via the one or more processors, the validation information based on the additional document.
6. The computer-implemented method of any one of claims 1 -5, wherein the document comprises a standardized life sciences document.
7. The computer-implemented method of claim 6, further comprising: in response to the validating of the ingredient, forwarding, via the one or more processors, the standardized life sciences document to a government computing device.
8. The computer-implemented method of any one of claims 6-7, wherein: the ingredient information included in the standardized life sciences document includes an element including an element name and/or element value; and the element name and/or element value indicates a name of the ingredient.
9. The computer-implemented method of any one of claims 6-8, wherein: the ingredient information comprises ingredient type information;
the standardized life sciences document comprises a document category and a document subcategory; and the document subcategory indicates the ingredient type information.
10. The computer-implemented method of any one of claims 1 -9, further comprising: based on the validating, constructing, via the one or more processors, a standardized life sciences document having a document category of a certificate and a subcategory of: a certificate of analysis; a third-party certification of a good manufacturing practice; a third-party certification of quality management; or a third party certification of Kosher compliance.
11 . The computer-implemented method of any one of claims 1 -10, wherein the validating comprises: determining, via the one or more processors, a maximum amount of the ingredient based on the validation information; and comparing, via the one or more processors, the maximum amount of the ingredient to an amount of the ingredient from the ingredient information.
12. The computer-implemented method of any one of claims 1 -1 1 , wherein the validating comprises: determining, via the one or more processors, a validity date of the document from the validation information; and comparing, via the one or more processors, the validity date to a current date.
13. The computer-implemented method of any one of claims 1 -12, wherein the validating comprises validating the ingredient based on an electronic signature of the document.
14. The computer-implemented method of any one of claims 1 -13, wherein the document comprises a standardized life sciences document having a document category of a certificate and a subcategory of: a certificate of analysis; a third-party certification of a good manufacturing practice; a third-party certification of quality management; or a third party certification of Kosher compliance; and wherein the validating comprises validating the ingredient based on the document category and/or subcategory.
15. The computer-implemented method of any one of claims 1 -14, further comprising identifying the ingredient by: determining, via the one or more processors, properties of the ingredient by analyzing text of the document; and identifying, via the one or more processors, the ingredient based on the determined properties.
16. A computer system for validating ingredients, the computer system comprising one or more processors configured to: receive a document including ingredient information; receive validation information comprising (i) regulatory information from a government entity, and/or (ii) compliance information from a company; and validate an ingredient indicated by the ingredient information based on the validation information.
17. The system of claim 16, wherein: the ingredient comprises a first ingredient; the one or more processors are further configured to perform the validate by: determining a second ingredient from the ingredient information; and determining if the first and second ingredients may be combined based on the validation information.
18. The computer system of any one of claims 16-17, further comprising a display device, and wherein the one or more processors are further configured to display an indication of if the ingredient has been validated on the display device.
19. The computer system of any one of claims 16-18, wherein the document comprises a standardized life sciences document comprising a standardized pharmaceutical document, a standardized biopharmaceutical document, a standardized nutritional document, or a standardized aroma-ingredient document.
20. The computer system of any one of claims 16-19, wherein the one or more processors are further configured to perform the validate by: determining, via an artificial intelligence (Al) or machine learning (ML) chatbot, additional information necessary to validate the ingredient based on: (i) the ingredient information, and (ii) the validation information; composing, via the Al or ML chatbot, a communication requesting the additional information necessary to validate the ingredient; and sending the communication to a laboratory computing device or a manufacturer computing device.
21 . The computer system of claim 20, wherein the one or more processors are configured to train the Al or ML chatbot using historical data comprising: (a) historical ingredient information, (b) historical validation information, (c) historical standardized documents, (d) historical communications from validation requestors, and/or (e) historical communications from validating parties.
Applications Claiming Priority (12)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363449704P | 2023-03-03 | 2023-03-03 | |
| US202363449719P | 2023-03-03 | 2023-03-03 | |
| US202363449682P | 2023-03-03 | 2023-03-03 | |
| US63/449,682 | 2023-03-03 | ||
| US63/449,704 | 2023-03-03 | ||
| US63/449,719 | 2023-03-03 | ||
| US202363461689P | 2023-04-25 | 2023-04-25 | |
| US202363461707P | 2023-04-25 | 2023-04-25 | |
| US202363461698P | 2023-04-25 | 2023-04-25 | |
| US63/461,698 | 2023-04-25 | ||
| US63/461,707 | 2023-04-25 | ||
| US63/461,689 | 2023-04-25 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024184738A1 true WO2024184738A1 (en) | 2024-09-12 |
Family
ID=90195454
Family Applications (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/IB2024/051916 Ceased WO2024184738A1 (en) | 2023-03-03 | 2024-02-28 | Validation and maintenance of electronic life sciences industries data |
| PCT/IB2024/051920 Ceased WO2024184740A1 (en) | 2023-03-03 | 2024-02-28 | Confidential disclosures of life sciences industries data |
| PCT/IB2024/051914 Ceased WO2024184737A1 (en) | 2023-03-03 | 2024-02-28 | File creation using artificial intelligence (ai), natural language processing (nlp), and other techniques for product, quality and regulatory information in the life science industries |
Family Applications After (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/IB2024/051920 Ceased WO2024184740A1 (en) | 2023-03-03 | 2024-02-28 | Confidential disclosures of life sciences industries data |
| PCT/IB2024/051914 Ceased WO2024184737A1 (en) | 2023-03-03 | 2024-02-28 | File creation using artificial intelligence (ai), natural language processing (nlp), and other techniques for product, quality and regulatory information in the life science industries |
Country Status (1)
| Country | Link |
|---|---|
| WO (3) | WO2024184738A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20040098282A1 (en) * | 2001-01-16 | 2004-05-20 | Menachem Levy | Data retrieval and report generation system for foodstuffs |
| US20150186900A1 (en) * | 2011-02-17 | 2015-07-02 | Ithos Global, Inc. | Product safety assessment information management system |
| US20200080980A1 (en) * | 2017-05-22 | 2020-03-12 | Valisure Llc | Methods for validating medication |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9537650B2 (en) * | 2009-12-15 | 2017-01-03 | Microsoft Technology Licensing, Llc | Verifiable trust for data through wrapper composition |
| US20210295031A1 (en) * | 2019-03-01 | 2021-09-23 | Iqvia Inc. | Automated classification and interpretation of life science documents |
| US20210049239A1 (en) * | 2019-08-16 | 2021-02-18 | Microsoft Technology Licensing, Llc | Multi-layer document structural info extraction framework |
| WO2021178689A1 (en) * | 2020-03-04 | 2021-09-10 | nference, inc. | Systems and methods for computing with private healthcare data |
| US11367008B2 (en) * | 2020-05-01 | 2022-06-21 | Cognitive Ops Inc. | Artificial intelligence techniques for improving efficiency |
| EP4009194A1 (en) * | 2020-12-04 | 2022-06-08 | IQVIA Inc. | Automated classification and interpretation of life science documents |
-
2024
- 2024-02-28 WO PCT/IB2024/051916 patent/WO2024184738A1/en not_active Ceased
- 2024-02-28 WO PCT/IB2024/051920 patent/WO2024184740A1/en not_active Ceased
- 2024-02-28 WO PCT/IB2024/051914 patent/WO2024184737A1/en not_active Ceased
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20040098282A1 (en) * | 2001-01-16 | 2004-05-20 | Menachem Levy | Data retrieval and report generation system for foodstuffs |
| US20150186900A1 (en) * | 2011-02-17 | 2015-07-02 | Ithos Global, Inc. | Product safety assessment information management system |
| US20200080980A1 (en) * | 2017-05-22 | 2020-03-12 | Valisure Llc | Methods for validating medication |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2024184740A1 (en) | 2024-09-12 |
| WO2024184737A1 (en) | 2024-09-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Van Dongen | Prior specification in Bayesian statistics: three cautionary tales | |
| Huebner | An overview of recent developments in cognitive diagnostic computer adaptive assessments | |
| CN113886262B (en) | Software automated testing method, device, computer equipment and storage medium | |
| WO2014169288A1 (en) | Evaluation control | |
| CN114049973B (en) | Dialogue quality inspection method, electronic device, computer storage medium and program product | |
| CN110543550B (en) | Method and device for automatically generating test questions | |
| Homer | Best practices in system dynamics modeling, revisited: a practitioner's view | |
| Thompson Coon et al. | Developing methods for the overarching synthesis of quantitative and qualitative evidence: The interweave synthesis approach | |
| Herrmann et al. | On the subjectivity of emotions in software projects: How reliable are pre-labeled data sets for sentiment analysis? | |
| US20250005294A1 (en) | Systems and methods for tailored resume creation | |
| CN110866209A (en) | Online education data push method, system and computer equipment | |
| CN118780336A (en) | Large model vertical field capability evaluation system, method, device and storage medium | |
| KR101739539B1 (en) | System and method for verifying and revising knowledge base | |
| CN114391154A (en) | Answer evaluation method, recording medium, and information processing apparatus | |
| Freitag et al. | The corrections dilemma: Media retractions increase belief accuracy but decrease trust | |
| WO2024184738A1 (en) | Validation and maintenance of electronic life sciences industries data | |
| CN109165286A (en) | Automatic question-answering method, device and computer readable storage medium | |
| US20210165792A1 (en) | Ontology driven crowd sourced multi-dimensional question-answer processing for automated bid processing for rapid bid submission and win rate enhancement | |
| Lambert et al. | Dietetic students’ performance of activities in an objective structured clinical examination | |
| CN114692647B (en) | Data processing method, device, equipment and medium | |
| CN120544954A (en) | Testing method, device, electronic device and storage medium for doctor-patient dialogue model | |
| JP2025165530A (en) | Program, method, information processing device, and system | |
| CN117931654A (en) | Code review method, device, electronic equipment and computer readable storage medium | |
| CN113112113B (en) | A learning strategy generation method, system, device and storage medium | |
| Ping | Latent variable interactions and quadratics |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24709194 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |