RFI Questions & Answers (updated July 23, 2018)*
*most recent updates colored yellow at top of list
Referring to Req. SR-DT-006, please specify the way how the attribute column should be added: Option 1: add an empty named column; Option 2: fill it with provided content; Option 3: call some service to gather the appropriate data.
Response: The concept for manipulating datasets is to permit the user to create, manage, execute R scripts either directly or also through Shiny applications developed and deployed through the application. There should not be UI elements implemented to do so. We are planning to rely on R and Shiny capabilities to extend the capabilities of the application.
Referring to Req. SR-MM-015 we would like to know, if the list of data types provided is the final one? Or it can be extended and should be configurable?
Response: This capability should be fully configurable by organizations adopting the system or end users as needed. The list of file types, formats, etc should not be in any way limited.
Referring to Req. SR-MM-016 please clarify what you mean by “entity status”.
Response: This particular requirement refers to data quality status which is a property of the data contained within the file being loaded for raw bioanalytical concentration or clinical data (i.e. data elements supporting the concentration data). Generally this refers to whether the data being provided or loaded to the system is of draft or final confirmed quality status. However, these quality status meta-data can appear in the data file itself or can be applied as a meta-data item and should be completely flexible, i.e. not necessarily required, refer to differing terms becaue not all organizations track such data in the same way.
Referring to Req. SR-DT-031 please specify the terms "samples", "parameters" and "identification" in context of this requirement.
Response: This is an application that will work with pharmacokinetic bioanalytical data. Samples refer to individual bioanalytical samples draft from test subjects, animal or human, for which a bioanalytical test method is applied to determine the concentration of one or more chemical compounds contained within the sample. Each of these concentration results will be associated with one data record. One or more data records makes up a grouping of data generally referred to as a pharmacokinetic profile. Parameters refer to the pharmacokinetic parameters generated from the pharmacokinetic analysis of a profile of data, for example, TMAX, CMAX, AUC, AUCINF, RACC, etc. Identification of profiles with less than 3 records or samples associated is generally a data quality consideration with regards to generation of the pharmacokinetic parameters.
Referring to Req. SR-GU-010 please explain, what do you mean by "configurable identifier for frame/step"?
Response: The envisioned system will have “entities” associated with certain activities within the system. DT refers to a “Data Transformation” step which can be used to modified the loaded data, merge various data elements (concentration with demography data, concentration with clinical Case Report Form data, concentration with protein binding data), transform data (scale concentration, dosing, other data to appropriate units), etc. These transformations are expected to be implemented via previously generated and qualified and appropriate parameterized R scripts and/or Shiny applications or by permitting the user to create ad hoc scripts if necessary. The scripts will be stored in both System wide managed global library locations or within a user’s library for scripts. DL refers to a “Data Load” step which is associated with importing data into the system via loading from disk or via integration with data services that may be integrated with other local sources of data within an organization. AN or “Analysis” steps refer to those steps within the system that implement primary and secondary or derived pharmacokinetic parameter generation via the Computation Engine library of R code. The “steps” are intended to represent elements of graphical relationships between the entities permitting the user to recognize the data flow between DL, DT, AN, RP (Reporting), etc. Each of these “relationships” are considered to be a “lineage”. Each of the entities in the lineage must have “identifiers” that identity the entity type, allow for multiple entities of the same type within a lineage, permit relationships between lineages representing DT merge entities from disparate DLs and permit sufficient configurable meta-data that allows the nature of the particular entity to be determined via visual or programmatic inspection, i.e. needs to be an meta-data entity that can be surfaced in the lineage map and obtained by inspection from a properties list of of the entity. Generally the idea is that the entities will have identifiers such as DL01, DT01, AN03, RP02, etc. and that the lineage map can both be represented graphically and in tabular fashion to permit identification of important elements of the data lineage.
Referring to Req. SR-DT-026, we assume, that the system should be able to work in "sandbox" mode with ability to evaluate the workflow and stop on selected step, is this the correct understanding of the requirement?
Response: The system is not intended to be rigidly adherent to a specific workflow but it is anticipated that the system will be able to “replay” data transformations and analyses and reporting steps/entities/nodes in stepwise fashion if needed. One use case in particular would be to permit re-execution if a lineage or portion of lineage were copied to apply to a differing dataload node/entity/dataset. This use case is particularly useful in cases where a single data load may have a single analyte but other data loads contain other analytes, perhaps metabolites that have strong relationships to the original lineage specifically since they may emanate from a single sample. In this case, the user may want to reapply all of the data transformations and reporting and analyses steps with some modifications to customize to the alternative analyte. The intent of this requirement was to be able to “replay” the sequence of a lineage made of the individual entities, DataLoad, DataTransformations, Analysis entities, Report entities, etc. This is important when copying a lineage to substitute a DataLoad and re-apply DataTransformations. The user case for this is when a lineage has been developed for one analyte and then copied to re-apply to other analytes to replicate the same types of DataTransformations.
Referring to Req. SR-AN-045, please double check if this is the complete list of data filters/controls? If it is not complete, please specify, what are other controls the system shall support?
No, the controls are required for at a minimum (there may be others defined in the solutions definition process for this project) the following: TAUi (and whether this applies to all profiles within the selected subset of data, all profiles for the selected subject, all profiles for all subjects), TOLDi (and whether this applies to all profiles within the selected subset of data, all profiles for the selected subject, all profiles for all subjects, Partial Areas selections, Axes (x/y) limits control selection, switching between log/lin y axes selection, switching between graph and tabular view of the data (or maintaining the tabular view as a slide out drawer or similar device), selection of the points for definition of the terminal elimination phase and whether this applies to all profiles within the selected subset of data, all profiles for the selected subject, all profiles for all subjects, selection of points that will be excluded from summary statistics and whether this applies to all profiles within the selected subset of data, all profiles for the selected subject, all profiles for all subjects, selection of points that will be excluded from AUC calculations and whether this applies to all profiles within the selected subset of data, all profiles for the selected subject, all profiles for all subjects, selection of parameters to be displayed in the Analysis module for interactive assessment of the goodness of fit of the terminal eliminiation phase rate constand and other parameters of interest, model selection control, auc calculation method control, variable input mapping controls for selection of concentration/time (nominal/actual)/dose/sample amount/(units for each of these), controls to apply selections from other lineages, controls to estimate the regression for the terminal elimination phase rate constant automatically based upon certain quality criteria (R^2, adjusted R^2, minimum number of points, spanratio, etc and this needs to be configurable), display optionally the “regression line”/line connecting points to be subject to AUC calculations, line connecting all points, markers for data points that are less than LLOQ/BLQ on the log-lin scale (since cannot take log of zero), full screen view control, controls to allow analyst to apply/edit/delete comments to the profile as a whole or individual points (these are stored in a data variable and provided for reporting), controls to inspect other data items (these can include items that may provide context to the analysis such as data items indicating that a subject has reported emesis within or prior to the dosing interval as one example), controls to permit visualizations of the computed parameters on the fly (these would generally be aggregated tables and plots of parameters that have been computed with parameters that have been computed to the point of generation of the visualization), controls to select next profile, next subject, next subset of data.
Referring to Req. SR-AN-057, we assume, that the system should graphically display the full chain of loading, transformations, calculations and publications, is it correct understanding?
Response: This is correct. The graphical display should support a method for the user to inspect the properties of each entity, i.e. meta-data of the entity. Note, a tabular display representation of the graphical display will also be required in order to be able to discern the properties or meta-data of each entity and associated lineage in comparison to the properties of meta-data in other lineages. It’s anticipated that the graphical view will not permit an integrated understanding of all of the properties of each of the entities simultaneously to always permit selection of lineages or individual entity objects. Thus a tabular view will also be necessary.
Referring to Req. SR-DT-030, please clarify, what "externally available" means? Should such data be available via REST API? Or it should be published as a public FTP/HTTP resource?
Response: This requirement means that transformed dataset, along with input or raw datasets (concentration or otherwise), analysis results or parameter datasets, and reporting items will be shareable via a web or other interface with controls that limit access according to access rights for data by project/protocol/lineage or individual dataset in order to protect restricted or data subject to blinding controls.
Referring to Req. SR-DT-008, please clarify the meaning of the term "profile" in the context of this RFI.
Response: Please refer to the response to the prior question concerning SR-DT-031, i.e. question 4 in this grouping.
Referring to Req. SR-DT-012, please specify, how or to what the system shall automatically apply the default profile.
Response: Individual profiles can be defined from input datasets by automated means applying default specifications by input file type. These specifications should determine which grouping of variables are used to define the profile. For example, the data can be defined by Study Design Elements that can be composed of data fields such as Study, Site ID, Subject ID, Randomization ID, Sample Matrix, Treatment Label or ID, Treatment Code, Period, Period Unit, Visit, Visit Unit, Cycle or Phase of Study, PK Sample Collection Type (Interval/Point), Analyte ID, PCMethod (bioanalytical method utilized to determine the concentration of the analyte) and others. Unique combinations of these variables/data fields will identify groups of concentration – time data points that can be considered a profile. Individual data points will then be identified by unique sample IDs and nominal or planned time of collection. Each individual group of records would then be assigned a unique profile identifier. This particular example is by no means the only scenario for profile identification. Each individual file specification, which should be configurable and managed through the application database, can define an approach for which variables may be used for profile identification. Further, the analyst may define and re-assign profile membership through DataTransformations manipulating which records belong to which profile identifier to facilitate groupings of data that define profiles to perform analyses as necessary. Note that for some analyses, records can belong to more than one profile/profile identifier.
Referring to Req. SR-ML-016, please specify the system behavior for the objects that were shared and then deleted
Response: If the entity was shared externally, i.e. published, then it cannot be deleted without removing the sharing status. Note that an entity can be versioned/cloned/copied while shared/published but the specific version of the entity cannot be deleted while published. Groups of lineages or entities should be optionally identified via a publishing or reporting event identifier that allows them to share the content contained as members. These members are maintained as a group within that event but can be members of other sharing/publishing/reporting events. Scenarios of these types of events are assembling membership that defines all of the content for a Clinical Study Report OR for a Regulatory Submission OR for a Data Monitoring Committee review. Across these events, it will be possible to share some of the same and disparate entities or entire lineages containing the entities of interest.
Referring to Req. UR-DL-FV-002, please specify if the file validator itself can be a standalone tool.
Response: Yes however there are core functions that should be accessible to the user when running the application in stand alone mode, i.e. incorporated into the application itself, that would allow the user to verify/validate a file import/data load relative to a file specification for that file type. The need for file validation in both modes may exists to permit bulk loading and validation of data to a database/repository/warehouse as well as to a local standalone application instance.
Referring to Req. SR-AA-011, How the anonymous access to the Global library should be provided? What are the access limitations for the anonymous user group? Please provide more details on this requirement.
Response: Further consideration of this requirement may be necessary. It refers to the need to be able to execute the application in standalone mode. How the library functions would operate or if they would be viable in this mode has not been fully considered. One approach would to maintain default content of a global library as a configurable option for the application so that the application could be delivered with a minimally sufficient set of functions required for the “global library” in standalone mode.
Referring to Req. SR-AA-016, please specify the meaning of the term "inactive", shall we read it as "offline" or as "disactivated"?
Response: This refers to users who have been deactivated in the system. If a user has been deactivated, their identify must not “disappear”. All lineages, entities, annotations and log entries, associated with this user must remain available even if the user is no longer a valid user in this system.
Referring to Req. SR-MM-023, please clarify the meaning of the term "Data Characteristic".
Response: Meta-data characteristics refer to whether the metadata allows 1) multiple values, 2) is user editable or is fixed/static 3) is stored in the NCA application repository or is reference to an external source of meta-data 4) whether it can have downstream dependencies, particularly if the metadata is an external reference.
Referring to Req. SR-AL-049, please clarify if the auto-loading should it be scheduled or event-based? Or both?
Response: It is anticipated that queue based loading of data loading events with separable prioritization for individual queues would be provisioned. This would allow organizations adopting the system to stage data loading for historical data in separate queue(s) than those for live or current data that may have higher processing priority. Scheduling of queues may be a desirable property for some organizations that wish to stage higher volume but lower priority data loads to off peak periods.
Referring to Req. SR-ML-001.1, we suspect there is a typo in WEBDEV, maybe it should be WEBDAV?
Response: That is correct, this was a typographical error and was intended to be WEBDAV.
Referring to Req. SR-SR-003, is it assumed, that a visual query builder should be used, or user should enter a query directly?
Response: Both capabilities are desirable depending on the specific search type.
Referring to Req. SR-SR-011 and Req. SR-SR-012, the entities mentioned in these requirements look rather like MetaData, not like columns. Is this the correct interpretation?
Response: Yes this is correct. These are typical “project” or “clinical protocol” meta-data. These are only examples and should be configurable by installation, potentially by data source.
Referring to Req. SR-SR-036, we assume, that "operators to be grouped" means creating complex expressions using multiple operators, is this correct?
Response: This is correct. For example GENERIC.NAME!=“metformin” AND (PKCNCN > 100 AND DOSE<3.0) AND AGE<17 AND AGEU==“YR” OR etc etc.
Referring to Req. SR-DT-004, can we read it as "concatenate" or "merge" instead of “append”?
Response: In this particular context, append can be read as either concatenate or merge. Both capabilities are required in the context of Data Transformation. Two or more datasets may be concatenated to a final complete aggregated dataload because the individual dataloads may represent incremental dataloads provided during the lifecycle of a clinical trial. Merging individual columns may be necessary in numerous contexts, as an example, if the protein binding data for one more more analytes or the demography for subjects is made available in separate datasets, these may need to be merged to permit analysis and reporting within the system.
Referring to Req. SR-DT-007, please specify the meaning of the term "Study Design Elements"?
Response: Please see the response for SR-DT-012 where the concept of “Study Design Elements” is referred to.
Referring to Req. SR-DT-013, does this requirement mean creation of subsets of distinct values? Should they be stored or calculated on the fly?
Response: Generally use, subsets of distinct values but these could be expressed in a more complex manner. For example, DOSE==100 & SEX==MALE could be a user specified subset specification. Another example is to create automated subsets basd upon distinct values, i.e. a set for SEX==MALE, another for SEX==FEMALE, others for each value of DOSE or TREATMENT, etc. Subsets should be managed as “criteria” and only be materialized (“calculated on the fly”) as needed for each action (Visualization/Reporting/Analysis/Secondary-Derived Analysis/etc).
Referring to Req. SR-DT-018, please clarify what "conversion to CDISC standard" means here? Transform only values to CDISC standard units? Or it means conversion of whole dataset to SDTM or SEND formats?
Response: This requirement refers to data unit management and transformation (conversion of units for concentration from mg/L to ng/mL for example) regardless the “data standard”. It does not refer to conversion to differing data formats but allows for units management regardless the native units provided in the input datasets.
Referring to Req. SR-DT-067, please specify the meaning of "audit user settings on data visualization on the last dataset in the data lineage".
Response: The system will permit multiple Data Transformations in a lineage. The intent is that the system will can capture the results of each of the Data Transformations as an aggregated result in the system audit/log.
Referring to Req. SR-DT-069, shall the system be compliant with the EU GDPR standard as well?
Response: This is a data centric scientific application that may, depending upon how data are presented to the system for individual organizations that adopt the use of the POSSC NCA Application incorporate individually identifiable details of subjects. In cases like this, the system and associated project files for local storage and integration with/to associated database/warehouse environments should facilitate the identification of data incorporating Sensitive Personally Identifiable data, removal and other operations that EU GDPR regulations require.
Referring to Req. SR-DT-071, please specify the meaning of "capture exclusions".
Response: This is an audit trail/logging requirement associated with modifications implemented to input datasets, i.e. DataLoads. All actions taken to modify DataLoads in subsequent entities identified as DataTransformations as well as the results of those actions will be captured in the audit trail/log for the system and identified by the lineage. Some of those DataTransformations can be identified as “data exclusions” from processing, i.e. exclusion or removal of input data records from analysis and reporting. The reasons for exclusion would be captured in the system audit trail/log. Example scenario is that if there is a data item or other independent information that indicates that a subject experienced emesis during or prior to a dosing interval, the analyst may exclude that data from analyses for concern that incomplete dose exposure or administration was a result.
Please double check if Req. SR-DT-073 is complete.
Response: This requirement was inadvertently corrupted. The final requirement is as follows: “The system shall provide the ability to produce a report of all data manipulations associated with a dataset, in the same order as performed, including the full data lineage. ”
Referring to Req. SR-FV-011, please specify the reason to allow the user to specify log file location.
Response: The rational is to permit the configuration of the File Validator facility to be set as required for differing sources and queues to support automated processing. Example scenarios is to support a location/queue for the standard production sources of data files as well as a location/queue for processing of legacy or bulk data for input. These locations/queues can be configured to execute at differing priorities and from differing sources of data for processing.
Referring to Req. UR-LR-05, are there any data retention policies applicable to logs? E.g.: "remove logs records older than 1 year".
Response: It must be possible to configure log and audit retention records to comply with individual organizations internal SOPs and policies. This was simply an example.
Please specify Req. SR-FV-014.
Response: This requires a configuration to the File Validation facility to configure logging of File Validation execution results as a default condition.
Referring to Req. SR-FV-013, does this requirement relate to transformation logs or audit logs?
Response: Neither. This refers to logging supporting a File Validation facility that can run independently of the primary NCA application and permits configuration of logging for execution of this facility.
Referring to Req. SR-AP-06, Is this requirement describing global settings for system or for particular data loading or transformation?
Response: The system for, at a minimum, restricted or data otherwise subject to blinded data access controls, must incorporate the details of READ access to the datasets associated both for the first and most recent access of the dataset.
Referring to Req. SR-AP-014, please clarify the meaning of "maintain an audit trail for all resolutions of user requests"?
Response: This requirement indicates that the system should maintain records for the identify and outcome/resolution each transaction that the user initiates through the system.
Referring to Req. SR-AN-031 and Req. SR-AN-042, please clarify the meaning of "KEL range"?
Response: The “KEL range” is the range of selected data points from a individual concentration time profile that is subjected to the log-linear regression to calculation of the KEL or terminal elimination phase rate constant. Note that while the range may be presented as a number of data points from Times T1 to Tn, there may be data points excluded from that range interactively by the user setting individual FLAG variables that select or deselect individual data points from the regression.
Referring to Req. SR-DT-024, please specify, if the system shall support the preview on portion of data or on whole dataset?
Response: It should be possible to preview the entire dataset
Referring to Req. SR-DT-022, please clarify if the system shall support deletion of the transformation itself or of the transformation result?
Response: Both should be possible.
Referring to Req. SR-ML-004, shall the system support preview before data loading?
Response: Yes. Preview of data file and field contents during the process of data loading should be possible to support mapping of data to an internal representation of the data fields.
Referring to Req. SR-PD-006, please provide the complete list of external standards the system shall support.
Response: This requirement is intended to require the system to provide facilities to “map” file content from input files to an internal standard and thus be configurable as necessary both for import and export. The initial list of file formats include ASCII (csv, tab delimited), sas transport, Microsoft Excel spreadsheet (xls, xlsx), R (.Rdata) file types.
Referring to Req. SR-AN-077, how the analysis dependencies mentioned in this requirement can be defined in the system?
Response: These types of derived analyses can be driven by characteristics incorporated in the setup of the primary analyses and as characteristics of the input concentration data itself. For example, computation of metabolite rations can be done when the system detects that the dataset has multiple analytes with one of the analytes defined as the parent analyte. The remainder would be considered metabolites and the ratio of metabolite to parent for select parameters would be computed automatically. There are a number of these scenarios that are identified in some detail in additional documentation, the Computation Engine Specifications document that will be provided separately.
Please specify Req. SR-AN-001.2, in particular the term "terminal elimination phase constant".
Response: The terminal elimination phase rate constant is a standard pharmacokinetic parameter that is generated in the course of performing the pharmacokinetic analysis of a single pharmacokinetic concentration-time profile. The details of the computation for this parameter are contained within the Computation Engine Specification document and involve the log-linear regression of the selected data points from the profile to generate a slope parameter, i.e. the KEL or terminal elimination phase rate constant.
Referring to Req. SR-AN-088, should the flags mentioned in this requirement be visible for other users?
Response: In general, unless further restricted or protected from unblinding, all of the data in the system should be generally accessible to other users. However, individual organizations that adopt this system should be able to configure the system according to corporate or local policy that may indeed restrict access from user to user without regard to formal data restrictions or study/data blinding/unblinding policies. This should be configurable by the institution using the system.
Referring to Req. SR-RR-003, is "Pharmacology Guidance Document, Chapter 23" a public document or some proprietary property? Please provide a link to this document or specify the content of Chapter 23 you are referring to.
Response: This Chapter 23 document referred to is, at the moment, proprietary to one of the consortium member organizations. Suitable alternatives information and context will be made available through the Computation Engine Specification document.
Referring to Req. SR-RR-006, please specify the "BA/BE/FE" abbreviations in this context.
Response: These are standard Clinical Pharmacology Clinical Trial experimental designs:
BA = Bioavailability study design
BE = Bioequivalence study design
FE = Food Effect study design
There are a number of other experimental designs that will generate data that this system will support, these include but are not limited to SAD (Single Ascending Dose), MAD (Multiple Ascending Dose), ADME (Absorption, Distribution, Metabolism, Excretion), Special Population (Hepatic insufficiency, renal insufficiency, pediatric, micro-dosing, TQT) and so on. Further, this system / application will permit management of data that both require and do not require Non-compartmental analyses (NCA) and thus can be the central application (depending how how an organization implements or chooses to deploy the application) that an organization uses to both analyze data through NCA and manage sparse Pk sample collection design studies and data through this system.
Referring to Req. SR-RR-047, please specify the format "Test vs. Reference defined by user".
Response: The system will need to accept multiple file formats as defined by the user. The specific formats will be defined during the scoping phase of the project.
Referring to Req. SR-QCR-026, please specify the kind of notification, shall it be e-mail notifications?
Response: Notifications are expected to be possible via email or via RSS feed. Other options will be considered in addition.
Please specify the following requirement (no number): “NCA tool to integrate with other tools e.g. Spotfire”.
Response: The NCA Tools must at a minimum integrate with a locally installed and controlled (for example Docker installation) of R and/or RStudio and minimum requirements for integration with Shiny apps, i.e. a web server such as NGINX or Apache. Other integrations may be desirable and provision of web services APIs or APIs that provide integration with other visualization tools such as Spotfire can be considered.
Please comment on the commercialization of the POSSC NCA Application System
Response: During the webinar, POSSC indicated that commercialization would be desirable even potentially even encouraged at some point in the project. However, a specific policy and approach with regards to the potential for commercial utilization of the product of the POSSC’s efforts has not yet been formalized and will likely be determined in the future as a component of selecting a specific open source license to utilize for the product or products that POSSC “owns” or manages.
Do you foresee that one or more product managers from the consortium will be available as product owners during the implementation phase?
The basic idea is that the consortium and the representatives from the member organizations will play an active role in design, testing, and implementation.
On the last session, you mentioned that the Excel file with requirements needs to be revised. Is the new version available?
It will be made available via the website ASAP. Please note that the requirements are accurate, it is only the categories taken from the internal documents that have been jumbled.
Do envision some kind of push notification or messaging capability to enable communication around QC requests, data updates, etc.?
Things like that should be possible. There are requirements that refer to this. This could be something that is role based. The basic technology could be email, RSS Feed for project or event, etc. The specifics are open to consideration.
On transformation - do u plan to use R for transformation only, or initial data loading and parsing as well?
The use of R throughout the environment is feasible but there may be situations to use other approaches/methods as well that are more appropriate. This is open to discussion, based upon cost, ease of use, etc. R will likely be used only for transforming data, not necessarily loading data.
There have been a number of mentions regarding utilization of shiny applications as services to build out independent functionality. Will it be up to the vendor(s) to determine the data interchange format(s) or will guidance be provided as to the implementation. Especially given shiny does not really support traditional mechanisms for external interactions through RPC/REST given the stateful websocket-based communication within the applications?
This comes down to implementation. Need to have a way to get data in/out whether you use Docker or another approach. Haven’t gotten to that level of detail in the requirements at this point to propose a particular path. Some of these suggestions to maintain a fully open-source product. Other solutions that require purchase of licenses aren’t as interesting to POSSC.
Would a basic solution that is open-source and more robust solution that requires licensing be appealing to the consortium?
Yes, that is something the consortium would consider but the consortium would default to open-source solutions to make the solution available broadly.
When you think about a minimal viable product, would you consider to limit data formats and data organization - for example, to CDISC-based datasets?
That wouldn’t meet the characteristics for an MVP. CDISC alone can be transformed outside, there are a lot of legacy data, and some other organizations don’t utilize CDISC so it is important to support other formats beyond CDISC.
As per provided material it seems there is a plan to use visual programming language (like Pipeline Pilot, LabView) for workflow creation. Is it correct understanding? Do you have any reference visual language in mind which fits current requirements?
The system isn’t meant to be workflow oriented but to understand the lineage. We have not have specific discussions using Pipeline Pilot or other languages.
Many companies used to implement their solutions as Standalone applications. But nowadays they are getting rid of Standalone and migrating their solutions from Desktop to Web-based solutions and Clouds due to many reasons. What you think about skipping this (Standalone) step of evolution and implement it in a modern (Web) way?
The application we are hoping to build should have ultility in a variety of use cases. We want to be able to use the application in the context of a workflow based application where one of the transformations is an NCA which “feels” like a desktop application. There are other circumstances where a simple desktop applications make sense. With that mind, the solution could be developed with web-based technologies and be hosted on the desktop. There are organizations in the consortium which lobbied for a desktop application. A desktop application also makes it easier to share the application with academics and regulators without requiring additional infrastructure.
In going over the Excel spreadsheet, are you sure the requirements match. It looks like that only one row was sorted and not the others. Please verify the integrity of the spreadsheet.
Not sure how that happened. We will fix that and repost it to the POSSC website
Open source licensing and ownership, would something that already exists isn’t licensable will that be considered?
The consortium will serve as project manager for the open source project. POSSC does not necessarily need to own the output but needs to ensure that the product is open source and well controlled from a quality perspective. POSSC will consider how to handle licensing challenges. Incorporation of components with varying licensing models will be made transparent to ensure that it’s possible to clearly understand all considerations for adoption of the software either for use or potential extension.
There were a lot of requirements of databases, sources of data, data warehouse, reporting, mappings, and workflows. The documentation provided is more of an enterprise solution and not just a PK solution.
What we are hoping to do is to develop a modular system to plug into databases and data warehouses. Standalone databases or use APIs to connect to existing resources. Authentication and access control, logging for audit controls. The PK Software is the core but there is interest in building a software solution that supports multiple use cases and environments, including standalone operation, incorporation into diverse pharmacometrics workflows as well as the interface to an enterprise environment supporting data repositories, controlled access, audit and logging, and data access controls.
There are signs you want to use R and Shiny server as part of your solution, is this a goal?
Yes, POSSC has decided that basing the core pharmacokinetic underpinnings of the solution on a widely utilized statistical software language environment, such as R, that the discipline already has significant exposure to and experience with has substantial benefits. Using R will also provide capabilities to extend the solution without substantial development effort, i.e. enabling creation and execution of user defined R scripts/programs as well as incorporation and execution of Shiny applications to provide new modules to extend the base capabilities.
What is the expectation on towards validation since this tool will be used in a controlled environment?
We are looking at well defined and accepted software development lifecycle practices to include appropriate Installation and Qualification test protocols that can and should be used by adopters of the new application. further, there will be extensive qualification of the pharmacokinetic R-based computations with reference/test cases with alignment to the literature where possible and industry practices.
It will be hard to keep the whole system validated if the user is adding his/her own scripts.
We will have a validation test suite which tests the system against reference results and installed according to specifications. POSSC project governance will enforce qualification and documentation of the system components and control of the base code. A carefully implemented and controlled approach leveraging application containerization is envisioned to ensure that the R computation environment with delivered base computation code remains valid. If the user or the organization develops scripts and Shiny application, they will be required, and there is no substitute for, the qualification of these scripts and appropriate evaluation of the quality of results.
If some parts of the system are using open source components, who is responsible for producing the validation of components that are “dragged in” to build the system.
It would be a shared responsibility, incorporated as a qualification effort through the SDLC process with the specifics worked out through the engagement with the selected vendor(s) and contributors. The consortium will play a significant role influencing the scientific integrity of the system (e.g. involving definition of test cases).
I had assumed the validation was related to a given level of the software for given use cases (i.e. the R component would be validated based upon a set of use cases), then the next layer validated, and then the next layer on top of that… There may be more components that may come together.
What you suggested is clear. Each of the modules would need their own validation documentation and testing. The system as a whole, of course, would be delivered with IQ/OQ and reference testing for each component to ensure that an appropriately qualified installation as a whole is achievable.
There is a consortium around NCA data structure around CDISC which will likely be released next year. The draft of this standard would be good for use on this project, having the standard data structure will make things much easier.
The intent is to support multiple ways of getting data into the system since it will support industry and academia for example. The ADAM standard for NCA may be a good target. There is interest in the consortium to read in and be able to output CDISC data formats. It is a good suggestion to also incorporate elements of the CDISC NDA Data Structure when it is available.
How do you envision the process of selecting the scope, how big is the group, how decisions will be made?
We are building out the RFI criteria to score responses. Every member of the consortium has a voting basis to make decisions. How we will do it exactly hasn’t been fully worked out but there will likely be technical steering committees and working groups which will work with the provider to make recommendations to the Board.
The funding side, the consortium is focused on steps for technical feasibility. Can you say anything about the participation model for the consortium members assuming you get through the technical gates. This will be a considerable project in scope.
The actual ask for funding to the Consortium members will depend on the final scope and the vendor proposals that come through. The participation model is focused on Pharma in order to build something that is broadly applicable. We have 11 companies that are participating at the moment and anticipate that to grow. Longer term it won’t just be pharma and we anticipate and will encourage broader participation (academia, regulators). Funding will be from the member organizations.
Is there a list of examples of R-scripts or Shiny applications that will be used for the final solution?
Yes there is. We will make that list available shortly.
Would you please explain the rationale behind your suggestion to open source the platform?
POSSC is committed long-term to the community contribution to its efforts. The goal is to support a code base developed and made available in the open, with broad opportunity for users/adopters/organizations/reviewers to inspect exactly how the software is constructed, what assumptions are made, etc. This is critical for discipline and community acceptance and promotes scientific integrity. Ultimately, we anticipate that the community will extend from Pharma to pharmaceutical regulatory, academic (for research and didactic purposes) , CROs, and other interested parties.
Could you provide more details about the desktop application usage scenario, so we can understand the need for a desktop application?
The membership of POSSC utilizes a wide range of use case scenarios. Some have highly integrated existing environments that incorporate web based applications and backend infrastructure (databases, authorization systems, etc.) and others utilize applications executed from the desktop without databases. The intent is to support
Standalone desktop utilization
Execution with integration through web-services for audit/logging, database, authentication/access controls, etc.
Execution in alternative pharmacometrics workflows incorporating the application into other environments as a standalone tool
Who would be the main users of the NCA Platform?
Clinical and Pre-clinical scientists with a requirement to analyze pharmacokinetic data utilizing non-compartmental pharmacokinetic analysis methodology.
What user scenarios will be the most important and popular according to your plan?
These will be broad due to the anticipated interest in the project and projected user community but the initial focus is on utilization within common Pharmaceutical development and discovery environments, i.e. for the analysis of clinical and pre-clinical pharmacokinetic data.
Do you have any timelines planned for the first version of the NCA Platform go-live?
The purpose of the RFI is to elicit the scope of work and cost to benchmark the development of a first version of the NCA application. POSSC will formalize business plans based on the assessment of the RFI response.
Are requirements 91-94 applicable to this RFI?
Yes. These requirements provide some of the details concerning the R Package that will implement the Computation Engine that will address data handling and the core Pharmacokinetic Parameter computations
What will be the main business processes implemented?
This question will be discussed within POSSC during our next teleconference and a response will be posted on the POSSC website
Are there any project examples that can be provided to illustrate the request?
This question will be discussed within POSSC during our next teleconference and a response will be posted on the POSSC website
In terms of interim SW versions request could you describe the minimal viable product?
POSSC is focused on implementing the common set of NCA requirements for the full solution which are reflected in the RFI. It is important for developing our business case for this project to have proposals that reflect the full solution so the concept of a Minimal Viable Product (MVP) has not been discussed broadly in the group beyond the consensus requirements. However, we anticipate the vendor chosen to implement this software will release interim versions as part of the development process, both to receive feedback but also to deliver intermediate solutions which will be of value to our members prior to completion of the project. The prioritization of requirements in the interim solutions would constitute our MVP and would be decided in collaboration with the vendor during the scoping phase.