How to integrate machine learning into technical patterns

dr Steven Gustafson is noon CTO and AI scientist passionate about solving tough problems while having fun and building great teams.

As an AI scientist, AI product owner, or senior scientist, I have developed many end-to-end machine learning (ML) and artificial intelligence (AI) systems and seen how software engineering managers often explore the nuances of ML systems do not consider . As the CTO of a startup for the past three years, I’ve had the opportunity to explore the integration of ML into core design patterns. By working with a CTO or engineering manager who understands both AI and software, the best design architectures and management patterns can go beyond traditional software applications such as databases and web applications to better govern and optimize an ML factory.

Continuous integration and delivery

Reduce the risk of releasing buggy applications. Always build with unit and unit tests and deploy with verification tests using code that is itself under version control. After deciding on a development branch, the system is deployed in a development environment.

Once all end-to-end and manual smoke testing is complete, a manual action will be deployed to production. ML models are included as well as the pipeline that runs the ML model. Gold standard data is used to verify that ML models and pipelines are accurate. Rolling back to software and database versions in good releases includes rolling back to the ML models and their data, all of which can be integrated and deployed automatically.

infrastructure as code

You should also try to avoid improper infrastructure deployment or configuration. Use code to specify the infrastructure and run scripts to rebuild and verify the infrastructure that the system requires. Likewise, the infrastructure required to build and test the ML models and run them in production should be defined as code. Once all infrastructure associated with ML model development and deployment is specified as code, it can be updated to reflect changes in the ML models or their usage, or the infrastructure can be rolled back to the last working ML model as needed will.

End-to-End Testing

Manual smoke testing is always useful, and keeping the tests fresh and current with new features, use cases, and data is an ongoing task. ML model predictions are no different. If part of the app is provided by an ML model recommendation, identify the claims that can be made, e.g. B. where there should be at least five suggestions in an application, or whether an email notification is constructed correctly, or whether the model can handle missing data as expected. A broken ML model or pipeline should not be released and will result in bad results in the app.

Alerts and notifications about processes

In an ML system, data comes in, models are run, the output of the models is stored and analyzed, and application tables are created. All of these processes either run on regularly scheduled jobs or are part of a queue or eventing system.

If a script fails, log the error and push it to an alerting dashboard (to debug later) and notify collaborators via email, Slack, or other method. If a notification is redundant, adjust the alert: Every alert should be a cause that requires action. Similar to latency in an application, store and analyze the results of an ML model application that supports data changes, infrastructure capacity pressures, or simply unexpected deviations from prediction types.

Model testing and version control

Unit testing and version control are the standards for most software programs, but not for ML model development or the underlying data. ML models are notorious for producing unintended results based on new data. First, apply version control to the code used to generate the model from a specific data set, which should also be under its own version control. The data and model must be aligned for replicability and rollback requirements.

To deploy a new model, model versioning changes can be done in what’s called a “commit,” much like any repository upgrade, and the new model is then fetched and inserted into the development pipeline. The deployment process must run prediction tests on validation data (used only in this step) to ensure that the expected level of quality has been maintained, and alerts should be issued when accuracy degrades or falls below a minimum tolerance.

Functional style architecture

ML systems require a significant amount of data processing and transformation. When ML systems are developed, it is usually done in an incremental manner and ends up being very procedural: data is assembled in a file, cleaned and loaded into a spreadsheet, processed and clustered, passed through a model and stored in a database . This initial design is used to build the model, understand the data, and get a handle on the performance. However, the deployed application can become very complex and difficult to maintain when this pattern is replicated.

Instead, a functional approach makes it possible to perform discrete transformations on data and propagate results to the next stage, better optimize and manage processes, reduce memory and storage requirements while increasing efficiency. To prevent a data system from being overloaded by storing predictions from the ML system, queuing and messaging systems are used to manage the volume of events. Additionally, since elastic cloud systems can lose nodes from time to time, queuing and messaging can buffer data coming into the ML system and ensure everything is processed.

By supporting ML model building, updates and releases as part of the overall software development process, an overall more robust system is created. In this way, you can better serve customers and users while making the lives of your engineers and scientists more satisfying and efficient.


The Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology leaders. Am I Qualified?


Leave a Reply

Your email address will not be published.